[jira] [Work logged] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25535?focusedWorklogId=652538=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652538
 ]

ASF GitHub Bot logged work on HIVE-25535:
-

Author: ASF GitHub Bot
Created on: 18/Sep/21 04:58
Start Date: 18/Sep/21 04:58
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on pull request #2651:
URL: https://github.com/apache/hive/pull/2651#issuecomment-922184779


   @deniskuzZ @mattmccline-microsoft @sankarh Could you guys please review the 
PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 652538)
Time Spent: 20m  (was: 10m)

> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *Use Case* - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound* for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> *Solution* - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17417017#comment-17417017
 ] 

Ashish Sharma commented on HIVE-25535:
--

[~dkuzmenko] I agree with you. Now compactor is running in a transaction so 
problem like FileNotFound will not come. This config is more intended to 
HDP-3.1 and lower version users. Where Lock-based Cleaner is still running. 
Backporting compactor running in transaction is not straight forwards as it 
required metastore schema change. 

> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Use Case* - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound* for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> *Solution* - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25343) Create or replace view should clean the old table properties

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25343?focusedWorklogId=652509=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652509
 ]

ASF GitHub Bot logged work on HIVE-25343:
-

Author: ASF GitHub Bot
Created on: 18/Sep/21 00:09
Start Date: 18/Sep/21 00:09
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2492:
URL: https://github.com/apache/hive/pull/2492#issuecomment-922142334


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 652509)
Time Spent: 0.5h  (was: 20m)

> Create or replace view should clean the old table properties
> 
>
> Key: HIVE-25343
> URL: https://issues.apache.org/jira/browse/HIVE-25343
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.3, 3.2.0
>Reporter: Lantao Jin
>Assignee: Lantao Jin
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2021-07-19 at 15.36.29.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In many cases, users use Spark and Hive together. When a user creates a view 
> via Spark, the table output columns will store in table properties, such as 
>  !Screen Shot 2021-07-19 at 15.36.29.png|width=80%!
> After that, if the user runs the command "create or replace view" via Hive, 
> to change the schema. The old table properties added by Spark are not cleaned 
> by Hive. Then users read the table via Spark. The schema didn't change. It 
> very confused users.
> How to reproduce:
> {code}
> spark-sql>create table lajin_table (a int, b int) stored as parquet;
> spark-sql>create view lajin_view as select * from lajin_table;
> spark-sql> desc lajin_view;
> a   int NULLNULL
> b   int NULLNULL
> hive>desc lajin_view;
> a   int 
> b   int
> hive>create or replace view lajin_view as select a, b, 3 as c from 
> lajin_table;
> hive>desc lajin_view;
> a   int 
> b   int 
> c   int
> spark-sql> desc lajin_view; -- not changed
> a   int NULLNULL
> b   int NULLNULL
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25270) To create external table without schema should use db schema instead of the metastore default fs

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25270?focusedWorklogId=652508=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652508
 ]

ASF GitHub Bot logged work on HIVE-25270:
-

Author: ASF GitHub Bot
Created on: 18/Sep/21 00:09
Start Date: 18/Sep/21 00:09
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #2468:
URL: https://github.com/apache/hive/pull/2468


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 652508)
Time Spent: 40m  (was: 0.5h)

> To create external table without schema should use db schema instead of the 
> metastore default fs
> 
>
> Key: HIVE-25270
> URL: https://issues.apache.org/jira/browse/HIVE-25270
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: shezm
>Assignee: shezm
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Hi
> when hive creates an external table without specifying the schema of the 
> location, such as the following sql
> {code:java}
> CREATE EXTERNAL TABLE `user.test_tbl` (
> id string,
> name string
> )
> LOCATION '/user/data/test_tbl'
> {code}
> The default schema will use the default.fs of metastore conf.
> But in some cases, there will be multiple hadoop namenodes, such as using 
> hadoop federation or hadoop rbf.
> I think that when creating an external table without specifying a schema, the 
> schema of db should be used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25335) Unreasonable setting reduce number, when join big size table(but small row count) and small size table

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25335?focusedWorklogId=652507=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652507
 ]

ASF GitHub Bot logged work on HIVE-25335:
-

Author: ASF GitHub Bot
Created on: 18/Sep/21 00:09
Start Date: 18/Sep/21 00:09
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2490:
URL: https://github.com/apache/hive/pull/2490#issuecomment-922142340


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 652507)
Time Spent: 20m  (was: 10m)

> Unreasonable setting reduce number, when join big size table(but small row 
> count) and small size table
> --
>
> Key: HIVE-25335
> URL: https://issues.apache.org/jira/browse/HIVE-25335
> Project: Hive
>  Issue Type: Improvement
>Reporter: zhengchenyu
>Assignee: zhengchenyu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-25335.001.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I found an application which is slow in our cluster, because the proccess 
> bytes of one reduce is very huge, but only two reduce. 
> when I debug, I found the reason. Because in this sql, one big size table 
> (about 30G) with few row count(about 3.5M), another small size table (about 
> 100M) have more row count (about 3.6M). So JoinStatsRule.process only use 
> 100M to estimate reducer's number. But we need to  process 30G byte in fact.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416899#comment-17416899
 ] 

Denys Kuzmenko commented on HIVE-25535:
---

Lock-based Cleaner implementation was required when Compaction was not running 
in a transaction. That's not the case anymore, however, HDP-3.1 is still 
relying on the locks. 

> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Use Case* - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound* for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> *Solution* - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25532) Missing authorization info for KILL QUERY command

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25532?focusedWorklogId=652411=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652411
 ]

ASF GitHub Bot logged work on HIVE-25532:
-

Author: ASF GitHub Bot
Created on: 17/Sep/21 17:45
Start Date: 17/Sep/21 17:45
Worklog Time Spent: 10m 
  Work Description: saihemanth-cloudera commented on a change in pull 
request #2649:
URL: https://github.com/apache/hive/pull/2649#discussion_r711241978



##
File path: service/src/java/org/apache/hive/service/server/KillQueryImpl.java
##
@@ -116,6 +119,8 @@ public static void killChildYarnJobs(Configuration conf, 
String tag, String doAs
 
   private static boolean isAdmin() {
 boolean isAdmin = false;
+// RANGER-1851
+HivePrivilegeObject serviceNameObj = new 
HivePrivilegeObject(HivePrivilegeObject.HivePrivilegeObjectType.SERVICE_NAME, 
null, "hiveservice");

Review comment:
   Instead of hard-cording "hiveservice" value, have you thought about 
making this configurable? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 652411)
Time Spent: 20m  (was: 10m)

> Missing authorization info for KILL QUERY command
> -
>
> Key: HIVE-25532
> URL: https://issues.apache.org/jira/browse/HIVE-25532
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Abhay
>Assignee: Abhay
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We added authorization for Kill Query command some time back with the help of 
> Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851
> However, we have observed that this hasn't been working as expected. The 
> Ranger service expects Hive to send in a privilege object of the type 
> SERVICE_NAME but we can see below
>  
> [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131]
>  that it is sending an empty array list. 
>  The Ranger service never throws an exception to this and this results in any 
> user being able to kill any query even though they don't have necessary 
> permissions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416713#comment-17416713
 ] 

Ashish Sharma commented on HIVE-25535:
--

[~dkuzmenko] update the use case in description.

> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Use Case* - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound* for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> *Solution* - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416713#comment-17416713
 ] 

Ashish Sharma edited comment on HIVE-25535 at 9/17/21, 2:01 PM:


[~dkuzmenko] updated use case in description.


was (Author: ashish-kumar-sharma):
[~dkuzmenko] update the use case in description.

> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Use Case* - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound* for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> *Solution* - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25378) Enable removal of old builds on hive ci

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25378:
--
Labels: pull-request-available  (was: )

> Enable removal of old builds on hive ci
> ---
>
> Key: HIVE-25378
> URL: https://issues.apache.org/jira/browse/HIVE-25378
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We are using the github plugin to run builds on PRs
> However to remove old builds that plugin needs to have periodic branch 
> scanning enabled - however since we also use the plugins merge mechanism; 
> this will cause to rediscover all open PRs after there is a new commit on the 
> target branch. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25378) Enable removal of old builds on hive ci

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25378?focusedWorklogId=652322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652322
 ]

ASF GitHub Bot logged work on HIVE-25378:
-

Author: ASF GitHub Bot
Created on: 17/Sep/21 14:00
Start Date: 17/Sep/21 14:00
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk opened a new pull request #2652:
URL: https://github.com/apache/hive/pull/2652


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 652322)
Remaining Estimate: 0h
Time Spent: 10m

> Enable removal of old builds on hive ci
> ---
>
> Key: HIVE-25378
> URL: https://issues.apache.org/jira/browse/HIVE-25378
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We are using the github plugin to run builds on PRs
> However to remove old builds that plugin needs to have periodic branch 
> scanning enabled - however since we also use the plugins merge mechanism; 
> this will cause to rediscover all open PRs after there is a new commit on the 
> target branch. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
*Use Case* - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take acquires locks on the metastore artefacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like *FileNotFound* for delta directory because at time of 
spark acid compilation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

*Solution* - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Use Case - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take acquires locks on the metastore artefacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like *FileNotFound* for delta directory because at time of 
spark acid compilation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Use Case* - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound* for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> *Solution* - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian 

[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
Use Case - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take acquires locks on the metastore artefacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like *FileNotFound* for delta directory because at time of 
spark acid compilation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Use Case - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take acquires locks on the metastore artefacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like *FileNotFound *for delta directory because at time of 
spark acid compilation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Use Case - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound* for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> Solution - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira

[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
Use Case - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take acquires locks on the metastore artefacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like *FileNotFound *for delta directory because at time of 
spark acid compilation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Use Case - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take aquires locks on the metastore artifacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like FileNotFound for delta directory because at time of 
spark acid complitation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Use Case - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound *for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> Solution - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira

[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
Use Case - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take aquires locks on the metastore artifacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like FileNotFound for delta directory because at time of 
spark acid complitation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Use Case - 

When external tool like SPARK_ACID try to access hive metastore directly 
instead of accessing LLAP or hs2 which lacks the ability of take aquires locks 
on the metastore artifacts. Due to which if any spark acid jobs starts and at 
the same time compaction happens in hive with leads to exceptions like 
FileNotFound for delta directory because at time of spark acid complitation 
phase delta files are present but when execution start delta files are deleted 
by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Use Case - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take aquires locks on the metastore artifacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like FileNotFound for delta directory 
> because at time of spark acid complitation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> Solution - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
Use Case - 

When external tool like SPARK_ACID try to access hive metastore directly 
instead of accessing LLAP or hs2 which lacks the ability of take aquires locks 
on the metastore artifacts. Due to which if any spark acid jobs starts and at 
the same time compaction happens in hive with leads to exceptions like 
FileNotFound for delta directory because at time of spark acid complitation 
phase delta files are present but when execution start delta files are deleted 
by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Use Case - 

When external tool like SPARK_ACID try to access hive metastore directly 
instead of accessing LLAP or hs2 which lacks the ability of take aquires locks 
on the metastore artifacts. Due to which if any spark acid jobs starts and at 
the same time compaction happens in hive with leads to exceptions like 
FileNotFound for delta directory because at time of spark acid complitation 
phase delta files are present but when execution start delta files are deleted 
by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have "HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED" which allow us to 
delay the deletion of "obsolete directories/files" but it is applicable to all 
the table in metastore where this config will provide table and partition level 
control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Use Case - 
> When external tool like SPARK_ACID try to access hive metastore directly 
> instead of accessing LLAP or hs2 which lacks the ability of take aquires 
> locks on the metastore artifacts. Due to which if any spark acid jobs starts 
> and at the same time compaction happens in hive with leads to exceptions like 
> FileNotFound for delta directory because at time of spark acid complitation 
> phase delta files are present but when execution start delta files are 
> deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> Solution - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
Use Case - 

When external tool like SPARK_ACID try to access hive metastore directly 
instead of accessing LLAP or hs2 which lacks the ability of take aquires locks 
on the metastore artifacts. Due to which if any spark acid jobs starts and at 
the same time compaction happens in hive with leads to exceptions like 
FileNotFound for delta directory because at time of spark acid complitation 
phase delta files are present but when execution start delta files are deleted 
by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have "HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED" which allow us to 
delay the deletion of "obsolete directories/files" but it is applicable to all 
the table in metastore where this config will provide table and partition level 
control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Use Case - 
> When external tool like SPARK_ACID try to access hive metastore directly 
> instead of accessing LLAP or hs2 which lacks the ability of take aquires 
> locks on the metastore artifacts. Due to which if any spark acid jobs starts 
> and at the same time compaction happens in hive with leads to exceptions like 
> FileNotFound for delta directory because at time of spark acid complitation 
> phase delta files are present but when execution start delta files are 
> deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have "HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED" which allow us to 
> delay the deletion of "obsolete directories/files" but it is applicable to 
> all the table in metastore where this config will provide table and partition 
> level control.
> Solution - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Add "NO_CLEANUP" in the table properties enable/disable the table-level cleanup 
and prevent the cleaner process from automatically cleaning obsolete 
directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25535:
--
Labels: pull-request-available  (was: )

> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add "NO_CLEANUP" in the table properties enable/disable the table-level 
> cleanup and prevent the cleaner process from automatically cleaning obsolete 
> directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25535?focusedWorklogId=652315=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652315
 ]

ASF GitHub Bot logged work on HIVE-25535:
-

Author: ASF GitHub Bot
Created on: 17/Sep/21 13:40
Start Date: 17/Sep/21 13:40
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma opened a new pull request #2651:
URL: https://github.com/apache/hive/pull/2651


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 652315)
Remaining Estimate: 0h
Time Spent: 10m

> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add "NO_CLEANUP" in the table properties enable/disable the table-level 
> cleanup and prevent the cleaner process from automatically cleaning obsolete 
> directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25535:
---
Summary: Control cleaning obsolete directories/files of a table via 
property  (was: Adding table property "NO_CLEANUP")

> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Add "NO_CLEANUP" in the table properties enable/disable the table-level 
> cleanup and prevent the cleaner process from automatically cleaning obsolete 
> directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25534) Error when executing DistCp on file system not supporting XAttrs

2021-09-17 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25534:
---
Summary: Error when executing DistCp on file system not supporting XAttrs  
(was: Don't preserve FileAttribute.XATTR to initialise distcp.)

> Error when executing DistCp on file system not supporting XAttrs
> 
>
> Key: HIVE-25534
> URL: https://issues.apache.org/jira/browse/HIVE-25534
> Project: Hive
>  Issue Type: Bug
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Remove the preserve xattr while calling distcp.
> {code:java}
> 2021-08-23 10:06:18,485 ERROR org.apache.hadoop.tools.DistCp: 
> [HiveServer2-Background-Pool: Thread-73]: XAttrs not supported on at least 
> one file system: 
>  org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not 
> supported for file system: s3a://hmangla1-dev
>  at 
> org.apache.hadoop.tools.util.DistCpUtils.checkFileSystemXAttrSupport(DistCpUtils.java:513)
>  ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:337) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.createJob(DistCp.java:304) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:214) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.execute(DistCp.java:193) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]{code}
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25533) Incorrect results when filtering data from UNION ALL sub-query

2021-09-17 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25533:
---
Description: 
With CBO enabled querying from a view or CTE with a {{UNION ALL}} clause 
produces wrong results, such as the following script shows.
{code:java}
CREATE TABLE n1 (c1 STRING);

INSERT OVERWRITE TABLE n1 VALUES('needn');

CREATE VIEW v1 
AS
SELECT 'maggie'  AS c1 FROM n1
UNION ALL
SELECT c1 FROM n1;
{code}
Return the incorrect result when using "=" or "IN" with single element.

For example, the following 2 querys return nothing.
{code:java}
SELECT * FROM v1 WHERE c1 = 'maggie';
SELECT * FROM v1 WHERE c1 IN ('maggie');{code}
 

However, I can get correct result when using "LIKE" or "IN" with multiple 
element.

For example, the following 2 querys return expected result.
{code:java}
SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug');
SELECT * FROM v1 WHERE c1 LIKE 'maggie%';
{code}
 

  was:
When querying form a view or CTE which "union all" 2 tables, such as the 
following script shows
{code:java}
CREATE TABLE n1 (c1 STRING);

INSERT OVERWRITE TABLE n1 VALUES('needn');

CREATE VIEW v1 
AS
SELECT 'maggie'  AS c1 FROM n1
UNION ALL
SELECT c1 FROM n1;
{code}
Return the incorrect result when using "=" or "IN" with single element.

For example, the following 2 querys return nothing.
{code:java}
SELECT * FROM v1 WHERE c1 = 'maggie';
SELECT * FROM v1 WHERE c1 IN ('maggie');{code}
 

However, I can get correct result when using "LIKE" or "IN" with multiple 
element.

For example, the following 2 querys return expected result.
{code:java}
SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug');
SELECT * FROM v1 WHERE c1 LIKE 'maggie%';
{code}
 


> Incorrect results when filtering data from UNION ALL sub-query
> --
>
> Key: HIVE-25533
> URL: https://issues.apache.org/jira/browse/HIVE-25533
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 3.1.0
> Environment: Azure HDInsight 4.1.7.5
> Hive 3.1.0
>Reporter: Needn Yu
>Priority: Critical
> Attachments: hive.png
>
>
> With CBO enabled querying from a view or CTE with a {{UNION ALL}} clause 
> produces wrong results, such as the following script shows.
> {code:java}
> CREATE TABLE n1 (c1 STRING);
> INSERT OVERWRITE TABLE n1 VALUES('needn');
> CREATE VIEW v1 
> AS
> SELECT 'maggie'  AS c1 FROM n1
> UNION ALL
> SELECT c1 FROM n1;
> {code}
> Return the incorrect result when using "=" or "IN" with single element.
> For example, the following 2 querys return nothing.
> {code:java}
> SELECT * FROM v1 WHERE c1 = 'maggie';
> SELECT * FROM v1 WHERE c1 IN ('maggie');{code}
>  
> However, I can get correct result when using "LIKE" or "IN" with multiple 
> element.
> For example, the following 2 querys return expected result.
> {code:java}
> SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug');
> SELECT * FROM v1 WHERE c1 LIKE 'maggie%';
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25533) Incorrect results when filtering data from UNION ALL sub-query

2021-09-17 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25533:
---
Summary: Incorrect results when filtering data from UNION ALL sub-query  
(was: With CBO enabled, Incorrect query result when using where CLAUSE to query 
data from 2 "UNION ALL" parts)

> Incorrect results when filtering data from UNION ALL sub-query
> --
>
> Key: HIVE-25533
> URL: https://issues.apache.org/jira/browse/HIVE-25533
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 3.1.0
> Environment: Azure HDInsight 4.1.7.5
> Hive 3.1.0
>Reporter: Needn Yu
>Priority: Critical
> Attachments: hive.png
>
>
> When querying form a view or CTE which "union all" 2 tables, such as the 
> following script shows
> {code:java}
> CREATE TABLE n1 (c1 STRING);
> INSERT OVERWRITE TABLE n1 VALUES('needn');
> CREATE VIEW v1 
> AS
> SELECT 'maggie'  AS c1 FROM n1
> UNION ALL
> SELECT c1 FROM n1;
> {code}
> Return the incorrect result when using "=" or "IN" with single element.
> For example, the following 2 querys return nothing.
> {code:java}
> SELECT * FROM v1 WHERE c1 = 'maggie';
> SELECT * FROM v1 WHERE c1 IN ('maggie');{code}
>  
> However, I can get correct result when using "LIKE" or "IN" with multiple 
> element.
> For example, the following 2 querys return expected result.
> {code:java}
> SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug');
> SELECT * FROM v1 WHERE c1 LIKE 'maggie%';
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25532) Missing authorization info for KILL QUERY command

2021-09-17 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25532:
---
Summary: Missing authorization info for KILL QUERY command  (was: Fix 
authorization support for Kill Query Command)

> Missing authorization info for KILL QUERY command
> -
>
> Key: HIVE-25532
> URL: https://issues.apache.org/jira/browse/HIVE-25532
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Abhay
>Assignee: Abhay
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We added authorization for Kill Query command some time back with the help of 
> Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851
> However, we have observed that this hasn't been working as expected. The 
> Ranger service expects Hive to send in a privilege object of the type 
> SERVICE_NAME but we can see below
>  
> [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131]
>  that it is sending an empty array list. 
>  The Ranger service never throws an exception to this and this results in any 
> user being able to kill any query even though they don't have necessary 
> permissions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25496) hadoop 3.3.1 / hive 3.2.1 / OpenJDK11 compatible?

2021-09-17 Thread Jerome Le Ray (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416639#comment-17416639
 ] 

Jerome Le Ray commented on HIVE-25496:
--

Hello [~belugabehr]

Finally, I'm using openJDK8 and everything works fine on both OnPrem and on 
Azure configuration.

Here the link of the OpenJDK8 used : 

[https://github.com/adoptium/temurin8-binaries/releases/download/jdk8u302-b08/OpenJDK8U-jdk_x64_linux_hotspot_8u302b08.tar.gz]

Thank you

> hadoop 3.3.1 / hive 3.2.1 / OpenJDK11 compatible?
> -
>
> Key: HIVE-25496
> URL: https://issues.apache.org/jira/browse/HIVE-25496
> Project: Hive
>  Issue Type: Bug
> Environment: Linux VM
>Reporter: Jerome Le Ray
>Assignee: Jerome Le Ray
>Priority: Major
>
> We used the following configuration
> hadoop 3.2.1
> hive 3.1.2
> PostGres 12
> Java - OracleJDK 8
> For internal reasons, we have to migrate to OpenJDK11.
> So, I've migrated hadoop 3.2.1 to the new version hadoop 3.3.1
> When I'm starting the hiveserver2 service, I've got the error :
> which: no hbase in 
> (/usr/local/bin:/bin:/usr/pgsql-12/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/jdk-11.0.10+9/bin:/opt/hivemetastore/hadoop-3.3.1/bin:/opt/hivemetastore/apache-hive-3.1.2-bin/b
> in)
> 2021-09-02 16:48:05: Starting HiveServer2
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/hivemetastore/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hivemetastore/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2021-09-02 16:48:06,744 INFO conf.HiveConf: Found configuration file 
> file:/opt/hivemetastore/apache-hive-3.1.2-bin/conf/hive-site.xml
> 2021-09-02 16:48:07,169 WARN conf.HiveConf: HiveConf of name 
> hive.metastore.local does not exist
> 2021-09-02 16:48:07,169 WARN conf.HiveConf: HiveConf of name 
> hive.metastore.thrift.bind.host does not exist
> 2021-09-02 16:48:07,170 WARN conf.HiveConf: HiveConf of name 
> hive.enforce.bucketing does not exist
> 2021-09-02 16:48:08,414 INFO server.HiveServer2: STARTUP_MSG:
> /
> STARTUP_MSG: Starting HiveServer2
> STARTUP_MSG: host = lhroelcspt1001.enterprisenet.org/10.90.122.159
> STARTUP_MSG: args = [-hiveconf, mapred.job.tracker=local, -hiveconf, 
> fs.default.name=file:///cip-data, -hiveconf, 
> hive.metastore.warehouse.dir=file:cip-data, --hiveconf, hive.server2.thrif
> t.port=1, --hiveconf, hive.root.logger=INFO,console]
> STARTUP_MSG: version = 3.1.2
> (...)
> STARTUP_MSG: build = git://HW13934/Users/gates/tmp/hive-branch-3.1/hive -r 
> 8190d2be7b7165effa62bd21b7d60ef81fb0e4af; compiled by 'gates' on Thu Aug 22 
> 15:01:18 PDT 2019
> /
> 2021-09-02 16:48:08,436 INFO server.HiveServer2: Starting HiveServer2
> 2021-09-02 16:48:08,462 WARN conf.HiveConf: HiveConf of name 
> hive.metastore.local does not exist
> 2021-09-02 16:48:08,463 WARN conf.HiveConf: HiveConf of name 
> hive.metastore.thrift.bind.host does not exist
> 2021-09-02 16:48:08,463 WARN conf.HiveConf: HiveConf of name 
> hive.enforce.bucketing does not exist
> Hive Session ID = 440449ff-99b7-429c-82d9-e20bdcc9b46f
> 2021-09-02 16:48:08,566 INFO SessionState: Hive Session ID = 
> 440449ff-99b7-429c-82d9-e20bdcc9b46f
> 2021-09-02 16:48:08,566 INFO server.HiveServer2: Shutting down HiveServer2
> 2021-09-02 16:48:08,584 INFO server.HiveServer2: Stopping/Disconnecting tez 
> sessions.
> 2021-09-02 16:48:08,585 WARN server.HiveServer2: Error starting HiveServer2 
> on attempt 1, will retry in 6ms
> java.lang.RuntimeException: Error applying authorization policy on hive 
> configuration: class jdk.internal.loader.ClassLoaders$AppClassLoader cannot 
> be cast to class java.net.URLClassLoader (jdk.
> internal.loader.ClassLoaders$AppClassLoader and java.net.URLClassLoader are 
> in module java.base of loader 'bootstrap')
>  at org.apache.hive.service.cli.CLIService.init(CLIService.java:118)
>  at org.apache.hive.service.CompositeService.init(CompositeService.java:59)
>  at org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:230)
>  at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1036)
>  at 
> org.apache.hive.service.server.HiveServer2.access$1600(HiveServer2.java:140)
>  at 
> org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1305)
>  at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1149)
>  at 

[jira] [Work started] (HIVE-25536) Upgrade to Kafka 2.8

2021-09-17 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25536 started by Viktor Somogyi-Vass.
--
> Upgrade to Kafka 2.8
> 
>
> Key: HIVE-25536
> URL: https://issues.apache.org/jira/browse/HIVE-25536
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Reporter: Viktor Somogyi-Vass
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25536) Upgrade to Kafka 2.8

2021-09-17 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass reassigned HIVE-25536:
--

Assignee: Viktor Somogyi-Vass

> Upgrade to Kafka 2.8
> 
>
> Key: HIVE-25536
> URL: https://issues.apache.org/jira/browse/HIVE-25536
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Reporter: Viktor Somogyi-Vass
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25503) Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries

2021-09-17 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko reassigned HIVE-25503:
-

Assignee: Denys Kuzmenko

> Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries
> --
>
> Key: HIVE-25503
> URL: https://issues.apache.org/jira/browse/HIVE-25503
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Performace improvement. Accumulated entries in COMPLETED_TXN_COMPONENTS can 
> lead to query performance degradation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25503) Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries

2021-09-17 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko resolved HIVE-25503.
---
Resolution: Fixed

> Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries
> --
>
> Key: HIVE-25503
> URL: https://issues.apache.org/jira/browse/HIVE-25503
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Performace improvement. Accumulated entries in COMPLETED_TXN_COMPONENTS can 
> lead to query performance degradation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25503) Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25503?focusedWorklogId=652266=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652266
 ]

ASF GitHub Bot logged work on HIVE-25503:
-

Author: ASF GitHub Bot
Created on: 17/Sep/21 11:00
Start Date: 17/Sep/21 11:00
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged pull request #2612:
URL: https://github.com/apache/hive/pull/2612


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 652266)
Time Spent: 50m  (was: 40m)

> Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries
> --
>
> Key: HIVE-25503
> URL: https://issues.apache.org/jira/browse/HIVE-25503
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Performace improvement. Accumulated entries in COMPLETED_TXN_COMPONENTS can 
> lead to query performance degradation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25535) Adding table property "NO_CLEANUP"

2021-09-17 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416607#comment-17416607
 ] 

Denys Kuzmenko commented on HIVE-25535:
---

hi [~ashish-kumar-sharma], could you please elaborate on the use case when this 
feature would be useful.
cc [~klcopp]

> Adding table property "NO_CLEANUP"
> --
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Add "NO_CLEANUP" in the table properties enable/disable the table-level 
> cleanup and prevent the cleaner process from automatically cleaning obsolete 
> directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25534) Don't preserve FileAttribute.XATTR to initialise distcp.

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25534?focusedWorklogId=652251=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652251
 ]

ASF GitHub Bot logged work on HIVE-25534:
-

Author: ASF GitHub Bot
Created on: 17/Sep/21 09:59
Start Date: 17/Sep/21 09:59
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2650:
URL: https://github.com/apache/hive/pull/2650#discussion_r710922922



##
File path: 
shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
##
@@ -1131,7 +1131,7 @@ public void setStoragePolicy(Path path, 
StoragePolicyValue policy)
   }
 }
 if (needToAddPreserveOption) {
-  params.add("-pbx");
+  params.add("-pb");  //Only Block Size will be preserved.

Review comment:
   Change done.

##
File path: 
shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
##
@@ -1273,8 +1272,6 @@ public boolean runDistCpWithSnapshots(String oldSnapshot, 
String newSnapshot, Li
   }
 } catch (Exception e) {
   throw new IOException("Cannot execute DistCp process: ", e);
-} finally {
-  conf.setBoolean("mapred.mapper.new-api", false);

Review comment:
   Added




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 652251)
Time Spent: 0.5h  (was: 20m)

> Don't preserve FileAttribute.XATTR to initialise distcp.
> 
>
> Key: HIVE-25534
> URL: https://issues.apache.org/jira/browse/HIVE-25534
> Project: Hive
>  Issue Type: Bug
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Remove the preserve xattr while calling distcp.
> {code:java}
> 2021-08-23 10:06:18,485 ERROR org.apache.hadoop.tools.DistCp: 
> [HiveServer2-Background-Pool: Thread-73]: XAttrs not supported on at least 
> one file system: 
>  org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not 
> supported for file system: s3a://hmangla1-dev
>  at 
> org.apache.hadoop.tools.util.DistCpUtils.checkFileSystemXAttrSupport(DistCpUtils.java:513)
>  ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:337) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.createJob(DistCp.java:304) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:214) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.execute(DistCp.java:193) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]{code}
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25346) cleanTxnToWriteIdTable breaks SNAPSHOT isolation

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25346?focusedWorklogId=652245=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652245
 ]

ASF GitHub Bot logged work on HIVE-25346:
-

Author: ASF GitHub Bot
Created on: 17/Sep/21 09:54
Start Date: 17/Sep/21 09:54
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2547:
URL: https://github.com/apache/hive/pull/2547#discussion_r710919584



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -1446,70 +1446,75 @@ public void commitTxn(CommitTxnRequest rqst)
 OperationType.UPDATE + "," + OperationType.DELETE + ")";
 
 long tempCommitId = generateTemporaryId();
-if (txnType.get() != TxnType.READ_ONLY
-&& !isReplayedReplTxn
-&& isUpdateOrDelete(stmt, conflictSQLSuffix)) {
-
-  isUpdateDelete = 'Y';
-  //if here it means currently committing txn performed update/delete 
and we should check WW conflict
-  /**
-   * "select distinct" is used below because
-   * 1. once we get to multi-statement txns, we only care to record 
that something was updated once
-   * 2. if {@link #addDynamicPartitions(AddDynamicPartitions)} is 
retried by caller it may create
-   *  duplicate entries in TXN_COMPONENTS
-   * but we want to add a PK on WRITE_SET which won't have unique rows 
w/o this distinct
-   * even if it includes all of its columns
-   *
-   * First insert into write_set using a temporary commitID, which 
will be updated in a separate call,
-   * see: {@link #updateWSCommitIdAndCleanUpMetadata(Statement, long, 
TxnType, Long, long)}}.
-   * This should decrease the scope of the S4U lock on the next_txn_id 
table.
-   */
-  Savepoint undoWriteSetForCurrentTxn = dbConn.setSavepoint();
-  stmt.executeUpdate("INSERT INTO \"WRITE_SET\" (\"WS_DATABASE\", 
\"WS_TABLE\", \"WS_PARTITION\", \"WS_TXNID\", \"WS_COMMIT_ID\", 
\"WS_OPERATION_TYPE\")" +
-  " SELECT DISTINCT \"TC_DATABASE\", \"TC_TABLE\", 
\"TC_PARTITION\", \"TC_TXNID\", " + tempCommitId + ", \"TC_OPERATION_TYPE\" " + 
conflictSQLSuffix);
-
-  /**
-   * This S4U will mutex with other commitTxn() and openTxns().
-   * -1 below makes txn intervals look like [3,3] [4,4] if all txns 
are serial
-   * Note: it's possible to have several txns have the same commit id. 
 Suppose 3 txns start
-   * at the same time and no new txns start until all 3 commit.
-   * We could've incremented the sequence for commitId as well but it 
doesn't add anything functionally.
-   */
-  acquireTxnLock(stmt, false);
-  commitId = getHighWaterMark(stmt);
+if (txnType.get() != TxnType.READ_ONLY && !isReplayedReplTxn && 
txnType.get() != TxnType.COMPACTION) {
+  if (isUpdateOrDelete(stmt, conflictSQLSuffix)) {
+isUpdateDelete = 'Y';
+//if here it means currently committing txn performed 
update/delete and we should check WW conflict
+/**
+ * "select distinct" is used below because
+ * 1. once we get to multi-statement txns, we only care to record 
that something was updated once
+ * 2. if {@link #addDynamicPartitions(AddDynamicPartitions)} is 
retried by caller it may create
+ *  duplicate entries in TXN_COMPONENTS
+ * but we want to add a PK on WRITE_SET which won't have unique 
rows w/o this distinct
+ * even if it includes all of its columns
+ *
+ * First insert into write_set using a temporary commitID, which 
will be updated in a separate call,
+ * see: {@link #updateWSCommitIdAndCleanUpMetadata(Statement, 
long, TxnType, Long, long)}}.
+ * This should decrease the scope of the S4U lock on the 
next_txn_id table.
+ */
+Savepoint undoWriteSetForCurrentTxn = dbConn.setSavepoint();
+stmt.executeUpdate("INSERT INTO \"WRITE_SET\" (\"WS_DATABASE\", 
\"WS_TABLE\", \"WS_PARTITION\", \"WS_TXNID\", \"WS_COMMIT_ID\", 
\"WS_OPERATION_TYPE\")" +
+" SELECT DISTINCT \"TC_DATABASE\", \"TC_TABLE\", 
\"TC_PARTITION\", \"TC_TXNID\", " + tempCommitId + ", \"TC_OPERATION_TYPE\" " + 
conflictSQLSuffix);
 
-  if (!rqst.isExclWriteEnabled()) {
 /**
- * see if there are any overlapping txns that wrote the same 
element, i.e. have a conflict
- * Since entire commit operation is mutexed wrt other start/commit 
ops,
- * committed.ws_commit_id <= current.ws_commit_id for all txns
- * thus if committed.ws_commit_id < current.ws_txnid, 

[jira] [Updated] (HIVE-23760) Upgrading to Kafka 2.5 Clients

2021-09-17 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-23760:
-
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Upgrading to Kafka 2.5 Clients
> --
>
> Key: HIVE-23760
> URL: https://issues.apache.org/jira/browse/HIVE-23760
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Reporter: Andras Katona
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25513) Delta metrics collection may cause NPE

2021-09-17 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-25513:
-
Parent: HIVE-24824
Issue Type: Sub-task  (was: Bug)

> Delta metrics collection may cause NPE
> --
>
> Key: HIVE-25513
> URL: https://issues.apache.org/jira/browse/HIVE-25513
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When collecting metrics about the number of deltas under specific 
> partitions/tables,  information about which partitions/tables are being read 
> is stored in the Configuration object under key delta.files.metrics.metadata. 
> This information is retrieved in 
> DeltaFilesMetricsReporter#mergeDeltaFilesStats when collecting the actual 
> information about the number of deltas. But if the information was never 
> stored for some reason, an NPE will be thrown from 
> DeltaFilesMetricsReporter#mergeDeltaFilesStats.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-25535) Adding table property "NO_CLEANUP"

2021-09-17 Thread Ashish Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25535 started by Ashish Sharma.

> Adding table property "NO_CLEANUP"
> --
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Add "NO_CLEANUP" in the table properties enable/disable the table-level 
> cleanup and prevent the cleaner process from automatically cleaning obsolete 
> directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25535) Adding table property "NO_CLEANUP"

2021-09-17 Thread Ashish Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-25535:



> Adding table property "NO_CLEANUP"
> --
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Add "NO_CLEANUP" in the table properties enable/disable the table-level 
> cleanup and prevent the cleaner process from automatically cleaning obsolete 
> directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25529) Add tests for reading/writing Iceberg V2 tables with delete files

2021-09-17 Thread Marton Bod (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Bod resolved HIVE-25529.
---
Resolution: Fixed

> Add tests for reading/writing Iceberg V2 tables with delete files
> -
>
> Key: HIVE-25529
> URL: https://issues.apache.org/jira/browse/HIVE-25529
> Project: Hive
>  Issue Type: Task
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Since Iceberg V2 tables are now official, we can start testing out whether V2 
> tables can be created/read/written by Hive. While Hive has no delete 
> statement yet on Iceberg tables, we can nonetheless use the Iceberg API to 
> create delete files manually and then check if Hive honors those deletes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25529) Add tests for reading/writing Iceberg V2 tables with delete files

2021-09-17 Thread Marton Bod (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416516#comment-17416516
 ] 

Marton Bod commented on HIVE-25529:
---

Pushed to master, thanks for the review [~pvary]!

> Add tests for reading/writing Iceberg V2 tables with delete files
> -
>
> Key: HIVE-25529
> URL: https://issues.apache.org/jira/browse/HIVE-25529
> Project: Hive
>  Issue Type: Task
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Since Iceberg V2 tables are now official, we can start testing out whether V2 
> tables can be created/read/written by Hive. While Hive has no delete 
> statement yet on Iceberg tables, we can nonetheless use the Iceberg API to 
> create delete files manually and then check if Hive honors those deletes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25534) Don't preserve FileAttribute.XATTR to initialise distcp.

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25534?focusedWorklogId=652122=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652122
 ]

ASF GitHub Bot logged work on HIVE-25534:
-

Author: ASF GitHub Bot
Created on: 17/Sep/21 06:12
Start Date: 17/Sep/21 06:12
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2650:
URL: https://github.com/apache/hive/pull/2650#discussion_r710778425



##
File path: 
shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
##
@@ -1273,8 +1272,6 @@ public boolean runDistCpWithSnapshots(String oldSnapshot, 
String newSnapshot, Li
   }
 } catch (Exception e) {
   throw new IOException("Cannot execute DistCp process: ", e);
-} finally {
-  conf.setBoolean("mapred.mapper.new-api", false);

Review comment:
   why is this change a part of this JIRA

##
File path: 
shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
##
@@ -1273,8 +1272,6 @@ public boolean runDistCpWithSnapshots(String oldSnapshot, 
String newSnapshot, Li
   }
 } catch (Exception e) {
   throw new IOException("Cannot execute DistCp process: ", e);
-} finally {
-  conf.setBoolean("mapred.mapper.new-api", false);

Review comment:
   add test




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 652122)
Time Spent: 20m  (was: 10m)

> Don't preserve FileAttribute.XATTR to initialise distcp.
> 
>
> Key: HIVE-25534
> URL: https://issues.apache.org/jira/browse/HIVE-25534
> Project: Hive
>  Issue Type: Bug
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Remove the preserve xattr while calling distcp.
> {code:java}
> 2021-08-23 10:06:18,485 ERROR org.apache.hadoop.tools.DistCp: 
> [HiveServer2-Background-Pool: Thread-73]: XAttrs not supported on at least 
> one file system: 
>  org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not 
> supported for file system: s3a://hmangla1-dev
>  at 
> org.apache.hadoop.tools.util.DistCpUtils.checkFileSystemXAttrSupport(DistCpUtils.java:513)
>  ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:337) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.createJob(DistCp.java:304) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:214) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.execute(DistCp.java:193) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]{code}
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25534) Don't preserve FileAttribute.XATTR to initialise distcp.

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25534:
--
Labels: pull-request-available  (was: )

> Don't preserve FileAttribute.XATTR to initialise distcp.
> 
>
> Key: HIVE-25534
> URL: https://issues.apache.org/jira/browse/HIVE-25534
> Project: Hive
>  Issue Type: Bug
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Remove the preserve xattr while calling distcp.
> {code:java}
> 2021-08-23 10:06:18,485 ERROR org.apache.hadoop.tools.DistCp: 
> [HiveServer2-Background-Pool: Thread-73]: XAttrs not supported on at least 
> one file system: 
>  org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not 
> supported for file system: s3a://hmangla1-dev
>  at 
> org.apache.hadoop.tools.util.DistCpUtils.checkFileSystemXAttrSupport(DistCpUtils.java:513)
>  ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:337) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.createJob(DistCp.java:304) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:214) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.execute(DistCp.java:193) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]{code}
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25534) Don't preserve FileAttribute.XATTR to initialise distcp.

2021-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25534?focusedWorklogId=652121=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652121
 ]

ASF GitHub Bot logged work on HIVE-25534:
-

Author: ASF GitHub Bot
Created on: 17/Sep/21 06:11
Start Date: 17/Sep/21 06:11
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2650:
URL: https://github.com/apache/hive/pull/2650#discussion_r710778268



##
File path: 
shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
##
@@ -1131,7 +1131,7 @@ public void setStoragePolicy(Path path, 
StoragePolicyValue policy)
   }
 }
 if (needToAddPreserveOption) {
-  params.add("-pbx");
+  params.add("-pb");  //Only Block Size will be preserved.

Review comment:
   We can't remove it in all the cases. for hdfs we should preserve




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 652121)
Remaining Estimate: 0h
Time Spent: 10m

> Don't preserve FileAttribute.XATTR to initialise distcp.
> 
>
> Key: HIVE-25534
> URL: https://issues.apache.org/jira/browse/HIVE-25534
> Project: Hive
>  Issue Type: Bug
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Remove the preserve xattr while calling distcp.
> {code:java}
> 2021-08-23 10:06:18,485 ERROR org.apache.hadoop.tools.DistCp: 
> [HiveServer2-Background-Pool: Thread-73]: XAttrs not supported on at least 
> one file system: 
>  org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not 
> supported for file system: s3a://hmangla1-dev
>  at 
> org.apache.hadoop.tools.util.DistCpUtils.checkFileSystemXAttrSupport(DistCpUtils.java:513)
>  ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:337) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.createJob(DistCp.java:304) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:214) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.execute(DistCp.java:193) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]{code}
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)