[jira] [Updated] (HIVE-25178) Reduce number of getPartition calls during loadDynamicPartitions

2021-05-28 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-25178:

Labels: performance  (was: )

> Reduce number of getPartition calls during loadDynamicPartitions
> 
>
> Key: HIVE-25178
> URL: https://issues.apache.org/jira/browse/HIVE-25178
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: performance
>
> When dynamic partitions are loaded, Hive::loadDynamicPartition loads all 
> partitions from HMS causing heavy load on it. This becomes worse when large 
> number of partitions are present in tables.
> Only relevant partitions being loaded in dynamic partitions can be queried 
> from HMS for partition existence.
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2958]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25176) Print DAG ID to Console

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25176?focusedWorklogId=603700=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603700
 ]

ASF GitHub Bot logged work on HIVE-25176:
-

Author: ASF GitHub Bot
Created on: 28/May/21 20:44
Start Date: 28/May/21 20:44
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on pull request #2328:
URL: https://github.com/apache/hive/pull/2328#issuecomment-850660916


   I would love to see this happening. This clearly saves +1 step which is 
finding the dag id in hs2 logs for a given hive query id (what i literally do 
all the time while troubleshooting)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603700)
Time Spent: 40m  (was: 0.5h)

> Print DAG ID to Console
> ---
>
> Key: HIVE-25176
> URL: https://issues.apache.org/jira/browse/HIVE-25176
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Would be helpful when troubleshooting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25176) Print DAG ID to Console

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25176?focusedWorklogId=603692=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603692
 ]

ASF GitHub Bot logged work on HIVE-25176:
-

Author: ASF GitHub Bot
Created on: 28/May/21 20:17
Start Date: 28/May/21 20:17
Worklog Time Spent: 10m 
  Work Description: belugabehr opened a new pull request #2328:
URL: https://github.com/apache/hive/pull/2328


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603692)
Time Spent: 0.5h  (was: 20m)

> Print DAG ID to Console
> ---
>
> Key: HIVE-25176
> URL: https://issues.apache.org/jira/browse/HIVE-25176
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Would be helpful when troubleshooting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25176) Print DAG ID to Console

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25176?focusedWorklogId=603688=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603688
 ]

ASF GitHub Bot logged work on HIVE-25176:
-

Author: ASF GitHub Bot
Created on: 28/May/21 20:15
Start Date: 28/May/21 20:15
Worklog Time Spent: 10m 
  Work Description: belugabehr closed pull request #2328:
URL: https://github.com/apache/hive/pull/2328


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603688)
Time Spent: 20m  (was: 10m)

> Print DAG ID to Console
> ---
>
> Key: HIVE-25176
> URL: https://issues.apache.org/jira/browse/HIVE-25176
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Would be helpful when troubleshooting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25177) Add Additional Debugging Help for HBase Reader

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25177:
--
Labels: pull-request-available  (was: )

> Add Additional Debugging Help for HBase Reader
> --
>
> Key: HIVE-25177
> URL: https://issues.apache.org/jira/browse/HIVE-25177
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I recently was wishing I had this data available to me.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25177) Add Additional Debugging Help for HBase Reader

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25177?focusedWorklogId=603633=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603633
 ]

ASF GitHub Bot logged work on HIVE-25177:
-

Author: ASF GitHub Bot
Created on: 28/May/21 17:28
Start Date: 28/May/21 17:28
Worklog Time Spent: 10m 
  Work Description: belugabehr opened a new pull request #2329:
URL: https://github.com/apache/hive/pull/2329


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603633)
Remaining Estimate: 0h
Time Spent: 10m

> Add Additional Debugging Help for HBase Reader
> --
>
> Key: HIVE-25177
> URL: https://issues.apache.org/jira/browse/HIVE-25177
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I recently was wishing I had this data available to me.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25177) Add Additional Debugging Help for HBase Reader

2021-05-28 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-25177:
--
Description: I recently was wishing I had this data available to me.

> Add Additional Debugging Help for HBase Reader
> --
>
> Key: HIVE-25177
> URL: https://issues.apache.org/jira/browse/HIVE-25177
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>
> I recently was wishing I had this data available to me.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25177) Add Additional Debugging Help for HBase Reader

2021-05-28 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor reassigned HIVE-25177:
-


> Add Additional Debugging Help for HBase Reader
> --
>
> Key: HIVE-25177
> URL: https://issues.apache.org/jira/browse/HIVE-25177
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=603611=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603611
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 28/May/21 16:28
Start Date: 28/May/21 16:28
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r641674237



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -162,6 +164,10 @@ public void run() {
 setupMsckPathInvalidation();
 Configuration msckConf = Msck.getMsckConf(conf);
 for (Table table : candidateTables) {
+  if 
(MetaStoreUtils.isDbBeingFailedOver(msc.getDatabase(table.getCatName(), 
table.getDbName( {

Review comment:
   Might not be doable. Say somehow we cached db failover property and 
after this repl.failover.enbaled prop is set to true for that db. Now, how will 
we figure out that cached data is outdated and needs to be fetched again?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603611)
Time Spent: 50m  (was: 40m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=603609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603609
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 28/May/21 16:25
Start Date: 28/May/21 16:25
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r641672545



##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestPartitionManagement.java
##
@@ -620,6 +620,59 @@ public void testNoPartitionDiscoveryForReplTable() throws 
Exception {
 assertEquals(3, partitions.size());
   }
 
+  @Test
+  public void testNoPartitionDiscoveryForFailoverDb() throws Exception {
+String dbName = "db_failover";
+String tableName = "tbl_failover";
+Map colMap = buildAllColumns();
+List partKeys = Lists.newArrayList("state", "dt");
+List partKeyTypes = Lists.newArrayList("string", "date");
+List> partVals = Lists.newArrayList(
+Lists.newArrayList("__HIVE_DEFAULT_PARTITION__", "1990-01-01"),
+Lists.newArrayList("CA", "1986-04-28"),
+Lists.newArrayList("MN", "2018-11-31"));
+createMetadata(DEFAULT_CATALOG_NAME, dbName, tableName, partKeys, 
partKeyTypes, partVals, colMap, false);
+Table table = client.getTable(dbName, tableName);
+List partitions = client.listPartitions(dbName, tableName, 
(short) -1);
+assertEquals(3, partitions.size());
+String tableLocation = table.getSd().getLocation();
+URI location = URI.create(tableLocation);
+Path tablePath = new Path(location);
+FileSystem fs = FileSystem.get(location, conf);
+Path newPart1 = new Path(tablePath, "state=WA/dt=2018-12-01");
+Path newPart2 = new Path(tablePath, "state=UT/dt=2018-12-02");
+fs.mkdirs(newPart1);
+fs.mkdirs(newPart2);
+assertEquals(5, fs.listStatus(tablePath).length);
+partitions = client.listPartitions(dbName, tableName, (short) -1);
+assertEquals(3, partitions.size());
+
+// table property is set to true, but the table is marked as replication 
target. The new
+// partitions should not be created
+
table.getParameters().put(PartitionManagementTask.DISCOVER_PARTITIONS_TBLPROPERTY,
 "true");
+Database db = client.getDatabase(table.getDbName());
+db.putToParameters(ReplConst.REPL_FAILOVER_ENABLED, "true");
+client.alterDatabase(table.getDbName(), db);
+client.alter_table(dbName, tableName, table);

Review comment:
   Alter table will enable discover partition property for the table.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603609)
Time Spent: 40m  (was: 0.5h)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=603605=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603605
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 28/May/21 16:19
Start Date: 28/May/21 16:19
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r641668822



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
##
@@ -228,6 +230,15 @@ public static boolean isExternalTable(Table table) {
 return isExternal(params);
   }
 
+  public static boolean isDbBeingFailedOver(Database db) {
+assert (db != null);
+Map dbParameters = db.getParameters();
+if ((dbParameters != null) && 
(dbParameters.containsKey(ReplConst.REPL_FAILOVER_ENABLED))) {
+  return 
!StringUtils.isEmpty(dbParameters.get(ReplConst.REPL_FAILOVER_ENABLED));

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603605)
Time Spent: 0.5h  (was: 20m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25102) Cache Iceberg table objects within same query

2021-05-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér resolved HIVE-25102.
--
Resolution: Fixed

> Cache Iceberg table objects within same query
> -
>
> Key: HIVE-25102
> URL: https://issues.apache.org/jira/browse/HIVE-25102
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> We run Catalogs.loadTable(configuration, props) plenty of times which is 
> costly.
> We should:
>  - Cache it maybe even globally based on the queryId
>  - Make sure that the query uses one snapshot during the whole execution of a 
> single query



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25102) Cache Iceberg table objects within same query

2021-05-28 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-25102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353437#comment-17353437
 ] 

László Pintér commented on HIVE-25102:
--

Merged into master. Thanks, [~Marton Bod] and [~pvary] for the review!

> Cache Iceberg table objects within same query
> -
>
> Key: HIVE-25102
> URL: https://issues.apache.org/jira/browse/HIVE-25102
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> We run Catalogs.loadTable(configuration, props) plenty of times which is 
> costly.
> We should:
>  - Cache it maybe even globally based on the queryId
>  - Make sure that the query uses one snapshot during the whole execution of a 
> single query



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25102) Cache Iceberg table objects within same query

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25102?focusedWorklogId=603596=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603596
 ]

ASF GitHub Bot logged work on HIVE-25102:
-

Author: ASF GitHub Bot
Created on: 28/May/21 16:07
Start Date: 28/May/21 16:07
Worklog Time Spent: 10m 
  Work Description: lcspinter merged pull request #2261:
URL: https://github.com/apache/hive/pull/2261


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603596)
Time Spent: 9h  (was: 8h 50m)

> Cache Iceberg table objects within same query
> -
>
> Key: HIVE-25102
> URL: https://issues.apache.org/jira/browse/HIVE-25102
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> We run Catalogs.loadTable(configuration, props) plenty of times which is 
> costly.
> We should:
>  - Cache it maybe even globally based on the queryId
>  - Make sure that the query uses one snapshot during the whole execution of a 
> single query



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=603584=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603584
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 28/May/21 15:51
Start Date: 28/May/21 15:51
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r640356534



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -99,6 +99,15 @@ public boolean isTargetOfReplication(Database db) {
 return false;
   }
 
+  public static boolean isBeingFailovedOver(Database db) {

Review comment:
   We can move this to some util class to avoid duplication

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -162,6 +164,10 @@ public void run() {
 setupMsckPathInvalidation();
 Configuration msckConf = Msck.getMsckConf(conf);
 for (Table table : candidateTables) {
+  if 
(MetaStoreUtils.isDbBeingFailedOver(msc.getDatabase(table.getCatName(), 
table.getDbName( {

Review comment:
   This is going to be costly. One HMS call per table. Can we maintain a 
cache?

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -99,6 +99,15 @@ public boolean isTargetOfReplication(Database db) {
 return false;
   }
 
+  public static boolean isBeingFailovedOver(Database db) {

Review comment:
   nit: Typo

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestPartitionManagement.java
##
@@ -659,6 +712,45 @@ public void testNoPartitionRetentionForReplTarget() throws 
TException, Interrupt
 assertEquals(3, partitions.size());
   }
 
+  @Test
+  public void testNoPartitionRetentionForFailoverDb() throws TException, 
InterruptedException {
+String dbName = "db_failover";
+String tableName = "tbl_failover";
+Map colMap = buildAllColumns();
+List partKeys = Lists.newArrayList("state", "dt");
+List partKeyTypes = Lists.newArrayList("string", "date");
+List> partVals = Lists.newArrayList(
+Lists.newArrayList("__HIVE_DEFAULT_PARTITION__", "1990-01-01"),
+Lists.newArrayList("CA", "1986-04-28"),
+Lists.newArrayList("MN", "2018-11-31"));
+// Check for the existence of partitions 10 seconds after the partition 
retention period has
+// elapsed. Gives enough time for the partition retention task to work.
+long partitionRetentionPeriodMs = 2;
+long waitingPeriodForTest = partitionRetentionPeriodMs + 10 * 1000;
+createMetadata(DEFAULT_CATALOG_NAME, dbName, tableName, partKeys, 
partKeyTypes, partVals, colMap, false);
+Table table = client.getTable(dbName, tableName);
+List partitions = client.listPartitions(dbName, tableName, 
(short) -1);
+assertEquals(3, partitions.size());
+
+
table.getParameters().put(PartitionManagementTask.DISCOVER_PARTITIONS_TBLPROPERTY,
 "true");
+
table.getParameters().put(PartitionManagementTask.PARTITION_RETENTION_PERIOD_TBLPROPERTY,
+partitionRetentionPeriodMs + "ms");
+client.alter_table(dbName, tableName, table);
+Database db = client.getDatabase(table.getDbName());
+db.putToParameters(ReplConst.REPL_FAILOVER_ENABLED, "true");

Review comment:
   May be we. can have two both cases covered, with and without 
ReplConst.REPL_FAILOVER_ENABLED in the same test.

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestPartitionManagement.java
##
@@ -620,6 +620,59 @@ public void testNoPartitionDiscoveryForReplTable() throws 
Exception {
 assertEquals(3, partitions.size());
   }
 
+  @Test
+  public void testNoPartitionDiscoveryForFailoverDb() throws Exception {
+String dbName = "db_failover";
+String tableName = "tbl_failover";
+Map colMap = buildAllColumns();
+List partKeys = Lists.newArrayList("state", "dt");
+List partKeyTypes = Lists.newArrayList("string", "date");
+List> partVals = Lists.newArrayList(
+Lists.newArrayList("__HIVE_DEFAULT_PARTITION__", "1990-01-01"),
+Lists.newArrayList("CA", "1986-04-28"),
+Lists.newArrayList("MN", "2018-11-31"));
+createMetadata(DEFAULT_CATALOG_NAME, dbName, tableName, partKeys, 
partKeyTypes, partVals, colMap, false);
+Table table = client.getTable(dbName, tableName);
+List partitions = client.listPartitions(dbName, tableName, 
(short) -1);
+assertEquals(3, partitions.size());
+String tableLocation = 

[jira] [Work logged] (HIVE-25176) Print DAG ID to Console

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25176?focusedWorklogId=603568=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603568
 ]

ASF GitHub Bot logged work on HIVE-25176:
-

Author: ASF GitHub Bot
Created on: 28/May/21 15:26
Start Date: 28/May/21 15:26
Worklog Time Spent: 10m 
  Work Description: belugabehr opened a new pull request #2328:
URL: https://github.com/apache/hive/pull/2328


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603568)
Remaining Estimate: 0h
Time Spent: 10m

> Print DAG ID to Console
> ---
>
> Key: HIVE-25176
> URL: https://issues.apache.org/jira/browse/HIVE-25176
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Would be helpful when troubleshooting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25176) Print DAG ID to Console

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25176:
--
Labels: pull-request-available  (was: )

> Print DAG ID to Console
> ---
>
> Key: HIVE-25176
> URL: https://issues.apache.org/jira/browse/HIVE-25176
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Would be helpful when troubleshooting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25176) Print DAG ID to Console

2021-05-28 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-25176:
--
Description: Would be helpful when troubleshooting.

> Print DAG ID to Console
> ---
>
> Key: HIVE-25176
> URL: https://issues.apache.org/jira/browse/HIVE-25176
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>
> Would be helpful when troubleshooting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25176) Print DAG ID to Console

2021-05-28 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor reassigned HIVE-25176:
-


> Print DAG ID to Console
> ---
>
> Key: HIVE-25176
> URL: https://issues.apache.org/jira/browse/HIVE-25176
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-287) support count(*) and count distinct on multiple columns

2021-05-28 Thread Dave Seth (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353307#comment-17353307
 ] 

Dave Seth edited comment on HIVE-287 at 5/28/21, 11:58 AM:
---

[Carters Coupon 
Code|https://couponsagent.com/front/store-profile/carters-coupon-codes]

Find discounts on more than 10,000 quirky graphic tees for toddlers, sweaters 
and hoodies, shoes, outerwear and much more for your little ones at the lowest 
prices online and hassle-free.


was (Author: daveseth9682):
*[Carters Coupon Code|*

[*https://couponsagent.com/front/store-profile/carters-coupon-codes*]

*]* 

*Find discounts on more than 10,000 quirky graphic tees for toddlers, sweaters 
and hoodies, shoes, outerwear and much more for your little ones at the lowest 
prices online and hassle-free.* **

> support count(*) and count distinct on multiple columns
> ---
>
> Key: HIVE-287
> URL: https://issues.apache.org/jira/browse/HIVE-287
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: Namit Jain
>Assignee: Arvind Prabhakar
>Priority: Major
> Fix For: 0.6.0
>
> Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, 
> HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch, 
> HIVE-287-6-branch-0.6.patch, HIVE-287-6-trunk.patch
>
>
> The following query does not work:
> select count(distinct col1, col2) from Tbl



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-287) support count(*) and count distinct on multiple columns

2021-05-28 Thread Dave Seth (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353307#comment-17353307
 ] 

Dave Seth edited comment on HIVE-287 at 5/28/21, 11:57 AM:
---

*[Carters Coupon Code|*

[*https://couponsagent.com/front/store-profile/carters-coupon-codes*]

*]* 

*Find discounts on more than 10,000 quirky graphic tees for toddlers, sweaters 
and hoodies, shoes, outerwear and much more for your little ones at the lowest 
prices online and hassle-free.* **


was (Author: daveseth9682):
*[Carters Coupon 
Code|*[*https://couponsagent.com/front/store-profile/carters-coupon-codes*]*]* 
*Find discounts on more than 10,000 quirky graphic tees for toddlers, sweaters 
and hoodies, shoes, outerwear and much more for your little ones at the lowest 
prices online and hassle-free.* **

> support count(*) and count distinct on multiple columns
> ---
>
> Key: HIVE-287
> URL: https://issues.apache.org/jira/browse/HIVE-287
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: Namit Jain
>Assignee: Arvind Prabhakar
>Priority: Major
> Fix For: 0.6.0
>
> Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, 
> HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch, 
> HIVE-287-6-branch-0.6.patch, HIVE-287-6-trunk.patch
>
>
> The following query does not work:
> select count(distinct col1, col2) from Tbl



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-287) support count(*) and count distinct on multiple columns

2021-05-28 Thread Dave Seth (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353307#comment-17353307
 ] 

Dave Seth commented on HIVE-287:


*[Carters Coupon 
Code|*[*https://couponsagent.com/front/store-profile/carters-coupon-codes*]*]* 
*Find discounts on more than 10,000 quirky graphic tees for toddlers, sweaters 
and hoodies, shoes, outerwear and much more for your little ones at the lowest 
prices online and hassle-free.* **

> support count(*) and count distinct on multiple columns
> ---
>
> Key: HIVE-287
> URL: https://issues.apache.org/jira/browse/HIVE-287
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: Namit Jain
>Assignee: Arvind Prabhakar
>Priority: Major
> Fix For: 0.6.0
>
> Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, 
> HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch, 
> HIVE-287-6-branch-0.6.patch, HIVE-287-6-trunk.patch
>
>
> The following query does not work:
> select count(distinct col1, col2) from Tbl



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25169) using coalesce via vector,source column type is int and target column type is bigint,the result of target is zero

2021-05-28 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353306#comment-17353306
 ] 

Panagiotis Garefalakis commented on HIVE-25169:
---

Hey [~junnan.yang] thanks for reporting this! Would it make sense to backport 
the ticket that resolved this from master?
On a general note it would be much easier to review this with a github PR and a 
test case.

Cheers


> using coalesce via vector,source column type is int and target column type is 
> bigint,the result of target is zero
> -
>
> Key: HIVE-25169
> URL: https://issues.apache.org/jira/browse/HIVE-25169
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.1.2
>Reporter: junnan.yang
>Priority: Major
> Attachments: HIVE-25169.01.patch
>
>
> sourceTable:
>     product_id int;
> ###
> targetTable:
>     product_id bigint;
> ##
> sql: 
>     insert overwrite table targetTable:
>     select 
>     ..
>      coalesce(product_id,-1),
>     ..
>     from sourceTable;
> ##
> explain sql :
>      UDFToLong(COALESCE(product_id,-1)) (type: bigint)
> ##
> result :
>      the column product_id in targetTable is zero, this is wrong result
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25174) HiveMetastoreAuthorizer didn't check URI permission for AlterTableEvent

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25174?focusedWorklogId=603418=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603418
 ]

ASF GitHub Bot logged work on HIVE-25174:
-

Author: ASF GitHub Bot
Created on: 28/May/21 07:39
Start Date: 28/May/21 07:39
Worklog Time Spent: 10m 
  Work Description: symious opened a new pull request #2327:
URL: https://github.com/apache/hive/pull/2327


   ### What changes were proposed in this pull request?
   When Using Ranger on Hive MetaStore, we met an issue that users without 
permission to table's HDFS path succeeded in running "msck repair table 
TABLENAME".
   
   This command is not authorized when we use `StorageBasedAuthorizer`, after 
checking the code, we found `StorageBasedAuthorizer` would check the permission 
of table's HDFS path, while `HiveMetastoreAuthorizer` used by Ranger won't when 
dealing with the event of `AlterTableEvent`.
   
   This ticket is to add the URI permission check on AlterTableEvent for 
`HiveMetastoreAuthorizer`.
   
   
   ### Why are the changes needed?
   When using `StorageBasedAuthorizer`, the command of `msck repair table` 
would fail if the user don't have write permission to the table's path. But 
when using `HiveMetastoreAuthorizer` with Ranger, the command would succeed 
even the user don't have write permission to the table's path.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Can be manually test with `alter table` command. Need to set Ranger as 
Authorizer for Hive MetaStore. Before the test, we need to ensure test user 
doesn't have write permission on the table's path.
   * before applying patch
   ```
   spark-sql>
> alter table yiyang_people add columns(id int);
   Time taken: 2.379 seconds
   21/05/28 15:33:17 INFO SparkSQLCLIDriver: Time taken: 2.379 seconds
   spark-sql>
   ```
   * after applying patch
   ```
   spark-sql>
>
> alter table yiyang_people add columns(id int);
   21/05/28 15:30:59 WARN HiveExternalCatalog: Could not alter schema of table 
`default`.`yiyang_people` in a Hive compatible way. Updating Hive metastore in 
Spark SQL specific format.
   java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:498)
   at 
org.apache.spark.sql.hive.client.Shim_v0_12.alterTable(HiveShim.scala:400)
   at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$alterTableDataSchema$1.apply$mcV$sp(HiveClientImpl.scala:536)
   at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$alterTableDataSchema$1.apply(HiveClientImpl.scala:515)
   at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$alterTableDataSchema$1.apply(HiveClientImpl.scala:515)
   at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:277)
   at 
org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:215)
   at 
org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:214)
   at 
org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:260)
   at 
org.apache.spark.sql.hive.client.HiveClientImpl.alterTableDataSchema(HiveClientImpl.scala:515)
   at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$alterTableDataSchema$1.apply$mcV$sp(HiveExternalCatalog.scala:664)
   at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$alterTableDataSchema$1.apply(HiveExternalCatalog.scala:650)
   at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$alterTableDataSchema$1.apply(HiveExternalCatalog.scala:650)
   at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
   at 
org.apache.spark.sql.hive.HiveExternalCatalog.alterTableDataSchema(HiveExternalCatalog.scala:650)
   at 
org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.alterTableDataSchema(ExternalCatalogWithListener.scala:124)
   at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.alterTableDataSchema(SessionCatalog.scala:391)
   at 
org.apache.spark.sql.execution.command.AlterTableAddColumnsCommand.run(tables.scala:203)
   at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   at 

[jira] [Updated] (HIVE-25174) HiveMetastoreAuthorizer didn't check URI permission for AlterTableEvent

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25174:
--
Labels: pull-request-available  (was: )

> HiveMetastoreAuthorizer didn't check URI permission for AlterTableEvent
> ---
>
> Key: HIVE-25174
> URL: https://issues.apache.org/jira/browse/HIVE-25174
> Project: Hive
>  Issue Type: Improvement
>Reporter: Janus Chow
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When Using Ranger on Hive MetaStore, we met an issue that users without 
> permission to table's HDFS path succeeded in running "msck repair table 
> TABLENAME".
> This command is not authorized when we use `StorageBasedAuthorizer`, after 
> checking the code, we found `StorageBasedAuthorizer` would check the 
> permission of table's HDFS path, while `HiveMetastoreAuthorizer` used by 
> Ranger won't when dealing with the event of `AlterTableEvent`.
> This ticket is to add the URI permission check on AlterTableEvent for 
> `HiveMetastoreAuthorizer`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25173) Fix build failure of hive-pre-upgrade due to missing dependency on pentaho-aggdesigner-algorithm

2021-05-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25173?focusedWorklogId=603404=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603404
 ]

ASF GitHub Bot logged work on HIVE-25173:
-

Author: ASF GitHub Bot
Created on: 28/May/21 06:01
Start Date: 28/May/21 06:01
Worklog Time Spent: 10m 
  Work Description: iwasakims commented on pull request #2326:
URL: https://github.com/apache/hive/pull/2326#issuecomment-850159676


   There are 3 test failures.
   
   * Testing / split-18 / PostProcess / 
testForcedLocalityMultiplePreemptionsSameHost1 – 
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService
   * Testing / split-10 / PostProcess / testExternalDefaultPaths – 
org.apache.hadoop.hive.ql.TestWarehouseExternalDir
   * Testing / split-10 / PostProcess / – 
org.apache.hadoop.hive.ql.TestWarehouseExternalDir
   
   I could not reproduce the failure on my local. It looks unrelated to the 
patch.
   
   ```
   $ cd ~/srcs/hive/llap-tez
   $ mvn test -Dtest=TestLlapTaskSchedulerService
   ...
   [INFO] ---
   [INFO]  T E S T S
   [INFO] ---
   [INFO] Running 
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService
   [INFO] Tests run: 34, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
14.162 s - in 
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService
   ```
   ```
   $ cd ~/srcs/hive/itests
   $ mvn install -DskipTests
   $ cd hive-unit
   $ mvn test -Dtest=TestWarehouseExternalDir
   ...
   [INFO] Running org.apache.hadoop.hive.ql.TestWarehouseExternalDir
   [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
44.955 s - in org.apache.hadoop.hive.ql.TestWarehouseExternalDir
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603404)
Time Spent: 20m  (was: 10m)

> Fix build failure of hive-pre-upgrade due to missing dependency on 
> pentaho-aggdesigner-algorithm
> 
>
> Key: HIVE-25173
> URL: https://issues.apache.org/jira/browse/HIVE-25173
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.2
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat}
> [ERROR] Failed to execute goal on project hive-pre-upgrade: Could not resolve 
> dependencies for project org.apache.hive:hive-pre-upgrade:jar:4.0.0-SNAPSHOT: 
> Failure to find org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)