[jira] [Work logged] (HIVE-23935) Fetching primaryKey through beeline fails with NPE

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23935?focusedWorklogId=504411=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504411
 ]

ASF GitHub Bot logged work on HIVE-23935:
-

Author: ASF GitHub Bot
Created on: 24/Oct/20 04:35
Start Date: 24/Oct/20 04:35
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on a change in pull request #1605:
URL: https://github.com/apache/hive/pull/1605#discussion_r511318476



##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
##
@@ -1124,6 +1125,40 @@ public void testEmptyTrustStoreProps() {
 setAndCheckSSLProperties(true, "", "", "jks");
   }
 
+  /**
+   * Tests getPrimaryKeys() when db_name isn't specified.
+   */
+  @Test
+  public void testGetPrimaryKeys() throws Exception {

Review comment:
   The change itself is in ObjectStore, and the API too is in ObjectStore. 
The `TestPrimaryKey` is a parametrised test and for one case, this issue 
doesn't happen. I think the test is better here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504411)
Time Spent: 1.5h  (was: 1h 20m)

> Fetching primaryKey through beeline fails with NPE
> --
>
> Key: HIVE-23935
> URL: https://issues.apache.org/jira/browse/HIVE-23935
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Fetching PrimaryKey of a table through Beeline !primarykey fails with NPE
> {noformat}
> 0: jdbc:hive2://localhost:1> !primarykeys Persons
> Error: MetaException(message:java.lang.NullPointerException) (state=,code=0)
> org.apache.hive.service.cli.HiveSQLException: 
> MetaException(message:java.lang.NullPointerException)
>   at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:360)
>   at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:351)
>   at 
> org.apache.hive.jdbc.HiveDatabaseMetaData.getPrimaryKeys(HiveDatabaseMetaData.java:573)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hive.beeline.Reflector.invoke(Reflector.java:89)
>   at org.apache.hive.beeline.Commands.metadata(Commands.java:125)
>   at org.apache.hive.beeline.Commands.primarykeys(Commands.java:231)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:57)
>   at 
> org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1465)
>   at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1504)
>   at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1364)
>   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1134)
>   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1082)
>   at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:546)
>   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:528)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23935) Fetching primaryKey through beeline fails with NPE

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23935?focusedWorklogId=504410=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504410
 ]

ASF GitHub Bot logged work on HIVE-23935:
-

Author: ASF GitHub Bot
Created on: 24/Oct/20 04:07
Start Date: 24/Oct/20 04:07
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on a change in pull request #1605:
URL: https://github.com/apache/hive/pull/1605#discussion_r511305357



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -10833,7 +10833,8 @@ public FileMetadataHandler 
getFileMetadataHandler(FileMetadataExprType type) {
  final String 
db_name_input,
  final String 
tbl_name_input)
   throws MetaException, NoSuchObjectException {
-final String db_name = normalizeIdentifier(db_name_input);
+final String db_name =

Review comment:
   Thanx @ashish-kumar-sharma 
   1. I will address.
   2. Why for catName? It isn't getting accessed anywhere which can fetch any 
NPE.
   3. I didn't catch this? I haven't introduce any variable itself.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504410)
Time Spent: 1h 20m  (was: 1h 10m)

> Fetching primaryKey through beeline fails with NPE
> --
>
> Key: HIVE-23935
> URL: https://issues.apache.org/jira/browse/HIVE-23935
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Fetching PrimaryKey of a table through Beeline !primarykey fails with NPE
> {noformat}
> 0: jdbc:hive2://localhost:1> !primarykeys Persons
> Error: MetaException(message:java.lang.NullPointerException) (state=,code=0)
> org.apache.hive.service.cli.HiveSQLException: 
> MetaException(message:java.lang.NullPointerException)
>   at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:360)
>   at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:351)
>   at 
> org.apache.hive.jdbc.HiveDatabaseMetaData.getPrimaryKeys(HiveDatabaseMetaData.java:573)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hive.beeline.Reflector.invoke(Reflector.java:89)
>   at org.apache.hive.beeline.Commands.metadata(Commands.java:125)
>   at org.apache.hive.beeline.Commands.primarykeys(Commands.java:231)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:57)
>   at 
> org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1465)
>   at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1504)
>   at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1364)
>   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1134)
>   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1082)
>   at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:546)
>   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:528)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23935) Fetching primaryKey through beeline fails with NPE

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23935?focusedWorklogId=504409=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504409
 ]

ASF GitHub Bot logged work on HIVE-23935:
-

Author: ASF GitHub Bot
Created on: 24/Oct/20 03:53
Start Date: 24/Oct/20 03:53
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1605:
URL: https://github.com/apache/hive/pull/1605#discussion_r511299604



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -10833,7 +10833,8 @@ public FileMetadataHandler 
getFileMetadataHandler(FileMetadataExprType type) {
  final String 
db_name_input,
  final String 
tbl_name_input)
   throws MetaException, NoSuchObjectException {
-final String db_name = normalizeIdentifier(db_name_input);
+final String db_name =

Review comment:
   1. Can we use StringUtils.isNotBlank(db_name_input) instead of 
(db_name_input!=null). 
   2. Also can we have the same check on catName.
   3. Can we use unified camel casing naming  convention across variable name.

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
##
@@ -1124,6 +1125,40 @@ public void testEmptyTrustStoreProps() {
 setAndCheckSSLProperties(true, "", "", "jks");
   }
 
+  /**
+   * Tests getPrimaryKeys() when db_name isn't specified.
+   */
+  @Test
+  public void testGetPrimaryKeys() throws Exception {

Review comment:
   Please add this Test to class TestPrimaryKey.java





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504409)
Time Spent: 1h 10m  (was: 1h)

> Fetching primaryKey through beeline fails with NPE
> --
>
> Key: HIVE-23935
> URL: https://issues.apache.org/jira/browse/HIVE-23935
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Fetching PrimaryKey of a table through Beeline !primarykey fails with NPE
> {noformat}
> 0: jdbc:hive2://localhost:1> !primarykeys Persons
> Error: MetaException(message:java.lang.NullPointerException) (state=,code=0)
> org.apache.hive.service.cli.HiveSQLException: 
> MetaException(message:java.lang.NullPointerException)
>   at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:360)
>   at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:351)
>   at 
> org.apache.hive.jdbc.HiveDatabaseMetaData.getPrimaryKeys(HiveDatabaseMetaData.java:573)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hive.beeline.Reflector.invoke(Reflector.java:89)
>   at org.apache.hive.beeline.Commands.metadata(Commands.java:125)
>   at org.apache.hive.beeline.Commands.primarykeys(Commands.java:231)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:57)
>   at 
> org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1465)
>   at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1504)
>   at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1364)
>   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1134)
>   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1082)
>   at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:546)
>   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:528)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> 

[jira] [Updated] (HIVE-24310) Allow specified number of deserialize errors to be ignored

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24310:
--
Labels: pull-request-available  (was: )

> Allow specified number of deserialize errors to be ignored
> --
>
> Key: HIVE-24310
> URL: https://issues.apache.org/jira/browse/HIVE-24310
> Project: Hive
>  Issue Type: Improvement
>  Components: Operators
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Sometimes we see some corrupted records in user's raw data,  like one 
> corrupted in a file which contains over thousands of records, user has to 
> either give up all records or replay the whole data in order to run 
> successfully on hive, we should provide a way to ignore such corrupted 
> records. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24310) Allow specified number of deserialize errors to be ignored

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24310?focusedWorklogId=504402=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504402
 ]

ASF GitHub Bot logged work on HIVE-24310:
-

Author: ASF GitHub Bot
Created on: 24/Oct/20 02:44
Start Date: 24/Oct/20 02:44
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1607:
URL: https://github.com/apache/hive/pull/1607


   
   
   ### What changes were proposed in this pull request?
   Allow specified number of deserialize errors to be ignored
   
   
   
   ### Why are the changes needed?
   Sometimes we see some corrupted records in user's raw data,  like one 
corrupted in a file which contains over thousands of records, user has to 
either give up all records or replay the whole data in order to run 
successfully on hive, we should provide a way to ignore such corrupted records. 
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   
   ### How was this patch tested?
   unit tests
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504402)
Remaining Estimate: 0h
Time Spent: 10m

> Allow specified number of deserialize errors to be ignored
> --
>
> Key: HIVE-24310
> URL: https://issues.apache.org/jira/browse/HIVE-24310
> Project: Hive
>  Issue Type: Improvement
>  Components: Operators
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Sometimes we see some corrupted records in user's raw data,  like one 
> corrupted in a file which contains over thousands of records, user has to 
> either give up all records or replay the whole data in order to run 
> successfully on hive, we should provide a way to ignore such corrupted 
> records. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24310) Allow specified number of deserialize errors to be ignored

2020-10-23 Thread Zhihua Deng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-24310:
--


> Allow specified number of deserialize errors to be ignored
> --
>
> Key: HIVE-24310
> URL: https://issues.apache.org/jira/browse/HIVE-24310
> Project: Hive
>  Issue Type: Improvement
>  Components: Operators
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>
> Sometimes we see some corrupted records in user's raw data,  like one 
> corrupted in a file which contains over thousands of records, user has to 
> either give up all records or replay the whole data in order to run 
> successfully on hive, we should provide a way to ignore such corrupted 
> records. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24032) Remove hadoop shims dependency and use FileSystem Api directly from standalone metastore

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24032?focusedWorklogId=504388=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504388
 ]

ASF GitHub Bot logged work on HIVE-24032:
-

Author: ASF GitHub Bot
Created on: 24/Oct/20 00:58
Start Date: 24/Oct/20 00:58
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1396:
URL: https://github.com/apache/hive/pull/1396#issuecomment-715647392


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504388)
Time Spent: 1.5h  (was: 1h 20m)

> Remove hadoop shims dependency and use FileSystem Api directly from 
> standalone metastore
> 
>
> Key: HIVE-24032
> URL: https://issues.apache.org/jira/browse/HIVE-24032
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24032.01.patch, HIVE-24032.02.patch, 
> HIVE-24032.03.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Remove hadoop shims dependency from standalone metastore. 
> Rename hive.repl.data.copy.lazy hive conf to 
> hive.repl.run.data.copy.tasks.on.target



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23926) Flaky test TestTableLevelReplicationScenarios.testRenameTableScenariosWithReplacePolicyDMLOperattion

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23926?focusedWorklogId=504386=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504386
 ]

ASF GitHub Bot logged work on HIVE-23926:
-

Author: ASF GitHub Bot
Created on: 24/Oct/20 00:58
Start Date: 24/Oct/20 00:58
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1420:
URL: https://github.com/apache/hive/pull/1420#issuecomment-715647384


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504386)
Time Spent: 20m  (was: 10m)

> Flaky test 
> TestTableLevelReplicationScenarios.testRenameTableScenariosWithReplacePolicyDMLOperattion
> 
>
> Key: HIVE-23926
> URL: https://issues.apache.org/jira/browse/HIVE-23926
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23926.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/123/testReport/org.apache.hadoop.hive.ql.parse/TestTableLevelReplicationScenarios/Testing___split_18___Archive___testRenameTableScenariosWithReplacePolicyDMLOperattion/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24304) Query containing UNION fails with OOM

2020-10-23 Thread Vineet Garg (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg resolved HIVE-24304.

Fix Version/s: 4.0.0
   Resolution: Fixed

Pushed to master.

> Query containing UNION fails with OOM
> -
>
> Key: HIVE-24304
> URL: https://issues.apache.org/jira/browse/HIVE-24304
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24304) Query containing UNION fails with OOM

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24304?focusedWorklogId=504378=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504378
 ]

ASF GitHub Bot logged work on HIVE-24304:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 23:54
Start Date: 23/Oct/20 23:54
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 merged pull request #1600:
URL: https://github.com/apache/hive/pull/1600


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504378)
Time Spent: 0.5h  (was: 20m)

> Query containing UNION fails with OOM
> -
>
> Key: HIVE-24304
> URL: https://issues.apache.org/jira/browse/HIVE-24304
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24294) TezSessionPool sessions can throw AssertionError

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24294?focusedWorklogId=504344=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504344
 ]

ASF GitHub Bot logged work on HIVE-24294:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 22:06
Start Date: 23/Oct/20 22:06
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on pull request #1596:
URL: https://github.com/apache/hive/pull/1596#issuecomment-715610964


   LGTM +1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504344)
Time Spent: 20m  (was: 10m)

> TezSessionPool sessions can throw AssertionError
> 
>
> Key: HIVE-24294
> URL: https://issues.apache.org/jira/browse/HIVE-24294
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Whenever default TezSessionPool sessions are reopened for some reason, we are 
> setting dagResources to null before close & setting it back in openWhenever 
> default TezSessionPool sessions are reopened for some reason, we are setting 
> dagResources to null before close & setting it back in open
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L498-L503
> If there is an exception in sessionState.close(), we are not restoring the 
> dagResource but moving the session back to TezSessionPool.eg., exception 
> trace when sessionState.close() failed
> {code:java}
> 2020-10-15T09:20:28,749 INFO  [HiveServer2-Background-Pool: Thread-25451]: 
> client.TezClient (:()) - Failed to shutdown Tez Session via proxy
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1602093123456_12345, yarnApplicationState=FINISHED, 
> finalApplicationStatus=SUCCEEDED, 
> trackingUrl=http://localhost:8088/proxy/application_1602093123456_12345/, 
> diagnostics=Session timed out, lastDAGCompletionTime=1602997683786 ms, 
> sessionTimeoutInterval=60 ms
> Session stats:submittedDAGs=2, successfulDAGs=2, failedDAGs=0, killedDAGs=0   
>  at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) 
> at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1060) 
> at org.apache.tez.client.TezClient.stop(TezClient.java:743) 
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.closeClient(TezSessionState.java:789)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:756)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.close(TezSessionPoolSession.java:111)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopenInternal(TezSessionPoolManager.java:496)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopen(TezSessionPoolManager.java:487)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.reopen(TezSessionPoolSession.java:228)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.getNewTezSessionOnError(TezTask.java:531)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:546) 
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:221){code}
> Because of this, all new queries using this corrupted sessions are failing 
> with below exception
> {code:java}
> Caused by: java.lang.AssertionError: Ensure called on an unitialized (or 
> closed) session 41774265-b7da-4d58-84a8-1bedfd597aecCaused by: 
> java.lang.AssertionError: Ensure called on an unitialized (or closed) session 
> 41774265-b7da-4d58-84a8-1bedfd597aec at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:685){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=504332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504332
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 21:36
Start Date: 23/Oct/20 21:36
Worklog Time Spent: 10m 
  Work Description: nareshpr commented on pull request #1577:
URL: https://github.com/apache/hive/pull/1577#issuecomment-715600895


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504332)
Time Spent: 1h 40m  (was: 1.5h)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22912) Support native submission of Hive queries to a Kubernetes Cluster

2020-10-23 Thread Viacheslav Avramenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219943#comment-17219943
 ] 

Viacheslav Avramenko commented on HIVE-22912:
-

I agreed with Surbhi and Michel, what about the kubernetes support as 
open-source project?

> Support native submission of Hive queries to a Kubernetes Cluster
> -
>
> Key: HIVE-22912
> URL: https://issues.apache.org/jira/browse/HIVE-22912
> Project: Hive
>  Issue Type: New Feature
>Reporter: Surbhi Aggarwal
>Priority: Major
>
> So many big data applications are already integrated or trying to natively 
> integrate with Kubernetes engine. Should we not work together to support hive 
> with this engine?
> If efforts are already being spent on this, please point me to it. Thanks !



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24066) Hive query on parquet data should identify if column is not present in file schema and show NULL value instead of Exception

2020-10-23 Thread Jainik Vora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jainik Vora updated HIVE-24066:
---
Description: 
I created a hive table containing columns with struct data type 
  
{code:java}
CREATE EXTERNAL TABLE test_dwh.sample_parquet_table (
  `context` struct<
`app`: struct<
`build`: string,
`name`: string,
`namespace`: string,
`version`: string
>,
`device`: struct<
`adtrackingenabled`: boolean,
`advertisingid`: string,
`id`: string,
`manufacturer`: string,
`model`: string,
`type`: string
>,
`locale`: string,
`library`: struct<
`name`: string,
`version`: string
>,
`os`: struct<
`name`: string,
`version`: string
>,
`screen`: struct<
`height`: bigint,
`width`: bigint
>,
`network`: struct<
`carrier`: string,
`cellular`: boolean,
`wifi`: boolean
 >,
`timezone`: string,
`userAgent`: string
>
) PARTITIONED BY (day string)
STORED as PARQUET
LOCATION 's3://xyz/events'{code}
 
 All columns are nullable hence the parquet files read by the table don't 
always contain all columns. If any file in a partition doesn't have 
"context.os" struct and if "context.os.name" is queried, Hive throws an 
exception as below. Same for "context.screen" as well.
  
{code:java}
2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 main([])]: 
CliDriver (SessionState.java:printError(1126)) - Failed with exception 
java.io.IOException:java.lang.RuntimeException: Primitive type osshould not 
doesn't match typeos[name]

2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 main([])]: 
CliDriver (SessionState.java:printError(1126)) - Failed with exception 
java.io.IOException:java.lang.RuntimeException: Primitive type osshould not 
doesn't match typeos[name]java.io.IOException: java.lang.RuntimeException: 
Primitive type osshould not doesn't match typeos[name] 
  at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
  at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428)
  at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147)
  at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2208)
  at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
  at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:787)
  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
  at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.hadoop.util.RunJar.run(RunJar.java:239)
  at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
Caused by: java.lang.RuntimeException: Primitive type osshould not doesn't 
match typeos[name] 
  at 
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.projectLeafTypes(DataWritableReadSupport.java:330)
 
  at 
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.projectLeafTypes(DataWritableReadSupport.java:322)
 
  at 
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.getProjectedSchema(DataWritableReadSupport.java:249)
  at 
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:379)
 
  at 
org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.getSplit(ParquetRecordReaderBase.java:84)
  at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:75)
  at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:60)
  at 
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:75)
  at 
org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:695)
  at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:333)
  at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:459) 
... 16 more{code}
 
 Querying context.os shows as null
{code:java}
hive> select context.os from test_dwh.sample_parquet_table where day='01' limit 
5;
OK
NULL
NULL
NULL
NULL
NULL
  

[jira] [Commented] (HIVE-21737) Upgrade Avro to version 1.10.0

2020-10-23 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219858#comment-17219858
 ] 

Chao Sun commented on HIVE-21737:
-

[~iemejia] instead of upgrading Avro in Hive, I think alternatively we can 
replace the usage of API that was removed (and was marked as deprecated from 
Avro 1.8) since Avro 1.9 by 
[AVRO-1605|https://issues.apache.org/jira/browse/AVRO-1605] - in particular, 
{{JsonProperties#getJsonProp}}. This could be an easier approach.

> Upgrade Avro to version 1.10.0
> --
>
> Key: HIVE-21737
> URL: https://issues.apache.org/jira/browse/HIVE-21737
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Ismaël Mejía
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Attachments: 0001-HIVE-21737-Bump-Apache-Avro-to-1.9.2.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Avro >= 1.9.x bring a lot of fixes including a leaner version of Avro without 
> Jackson in the public API and Guava as a dependency. Worth the update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24304) Query containing UNION fails with OOM

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24304?focusedWorklogId=504257=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504257
 ]

ASF GitHub Bot logged work on HIVE-24304:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 17:11
Start Date: 23/Oct/20 17:11
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1600:
URL: https://github.com/apache/hive/pull/1600#discussion_r511024068



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdExpressionLineage.java
##
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.optimizer.calcite.stats;
+
+
+import org.apache.calcite.rel.core.Union;
+import org.apache.calcite.rel.metadata.BuiltInMetadata;
+import org.apache.calcite.rel.metadata.MetadataDef;
+import org.apache.calcite.rel.metadata.MetadataHandler;
+import org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider;
+import org.apache.calcite.rel.metadata.RelMetadataProvider;
+import org.apache.calcite.rel.metadata.RelMetadataQuery;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.util.BuiltInMethod;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable;
+import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan;
+import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveUnion;
+
+import java.util.Set;
+
+public final class HiveRelMdExpressionLineage
+implements MetadataHandler {
+  public static final RelMetadataProvider SOURCE =
+  ReflectiveRelMetadataProvider.reflectiveSource(
+  BuiltInMethod.EXPRESSION_LINEAGE.method, new 
HiveRelMdExpressionLineage());
+
+  //~ Constructors ---
+
+  private HiveRelMdExpressionLineage() {}
+
+  //~ Methods 
+
+  public MetadataDef getDef() {
+return BuiltInMetadata.ExpressionLineage.DEF;
+  }
+
+  public Set getExpressionLineage(HiveUnion rel, RelMetadataQuery mq,
+  RexNode outputExpression) {
+return null;

Review comment:
   Can we add a comment based on the JIRA discussion on why we are 
returning null for Union operator (it will help us recall in case we revisit 
this code in the future)?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504257)
Time Spent: 20m  (was: 10m)

> Query containing UNION fails with OOM
> -
>
> Key: HIVE-24304
> URL: https://issues.apache.org/jira/browse/HIVE-24304
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-20273) Spark jobs aren't cancelled if getSparkJobInfo or getSparkStagesInfo

2020-10-23 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-20273:
---

Assignee: (was: Sahil Takiar)

> Spark jobs aren't cancelled if getSparkJobInfo or getSparkStagesInfo
> 
>
> Key: HIVE-20273
> URL: https://issues.apache.org/jira/browse/HIVE-20273
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20273.1.patch, HIVE-20273.2.patch
>
>
> HIVE-19053 and HIVE-19733 added handling of {{InterruptedException}} to 
> {{RemoteSparkJobStatus#getSparkJobInfo}} and 
> {{RemoteSparkJobStatus#getSparkStagesInfo}}. Now, these methods catch 
> {{InterruptedException}} and wrap the exception in a {{HiveException}} and 
> then throw the new {{HiveException}}.
> This new {{HiveException}} is then caught in 
> {{RemoteSparkJobMonitor#startMonitor}} which then looks for exceptions that 
> match the condition:
> {code:java}
> if (e instanceof InterruptedException ||
> (e instanceof HiveException && e.getCause() instanceof 
> InterruptedException))
> {code}
> If this condition is met (in this case it is), the exception will again be 
> wrapped in another {{HiveException}} and is thrown again. So the final 
> exception is a {{HiveException}} that wraps a {{HiveException}} that wraps an 
> {{InterruptedException}}.
> The double nesting of hive exception causes the logic in 
> {{SparkTask#setSparkException}} to break, and doesn't cause {{killJob}} to 
> get triggered.
> This causes interrupted Hive queries to not kill their corresponding Spark 
> jobs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-20519) Remove 30m min value for hive.spark.session.timeout

2020-10-23 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-20519:
---

Assignee: (was: Sahil Takiar)

> Remove 30m min value for hive.spark.session.timeout
> ---
>
> Key: HIVE-20519
> URL: https://issues.apache.org/jira/browse/HIVE-20519
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20519.1.patch, HIVE-20519.2.patch, 
> HIVE-20519.3.patch
>
>
> In HIVE-14162 we added the config \{{hive.spark.session.timeout}} which 
> provided a way to time out Spark sessions that are active for a long period 
> of time. The config has a lower bound of 30m which we should remove. It 
> should be possible for users to configure this value so the HoS session is 
> closed as soon as the query is complete.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-20828) Upgrade to Spark 2.4.0

2020-10-23 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-20828:
---

Assignee: (was: Sahil Takiar)

> Upgrade to Spark 2.4.0
> --
>
> Key: HIVE-20828
> URL: https://issues.apache.org/jira/browse/HIVE-20828
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20828.1.patch, HIVE-20828.2.patch
>
>
> The Spark community is in the process of releasing Spark 2.4.0. We should do 
> some testing with the RC candidates and then upgrade once the release is 
> finalized.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-19821) Distributed HiveServer2

2020-10-23 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-19821:
---

Assignee: (was: Sahil Takiar)

> Distributed HiveServer2
> ---
>
> Key: HIVE-19821
> URL: https://issues.apache.org/jira/browse/HIVE-19821
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2
>Reporter: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19821.1.WIP.patch, HIVE-19821.2.WIP.patch, 
> HIVE-19821_ Distributed HiveServer2.pdf
>
>
> HS2 deployments often hit OOM issues due to a number of factors: (1) too many 
> concurrent connections, (2) query that scan a large number of partitions have 
> to pull a lot of metadata into memory (e.g. a query reading thousands of 
> partitions requires loading thousands of partitions into memory), (3) very 
> large queries can take up a lot of heap space, especially during query 
> parsing. There are a number of other factors that cause HiveServer2 to run 
> out of memory, these are just some of the more commons ones.
> Distributed HS2 proposes to do all query parsing, compilation, planning, and 
> execution coordination inside a dedicated container. This should 
> significantly decrease memory pressure on HS2 and allow HS2 to scale to a 
> larger number of concurrent users.
> For HoS (and I think Hive-on-Tez) this just requires moving all query 
> compilation, planning, etc. inside the application master for the 
> corresponding Hive session.
> The main benefit here is isolation. A poorly written Hive query cannot bring 
> down an entire HiveServer2 instance and force all other queries to fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=504250=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504250
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 16:56
Start Date: 23/Oct/20 16:56
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on pull request #1577:
URL: https://github.com/apache/hive/pull/1577#issuecomment-715459208


   @kgyrtkirk @nareshpr I significantly changed the patch. Please let me know 
if you have further concerns.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504250)
Time Spent: 1.5h  (was: 1h 20m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504248=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504248
 ]

ASF GitHub Bot logged work on HIVE-24258:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 16:52
Start Date: 23/Oct/20 16:52
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1587:
URL: https://github.com/apache/hive/pull/1587#discussion_r511014101



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##
@@ -2846,31 +2846,28 @@ public SQLAllTableConstraints 
getAllTableConstraints(String catName, String dbNa
 return sqlAllTableConstraints;
   }
 
-  @Override public List createTableWithConstraints(Table tbl, 
List primaryKeys,
-  List foreignKeys, List 
uniqueConstraints,
-  List notNullConstraints, 
List defaultConstraints,
-  List checkConstraints) throws 
InvalidObjectException, MetaException {
-List constraintNames = rawStore
-.createTableWithConstraints(tbl, primaryKeys, foreignKeys, 
uniqueConstraints, notNullConstraints,
-defaultConstraints, checkConstraints);
+  @Override public SQLAllTableConstraints createTableWithConstraints(Table 
tbl, SQLAllTableConstraints constraints) throws InvalidObjectException, 
MetaException {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504248)
Time Spent: 1h 10m  (was: 1h)

> [CachedStore] Data miss match between cachedstore and rawstore
> --
>
> Key: HIVE-24258
> URL: https://issues.apache.org/jira/browse/HIVE-24258
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Description
> Objects like table name, db name, column name etc are case insensitive as per 
> HIVE contract but standalone metastore cachedstore is case sensitive.  As 
> result of which there is mismatch in rawstore output and cachedstore output.
> Example - 
> expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]> 
> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24309) Simplify ConvertJoinMapJoin logic

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24309:
--
Labels: pull-request-available  (was: )

> Simplify ConvertJoinMapJoin logic 
> --
>
> Key: HIVE-24309
> URL: https://issues.apache.org/jira/browse/HIVE-24309
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ConvertMapJoin logic can be further simplified:
> [https://github.com/pgaref/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L92]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504229=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504229
 ]

ASF GitHub Bot logged work on HIVE-24258:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 16:14
Start Date: 23/Oct/20 16:14
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1587:
URL: https://github.com/apache/hive/pull/1587#discussion_r510993015



##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
##
@@ -1568,12 +1568,7 @@ public void testPrimaryKeys() {
 List cachedKeys = sharedCache.listCachedPrimaryKeys(
 DEFAULT_CATALOG_NAME, tbl.getDbName(), tbl.getTableName());
 
-Assert.assertEquals(cachedKeys.size(), 1);
-Assert.assertEquals(cachedKeys.get(0).getPk_name(), "pk1");
-Assert.assertEquals(cachedKeys.get(0).getTable_db(), "db");
-Assert.assertEquals(cachedKeys.get(0).getTable_name(), tbl.getTableName());
-Assert.assertEquals(cachedKeys.get(0).getColumn_name(), "col1");
-Assert.assertEquals(cachedKeys.get(0).getCatName(), DEFAULT_CATALOG_NAME);
+Assert.assertEquals(origKeys,cachedKeys);

Review comment:
   Done

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
##
@@ -1499,20 +1499,11 @@ SQLAllTableConstraints getAllTableConstraints(String 
catName, String dbName, Str
   /**
* Create a table with constraints
* @param tbl table definition
-   * @param primaryKeys primary key definition, or null
-   * @param foreignKeys foreign key definition, or null
-   * @param uniqueConstraints unique constraints definition, or null
-   * @param notNullConstraints not null constraints definition, or null
-   * @param defaultConstraints default values definition, or null
* @return list of constraint names

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504229)
Time Spent: 1h  (was: 50m)

> [CachedStore] Data miss match between cachedstore and rawstore
> --
>
> Key: HIVE-24258
> URL: https://issues.apache.org/jira/browse/HIVE-24258
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Description
> Objects like table name, db name, column name etc are case insensitive as per 
> HIVE contract but standalone metastore cachedstore is case sensitive.  As 
> result of which there is mismatch in rawstore output and cachedstore output.
> Example - 
> expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]> 
> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24309) Simplify ConvertJoinMapJoin logic

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24309?focusedWorklogId=504230=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504230
 ]

ASF GitHub Bot logged work on HIVE-24309:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 16:14
Start Date: 23/Oct/20 16:14
Worklog Time Spent: 10m 
  Work Description: pgaref opened a new pull request #1606:
URL: https://github.com/apache/hive/pull/1606


   Change-Id: I89865b6ebc102fa63a99beb94a89771b779cc300
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504230)
Remaining Estimate: 0h
Time Spent: 10m

> Simplify ConvertJoinMapJoin logic 
> --
>
> Key: HIVE-24309
> URL: https://issues.apache.org/jira/browse/HIVE-24309
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ConvertMapJoin logic can be further simplified:
> [https://github.com/pgaref/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L92]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504227=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504227
 ]

ASF GitHub Bot logged work on HIVE-24258:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 16:13
Start Date: 23/Oct/20 16:13
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1587:
URL: https://github.com/apache/hive/pull/1587#discussion_r510992591



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##
@@ -2785,31 +2696,23 @@ public void 
add_not_null_constraint(AddNotNullConstraintRequest req)
 @Override
 public void add_default_constraint(AddDefaultConstraintRequest req)
 throws MetaException, InvalidObjectException {
-  List defaultConstraintCols= 
req.getDefaultConstraintCols();
-  String constraintName = (defaultConstraintCols != null && 
defaultConstraintCols.size() > 0) ?
-  defaultConstraintCols.get(0).getDc_name() : "null";
+  List defaultConstraints= 
req.getDefaultConstraintCols();

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504227)
Time Spent: 40m  (was: 0.5h)

> [CachedStore] Data miss match between cachedstore and rawstore
> --
>
> Key: HIVE-24258
> URL: https://issues.apache.org/jira/browse/HIVE-24258
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Description
> Objects like table name, db name, column name etc are case insensitive as per 
> HIVE contract but standalone metastore cachedstore is case sensitive.  As 
> result of which there is mismatch in rawstore output and cachedstore output.
> Example - 
> expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]> 
> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504228=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504228
 ]

ASF GitHub Bot logged work on HIVE-24258:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 16:13
Start Date: 23/Oct/20 16:13
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1587:
URL: https://github.com/apache/hive/pull/1587#discussion_r510992922



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
##
@@ -58,14 +59,11 @@ public static String buildDbKeyWithDelimiterSuffix(String 
catName, String dbName
*
*/
   public static String buildPartitionCacheKey(List partVals) {
-if (partVals == null || partVals.isEmpty()) {
-  return "";
-}
-return String.join(delimit, partVals);
+return CollectionUtils.isNotEmpty(partVals) ? String.join(delimit, 
partVals) : "";
   }
 
   public static String buildTableKey(String catName, String dbName, String 
tableName) {
-return buildKey(catName.toLowerCase(), dbName.toLowerCase(), 
tableName.toLowerCase());
+return 
buildKey(StringUtils.normalizeIdentifier(catName),StringUtils.normalizeIdentifier(dbName),StringUtils.normalizeIdentifier(tableName));

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504228)
Time Spent: 50m  (was: 40m)

> [CachedStore] Data miss match between cachedstore and rawstore
> --
>
> Key: HIVE-24258
> URL: https://issues.apache.org/jira/browse/HIVE-24258
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Description
> Objects like table name, db name, column name etc are case insensitive as per 
> HIVE contract but standalone metastore cachedstore is case sensitive.  As 
> result of which there is mismatch in rawstore output and cachedstore output.
> Example - 
> expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]> 
> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504226=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504226
 ]

ASF GitHub Bot logged work on HIVE-24258:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 16:12
Start Date: 23/Oct/20 16:12
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1587:
URL: https://github.com/apache/hive/pull/1587#discussion_r510992400



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##
@@ -2255,121 +2257,61 @@ private void create_table_core(final RawStore ms, 
final CreateTableRequest req)
   tbl.putToParameters(hive_metastoreConstants.DDL_TIME, 
Long.toString(time));
 }
 
-if (primaryKeys == null && foreignKeys == null
-&& uniqueConstraints == null && notNullConstraints == null && 
defaultConstraints == null
-&& checkConstraints == null) {
+if (CollectionUtils.isEmpty(constraints.getPrimaryKeys()) && 
CollectionUtils.isEmpty(constraints.getForeignKeys())
+&& 
CollectionUtils.isEmpty(constraints.getUniqueConstraints())&& 
CollectionUtils.isEmpty(constraints.getNotNullConstraints())&& 
CollectionUtils.isEmpty(constraints.getDefaultConstraints())
+&& CollectionUtils.isEmpty(constraints.getCheckConstraints())) {
   ms.createTable(tbl);
 } else {
   // Check that constraints have catalog name properly set first
-  if (primaryKeys != null && !primaryKeys.isEmpty() && 
!primaryKeys.get(0).isSetCatName()) {
-for (SQLPrimaryKey pkcol : primaryKeys) 
pkcol.setCatName(tbl.getCatName());
+  if (CollectionUtils.isNotEmpty(constraints.getPrimaryKeys()) && 
!constraints.getPrimaryKeys().get(0).isSetCatName()) {
+for (SQLPrimaryKey pkcol : constraints.getPrimaryKeys()) 
pkcol.setCatName(tbl.getCatName());

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504226)
Time Spent: 0.5h  (was: 20m)

> [CachedStore] Data miss match between cachedstore and rawstore
> --
>
> Key: HIVE-24258
> URL: https://issues.apache.org/jira/browse/HIVE-24258
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Description
> Objects like table name, db name, column name etc are case insensitive as per 
> HIVE contract but standalone metastore cachedstore is case sensitive.  As 
> result of which there is mismatch in rawstore output and cachedstore output.
> Example - 
> expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]> 
> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504225=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504225
 ]

ASF GitHub Bot logged work on HIVE-24258:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 16:12
Start Date: 23/Oct/20 16:12
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1587:
URL: https://github.com/apache/hive/pull/1587#discussion_r510992214



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStoreUpdateUsingEvents.java
##
@@ -419,18 +412,14 @@ public void testConstraintsForUpdateUsingEvents() throws 
Exception {
   public void assertRawStoreAndCachedStoreConstraint(String catName, String 
dbName, String tblName)
   throws MetaException, NoSuchObjectException {
 SQLAllTableConstraints rawStoreConstraints = 
rawStore.getAllTableConstraints(catName, dbName, tblName);
-List primaryKeys = 
sharedCache.listCachedPrimaryKeys(catName, dbName, tblName);
-List notNullConstraints = 
sharedCache.listCachedNotNullConstraints(catName, dbName, tblName);
-List uniqueConstraints = 
sharedCache.listCachedUniqueConstraint(catName, dbName, tblName);
-List defaultConstraints = 
sharedCache.listCachedDefaultConstraint(catName, dbName, tblName);
-List checkConstraints = 
sharedCache.listCachedCheckConstraint(catName, dbName, tblName);
-List foreignKeys = 
sharedCache.listCachedForeignKeys(catName, dbName, tblName, null, null);
-Assert.assertEquals(rawStoreConstraints.getPrimaryKeys(), primaryKeys);
-Assert.assertEquals(rawStoreConstraints.getNotNullConstraints(), 
notNullConstraints);
-Assert.assertEquals(rawStoreConstraints.getUniqueConstraints(), 
uniqueConstraints);
-Assert.assertEquals(rawStoreConstraints.getDefaultConstraints(), 
defaultConstraints);
-Assert.assertEquals(rawStoreConstraints.getCheckConstraints(), 
checkConstraints);
-Assert.assertEquals(rawStoreConstraints.getForeignKeys(), foreignKeys);
+SQLAllTableConstraints cachedStoreConstraints = new 
SQLAllTableConstraints();
+
cachedStoreConstraints.setPrimaryKeys(sharedCache.listCachedPrimaryKeys(catName,
 dbName, tblName));
+
cachedStoreConstraints.setForeignKeys(sharedCache.listCachedForeignKeys(catName,
 dbName, tblName, null, null));
+
cachedStoreConstraints.setNotNullConstraints(sharedCache.listCachedNotNullConstraints(catName,
 dbName, tblName));
+
cachedStoreConstraints.setDefaultConstraints(sharedCache.listCachedDefaultConstraint(catName,
 dbName, tblName));
+
cachedStoreConstraints.setCheckConstraints(sharedCache.listCachedCheckConstraint(catName,
 dbName, tblName));
+
cachedStoreConstraints.setUniqueConstraints(sharedCache.listCachedUniqueConstraint(catName,
 dbName, tblName));
+Assert.assertEquals(rawStoreConstraints,cachedStoreConstraints);

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504225)
Time Spent: 20m  (was: 10m)

> [CachedStore] Data miss match between cachedstore and rawstore
> --
>
> Key: HIVE-24258
> URL: https://issues.apache.org/jira/browse/HIVE-24258
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Description
> Objects like table name, db name, column name etc are case insensitive as per 
> HIVE contract but standalone metastore cachedstore is case sensitive.  As 
> result of which there is mismatch in rawstore output and cachedstore output.
> Example - 
> expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]> 
> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24308) FIX conditions used for DPHJ conversion

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24308?focusedWorklogId=504213=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504213
 ]

ASF GitHub Bot logged work on HIVE-24308:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 15:18
Start Date: 23/Oct/20 15:18
Worklog Time Spent: 10m 
  Work Description: pgaref opened a new pull request #1604:
URL: https://github.com/apache/hive/pull/1604


   FIX conditions used for DPHJ conversion
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504213)
Time Spent: 0.5h  (was: 20m)

> FIX conditions used for DPHJ conversion  
> -
>
> Key: HIVE-24308
> URL: https://issues.apache.org/jira/browse/HIVE-24308
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Found a weird scenario when looking at the ConvertJoinMapJoin logic: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198]
>  When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is 
> lower than expected the code returns a MJ because of the condition above!
> In general, I believe the ShuffleSize check: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624]
>  should be part of the shuffleJoin DPHJ conversion.
> And the preferred conversion would be: MJ > DPHJ > SMB



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24308) FIX conditions used for DPHJ conversion

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24308?focusedWorklogId=504214=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504214
 ]

ASF GitHub Bot logged work on HIVE-24308:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 15:18
Start Date: 23/Oct/20 15:18
Worklog Time Spent: 10m 
  Work Description: pgaref closed pull request #1604:
URL: https://github.com/apache/hive/pull/1604


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504214)
Time Spent: 40m  (was: 0.5h)

> FIX conditions used for DPHJ conversion  
> -
>
> Key: HIVE-24308
> URL: https://issues.apache.org/jira/browse/HIVE-24308
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Found a weird scenario when looking at the ConvertJoinMapJoin logic: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198]
>  When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is 
> lower than expected the code returns a MJ because of the condition above!
> In general, I believe the ShuffleSize check: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624]
>  should be part of the shuffleJoin DPHJ conversion.
> And the preferred conversion would be: MJ > DPHJ > SMB



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23935) Fetching primaryKey through beeline fails with NPE

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23935?focusedWorklogId=504211=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504211
 ]

ASF GitHub Bot logged work on HIVE-23935:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 15:05
Start Date: 23/Oct/20 15:05
Worklog Time Spent: 10m 
  Work Description: ayushtkn opened a new pull request #1605:
URL: https://github.com/apache/hive/pull/1605


   https://issues.apache.org/jira/browse/HIVE-23935
   
   Entire Trace -
   
   0: jdbc:hive2://localhost:1> !primarykeys Persons
   Error: MetaException(message:java.lang.NullPointerException) (state=,code=0)
   org.apache.hive.service.cli.HiveSQLException: 
MetaException(message:java.lang.NullPointerException)
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:360)
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:351)
at 
org.apache.hive.jdbc.HiveDatabaseMetaData.getPrimaryKeys(HiveDatabaseMetaData.java:573)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hive.beeline.Reflector.invoke(Reflector.java:89)
at org.apache.hive.beeline.Commands.metadata(Commands.java:125)
at org.apache.hive.beeline.Commands.primarykeys(Commands.java:231)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:57)
at 
org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1465)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1504)
at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1364)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1134)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1082)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:546)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:528)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
   Caused by: org.apache.hive.service.cli.HiveSQLException: 
MetaException(message:java.lang.NullPointerException)
at 
org.apache.hive.service.cli.operation.GetPrimaryKeysOperation.runInternal(GetPrimaryKeysOperation.java:120)
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:277)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.getPrimaryKeys(HiveSessionImpl.java:997)
at 
org.apache.hive.service.cli.CLIService.getPrimaryKeys(CLIService.java:416)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.GetPrimaryKeys(ThriftCLIService.java:838)
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetPrimaryKeys.getResult(TCLIService.java:1717)
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetPrimaryKeys.getResult(TCLIService.java:1702)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
   Caused by: MetaException(message:java.lang.NullPointerException)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:7921)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.throwMetaException(HiveMetaStore.java:9105)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_primary_keys(HiveMetaStore.java:9067)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  

[jira] [Resolved] (HIVE-24113) NPE in GenericUDFToUnixTimeStamp

2020-10-23 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér resolved HIVE-24113.
--
Resolution: Fixed

> NPE in GenericUDFToUnixTimeStamp
> 
>
> Key: HIVE-24113
> URL: https://issues.apache.org/jira/browse/HIVE-24113
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.2
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>
> Following query will trigger the getPartitionsByExpr call at HMS, HMS will 
> try to evaluate the filter based on the PartitionExpressionForMetastore 
> proxy, this proxy uses the QL packages to evaluate the filter and call 
> GenericUDFToUnixTimeStamp.
> select * from table_name where hour between 
> from_unixtime(unix_timestamp('2020090120', 'MMddHH') - 1*60*60, 
> 'MMddHH') and from_unixtime(unix_timestamp('2020090122', 'MMddHH') + 
> 2*60*60, 'MMddHH');
> I think SessionState in the code path will always be NULL thats why it hit 
> the NPE.
> {code:java}
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.initializeInput(GenericUDFToUnixTimeStamp.java:126)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.initialize(GenericUDFToUnixTimeStamp.java:75)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:148)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:146)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.prepareExpr(PartExprEvalUtils.java:119)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prunePartitionNames(PartitionPruner.java:551)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.filterPartitionsByExpr(PartitionExpressionForMetastore.java:82)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionNamesPrunedByExprNoTxn(ObjectStore.java:3527)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.access$1400(ObjectStore.java:252)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$10.getJdoResult(ObjectStore.java:3493)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$10.getJdoResult(ObjectStore.java:3464)
>  ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:3764)
>  [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3499)
>  [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(ObjectStore.java:3452)
>  [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_112]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_112]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
> [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
> at com.sun.proxy.$Proxy28.getPartitionsByExpr(Unknown Source) [?:?]
> at 
> 

[jira] [Assigned] (HIVE-24309) Simplify ConvertJoinMapJoin logic

2020-10-23 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-24309:
-


> Simplify ConvertJoinMapJoin logic 
> --
>
> Key: HIVE-24309
> URL: https://issues.apache.org/jira/browse/HIVE-24309
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> ConvertMapJoin logic can be further simplified:
> [https://github.com/pgaref/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L92]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24308) FIX conditions used for DPHJ conversion

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24308?focusedWorklogId=504139=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504139
 ]

ASF GitHub Bot logged work on HIVE-24308:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 12:49
Start Date: 23/Oct/20 12:49
Worklog Time Spent: 10m 
  Work Description: pgaref commented on pull request #1604:
URL: https://github.com/apache/hive/pull/1604#issuecomment-715320363


   @rbalamohan @jesus Can you please take a look?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504139)
Time Spent: 20m  (was: 10m)

> FIX conditions used for DPHJ conversion  
> -
>
> Key: HIVE-24308
> URL: https://issues.apache.org/jira/browse/HIVE-24308
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Found a weird scenario when looking at the ConvertJoinMapJoin logic: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198]
>  When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is 
> lower than expected the code returns a MJ because of the condition above!
> In general, I believe the ShuffleSize check: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624]
>  should be part of the shuffleJoin DPHJ conversion.
> And the preferred conversion would be: MJ > DPHJ > SMB



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24308) FIX conditions used for DPHJ conversion

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24308:
--
Labels: pull-request-available  (was: )

> FIX conditions used for DPHJ conversion  
> -
>
> Key: HIVE-24308
> URL: https://issues.apache.org/jira/browse/HIVE-24308
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Found a weird scenario when looking at the ConvertJoinMapJoin logic: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198]
>  When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is 
> lower than expected the code returns a MJ because of the condition above!
> In general, I believe the ShuffleSize check: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624]
>  should be part of the shuffleJoin DPHJ conversion.
> And the preferred conversion would be: MJ > DPHJ > SMB



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24308) FIX conditions used for DPHJ conversion

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24308?focusedWorklogId=504138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504138
 ]

ASF GitHub Bot logged work on HIVE-24308:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 12:44
Start Date: 23/Oct/20 12:44
Worklog Time Spent: 10m 
  Work Description: pgaref opened a new pull request #1604:
URL: https://github.com/apache/hive/pull/1604


   Change-Id: Iaa1d4a5c857b6c494aa220c6c96d7659a2a68aa4
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504138)
Remaining Estimate: 0h
Time Spent: 10m

> FIX conditions used for DPHJ conversion  
> -
>
> Key: HIVE-24308
> URL: https://issues.apache.org/jira/browse/HIVE-24308
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Found a weird scenario when looking at the ConvertJoinMapJoin logic: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198]
>  When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is 
> lower than expected the code returns a MJ because of the condition above!
> In general, I believe the ShuffleSize check: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624]
>  should be part of the shuffleJoin DPHJ conversion.
> And the preferred conversion would be: MJ > DPHJ > SMB



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24308) FIX conditions used for DPHJ conversion

2020-10-23 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-24308:
--
Description: 
Found a weird scenario when looking at the ConvertJoinMapJoin logic: 
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198]
 When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is lower 
than expected the code returns a MJ because of the condition above!

In general, I believe the ShuffleSize check: 
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624]
 should be part of the shuffleJoin DPHJ conversion.

And the preferred conversion would be: MJ > DPHJ > SMB

  was:
Found a weird scenario when looking at the ConvertJoinMapJoin logic: 
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198]
 When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is lower 
than expected the code returns a MJ because of the condition above!

In general, I believe the ShuffleSize check: 
[https://github.com/apache/hive/blob/052c9da958f5cf3998091a7eb4b24192a5bb61e9/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624]
 should be part of the shuffleJoin DPHJ conversion.

And the preferred conversion would be: MJ > DPHJ > SMB


> FIX conditions used for DPHJ conversion  
> -
>
> Key: HIVE-24308
> URL: https://issues.apache.org/jira/browse/HIVE-24308
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> Found a weird scenario when looking at the ConvertJoinMapJoin logic: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198]
>  When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is 
> lower than expected the code returns a MJ because of the condition above!
> In general, I believe the ShuffleSize check: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624]
>  should be part of the shuffleJoin DPHJ conversion.
> And the preferred conversion would be: MJ > DPHJ > SMB



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24308) FIX conditions used for DPHJ conversion

2020-10-23 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-24308:
-


> FIX conditions used for DPHJ conversion  
> -
>
> Key: HIVE-24308
> URL: https://issues.apache.org/jira/browse/HIVE-24308
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> Found a weird scenario when looking at the ConvertJoinMapJoin logic: 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198]
>  When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is 
> lower than expected the code returns a MJ because of the condition above!
> In general, I believe the ShuffleSize check: 
> [https://github.com/apache/hive/blob/052c9da958f5cf3998091a7eb4b24192a5bb61e9/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624]
>  should be part of the shuffleJoin DPHJ conversion.
> And the preferred conversion would be: MJ > DPHJ > SMB



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24307) Beeline with property-file and -e parameter is failing

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24307?focusedWorklogId=504128=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504128
 ]

ASF GitHub Bot logged work on HIVE-24307:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 11:46
Start Date: 23/Oct/20 11:46
Worklog Time Spent: 10m 
  Work Description: ayushtkn opened a new pull request #1603:
URL: https://github.com/apache/hive/pull/1603


   https://issues.apache.org/jira/browse/HIVE-24307



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504128)
Remaining Estimate: 0h
Time Spent: 10m

> Beeline with property-file and -e parameter is failing
> --
>
> Key: HIVE-24307
> URL: https://issues.apache.org/jira/browse/HIVE-24307
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Beeline query with property file specified with -e parameter fails with :
> {noformat}
> Cannot run commands specified using -e. No current connection
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24307) Beeline with property-file and -e parameter is failing

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24307:
--
Labels: pull-request-available  (was: )

> Beeline with property-file and -e parameter is failing
> --
>
> Key: HIVE-24307
> URL: https://issues.apache.org/jira/browse/HIVE-24307
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Beeline query with property file specified with -e parameter fails with :
> {noformat}
> Cannot run commands specified using -e. No current connection
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Moved] (HIVE-24307) Beeline with property-file and -e parameter is failing

2020-10-23 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena moved HDFS-15647 to HIVE-24307:


Key: HIVE-24307  (was: HDFS-15647)
Project: Hive  (was: Hadoop HDFS)

> Beeline with property-file and -e parameter is failing
> --
>
> Key: HIVE-24307
> URL: https://issues.apache.org/jira/browse/HIVE-24307
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>
> Beeline query with property file specified with -e parameter fails with :
> {noformat}
> Cannot run commands specified using -e. No current connection
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24258:
--
Labels: pull-request-available  (was: )

> [CachedStore] Data miss match between cachedstore and rawstore
> --
>
> Key: HIVE-24258
> URL: https://issues.apache.org/jira/browse/HIVE-24258
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Description
> Objects like table name, db name, column name etc are case insensitive as per 
> HIVE contract but standalone metastore cachedstore is case sensitive.  As 
> result of which there is mismatch in rawstore output and cachedstore output.
> Example - 
> expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]> 
> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504124=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504124
 ]

ASF GitHub Bot logged work on HIVE-24258:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 11:28
Start Date: 23/Oct/20 11:28
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1587:
URL: https://github.com/apache/hive/pull/1587#discussion_r510806059



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStoreUpdateUsingEvents.java
##
@@ -419,18 +412,14 @@ public void testConstraintsForUpdateUsingEvents() throws 
Exception {
   public void assertRawStoreAndCachedStoreConstraint(String catName, String 
dbName, String tblName)
   throws MetaException, NoSuchObjectException {
 SQLAllTableConstraints rawStoreConstraints = 
rawStore.getAllTableConstraints(catName, dbName, tblName);
-List primaryKeys = 
sharedCache.listCachedPrimaryKeys(catName, dbName, tblName);
-List notNullConstraints = 
sharedCache.listCachedNotNullConstraints(catName, dbName, tblName);
-List uniqueConstraints = 
sharedCache.listCachedUniqueConstraint(catName, dbName, tblName);
-List defaultConstraints = 
sharedCache.listCachedDefaultConstraint(catName, dbName, tblName);
-List checkConstraints = 
sharedCache.listCachedCheckConstraint(catName, dbName, tblName);
-List foreignKeys = 
sharedCache.listCachedForeignKeys(catName, dbName, tblName, null, null);
-Assert.assertEquals(rawStoreConstraints.getPrimaryKeys(), primaryKeys);
-Assert.assertEquals(rawStoreConstraints.getNotNullConstraints(), 
notNullConstraints);
-Assert.assertEquals(rawStoreConstraints.getUniqueConstraints(), 
uniqueConstraints);
-Assert.assertEquals(rawStoreConstraints.getDefaultConstraints(), 
defaultConstraints);
-Assert.assertEquals(rawStoreConstraints.getCheckConstraints(), 
checkConstraints);
-Assert.assertEquals(rawStoreConstraints.getForeignKeys(), foreignKeys);
+SQLAllTableConstraints cachedStoreConstraints = new 
SQLAllTableConstraints();
+
cachedStoreConstraints.setPrimaryKeys(sharedCache.listCachedPrimaryKeys(catName,
 dbName, tblName));
+
cachedStoreConstraints.setForeignKeys(sharedCache.listCachedForeignKeys(catName,
 dbName, tblName, null, null));
+
cachedStoreConstraints.setNotNullConstraints(sharedCache.listCachedNotNullConstraints(catName,
 dbName, tblName));
+
cachedStoreConstraints.setDefaultConstraints(sharedCache.listCachedDefaultConstraint(catName,
 dbName, tblName));
+
cachedStoreConstraints.setCheckConstraints(sharedCache.listCachedCheckConstraint(catName,
 dbName, tblName));
+
cachedStoreConstraints.setUniqueConstraints(sharedCache.listCachedUniqueConstraint(catName,
 dbName, tblName));
+Assert.assertEquals(rawStoreConstraints,cachedStoreConstraints);

Review comment:
   nit: Add space after ,

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##
@@ -2255,121 +2257,61 @@ private void create_table_core(final RawStore ms, 
final CreateTableRequest req)
   tbl.putToParameters(hive_metastoreConstants.DDL_TIME, 
Long.toString(time));
 }
 
-if (primaryKeys == null && foreignKeys == null
-&& uniqueConstraints == null && notNullConstraints == null && 
defaultConstraints == null
-&& checkConstraints == null) {
+if (CollectionUtils.isEmpty(constraints.getPrimaryKeys()) && 
CollectionUtils.isEmpty(constraints.getForeignKeys())
+&& 
CollectionUtils.isEmpty(constraints.getUniqueConstraints())&& 
CollectionUtils.isEmpty(constraints.getNotNullConstraints())&& 
CollectionUtils.isEmpty(constraints.getDefaultConstraints())
+&& CollectionUtils.isEmpty(constraints.getCheckConstraints())) {
   ms.createTable(tbl);
 } else {
   // Check that constraints have catalog name properly set first
-  if (primaryKeys != null && !primaryKeys.isEmpty() && 
!primaryKeys.get(0).isSetCatName()) {
-for (SQLPrimaryKey pkcol : primaryKeys) 
pkcol.setCatName(tbl.getCatName());
+  if (CollectionUtils.isNotEmpty(constraints.getPrimaryKeys()) && 
!constraints.getPrimaryKeys().get(0).isSetCatName()) {
+for (SQLPrimaryKey pkcol : constraints.getPrimaryKeys()) 
pkcol.setCatName(tbl.getCatName());

Review comment:
   nit: Use for () { 
   --
   } 
   even for single statement.
   Or use this instead: constraints.getPrimaryKeys().forEach(pk -> 
pk.setCatName(tbl.getCatName()));

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##
@@ -2785,31 +2696,23 @@ public void 
add_not_null_constraint(AddNotNullConstraintRequest req)
 @Override

[jira] [Updated] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore

2020-10-23 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-24258:

Description: 
Description

Objects like table name, db name, column name etc are case insensitive as per 
HIVE contract but standalone metastore cachedstore is case sensitive.  As 
result of which there is mismatch in rawstore output and cachedstore output.

Example - 

expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
validate_cstr:false, rely_cstr:false, catName:hive)]> 

but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
validate_cstr:false, rely_cstr:false, catName:hive)]>

  was:
Description

Objects like table name, db name, column name etc are case insensitive as per 
HIVE contract but standalone metastore cachedstore is case sensitive.  As 
result of which there is miss match in rawstore output and cachedstore output.

Example - 

expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
validate_cstr:false, rely_cstr:false, catName:hive)]> 

but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
validate_cstr:false, rely_cstr:false, catName:hive)]>


> [CachedStore] Data miss match between cachedstore and rawstore
> --
>
> Key: HIVE-24258
> URL: https://issues.apache.org/jira/browse/HIVE-24258
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Sankar Hariappan
>Priority: Major
>
> Description
> Objects like table name, db name, column name etc are case insensitive as per 
> HIVE contract but standalone metastore cachedstore is case sensitive.  As 
> result of which there is mismatch in rawstore output and cachedstore output.
> Example - 
> expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]> 
> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore

2020-10-23 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-24258:

Description: 
Description

Objects like table name, db name, column name etc are case insensitive as per 
HIVE contract but standalone metastore cachedstore is case sensitive.  As 
result of which there is miss match in rawstore output and cachedstore output.

Example - 

expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
validate_cstr:false, rely_cstr:false, catName:hive)]> 

but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
validate_cstr:false, rely_cstr:false, catName:hive)]>

  was:
Description

Objects like table name, db name, column name etc are case incentives as per 
HIVE contract but standalone metastore cachedstore is case sensitive.  As 
result of which there is miss match in rawstore output and cachedstore output.

Example - 

expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
validate_cstr:false, rely_cstr:false, catName:hive)]> 

but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
validate_cstr:false, rely_cstr:false, catName:hive)]>


> [CachedStore] Data miss match between cachedstore and rawstore
> --
>
> Key: HIVE-24258
> URL: https://issues.apache.org/jira/browse/HIVE-24258
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Sankar Hariappan
>Priority: Major
>
> Description
> Objects like table name, db name, column name etc are case insensitive as per 
> HIVE contract but standalone metastore cachedstore is case sensitive.  As 
> result of which there is miss match in rawstore output and cachedstore output.
> Example - 
> expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]> 
> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore

2020-10-23 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-24258:
---

Assignee: Sankar Hariappan  (was: Ashish Sharma)

> [CachedStore] Data miss match between cachedstore and rawstore
> --
>
> Key: HIVE-24258
> URL: https://issues.apache.org/jira/browse/HIVE-24258
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Sankar Hariappan
>Priority: Major
>
> Description
> Objects like table name, db name, column name etc are case incentives as per 
> HIVE contract but standalone metastore cachedstore is case sensitive.  As 
> result of which there is miss match in rawstore output and cachedstore output.
> Example - 
> expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]> 
> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, 
> column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, 
> validate_cstr:false, rely_cstr:false, catName:hive)]>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24165?focusedWorklogId=504048=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504048
 ]

ASF GitHub Bot logged work on HIVE-24165:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 06:42
Start Date: 23/Oct/20 06:42
Worklog Time Spent: 10m 
  Work Description: loudongfeng closed pull request #1597:
URL: https://github.com/apache/hive/pull/1597


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504048)
Time Spent: 20m  (was: 10m)

> CBO: Query fails after multiple count distinct rewrite 
> ---
>
> Key: HIVE-24165
> URL: https://issues.apache.org/jira/browse/HIVE-24165
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24165.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> One way to reproduce:
>  
> {code:sql}
>  CREATE TABLE test(
>  `device_id` string, 
>  `level` string, 
>  `site_id` string, 
>  `user_id` string, 
>  `first_date` string, 
>  `last_date` string,
>  `dt` string) ;
>  set hive.execution.engine=tez;
>  set hive.optimize.distinct.rewrite=true;
>  set hive.cli.print.header=true;
>  select 
>  dt,
>  site_id,
>  count(DISTINCT t1.device_id) as device_tol_cnt,
>  count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else 
> null end) as device_add_cnt 
>  from test t1 where dt='2020-09-15' 
>  group by
>  dt,
>  site_id
>  ;
> {code}
>  
> Error log:  
> {code:java}
> Exception in thread "main" java.lang.AssertionError: Cannot add expression of 
> different type to set:
> set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
> "ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT NULL
> expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT 
> $f3_0) NOT NULL
> set is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group={2, 
> 3},agg#0=count($0),agg#1=count($1))
> expression is HiveProject#95
>   at 
> org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411)
>   at 
> org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234)
>   at 
> org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317)
>   at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280)
>   at 
> org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609)
>   at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
>   at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
>   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
>   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450)
>   at 
> 

[jira] [Commented] (HIVE-18537) [Calcite-CBO] Queries with a nested distinct clause and a windowing function seem to fail with calcite Assertion error

2020-10-23 Thread Nemon Lou (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-18537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219498#comment-17219498
 ] 

Nemon Lou commented on HIVE-18537:
--

This issue get fixed after upgrade calcite to 1.17.0 or higher.

Master branch can not reproduce this issue any more.

> [Calcite-CBO] Queries with a nested distinct clause and a windowing function 
> seem to fail with calcite Assertion error
> --
>
> Key: HIVE-18537
> URL: https://issues.apache.org/jira/browse/HIVE-18537
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0, 2.3.2, 3.1.2
>Reporter: Amruth Sampath
>Priority: Critical
>
> Sample test case to re-produce the issue. The issue does not occur if 
> *hive.cbo.enable=false*
> {code:java}
> create table test_cbo (
>  `a` BIGINT,
>  `b` STRING,
>  `c` TIMESTAMP,
>  `d` STRING
>  );
> SELECT 1
>  FROM
>  (SELECT
>  DISTINCT
>  a AS a_,
>  b AS b_,
>  rank() over (partition BY a ORDER BY c DESC) AS c_,
>  d AS d_
>  FROM test_cbo
>  WHERE b = 'some_filter' ) n
>  WHERE c_ = 1;
> {code}
> Fails with, 
> {code:java}
> Exception in thread "main" java.lang.AssertionError: Internal error: Cannot 
> add expression of different type to set:
> set type is RecordType(BIGINT a_, INTEGER c_, VARCHAR(2147483647) CHARACTER 
> SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" d_) NOT NULL
> expression type is RecordType(BIGINT a_, VARCHAR(2147483647) CHARACTER SET 
> "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" c_, INTEGER d_) NOT NULL
> set is rel#112:HiveAggregate.HIVE.[](input=HepRelVertex#121,group={0, 2, 3})
> expression is HiveProject#123{code}
> This might be related to https://issues.apache.org/jira/browse/CALCITE-1868.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite

2020-10-23 Thread Nemon Lou (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou resolved HIVE-24165.
--
Resolution: Invalid

> CBO: Query fails after multiple count distinct rewrite 
> ---
>
> Key: HIVE-24165
> URL: https://issues.apache.org/jira/browse/HIVE-24165
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24165.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> One way to reproduce:
>  
> {code:sql}
>  CREATE TABLE test(
>  `device_id` string, 
>  `level` string, 
>  `site_id` string, 
>  `user_id` string, 
>  `first_date` string, 
>  `last_date` string,
>  `dt` string) ;
>  set hive.execution.engine=tez;
>  set hive.optimize.distinct.rewrite=true;
>  set hive.cli.print.header=true;
>  select 
>  dt,
>  site_id,
>  count(DISTINCT t1.device_id) as device_tol_cnt,
>  count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else 
> null end) as device_add_cnt 
>  from test t1 where dt='2020-09-15' 
>  group by
>  dt,
>  site_id
>  ;
> {code}
>  
> Error log:  
> {code:java}
> Exception in thread "main" java.lang.AssertionError: Cannot add expression of 
> different type to set:
> set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
> "ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT NULL
> expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT 
> $f3_0) NOT NULL
> set is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group={2, 
> 3},agg#0=count($0),agg#1=count($1))
> expression is HiveProject#95
>   at 
> org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411)
>   at 
> org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234)
>   at 
> org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317)
>   at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280)
>   at 
> org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609)
>   at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
>   at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
>   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
>   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> 

[jira] [Commented] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite

2020-10-23 Thread Nemon Lou (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219496#comment-17219496
 ] 

Nemon Lou commented on HIVE-24165:
--

Not able to reproduce in master branch.

After upgrade calcite from 1.16.0 to 1.17.0,this bug also gone for branch3 with 
multi distinct rewrite.

May be fixed in CALCITE-2232

> CBO: Query fails after multiple count distinct rewrite 
> ---
>
> Key: HIVE-24165
> URL: https://issues.apache.org/jira/browse/HIVE-24165
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24165.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> One way to reproduce:
>  
> {code:sql}
>  CREATE TABLE test(
>  `device_id` string, 
>  `level` string, 
>  `site_id` string, 
>  `user_id` string, 
>  `first_date` string, 
>  `last_date` string,
>  `dt` string) ;
>  set hive.execution.engine=tez;
>  set hive.optimize.distinct.rewrite=true;
>  set hive.cli.print.header=true;
>  select 
>  dt,
>  site_id,
>  count(DISTINCT t1.device_id) as device_tol_cnt,
>  count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else 
> null end) as device_add_cnt 
>  from test t1 where dt='2020-09-15' 
>  group by
>  dt,
>  site_id
>  ;
> {code}
>  
> Error log:  
> {code:java}
> Exception in thread "main" java.lang.AssertionError: Cannot add expression of 
> different type to set:
> set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
> "ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT NULL
> expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT 
> $f3_0) NOT NULL
> set is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group={2, 
> 3},agg#0=count($0),agg#1=count($1))
> expression is HiveProject#95
>   at 
> org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411)
>   at 
> org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234)
>   at 
> org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317)
>   at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280)
>   at 
> org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609)
>   at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
>   at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
>   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
>   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)
>   at