[jira] [Work logged] (HIVE-23935) Fetching primaryKey through beeline fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-23935?focusedWorklogId=504411=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504411 ] ASF GitHub Bot logged work on HIVE-23935: - Author: ASF GitHub Bot Created on: 24/Oct/20 04:35 Start Date: 24/Oct/20 04:35 Worklog Time Spent: 10m Work Description: ayushtkn commented on a change in pull request #1605: URL: https://github.com/apache/hive/pull/1605#discussion_r511318476 ## File path: standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java ## @@ -1124,6 +1125,40 @@ public void testEmptyTrustStoreProps() { setAndCheckSSLProperties(true, "", "", "jks"); } + /** + * Tests getPrimaryKeys() when db_name isn't specified. + */ + @Test + public void testGetPrimaryKeys() throws Exception { Review comment: The change itself is in ObjectStore, and the API too is in ObjectStore. The `TestPrimaryKey` is a parametrised test and for one case, this issue doesn't happen. I think the test is better here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504411) Time Spent: 1.5h (was: 1h 20m) > Fetching primaryKey through beeline fails with NPE > -- > > Key: HIVE-23935 > URL: https://issues.apache.org/jira/browse/HIVE-23935 > Project: Hive > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > Fetching PrimaryKey of a table through Beeline !primarykey fails with NPE > {noformat} > 0: jdbc:hive2://localhost:1> !primarykeys Persons > Error: MetaException(message:java.lang.NullPointerException) (state=,code=0) > org.apache.hive.service.cli.HiveSQLException: > MetaException(message:java.lang.NullPointerException) > at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:360) > at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:351) > at > org.apache.hive.jdbc.HiveDatabaseMetaData.getPrimaryKeys(HiveDatabaseMetaData.java:573) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hive.beeline.Reflector.invoke(Reflector.java:89) > at org.apache.hive.beeline.Commands.metadata(Commands.java:125) > at org.apache.hive.beeline.Commands.primarykeys(Commands.java:231) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:57) > at > org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1465) > at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1504) > at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1364) > at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1134) > at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1082) > at > org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:546) > at org.apache.hive.beeline.BeeLine.main(BeeLine.java:528) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:323) > at org.apache.hadoop.util.RunJar.main(RunJar.java:236){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23935) Fetching primaryKey through beeline fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-23935?focusedWorklogId=504410=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504410 ] ASF GitHub Bot logged work on HIVE-23935: - Author: ASF GitHub Bot Created on: 24/Oct/20 04:07 Start Date: 24/Oct/20 04:07 Worklog Time Spent: 10m Work Description: ayushtkn commented on a change in pull request #1605: URL: https://github.com/apache/hive/pull/1605#discussion_r511305357 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java ## @@ -10833,7 +10833,8 @@ public FileMetadataHandler getFileMetadataHandler(FileMetadataExprType type) { final String db_name_input, final String tbl_name_input) throws MetaException, NoSuchObjectException { -final String db_name = normalizeIdentifier(db_name_input); +final String db_name = Review comment: Thanx @ashish-kumar-sharma 1. I will address. 2. Why for catName? It isn't getting accessed anywhere which can fetch any NPE. 3. I didn't catch this? I haven't introduce any variable itself. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504410) Time Spent: 1h 20m (was: 1h 10m) > Fetching primaryKey through beeline fails with NPE > -- > > Key: HIVE-23935 > URL: https://issues.apache.org/jira/browse/HIVE-23935 > Project: Hive > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Fetching PrimaryKey of a table through Beeline !primarykey fails with NPE > {noformat} > 0: jdbc:hive2://localhost:1> !primarykeys Persons > Error: MetaException(message:java.lang.NullPointerException) (state=,code=0) > org.apache.hive.service.cli.HiveSQLException: > MetaException(message:java.lang.NullPointerException) > at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:360) > at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:351) > at > org.apache.hive.jdbc.HiveDatabaseMetaData.getPrimaryKeys(HiveDatabaseMetaData.java:573) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hive.beeline.Reflector.invoke(Reflector.java:89) > at org.apache.hive.beeline.Commands.metadata(Commands.java:125) > at org.apache.hive.beeline.Commands.primarykeys(Commands.java:231) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:57) > at > org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1465) > at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1504) > at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1364) > at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1134) > at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1082) > at > org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:546) > at org.apache.hive.beeline.BeeLine.main(BeeLine.java:528) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:323) > at org.apache.hadoop.util.RunJar.main(RunJar.java:236){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23935) Fetching primaryKey through beeline fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-23935?focusedWorklogId=504409=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504409 ] ASF GitHub Bot logged work on HIVE-23935: - Author: ASF GitHub Bot Created on: 24/Oct/20 03:53 Start Date: 24/Oct/20 03:53 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #1605: URL: https://github.com/apache/hive/pull/1605#discussion_r511299604 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java ## @@ -10833,7 +10833,8 @@ public FileMetadataHandler getFileMetadataHandler(FileMetadataExprType type) { final String db_name_input, final String tbl_name_input) throws MetaException, NoSuchObjectException { -final String db_name = normalizeIdentifier(db_name_input); +final String db_name = Review comment: 1. Can we use StringUtils.isNotBlank(db_name_input) instead of (db_name_input!=null). 2. Also can we have the same check on catName. 3. Can we use unified camel casing naming convention across variable name. ## File path: standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java ## @@ -1124,6 +1125,40 @@ public void testEmptyTrustStoreProps() { setAndCheckSSLProperties(true, "", "", "jks"); } + /** + * Tests getPrimaryKeys() when db_name isn't specified. + */ + @Test + public void testGetPrimaryKeys() throws Exception { Review comment: Please add this Test to class TestPrimaryKey.java This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504409) Time Spent: 1h 10m (was: 1h) > Fetching primaryKey through beeline fails with NPE > -- > > Key: HIVE-23935 > URL: https://issues.apache.org/jira/browse/HIVE-23935 > Project: Hive > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Fetching PrimaryKey of a table through Beeline !primarykey fails with NPE > {noformat} > 0: jdbc:hive2://localhost:1> !primarykeys Persons > Error: MetaException(message:java.lang.NullPointerException) (state=,code=0) > org.apache.hive.service.cli.HiveSQLException: > MetaException(message:java.lang.NullPointerException) > at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:360) > at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:351) > at > org.apache.hive.jdbc.HiveDatabaseMetaData.getPrimaryKeys(HiveDatabaseMetaData.java:573) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hive.beeline.Reflector.invoke(Reflector.java:89) > at org.apache.hive.beeline.Commands.metadata(Commands.java:125) > at org.apache.hive.beeline.Commands.primarykeys(Commands.java:231) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:57) > at > org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1465) > at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1504) > at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1364) > at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1134) > at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1082) > at > org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:546) > at org.apache.hive.beeline.BeeLine.main(BeeLine.java:528) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at >
[jira] [Updated] (HIVE-24310) Allow specified number of deserialize errors to be ignored
[ https://issues.apache.org/jira/browse/HIVE-24310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24310: -- Labels: pull-request-available (was: ) > Allow specified number of deserialize errors to be ignored > -- > > Key: HIVE-24310 > URL: https://issues.apache.org/jira/browse/HIVE-24310 > Project: Hive > Issue Type: Improvement > Components: Operators >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Sometimes we see some corrupted records in user's raw data, like one > corrupted in a file which contains over thousands of records, user has to > either give up all records or replay the whole data in order to run > successfully on hive, we should provide a way to ignore such corrupted > records. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24310) Allow specified number of deserialize errors to be ignored
[ https://issues.apache.org/jira/browse/HIVE-24310?focusedWorklogId=504402=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504402 ] ASF GitHub Bot logged work on HIVE-24310: - Author: ASF GitHub Bot Created on: 24/Oct/20 02:44 Start Date: 24/Oct/20 02:44 Worklog Time Spent: 10m Work Description: dengzhhu653 opened a new pull request #1607: URL: https://github.com/apache/hive/pull/1607 ### What changes were proposed in this pull request? Allow specified number of deserialize errors to be ignored ### Why are the changes needed? Sometimes we see some corrupted records in user's raw data, like one corrupted in a file which contains over thousands of records, user has to either give up all records or replay the whole data in order to run successfully on hive, we should provide a way to ignore such corrupted records. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? unit tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504402) Remaining Estimate: 0h Time Spent: 10m > Allow specified number of deserialize errors to be ignored > -- > > Key: HIVE-24310 > URL: https://issues.apache.org/jira/browse/HIVE-24310 > Project: Hive > Issue Type: Improvement > Components: Operators >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Sometimes we see some corrupted records in user's raw data, like one > corrupted in a file which contains over thousands of records, user has to > either give up all records or replay the whole data in order to run > successfully on hive, we should provide a way to ignore such corrupted > records. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24310) Allow specified number of deserialize errors to be ignored
[ https://issues.apache.org/jira/browse/HIVE-24310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng reassigned HIVE-24310: -- > Allow specified number of deserialize errors to be ignored > -- > > Key: HIVE-24310 > URL: https://issues.apache.org/jira/browse/HIVE-24310 > Project: Hive > Issue Type: Improvement > Components: Operators >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > > Sometimes we see some corrupted records in user's raw data, like one > corrupted in a file which contains over thousands of records, user has to > either give up all records or replay the whole data in order to run > successfully on hive, we should provide a way to ignore such corrupted > records. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24032) Remove hadoop shims dependency and use FileSystem Api directly from standalone metastore
[ https://issues.apache.org/jira/browse/HIVE-24032?focusedWorklogId=504388=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504388 ] ASF GitHub Bot logged work on HIVE-24032: - Author: ASF GitHub Bot Created on: 24/Oct/20 00:58 Start Date: 24/Oct/20 00:58 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1396: URL: https://github.com/apache/hive/pull/1396#issuecomment-715647392 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504388) Time Spent: 1.5h (was: 1h 20m) > Remove hadoop shims dependency and use FileSystem Api directly from > standalone metastore > > > Key: HIVE-24032 > URL: https://issues.apache.org/jira/browse/HIVE-24032 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24032.01.patch, HIVE-24032.02.patch, > HIVE-24032.03.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > Remove hadoop shims dependency from standalone metastore. > Rename hive.repl.data.copy.lazy hive conf to > hive.repl.run.data.copy.tasks.on.target -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23926) Flaky test TestTableLevelReplicationScenarios.testRenameTableScenariosWithReplacePolicyDMLOperattion
[ https://issues.apache.org/jira/browse/HIVE-23926?focusedWorklogId=504386=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504386 ] ASF GitHub Bot logged work on HIVE-23926: - Author: ASF GitHub Bot Created on: 24/Oct/20 00:58 Start Date: 24/Oct/20 00:58 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1420: URL: https://github.com/apache/hive/pull/1420#issuecomment-715647384 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504386) Time Spent: 20m (was: 10m) > Flaky test > TestTableLevelReplicationScenarios.testRenameTableScenariosWithReplacePolicyDMLOperattion > > > Key: HIVE-23926 > URL: https://issues.apache.org/jira/browse/HIVE-23926 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23926.01.patch > > Time Spent: 20m > Remaining Estimate: 0h > > http://ci.hive.apache.org/job/hive-precommit/job/master/123/testReport/org.apache.hadoop.hive.ql.parse/TestTableLevelReplicationScenarios/Testing___split_18___Archive___testRenameTableScenariosWithReplacePolicyDMLOperattion/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24304) Query containing UNION fails with OOM
[ https://issues.apache.org/jira/browse/HIVE-24304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg resolved HIVE-24304. Fix Version/s: 4.0.0 Resolution: Fixed Pushed to master. > Query containing UNION fails with OOM > - > > Key: HIVE-24304 > URL: https://issues.apache.org/jira/browse/HIVE-24304 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24304) Query containing UNION fails with OOM
[ https://issues.apache.org/jira/browse/HIVE-24304?focusedWorklogId=504378=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504378 ] ASF GitHub Bot logged work on HIVE-24304: - Author: ASF GitHub Bot Created on: 23/Oct/20 23:54 Start Date: 23/Oct/20 23:54 Worklog Time Spent: 10m Work Description: vineetgarg02 merged pull request #1600: URL: https://github.com/apache/hive/pull/1600 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504378) Time Spent: 0.5h (was: 20m) > Query containing UNION fails with OOM > - > > Key: HIVE-24304 > URL: https://issues.apache.org/jira/browse/HIVE-24304 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24294) TezSessionPool sessions can throw AssertionError
[ https://issues.apache.org/jira/browse/HIVE-24294?focusedWorklogId=504344=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504344 ] ASF GitHub Bot logged work on HIVE-24294: - Author: ASF GitHub Bot Created on: 23/Oct/20 22:06 Start Date: 23/Oct/20 22:06 Worklog Time Spent: 10m Work Description: mustafaiman commented on pull request #1596: URL: https://github.com/apache/hive/pull/1596#issuecomment-715610964 LGTM +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504344) Time Spent: 20m (was: 10m) > TezSessionPool sessions can throw AssertionError > > > Key: HIVE-24294 > URL: https://issues.apache.org/jira/browse/HIVE-24294 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Whenever default TezSessionPool sessions are reopened for some reason, we are > setting dagResources to null before close & setting it back in openWhenever > default TezSessionPool sessions are reopened for some reason, we are setting > dagResources to null before close & setting it back in open > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L498-L503 > If there is an exception in sessionState.close(), we are not restoring the > dagResource but moving the session back to TezSessionPool.eg., exception > trace when sessionState.close() failed > {code:java} > 2020-10-15T09:20:28,749 INFO [HiveServer2-Background-Pool: Thread-25451]: > client.TezClient (:()) - Failed to shutdown Tez Session via proxy > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1602093123456_12345, yarnApplicationState=FINISHED, > finalApplicationStatus=SUCCEEDED, > trackingUrl=http://localhost:8088/proxy/application_1602093123456_12345/, > diagnostics=Session timed out, lastDAGCompletionTime=1602997683786 ms, > sessionTimeoutInterval=60 ms > Session stats:submittedDAGs=2, successfulDAGs=2, failedDAGs=0, killedDAGs=0 > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1060) > at org.apache.tez.client.TezClient.stop(TezClient.java:743) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.closeClient(TezSessionState.java:789) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:756) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.close(TezSessionPoolSession.java:111) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopenInternal(TezSessionPoolManager.java:496) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopen(TezSessionPoolManager.java:487) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.reopen(TezSessionPoolSession.java:228) > > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.getNewTezSessionOnError(TezTask.java:531) > > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:546) > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:221){code} > Because of this, all new queries using this corrupted sessions are failing > with below exception > {code:java} > Caused by: java.lang.AssertionError: Ensure called on an unitialized (or > closed) session 41774265-b7da-4d58-84a8-1bedfd597aecCaused by: > java.lang.AssertionError: Ensure called on an unitialized (or closed) session > 41774265-b7da-4d58-84a8-1bedfd597aec at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:685){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background
[ https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=504332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504332 ] ASF GitHub Bot logged work on HIVE-24270: - Author: ASF GitHub Bot Created on: 23/Oct/20 21:36 Start Date: 23/Oct/20 21:36 Worklog Time Spent: 10m Work Description: nareshpr commented on pull request #1577: URL: https://github.com/apache/hive/pull/1577#issuecomment-715600895 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504332) Time Spent: 1h 40m (was: 1.5h) > Move scratchdir cleanup to background > - > > Key: HIVE-24270 > URL: https://issues.apache.org/jira/browse/HIVE-24270 > Project: Hive > Issue Type: Improvement >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > In cloud environment, scratchdir cleaning at the end of the query may take > long time. This causes client to hang up to 1 minute even after the results > were streamed back. During this time client just waits for cleanup to finish. > Cleanup can take place in the background in HiveServer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22912) Support native submission of Hive queries to a Kubernetes Cluster
[ https://issues.apache.org/jira/browse/HIVE-22912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219943#comment-17219943 ] Viacheslav Avramenko commented on HIVE-22912: - I agreed with Surbhi and Michel, what about the kubernetes support as open-source project? > Support native submission of Hive queries to a Kubernetes Cluster > - > > Key: HIVE-22912 > URL: https://issues.apache.org/jira/browse/HIVE-22912 > Project: Hive > Issue Type: New Feature >Reporter: Surbhi Aggarwal >Priority: Major > > So many big data applications are already integrated or trying to natively > integrate with Kubernetes engine. Should we not work together to support hive > with this engine? > If efforts are already being spent on this, please point me to it. Thanks ! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24066) Hive query on parquet data should identify if column is not present in file schema and show NULL value instead of Exception
[ https://issues.apache.org/jira/browse/HIVE-24066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jainik Vora updated HIVE-24066: --- Description: I created a hive table containing columns with struct data type {code:java} CREATE EXTERNAL TABLE test_dwh.sample_parquet_table ( `context` struct< `app`: struct< `build`: string, `name`: string, `namespace`: string, `version`: string >, `device`: struct< `adtrackingenabled`: boolean, `advertisingid`: string, `id`: string, `manufacturer`: string, `model`: string, `type`: string >, `locale`: string, `library`: struct< `name`: string, `version`: string >, `os`: struct< `name`: string, `version`: string >, `screen`: struct< `height`: bigint, `width`: bigint >, `network`: struct< `carrier`: string, `cellular`: boolean, `wifi`: boolean >, `timezone`: string, `userAgent`: string > ) PARTITIONED BY (day string) STORED as PARQUET LOCATION 's3://xyz/events'{code} All columns are nullable hence the parquet files read by the table don't always contain all columns. If any file in a partition doesn't have "context.os" struct and if "context.os.name" is queried, Hive throws an exception as below. Same for "context.screen" as well. {code:java} 2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 main([])]: CliDriver (SessionState.java:printError(1126)) - Failed with exception java.io.IOException:java.lang.RuntimeException: Primitive type osshould not doesn't match typeos[name] 2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 main([])]: CliDriver (SessionState.java:printError(1126)) - Failed with exception java.io.IOException:java.lang.RuntimeException: Primitive type osshould not doesn't match typeos[name]java.io.IOException: java.lang.RuntimeException: Primitive type osshould not doesn't match typeos[name] at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2208) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:787) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:239) at org.apache.hadoop.util.RunJar.main(RunJar.java:153) Caused by: java.lang.RuntimeException: Primitive type osshould not doesn't match typeos[name] at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.projectLeafTypes(DataWritableReadSupport.java:330) at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.projectLeafTypes(DataWritableReadSupport.java:322) at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.getProjectedSchema(DataWritableReadSupport.java:249) at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:379) at org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.getSplit(ParquetRecordReaderBase.java:84) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:75) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:60) at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:75) at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:695) at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:333) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:459) ... 16 more{code} Querying context.os shows as null {code:java} hive> select context.os from test_dwh.sample_parquet_table where day='01' limit 5; OK NULL NULL NULL NULL NULL
[jira] [Commented] (HIVE-21737) Upgrade Avro to version 1.10.0
[ https://issues.apache.org/jira/browse/HIVE-21737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219858#comment-17219858 ] Chao Sun commented on HIVE-21737: - [~iemejia] instead of upgrading Avro in Hive, I think alternatively we can replace the usage of API that was removed (and was marked as deprecated from Avro 1.8) since Avro 1.9 by [AVRO-1605|https://issues.apache.org/jira/browse/AVRO-1605] - in particular, {{JsonProperties#getJsonProp}}. This could be an easier approach. > Upgrade Avro to version 1.10.0 > -- > > Key: HIVE-21737 > URL: https://issues.apache.org/jira/browse/HIVE-21737 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Ismaël Mejía >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Attachments: 0001-HIVE-21737-Bump-Apache-Avro-to-1.9.2.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Avro >= 1.9.x bring a lot of fixes including a leaner version of Avro without > Jackson in the public API and Guava as a dependency. Worth the update. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24304) Query containing UNION fails with OOM
[ https://issues.apache.org/jira/browse/HIVE-24304?focusedWorklogId=504257=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504257 ] ASF GitHub Bot logged work on HIVE-24304: - Author: ASF GitHub Bot Created on: 23/Oct/20 17:11 Start Date: 23/Oct/20 17:11 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1600: URL: https://github.com/apache/hive/pull/1600#discussion_r511024068 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdExpressionLineage.java ## @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.optimizer.calcite.stats; + + +import org.apache.calcite.rel.core.Union; +import org.apache.calcite.rel.metadata.BuiltInMetadata; +import org.apache.calcite.rel.metadata.MetadataDef; +import org.apache.calcite.rel.metadata.MetadataHandler; +import org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider; +import org.apache.calcite.rel.metadata.RelMetadataProvider; +import org.apache.calcite.rel.metadata.RelMetadataQuery; +import org.apache.calcite.rex.RexNode; +import org.apache.calcite.util.BuiltInMethod; +import org.apache.calcite.util.ImmutableBitSet; +import org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable; +import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan; +import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveUnion; + +import java.util.Set; + +public final class HiveRelMdExpressionLineage +implements MetadataHandler { + public static final RelMetadataProvider SOURCE = + ReflectiveRelMetadataProvider.reflectiveSource( + BuiltInMethod.EXPRESSION_LINEAGE.method, new HiveRelMdExpressionLineage()); + + //~ Constructors --- + + private HiveRelMdExpressionLineage() {} + + //~ Methods + + public MetadataDef getDef() { +return BuiltInMetadata.ExpressionLineage.DEF; + } + + public Set getExpressionLineage(HiveUnion rel, RelMetadataQuery mq, + RexNode outputExpression) { +return null; Review comment: Can we add a comment based on the JIRA discussion on why we are returning null for Union operator (it will help us recall in case we revisit this code in the future)? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504257) Time Spent: 20m (was: 10m) > Query containing UNION fails with OOM > - > > Key: HIVE-24304 > URL: https://issues.apache.org/jira/browse/HIVE-24304 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-20273) Spark jobs aren't cancelled if getSparkJobInfo or getSparkStagesInfo
[ https://issues.apache.org/jira/browse/HIVE-20273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar reassigned HIVE-20273: --- Assignee: (was: Sahil Takiar) > Spark jobs aren't cancelled if getSparkJobInfo or getSparkStagesInfo > > > Key: HIVE-20273 > URL: https://issues.apache.org/jira/browse/HIVE-20273 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Priority: Major > Attachments: HIVE-20273.1.patch, HIVE-20273.2.patch > > > HIVE-19053 and HIVE-19733 added handling of {{InterruptedException}} to > {{RemoteSparkJobStatus#getSparkJobInfo}} and > {{RemoteSparkJobStatus#getSparkStagesInfo}}. Now, these methods catch > {{InterruptedException}} and wrap the exception in a {{HiveException}} and > then throw the new {{HiveException}}. > This new {{HiveException}} is then caught in > {{RemoteSparkJobMonitor#startMonitor}} which then looks for exceptions that > match the condition: > {code:java} > if (e instanceof InterruptedException || > (e instanceof HiveException && e.getCause() instanceof > InterruptedException)) > {code} > If this condition is met (in this case it is), the exception will again be > wrapped in another {{HiveException}} and is thrown again. So the final > exception is a {{HiveException}} that wraps a {{HiveException}} that wraps an > {{InterruptedException}}. > The double nesting of hive exception causes the logic in > {{SparkTask#setSparkException}} to break, and doesn't cause {{killJob}} to > get triggered. > This causes interrupted Hive queries to not kill their corresponding Spark > jobs. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-20519) Remove 30m min value for hive.spark.session.timeout
[ https://issues.apache.org/jira/browse/HIVE-20519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar reassigned HIVE-20519: --- Assignee: (was: Sahil Takiar) > Remove 30m min value for hive.spark.session.timeout > --- > > Key: HIVE-20519 > URL: https://issues.apache.org/jira/browse/HIVE-20519 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Priority: Major > Attachments: HIVE-20519.1.patch, HIVE-20519.2.patch, > HIVE-20519.3.patch > > > In HIVE-14162 we added the config \{{hive.spark.session.timeout}} which > provided a way to time out Spark sessions that are active for a long period > of time. The config has a lower bound of 30m which we should remove. It > should be possible for users to configure this value so the HoS session is > closed as soon as the query is complete. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-20828) Upgrade to Spark 2.4.0
[ https://issues.apache.org/jira/browse/HIVE-20828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar reassigned HIVE-20828: --- Assignee: (was: Sahil Takiar) > Upgrade to Spark 2.4.0 > -- > > Key: HIVE-20828 > URL: https://issues.apache.org/jira/browse/HIVE-20828 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Priority: Major > Attachments: HIVE-20828.1.patch, HIVE-20828.2.patch > > > The Spark community is in the process of releasing Spark 2.4.0. We should do > some testing with the RC candidates and then upgrade once the release is > finalized. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-19821) Distributed HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-19821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar reassigned HIVE-19821: --- Assignee: (was: Sahil Takiar) > Distributed HiveServer2 > --- > > Key: HIVE-19821 > URL: https://issues.apache.org/jira/browse/HIVE-19821 > Project: Hive > Issue Type: New Feature > Components: HiveServer2 >Reporter: Sahil Takiar >Priority: Major > Attachments: HIVE-19821.1.WIP.patch, HIVE-19821.2.WIP.patch, > HIVE-19821_ Distributed HiveServer2.pdf > > > HS2 deployments often hit OOM issues due to a number of factors: (1) too many > concurrent connections, (2) query that scan a large number of partitions have > to pull a lot of metadata into memory (e.g. a query reading thousands of > partitions requires loading thousands of partitions into memory), (3) very > large queries can take up a lot of heap space, especially during query > parsing. There are a number of other factors that cause HiveServer2 to run > out of memory, these are just some of the more commons ones. > Distributed HS2 proposes to do all query parsing, compilation, planning, and > execution coordination inside a dedicated container. This should > significantly decrease memory pressure on HS2 and allow HS2 to scale to a > larger number of concurrent users. > For HoS (and I think Hive-on-Tez) this just requires moving all query > compilation, planning, etc. inside the application master for the > corresponding Hive session. > The main benefit here is isolation. A poorly written Hive query cannot bring > down an entire HiveServer2 instance and force all other queries to fail. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background
[ https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=504250=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504250 ] ASF GitHub Bot logged work on HIVE-24270: - Author: ASF GitHub Bot Created on: 23/Oct/20 16:56 Start Date: 23/Oct/20 16:56 Worklog Time Spent: 10m Work Description: mustafaiman commented on pull request #1577: URL: https://github.com/apache/hive/pull/1577#issuecomment-715459208 @kgyrtkirk @nareshpr I significantly changed the patch. Please let me know if you have further concerns. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504250) Time Spent: 1.5h (was: 1h 20m) > Move scratchdir cleanup to background > - > > Key: HIVE-24270 > URL: https://issues.apache.org/jira/browse/HIVE-24270 > Project: Hive > Issue Type: Improvement >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > In cloud environment, scratchdir cleaning at the end of the query may take > long time. This causes client to hang up to 1 minute even after the results > were streamed back. During this time client just waits for cleanup to finish. > Cleanup can take place in the background in HiveServer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504248=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504248 ] ASF GitHub Bot logged work on HIVE-24258: - Author: ASF GitHub Bot Created on: 23/Oct/20 16:52 Start Date: 23/Oct/20 16:52 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #1587: URL: https://github.com/apache/hive/pull/1587#discussion_r511014101 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java ## @@ -2846,31 +2846,28 @@ public SQLAllTableConstraints getAllTableConstraints(String catName, String dbNa return sqlAllTableConstraints; } - @Override public List createTableWithConstraints(Table tbl, List primaryKeys, - List foreignKeys, List uniqueConstraints, - List notNullConstraints, List defaultConstraints, - List checkConstraints) throws InvalidObjectException, MetaException { -List constraintNames = rawStore -.createTableWithConstraints(tbl, primaryKeys, foreignKeys, uniqueConstraints, notNullConstraints, -defaultConstraints, checkConstraints); + @Override public SQLAllTableConstraints createTableWithConstraints(Table tbl, SQLAllTableConstraints constraints) throws InvalidObjectException, MetaException { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504248) Time Spent: 1h 10m (was: 1h) > [CachedStore] Data miss match between cachedstore and rawstore > -- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Sankar Hariappan >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24309) Simplify ConvertJoinMapJoin logic
[ https://issues.apache.org/jira/browse/HIVE-24309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24309: -- Labels: pull-request-available (was: ) > Simplify ConvertJoinMapJoin logic > -- > > Key: HIVE-24309 > URL: https://issues.apache.org/jira/browse/HIVE-24309 > Project: Hive > Issue Type: Improvement >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > ConvertMapJoin logic can be further simplified: > [https://github.com/pgaref/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L92] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504229=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504229 ] ASF GitHub Bot logged work on HIVE-24258: - Author: ASF GitHub Bot Created on: 23/Oct/20 16:14 Start Date: 23/Oct/20 16:14 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #1587: URL: https://github.com/apache/hive/pull/1587#discussion_r510993015 ## File path: standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java ## @@ -1568,12 +1568,7 @@ public void testPrimaryKeys() { List cachedKeys = sharedCache.listCachedPrimaryKeys( DEFAULT_CATALOG_NAME, tbl.getDbName(), tbl.getTableName()); -Assert.assertEquals(cachedKeys.size(), 1); -Assert.assertEquals(cachedKeys.get(0).getPk_name(), "pk1"); -Assert.assertEquals(cachedKeys.get(0).getTable_db(), "db"); -Assert.assertEquals(cachedKeys.get(0).getTable_name(), tbl.getTableName()); -Assert.assertEquals(cachedKeys.get(0).getColumn_name(), "col1"); -Assert.assertEquals(cachedKeys.get(0).getCatName(), DEFAULT_CATALOG_NAME); +Assert.assertEquals(origKeys,cachedKeys); Review comment: Done ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java ## @@ -1499,20 +1499,11 @@ SQLAllTableConstraints getAllTableConstraints(String catName, String dbName, Str /** * Create a table with constraints * @param tbl table definition - * @param primaryKeys primary key definition, or null - * @param foreignKeys foreign key definition, or null - * @param uniqueConstraints unique constraints definition, or null - * @param notNullConstraints not null constraints definition, or null - * @param defaultConstraints default values definition, or null * @return list of constraint names Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504229) Time Spent: 1h (was: 50m) > [CachedStore] Data miss match between cachedstore and rawstore > -- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Sankar Hariappan >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24309) Simplify ConvertJoinMapJoin logic
[ https://issues.apache.org/jira/browse/HIVE-24309?focusedWorklogId=504230=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504230 ] ASF GitHub Bot logged work on HIVE-24309: - Author: ASF GitHub Bot Created on: 23/Oct/20 16:14 Start Date: 23/Oct/20 16:14 Worklog Time Spent: 10m Work Description: pgaref opened a new pull request #1606: URL: https://github.com/apache/hive/pull/1606 Change-Id: I89865b6ebc102fa63a99beb94a89771b779cc300 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504230) Remaining Estimate: 0h Time Spent: 10m > Simplify ConvertJoinMapJoin logic > -- > > Key: HIVE-24309 > URL: https://issues.apache.org/jira/browse/HIVE-24309 > Project: Hive > Issue Type: Improvement >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > ConvertMapJoin logic can be further simplified: > [https://github.com/pgaref/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L92] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504227=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504227 ] ASF GitHub Bot logged work on HIVE-24258: - Author: ASF GitHub Bot Created on: 23/Oct/20 16:13 Start Date: 23/Oct/20 16:13 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #1587: URL: https://github.com/apache/hive/pull/1587#discussion_r510992591 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java ## @@ -2785,31 +2696,23 @@ public void add_not_null_constraint(AddNotNullConstraintRequest req) @Override public void add_default_constraint(AddDefaultConstraintRequest req) throws MetaException, InvalidObjectException { - List defaultConstraintCols= req.getDefaultConstraintCols(); - String constraintName = (defaultConstraintCols != null && defaultConstraintCols.size() > 0) ? - defaultConstraintCols.get(0).getDc_name() : "null"; + List defaultConstraints= req.getDefaultConstraintCols(); Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504227) Time Spent: 40m (was: 0.5h) > [CachedStore] Data miss match between cachedstore and rawstore > -- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Sankar Hariappan >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504228=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504228 ] ASF GitHub Bot logged work on HIVE-24258: - Author: ASF GitHub Bot Created on: 23/Oct/20 16:13 Start Date: 23/Oct/20 16:13 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #1587: URL: https://github.com/apache/hive/pull/1587#discussion_r510992922 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java ## @@ -58,14 +59,11 @@ public static String buildDbKeyWithDelimiterSuffix(String catName, String dbName * */ public static String buildPartitionCacheKey(List partVals) { -if (partVals == null || partVals.isEmpty()) { - return ""; -} -return String.join(delimit, partVals); +return CollectionUtils.isNotEmpty(partVals) ? String.join(delimit, partVals) : ""; } public static String buildTableKey(String catName, String dbName, String tableName) { -return buildKey(catName.toLowerCase(), dbName.toLowerCase(), tableName.toLowerCase()); +return buildKey(StringUtils.normalizeIdentifier(catName),StringUtils.normalizeIdentifier(dbName),StringUtils.normalizeIdentifier(tableName)); Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504228) Time Spent: 50m (was: 40m) > [CachedStore] Data miss match between cachedstore and rawstore > -- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Sankar Hariappan >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504226=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504226 ] ASF GitHub Bot logged work on HIVE-24258: - Author: ASF GitHub Bot Created on: 23/Oct/20 16:12 Start Date: 23/Oct/20 16:12 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #1587: URL: https://github.com/apache/hive/pull/1587#discussion_r510992400 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java ## @@ -2255,121 +2257,61 @@ private void create_table_core(final RawStore ms, final CreateTableRequest req) tbl.putToParameters(hive_metastoreConstants.DDL_TIME, Long.toString(time)); } -if (primaryKeys == null && foreignKeys == null -&& uniqueConstraints == null && notNullConstraints == null && defaultConstraints == null -&& checkConstraints == null) { +if (CollectionUtils.isEmpty(constraints.getPrimaryKeys()) && CollectionUtils.isEmpty(constraints.getForeignKeys()) +&& CollectionUtils.isEmpty(constraints.getUniqueConstraints())&& CollectionUtils.isEmpty(constraints.getNotNullConstraints())&& CollectionUtils.isEmpty(constraints.getDefaultConstraints()) +&& CollectionUtils.isEmpty(constraints.getCheckConstraints())) { ms.createTable(tbl); } else { // Check that constraints have catalog name properly set first - if (primaryKeys != null && !primaryKeys.isEmpty() && !primaryKeys.get(0).isSetCatName()) { -for (SQLPrimaryKey pkcol : primaryKeys) pkcol.setCatName(tbl.getCatName()); + if (CollectionUtils.isNotEmpty(constraints.getPrimaryKeys()) && !constraints.getPrimaryKeys().get(0).isSetCatName()) { +for (SQLPrimaryKey pkcol : constraints.getPrimaryKeys()) pkcol.setCatName(tbl.getCatName()); Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504226) Time Spent: 0.5h (was: 20m) > [CachedStore] Data miss match between cachedstore and rawstore > -- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Sankar Hariappan >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504225=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504225 ] ASF GitHub Bot logged work on HIVE-24258: - Author: ASF GitHub Bot Created on: 23/Oct/20 16:12 Start Date: 23/Oct/20 16:12 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #1587: URL: https://github.com/apache/hive/pull/1587#discussion_r510992214 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStoreUpdateUsingEvents.java ## @@ -419,18 +412,14 @@ public void testConstraintsForUpdateUsingEvents() throws Exception { public void assertRawStoreAndCachedStoreConstraint(String catName, String dbName, String tblName) throws MetaException, NoSuchObjectException { SQLAllTableConstraints rawStoreConstraints = rawStore.getAllTableConstraints(catName, dbName, tblName); -List primaryKeys = sharedCache.listCachedPrimaryKeys(catName, dbName, tblName); -List notNullConstraints = sharedCache.listCachedNotNullConstraints(catName, dbName, tblName); -List uniqueConstraints = sharedCache.listCachedUniqueConstraint(catName, dbName, tblName); -List defaultConstraints = sharedCache.listCachedDefaultConstraint(catName, dbName, tblName); -List checkConstraints = sharedCache.listCachedCheckConstraint(catName, dbName, tblName); -List foreignKeys = sharedCache.listCachedForeignKeys(catName, dbName, tblName, null, null); -Assert.assertEquals(rawStoreConstraints.getPrimaryKeys(), primaryKeys); -Assert.assertEquals(rawStoreConstraints.getNotNullConstraints(), notNullConstraints); -Assert.assertEquals(rawStoreConstraints.getUniqueConstraints(), uniqueConstraints); -Assert.assertEquals(rawStoreConstraints.getDefaultConstraints(), defaultConstraints); -Assert.assertEquals(rawStoreConstraints.getCheckConstraints(), checkConstraints); -Assert.assertEquals(rawStoreConstraints.getForeignKeys(), foreignKeys); +SQLAllTableConstraints cachedStoreConstraints = new SQLAllTableConstraints(); + cachedStoreConstraints.setPrimaryKeys(sharedCache.listCachedPrimaryKeys(catName, dbName, tblName)); + cachedStoreConstraints.setForeignKeys(sharedCache.listCachedForeignKeys(catName, dbName, tblName, null, null)); + cachedStoreConstraints.setNotNullConstraints(sharedCache.listCachedNotNullConstraints(catName, dbName, tblName)); + cachedStoreConstraints.setDefaultConstraints(sharedCache.listCachedDefaultConstraint(catName, dbName, tblName)); + cachedStoreConstraints.setCheckConstraints(sharedCache.listCachedCheckConstraint(catName, dbName, tblName)); + cachedStoreConstraints.setUniqueConstraints(sharedCache.listCachedUniqueConstraint(catName, dbName, tblName)); +Assert.assertEquals(rawStoreConstraints,cachedStoreConstraints); Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504225) Time Spent: 20m (was: 10m) > [CachedStore] Data miss match between cachedstore and rawstore > -- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Sankar Hariappan >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24308) FIX conditions used for DPHJ conversion
[ https://issues.apache.org/jira/browse/HIVE-24308?focusedWorklogId=504213=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504213 ] ASF GitHub Bot logged work on HIVE-24308: - Author: ASF GitHub Bot Created on: 23/Oct/20 15:18 Start Date: 23/Oct/20 15:18 Worklog Time Spent: 10m Work Description: pgaref opened a new pull request #1604: URL: https://github.com/apache/hive/pull/1604 FIX conditions used for DPHJ conversion ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504213) Time Spent: 0.5h (was: 20m) > FIX conditions used for DPHJ conversion > - > > Key: HIVE-24308 > URL: https://issues.apache.org/jira/browse/HIVE-24308 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Found a weird scenario when looking at the ConvertJoinMapJoin logic: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198] > When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is > lower than expected the code returns a MJ because of the condition above! > In general, I believe the ShuffleSize check: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624] > should be part of the shuffleJoin DPHJ conversion. > And the preferred conversion would be: MJ > DPHJ > SMB -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24308) FIX conditions used for DPHJ conversion
[ https://issues.apache.org/jira/browse/HIVE-24308?focusedWorklogId=504214=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504214 ] ASF GitHub Bot logged work on HIVE-24308: - Author: ASF GitHub Bot Created on: 23/Oct/20 15:18 Start Date: 23/Oct/20 15:18 Worklog Time Spent: 10m Work Description: pgaref closed pull request #1604: URL: https://github.com/apache/hive/pull/1604 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504214) Time Spent: 40m (was: 0.5h) > FIX conditions used for DPHJ conversion > - > > Key: HIVE-24308 > URL: https://issues.apache.org/jira/browse/HIVE-24308 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Found a weird scenario when looking at the ConvertJoinMapJoin logic: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198] > When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is > lower than expected the code returns a MJ because of the condition above! > In general, I believe the ShuffleSize check: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624] > should be part of the shuffleJoin DPHJ conversion. > And the preferred conversion would be: MJ > DPHJ > SMB -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23935) Fetching primaryKey through beeline fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-23935?focusedWorklogId=504211=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504211 ] ASF GitHub Bot logged work on HIVE-23935: - Author: ASF GitHub Bot Created on: 23/Oct/20 15:05 Start Date: 23/Oct/20 15:05 Worklog Time Spent: 10m Work Description: ayushtkn opened a new pull request #1605: URL: https://github.com/apache/hive/pull/1605 https://issues.apache.org/jira/browse/HIVE-23935 Entire Trace - 0: jdbc:hive2://localhost:1> !primarykeys Persons Error: MetaException(message:java.lang.NullPointerException) (state=,code=0) org.apache.hive.service.cli.HiveSQLException: MetaException(message:java.lang.NullPointerException) at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:360) at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:351) at org.apache.hive.jdbc.HiveDatabaseMetaData.getPrimaryKeys(HiveDatabaseMetaData.java:573) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hive.beeline.Reflector.invoke(Reflector.java:89) at org.apache.hive.beeline.Commands.metadata(Commands.java:125) at org.apache.hive.beeline.Commands.primarykeys(Commands.java:231) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:57) at org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1465) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1504) at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1364) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1134) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1082) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:546) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:528) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) Caused by: org.apache.hive.service.cli.HiveSQLException: MetaException(message:java.lang.NullPointerException) at org.apache.hive.service.cli.operation.GetPrimaryKeysOperation.runInternal(GetPrimaryKeysOperation.java:120) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:277) at org.apache.hive.service.cli.session.HiveSessionImpl.getPrimaryKeys(HiveSessionImpl.java:997) at org.apache.hive.service.cli.CLIService.getPrimaryKeys(CLIService.java:416) at org.apache.hive.service.cli.thrift.ThriftCLIService.GetPrimaryKeys(ThriftCLIService.java:838) at org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetPrimaryKeys.getResult(TCLIService.java:1717) at org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetPrimaryKeys.getResult(TCLIService.java:1702) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: MetaException(message:java.lang.NullPointerException) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:7921) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.throwMetaException(HiveMetaStore.java:9105) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_primary_keys(HiveMetaStore.java:9067) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[jira] [Resolved] (HIVE-24113) NPE in GenericUDFToUnixTimeStamp
[ https://issues.apache.org/jira/browse/HIVE-24113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Pintér resolved HIVE-24113. -- Resolution: Fixed > NPE in GenericUDFToUnixTimeStamp > > > Key: HIVE-24113 > URL: https://issues.apache.org/jira/browse/HIVE-24113 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.1.2 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > > Following query will trigger the getPartitionsByExpr call at HMS, HMS will > try to evaluate the filter based on the PartitionExpressionForMetastore > proxy, this proxy uses the QL packages to evaluate the filter and call > GenericUDFToUnixTimeStamp. > select * from table_name where hour between > from_unixtime(unix_timestamp('2020090120', 'MMddHH') - 1*60*60, > 'MMddHH') and from_unixtime(unix_timestamp('2020090122', 'MMddHH') + > 2*60*60, 'MMddHH'); > I think SessionState in the code path will always be NULL thats why it hit > the NPE. > {code:java} > java.lang.NullPointerException: null > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.initializeInput(GenericUDFToUnixTimeStamp.java:126) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.initialize(GenericUDFToUnixTimeStamp.java:75) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:148) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:146) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.prepareExpr(PartExprEvalUtils.java:119) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prunePartitionNames(PartitionPruner.java:551) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.filterPartitionsByExpr(PartitionExpressionForMetastore.java:82) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionNamesPrunedByExprNoTxn(ObjectStore.java:3527) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore.access$1400(ObjectStore.java:252) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore$10.getJdoResult(ObjectStore.java:3493) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore$10.getJdoResult(ObjectStore.java:3464) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:3764) > [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3499) > [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(ObjectStore.java:3452) > [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_112] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_112] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_112] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112] > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at com.sun.proxy.$Proxy28.getPartitionsByExpr(Unknown Source) [?:?] > at >
[jira] [Assigned] (HIVE-24309) Simplify ConvertJoinMapJoin logic
[ https://issues.apache.org/jira/browse/HIVE-24309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-24309: - > Simplify ConvertJoinMapJoin logic > -- > > Key: HIVE-24309 > URL: https://issues.apache.org/jira/browse/HIVE-24309 > Project: Hive > Issue Type: Improvement >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > ConvertMapJoin logic can be further simplified: > [https://github.com/pgaref/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L92] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24308) FIX conditions used for DPHJ conversion
[ https://issues.apache.org/jira/browse/HIVE-24308?focusedWorklogId=504139=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504139 ] ASF GitHub Bot logged work on HIVE-24308: - Author: ASF GitHub Bot Created on: 23/Oct/20 12:49 Start Date: 23/Oct/20 12:49 Worklog Time Spent: 10m Work Description: pgaref commented on pull request #1604: URL: https://github.com/apache/hive/pull/1604#issuecomment-715320363 @rbalamohan @jesus Can you please take a look? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504139) Time Spent: 20m (was: 10m) > FIX conditions used for DPHJ conversion > - > > Key: HIVE-24308 > URL: https://issues.apache.org/jira/browse/HIVE-24308 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Found a weird scenario when looking at the ConvertJoinMapJoin logic: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198] > When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is > lower than expected the code returns a MJ because of the condition above! > In general, I believe the ShuffleSize check: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624] > should be part of the shuffleJoin DPHJ conversion. > And the preferred conversion would be: MJ > DPHJ > SMB -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24308) FIX conditions used for DPHJ conversion
[ https://issues.apache.org/jira/browse/HIVE-24308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24308: -- Labels: pull-request-available (was: ) > FIX conditions used for DPHJ conversion > - > > Key: HIVE-24308 > URL: https://issues.apache.org/jira/browse/HIVE-24308 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Found a weird scenario when looking at the ConvertJoinMapJoin logic: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198] > When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is > lower than expected the code returns a MJ because of the condition above! > In general, I believe the ShuffleSize check: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624] > should be part of the shuffleJoin DPHJ conversion. > And the preferred conversion would be: MJ > DPHJ > SMB -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24308) FIX conditions used for DPHJ conversion
[ https://issues.apache.org/jira/browse/HIVE-24308?focusedWorklogId=504138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504138 ] ASF GitHub Bot logged work on HIVE-24308: - Author: ASF GitHub Bot Created on: 23/Oct/20 12:44 Start Date: 23/Oct/20 12:44 Worklog Time Spent: 10m Work Description: pgaref opened a new pull request #1604: URL: https://github.com/apache/hive/pull/1604 Change-Id: Iaa1d4a5c857b6c494aa220c6c96d7659a2a68aa4 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504138) Remaining Estimate: 0h Time Spent: 10m > FIX conditions used for DPHJ conversion > - > > Key: HIVE-24308 > URL: https://issues.apache.org/jira/browse/HIVE-24308 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Found a weird scenario when looking at the ConvertJoinMapJoin logic: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198] > When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is > lower than expected the code returns a MJ because of the condition above! > In general, I believe the ShuffleSize check: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624] > should be part of the shuffleJoin DPHJ conversion. > And the preferred conversion would be: MJ > DPHJ > SMB -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24308) FIX conditions used for DPHJ conversion
[ https://issues.apache.org/jira/browse/HIVE-24308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-24308: -- Description: Found a weird scenario when looking at the ConvertJoinMapJoin logic: [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198] When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is lower than expected the code returns a MJ because of the condition above! In general, I believe the ShuffleSize check: [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624] should be part of the shuffleJoin DPHJ conversion. And the preferred conversion would be: MJ > DPHJ > SMB was: Found a weird scenario when looking at the ConvertJoinMapJoin logic: [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198] When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is lower than expected the code returns a MJ because of the condition above! In general, I believe the ShuffleSize check: [https://github.com/apache/hive/blob/052c9da958f5cf3998091a7eb4b24192a5bb61e9/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624] should be part of the shuffleJoin DPHJ conversion. And the preferred conversion would be: MJ > DPHJ > SMB > FIX conditions used for DPHJ conversion > - > > Key: HIVE-24308 > URL: https://issues.apache.org/jira/browse/HIVE-24308 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > Found a weird scenario when looking at the ConvertJoinMapJoin logic: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198] > When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is > lower than expected the code returns a MJ because of the condition above! > In general, I believe the ShuffleSize check: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624] > should be part of the shuffleJoin DPHJ conversion. > And the preferred conversion would be: MJ > DPHJ > SMB -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24308) FIX conditions used for DPHJ conversion
[ https://issues.apache.org/jira/browse/HIVE-24308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-24308: - > FIX conditions used for DPHJ conversion > - > > Key: HIVE-24308 > URL: https://issues.apache.org/jira/browse/HIVE-24308 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > Found a weird scenario when looking at the ConvertJoinMapJoin logic: > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1198] > When the distinct keys cannot fit in memory AND the DPHJ ShuffleSize is > lower than expected the code returns a MJ because of the condition above! > In general, I believe the ShuffleSize check: > [https://github.com/apache/hive/blob/052c9da958f5cf3998091a7eb4b24192a5bb61e9/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L1624] > should be part of the shuffleJoin DPHJ conversion. > And the preferred conversion would be: MJ > DPHJ > SMB -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24307) Beeline with property-file and -e parameter is failing
[ https://issues.apache.org/jira/browse/HIVE-24307?focusedWorklogId=504128=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504128 ] ASF GitHub Bot logged work on HIVE-24307: - Author: ASF GitHub Bot Created on: 23/Oct/20 11:46 Start Date: 23/Oct/20 11:46 Worklog Time Spent: 10m Work Description: ayushtkn opened a new pull request #1603: URL: https://github.com/apache/hive/pull/1603 https://issues.apache.org/jira/browse/HIVE-24307 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504128) Remaining Estimate: 0h Time Spent: 10m > Beeline with property-file and -e parameter is failing > -- > > Key: HIVE-24307 > URL: https://issues.apache.org/jira/browse/HIVE-24307 > Project: Hive > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Beeline query with property file specified with -e parameter fails with : > {noformat} > Cannot run commands specified using -e. No current connection > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24307) Beeline with property-file and -e parameter is failing
[ https://issues.apache.org/jira/browse/HIVE-24307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24307: -- Labels: pull-request-available (was: ) > Beeline with property-file and -e parameter is failing > -- > > Key: HIVE-24307 > URL: https://issues.apache.org/jira/browse/HIVE-24307 > Project: Hive > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Beeline query with property file specified with -e parameter fails with : > {noformat} > Cannot run commands specified using -e. No current connection > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Moved] (HIVE-24307) Beeline with property-file and -e parameter is failing
[ https://issues.apache.org/jira/browse/HIVE-24307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena moved HDFS-15647 to HIVE-24307: Key: HIVE-24307 (was: HDFS-15647) Project: Hive (was: Hadoop HDFS) > Beeline with property-file and -e parameter is failing > -- > > Key: HIVE-24307 > URL: https://issues.apache.org/jira/browse/HIVE-24307 > Project: Hive > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > > Beeline query with property file specified with -e parameter fails with : > {noformat} > Cannot run commands specified using -e. No current connection > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24258: -- Labels: pull-request-available (was: ) > [CachedStore] Data miss match between cachedstore and rawstore > -- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Sankar Hariappan >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504124=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504124 ] ASF GitHub Bot logged work on HIVE-24258: - Author: ASF GitHub Bot Created on: 23/Oct/20 11:28 Start Date: 23/Oct/20 11:28 Worklog Time Spent: 10m Work Description: sankarh commented on a change in pull request #1587: URL: https://github.com/apache/hive/pull/1587#discussion_r510806059 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStoreUpdateUsingEvents.java ## @@ -419,18 +412,14 @@ public void testConstraintsForUpdateUsingEvents() throws Exception { public void assertRawStoreAndCachedStoreConstraint(String catName, String dbName, String tblName) throws MetaException, NoSuchObjectException { SQLAllTableConstraints rawStoreConstraints = rawStore.getAllTableConstraints(catName, dbName, tblName); -List primaryKeys = sharedCache.listCachedPrimaryKeys(catName, dbName, tblName); -List notNullConstraints = sharedCache.listCachedNotNullConstraints(catName, dbName, tblName); -List uniqueConstraints = sharedCache.listCachedUniqueConstraint(catName, dbName, tblName); -List defaultConstraints = sharedCache.listCachedDefaultConstraint(catName, dbName, tblName); -List checkConstraints = sharedCache.listCachedCheckConstraint(catName, dbName, tblName); -List foreignKeys = sharedCache.listCachedForeignKeys(catName, dbName, tblName, null, null); -Assert.assertEquals(rawStoreConstraints.getPrimaryKeys(), primaryKeys); -Assert.assertEquals(rawStoreConstraints.getNotNullConstraints(), notNullConstraints); -Assert.assertEquals(rawStoreConstraints.getUniqueConstraints(), uniqueConstraints); -Assert.assertEquals(rawStoreConstraints.getDefaultConstraints(), defaultConstraints); -Assert.assertEquals(rawStoreConstraints.getCheckConstraints(), checkConstraints); -Assert.assertEquals(rawStoreConstraints.getForeignKeys(), foreignKeys); +SQLAllTableConstraints cachedStoreConstraints = new SQLAllTableConstraints(); + cachedStoreConstraints.setPrimaryKeys(sharedCache.listCachedPrimaryKeys(catName, dbName, tblName)); + cachedStoreConstraints.setForeignKeys(sharedCache.listCachedForeignKeys(catName, dbName, tblName, null, null)); + cachedStoreConstraints.setNotNullConstraints(sharedCache.listCachedNotNullConstraints(catName, dbName, tblName)); + cachedStoreConstraints.setDefaultConstraints(sharedCache.listCachedDefaultConstraint(catName, dbName, tblName)); + cachedStoreConstraints.setCheckConstraints(sharedCache.listCachedCheckConstraint(catName, dbName, tblName)); + cachedStoreConstraints.setUniqueConstraints(sharedCache.listCachedUniqueConstraint(catName, dbName, tblName)); +Assert.assertEquals(rawStoreConstraints,cachedStoreConstraints); Review comment: nit: Add space after , ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java ## @@ -2255,121 +2257,61 @@ private void create_table_core(final RawStore ms, final CreateTableRequest req) tbl.putToParameters(hive_metastoreConstants.DDL_TIME, Long.toString(time)); } -if (primaryKeys == null && foreignKeys == null -&& uniqueConstraints == null && notNullConstraints == null && defaultConstraints == null -&& checkConstraints == null) { +if (CollectionUtils.isEmpty(constraints.getPrimaryKeys()) && CollectionUtils.isEmpty(constraints.getForeignKeys()) +&& CollectionUtils.isEmpty(constraints.getUniqueConstraints())&& CollectionUtils.isEmpty(constraints.getNotNullConstraints())&& CollectionUtils.isEmpty(constraints.getDefaultConstraints()) +&& CollectionUtils.isEmpty(constraints.getCheckConstraints())) { ms.createTable(tbl); } else { // Check that constraints have catalog name properly set first - if (primaryKeys != null && !primaryKeys.isEmpty() && !primaryKeys.get(0).isSetCatName()) { -for (SQLPrimaryKey pkcol : primaryKeys) pkcol.setCatName(tbl.getCatName()); + if (CollectionUtils.isNotEmpty(constraints.getPrimaryKeys()) && !constraints.getPrimaryKeys().get(0).isSetCatName()) { +for (SQLPrimaryKey pkcol : constraints.getPrimaryKeys()) pkcol.setCatName(tbl.getCatName()); Review comment: nit: Use for () { -- } even for single statement. Or use this instead: constraints.getPrimaryKeys().forEach(pk -> pk.setCatName(tbl.getCatName())); ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java ## @@ -2785,31 +2696,23 @@ public void add_not_null_constraint(AddNotNullConstraintRequest req) @Override
[jira] [Updated] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-24258: Description: Description Objects like table name, db name, column name etc are case insensitive as per HIVE contract but standalone metastore cachedstore is case sensitive. As result of which there is mismatch in rawstore output and cachedstore output. Example - expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, validate_cstr:false, rely_cstr:false, catName:hive)]> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, validate_cstr:false, rely_cstr:false, catName:hive)]> was: Description Objects like table name, db name, column name etc are case insensitive as per HIVE contract but standalone metastore cachedstore is case sensitive. As result of which there is miss match in rawstore output and cachedstore output. Example - expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, validate_cstr:false, rely_cstr:false, catName:hive)]> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, validate_cstr:false, rely_cstr:false, catName:hive)]> > [CachedStore] Data miss match between cachedstore and rawstore > -- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Sankar Hariappan >Priority: Major > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-24258: Description: Description Objects like table name, db name, column name etc are case insensitive as per HIVE contract but standalone metastore cachedstore is case sensitive. As result of which there is miss match in rawstore output and cachedstore output. Example - expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, validate_cstr:false, rely_cstr:false, catName:hive)]> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, validate_cstr:false, rely_cstr:false, catName:hive)]> was: Description Objects like table name, db name, column name etc are case incentives as per HIVE contract but standalone metastore cachedstore is case sensitive. As result of which there is miss match in rawstore output and cachedstore output. Example - expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, validate_cstr:false, rely_cstr:false, catName:hive)]> but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, validate_cstr:false, rely_cstr:false, catName:hive)]> > [CachedStore] Data miss match between cachedstore and rawstore > -- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Sankar Hariappan >Priority: Major > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is miss match in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan reassigned HIVE-24258: --- Assignee: Sankar Hariappan (was: Ashish Sharma) > [CachedStore] Data miss match between cachedstore and rawstore > -- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Sankar Hariappan >Priority: Major > > Description > Objects like table name, db name, column name etc are case incentives as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is miss match in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite
[ https://issues.apache.org/jira/browse/HIVE-24165?focusedWorklogId=504048=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504048 ] ASF GitHub Bot logged work on HIVE-24165: - Author: ASF GitHub Bot Created on: 23/Oct/20 06:42 Start Date: 23/Oct/20 06:42 Worklog Time Spent: 10m Work Description: loudongfeng closed pull request #1597: URL: https://github.com/apache/hive/pull/1597 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504048) Time Spent: 20m (was: 10m) > CBO: Query fails after multiple count distinct rewrite > --- > > Key: HIVE-24165 > URL: https://issues.apache.org/jira/browse/HIVE-24165 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: Nemon Lou >Assignee: Nemon Lou >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24165.patch > > Time Spent: 20m > Remaining Estimate: 0h > > One way to reproduce: > > {code:sql} > CREATE TABLE test( > `device_id` string, > `level` string, > `site_id` string, > `user_id` string, > `first_date` string, > `last_date` string, > `dt` string) ; > set hive.execution.engine=tez; > set hive.optimize.distinct.rewrite=true; > set hive.cli.print.header=true; > select > dt, > site_id, > count(DISTINCT t1.device_id) as device_tol_cnt, > count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else > null end) as device_add_cnt > from test t1 where dt='2020-09-15' > group by > dt, > site_id > ; > {code} > > Error log: > {code:java} > Exception in thread "main" java.lang.AssertionError: Cannot add expression of > different type to set: > set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE > "ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT NULL > expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT > $f3_0) NOT NULL > set is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group={2, > 3},agg#0=count($0),agg#1=count($1)) > expression is HiveProject#95 > at > org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411) > at > org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57) > at > org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234) > at > org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186) > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317) > at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) > at > org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280) > at > org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74) > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) > at > org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609) > at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052) > at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) > at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450) > at >
[jira] [Commented] (HIVE-18537) [Calcite-CBO] Queries with a nested distinct clause and a windowing function seem to fail with calcite Assertion error
[ https://issues.apache.org/jira/browse/HIVE-18537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219498#comment-17219498 ] Nemon Lou commented on HIVE-18537: -- This issue get fixed after upgrade calcite to 1.17.0 or higher. Master branch can not reproduce this issue any more. > [Calcite-CBO] Queries with a nested distinct clause and a windowing function > seem to fail with calcite Assertion error > -- > > Key: HIVE-18537 > URL: https://issues.apache.org/jira/browse/HIVE-18537 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.0, 2.3.2, 3.1.2 >Reporter: Amruth Sampath >Priority: Critical > > Sample test case to re-produce the issue. The issue does not occur if > *hive.cbo.enable=false* > {code:java} > create table test_cbo ( > `a` BIGINT, > `b` STRING, > `c` TIMESTAMP, > `d` STRING > ); > SELECT 1 > FROM > (SELECT > DISTINCT > a AS a_, > b AS b_, > rank() over (partition BY a ORDER BY c DESC) AS c_, > d AS d_ > FROM test_cbo > WHERE b = 'some_filter' ) n > WHERE c_ = 1; > {code} > Fails with, > {code:java} > Exception in thread "main" java.lang.AssertionError: Internal error: Cannot > add expression of different type to set: > set type is RecordType(BIGINT a_, INTEGER c_, VARCHAR(2147483647) CHARACTER > SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" d_) NOT NULL > expression type is RecordType(BIGINT a_, VARCHAR(2147483647) CHARACTER SET > "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" c_, INTEGER d_) NOT NULL > set is rel#112:HiveAggregate.HIVE.[](input=HepRelVertex#121,group={0, 2, 3}) > expression is HiveProject#123{code} > This might be related to https://issues.apache.org/jira/browse/CALCITE-1868. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite
[ https://issues.apache.org/jira/browse/HIVE-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou resolved HIVE-24165. -- Resolution: Invalid > CBO: Query fails after multiple count distinct rewrite > --- > > Key: HIVE-24165 > URL: https://issues.apache.org/jira/browse/HIVE-24165 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: Nemon Lou >Assignee: Nemon Lou >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24165.patch > > Time Spent: 10m > Remaining Estimate: 0h > > One way to reproduce: > > {code:sql} > CREATE TABLE test( > `device_id` string, > `level` string, > `site_id` string, > `user_id` string, > `first_date` string, > `last_date` string, > `dt` string) ; > set hive.execution.engine=tez; > set hive.optimize.distinct.rewrite=true; > set hive.cli.print.header=true; > select > dt, > site_id, > count(DISTINCT t1.device_id) as device_tol_cnt, > count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else > null end) as device_add_cnt > from test t1 where dt='2020-09-15' > group by > dt, > site_id > ; > {code} > > Error log: > {code:java} > Exception in thread "main" java.lang.AssertionError: Cannot add expression of > different type to set: > set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE > "ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT NULL > expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT > $f3_0) NOT NULL > set is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group={2, > 3},agg#0=count($0),agg#1=count($1)) > expression is HiveProject#95 > at > org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411) > at > org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57) > at > org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234) > at > org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186) > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317) > at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) > at > org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280) > at > org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74) > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) > at > org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609) > at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052) > at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) > at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126) > at >
[jira] [Commented] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite
[ https://issues.apache.org/jira/browse/HIVE-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219496#comment-17219496 ] Nemon Lou commented on HIVE-24165: -- Not able to reproduce in master branch. After upgrade calcite from 1.16.0 to 1.17.0,this bug also gone for branch3 with multi distinct rewrite. May be fixed in CALCITE-2232 > CBO: Query fails after multiple count distinct rewrite > --- > > Key: HIVE-24165 > URL: https://issues.apache.org/jira/browse/HIVE-24165 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: Nemon Lou >Assignee: Nemon Lou >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24165.patch > > Time Spent: 10m > Remaining Estimate: 0h > > One way to reproduce: > > {code:sql} > CREATE TABLE test( > `device_id` string, > `level` string, > `site_id` string, > `user_id` string, > `first_date` string, > `last_date` string, > `dt` string) ; > set hive.execution.engine=tez; > set hive.optimize.distinct.rewrite=true; > set hive.cli.print.header=true; > select > dt, > site_id, > count(DISTINCT t1.device_id) as device_tol_cnt, > count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else > null end) as device_add_cnt > from test t1 where dt='2020-09-15' > group by > dt, > site_id > ; > {code} > > Error log: > {code:java} > Exception in thread "main" java.lang.AssertionError: Cannot add expression of > different type to set: > set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE > "ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT NULL > expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT > $f3_0) NOT NULL > set is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group={2, > 3},agg#0=count($0),agg#1=count($1)) > expression is HiveProject#95 > at > org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411) > at > org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57) > at > org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234) > at > org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186) > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317) > at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) > at > org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280) > at > org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74) > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) > at > org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609) > at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052) > at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) > at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773) > at