[jira] [Commented] (HIVE-15302) Relax the requirement that HoS needs Spark built w/o Hive
[ https://issues.apache.org/jira/browse/HIVE-15302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970231#comment-15970231 ] Marcelo Vanzin commented on HIVE-15302: --- Livy doesn't figure out what spark.yarn.archive or spark.yarn.jars should be. It assumes the user has a valid configuration. If you're going to manage the list of jars for the user, the best way is to use maven, as I said. Have a module that is "Hive's packaging of Spark" and have it create a zip with all the needed jars or something, and use that, instead of manually figuring out lists of jars. > Relax the requirement that HoS needs Spark built w/o Hive > - > > Key: HIVE-15302 > URL: https://issues.apache.org/jira/browse/HIVE-15302 > Project: Hive > Issue Type: Improvement >Reporter: Rui Li >Assignee: Rui Li > > This requirement becomes more and more unacceptable as SparkSQL becomes > widely adopted. Let's use this JIRA to find out how we can relax the > limitation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16287) Alter table partition rename with location - moves partition back to hive warehouse
[ https://issues.apache.org/jira/browse/HIVE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970230#comment-15970230 ] Rui Li commented on HIVE-16287: --- [~vihangk1], since this issue exists in 1.x, could you provide a patch for branch-1 too? Thanks. > Alter table partition rename with location - moves partition back to hive > warehouse > --- > > Key: HIVE-16287 > URL: https://issues.apache.org/jira/browse/HIVE-16287 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.1.0 > Environment: RHEL 6.8 >Reporter: Ying Chen >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-16287.01.patch, HIVE-16287.02.patch, > HIVE-16287.03.patch, HIVE-16287.04.patch > > Original Estimate: 48h > Remaining Estimate: 48h > > I was renaming my partition in a table that I've created using the location > clause, and noticed that when after rename is completed, my partition is > moved to the hive warehouse (hive.metastore.warehouse.dir). > {quote} > create table test_local_part (col1 int) partitioned by (col2 int) location > '/tmp/testtable/test_local_part'; > insert into test_local_part partition (col2=1) values (1),(3); > insert into test_local_part partition (col2=2) values (3); > alter table test_local_part partition (col2='1') rename to partition > (col2='4'); > {quote} > Running: >describe formatted test_local_part partition (col2='2') > # Detailed Partition Information > Partition Value: [2] > Database: default > Table:test_local_part > CreateTime: Mon Mar 20 13:25:28 PDT 2017 > LastAccessTime: UNKNOWN > Protect Mode: None > Location: > *hdfs://my.server.com:8020/tmp/testtable/test_local_part/col2=2* > Running: >describe formatted test_local_part partition (col2='4') > # Detailed Partition Information > Partition Value: [4] > Database: default > Table:test_local_part > CreateTime: Mon Mar 20 13:24:53 PDT 2017 > LastAccessTime: UNKNOWN > Protect Mode: None > Location: > *hdfs://my.server.com:8020/apps/hive/warehouse/test_local_part/col2=4* > --- > Per Sergio's comment - "The rename should create the new partition name in > the same location of the table. " -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15302) Relax the requirement that HoS needs Spark built w/o Hive
[ https://issues.apache.org/jira/browse/HIVE-15302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970228#comment-15970228 ] Rui Li commented on HIVE-15302: --- Thanks [~vanzin] for the suggestions. I'm trying to figure out the least required jars to set for {{spark.yarn.archive}}. The purpose of doing this is to avoid conflicts and potentially improve performance. Could you please explain more about how you figured out these jars in your work for Livy? It doesn't seem obvious to me. > Relax the requirement that HoS needs Spark built w/o Hive > - > > Key: HIVE-15302 > URL: https://issues.apache.org/jira/browse/HIVE-15302 > Project: Hive > Issue Type: Improvement >Reporter: Rui Li >Assignee: Rui Li > > This requirement becomes more and more unacceptable as SparkSQL becomes > widely adopted. Let's use this JIRA to find out how we can relax the > limitation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969952#comment-15969952 ] Edward Capriolo commented on HIVE-16029: Code looks look, but some of the q test files run the explain command: https://builds.apache.org/job/PreCommit-HIVE-Build/4704/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_udaf_collect_set_/ You need to update the .q.out files so they do not fil > COLLECT_SET and COLLECT_LIST does not return NULL in the result > --- > > Key: HIVE-16029 > URL: https://issues.apache.org/jira/browse/HIVE-16029 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-16029.2.patch, HIVE-16029.patch > > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; > +-+ > | collect_set_test.a | > +-+ > | 1 | > | 2 | > | NULL| > | 4 | > | NULL| > +-+ > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,4] | > +---+ > {code} > The correct result should be: > {code} > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,null,4] | > +---+ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969949#comment-15969949 ] Hive QA commented on HIVE-16029: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12863552/HIVE-16029.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10579 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udaf_collect_set] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_order_null] (batchId=27) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[udaf_collect_set_2] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=143) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[udaf_collect_set] (batchId=102) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4704/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4704/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4704/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12863552 - PreCommit-HIVE-Build > COLLECT_SET and COLLECT_LIST does not return NULL in the result > --- > > Key: HIVE-16029 > URL: https://issues.apache.org/jira/browse/HIVE-16029 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-16029.2.patch, HIVE-16029.patch > > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; > +-+ > | collect_set_test.a | > +-+ > | 1 | > | 2 | > | NULL| > | 4 | > | NULL| > +-+ > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,4] | > +---+ > {code} > The correct result should be: > {code} > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,null,4] | > +---+ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969931#comment-15969931 ] Eric Lin commented on HIVE-16029: - Review is also updated: https://reviews.apache.org/r/57009/. Please help to review and see if there is any other changes required. > COLLECT_SET and COLLECT_LIST does not return NULL in the result > --- > > Key: HIVE-16029 > URL: https://issues.apache.org/jira/browse/HIVE-16029 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-16029.2.patch, HIVE-16029.patch > > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; > +-+ > | collect_set_test.a | > +-+ > | 1 | > | 2 | > | NULL| > | 4 | > | NULL| > +-+ > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,4] | > +---+ > {code} > The correct result should be: > {code} > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,null,4] | > +---+ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result
[ https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Lin updated HIVE-16029: Attachment: HIVE-16029.2.patch Attaching new patch so that COLLECT_SET takes two arguments, first one is the same as before, second one is boolean value of true or false, which was suggested by Edward. > COLLECT_SET and COLLECT_LIST does not return NULL in the result > --- > > Key: HIVE-16029 > URL: https://issues.apache.org/jira/browse/HIVE-16029 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Minor > Attachments: HIVE-16029.2.patch, HIVE-16029.patch > > > See the test case below: > {code} > 0: jdbc:hive2://localhost:1/default> select * from collect_set_test; > +-+ > | collect_set_test.a | > +-+ > | 1 | > | 2 | > | NULL| > | 4 | > | NULL| > +-+ > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,4] | > +---+ > {code} > The correct result should be: > {code} > 0: jdbc:hive2://localhost:1/default> select collect_set(a) from > collect_set_test; > +---+ > | _c0 | > +---+ > | [1,2,null,4] | > +---+ > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16451) Race condition between HiveStatement.getQueryLog and HiveStatement.runAsyncOnServer
[ https://issues.apache.org/jira/browse/HIVE-16451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969882#comment-15969882 ] Hive QA commented on HIVE-16451: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12863540/HIVE-16451.03.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10579 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_order_null] (batchId=27) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=143) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_count_distinct] (batchId=109) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4703/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4703/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4703/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12863540 - PreCommit-HIVE-Build > Race condition between HiveStatement.getQueryLog and > HiveStatement.runAsyncOnServer > --- > > Key: HIVE-16451 > URL: https://issues.apache.org/jira/browse/HIVE-16451 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 3.0.0 >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-16451.02.patch, HIVE-16451.03.patch, > HIVE-16451.patch > > > During the BeeLineDriver testing I have met the following race condition: > - Run the query asynchronously through BeeLine > - Querying the logs in the BeeLine > In the following code: > {code:title=HiveStatement.runAsyncOnServer} > private void runAsyncOnServer(String sql) throws SQLException { > checkConnection("execute"); > closeClientOperation(); > initFlags(); > [..] > } > {code} > {code:title=HiveStatement.getQueryLog} > public List getQueryLog(boolean incremental, int fetchSize) > throws SQLException, ClosedOrCancelledStatementException { > [..] > try { > if (stmtHandle != null) { > [..] > } else { > if (isQueryClosed) { > throw new ClosedOrCancelledStatementException("Method getQueryLog() > failed. The " + > "statement has been closed or cancelled."); > } else { > return logs; > } > } > } catch (SQLException e) { > [..] > } > [..] > } > {code} > The runAsyncOnServer {{closeClientOperation}} sets {{isQueryClosed}} flag to > true: > {code:title=HiveStatement.closeClientOperation} > void closeClientOperation() throws SQLException { > [..] > isQueryClosed = true; > isExecuteStatementFailed = false; > stmtHandle = null; > } > {code} > The {{initFlags}} sets it to false: > {code} > private void initFlags() { > isCancelled = false; > isQueryClosed = false; > isLogBeingGenerated = true; > isExecuteStatementFailed = false; > isOperationComplete = false; > } > {code} > If the {{getQueryLog}} is called after the {{closeClientOperation}}, but > before the {{initFlags}}, then we will have a following warning if verbose > mode is set to true in BeeLine: > {code} > Warning: org.apache.hive.jdbc.ClosedOrCancelledStatementException: Method > getQueryLog() failed. The statement has been closed or cancelled. > (state=,code=0) > {code} > This caused this fail: > https://builds.apache.org/job/PreCommit-HIVE-Build/4691/testReport/org.apache.hadoop.hive.cli/TestBeeLineDriver/testCliDriver_smb_mapjoin_11_/ > {code} > Error Message > Client result comparison failed with error code = 1 while executing > fname=smb_mapjoin_11 > 16a17 > > Warning: org.apache.hive.jdbc.ClosedOrCancelledStatementException: Method > > getQueryLog() failed. The statement has been closed or cancelled. > > (state=,code=0) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16451) Race condition between HiveStatement.getQueryLog and HiveStatement.runAsyncOnServer
[ https://issues.apache.org/jira/browse/HIVE-16451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-16451: -- Attachment: HIVE-16451.03.patch Retriggering the precommit with the same file, to check again. > Race condition between HiveStatement.getQueryLog and > HiveStatement.runAsyncOnServer > --- > > Key: HIVE-16451 > URL: https://issues.apache.org/jira/browse/HIVE-16451 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 3.0.0 >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-16451.02.patch, HIVE-16451.03.patch, > HIVE-16451.patch > > > During the BeeLineDriver testing I have met the following race condition: > - Run the query asynchronously through BeeLine > - Querying the logs in the BeeLine > In the following code: > {code:title=HiveStatement.runAsyncOnServer} > private void runAsyncOnServer(String sql) throws SQLException { > checkConnection("execute"); > closeClientOperation(); > initFlags(); > [..] > } > {code} > {code:title=HiveStatement.getQueryLog} > public List getQueryLog(boolean incremental, int fetchSize) > throws SQLException, ClosedOrCancelledStatementException { > [..] > try { > if (stmtHandle != null) { > [..] > } else { > if (isQueryClosed) { > throw new ClosedOrCancelledStatementException("Method getQueryLog() > failed. The " + > "statement has been closed or cancelled."); > } else { > return logs; > } > } > } catch (SQLException e) { > [..] > } > [..] > } > {code} > The runAsyncOnServer {{closeClientOperation}} sets {{isQueryClosed}} flag to > true: > {code:title=HiveStatement.closeClientOperation} > void closeClientOperation() throws SQLException { > [..] > isQueryClosed = true; > isExecuteStatementFailed = false; > stmtHandle = null; > } > {code} > The {{initFlags}} sets it to false: > {code} > private void initFlags() { > isCancelled = false; > isQueryClosed = false; > isLogBeingGenerated = true; > isExecuteStatementFailed = false; > isOperationComplete = false; > } > {code} > If the {{getQueryLog}} is called after the {{closeClientOperation}}, but > before the {{initFlags}}, then we will have a following warning if verbose > mode is set to true in BeeLine: > {code} > Warning: org.apache.hive.jdbc.ClosedOrCancelledStatementException: Method > getQueryLog() failed. The statement has been closed or cancelled. > (state=,code=0) > {code} > This caused this fail: > https://builds.apache.org/job/PreCommit-HIVE-Build/4691/testReport/org.apache.hadoop.hive.cli/TestBeeLineDriver/testCliDriver_smb_mapjoin_11_/ > {code} > Error Message > Client result comparison failed with error code = 1 while executing > fname=smb_mapjoin_11 > 16a17 > > Warning: org.apache.hive.jdbc.ClosedOrCancelledStatementException: Method > > getQueryLog() failed. The statement has been closed or cancelled. > > (state=,code=0) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)