[jira] [Commented] (HIVE-15302) Relax the requirement that HoS needs Spark built w/o Hive

2017-04-15 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970231#comment-15970231
 ] 

Marcelo Vanzin commented on HIVE-15302:
---

Livy doesn't figure out what spark.yarn.archive or spark.yarn.jars should be. 
It assumes the user has a valid configuration.

If you're going to manage the list of jars for the user, the best way is to use 
maven, as I said. Have a module that is "Hive's packaging of Spark" and have it 
create a zip with all the needed jars or something, and use that, instead of 
manually figuring out lists of jars.

> Relax the requirement that HoS needs Spark built w/o Hive
> -
>
> Key: HIVE-15302
> URL: https://issues.apache.org/jira/browse/HIVE-15302
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Rui Li
>
> This requirement becomes more and more unacceptable as SparkSQL becomes 
> widely adopted. Let's use this JIRA to find out how we can relax the 
> limitation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16287) Alter table partition rename with location - moves partition back to hive warehouse

2017-04-15 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970230#comment-15970230
 ] 

Rui Li commented on HIVE-16287:
---

[~vihangk1], since this issue exists in 1.x, could you provide a patch for 
branch-1 too? Thanks.

> Alter table partition rename with location - moves partition back to hive 
> warehouse
> ---
>
> Key: HIVE-16287
> URL: https://issues.apache.org/jira/browse/HIVE-16287
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.0
> Environment: RHEL 6.8 
>Reporter: Ying Chen
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16287.01.patch, HIVE-16287.02.patch, 
> HIVE-16287.03.patch, HIVE-16287.04.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> I was renaming my partition in a table that I've created using the location 
> clause, and noticed that when after rename is completed, my partition is 
> moved to the hive warehouse (hive.metastore.warehouse.dir).
> {quote}
> create table test_local_part (col1 int) partitioned by (col2 int) location 
> '/tmp/testtable/test_local_part';
> insert into test_local_part  partition (col2=1) values (1),(3);
> insert into test_local_part  partition (col2=2) values (3);
> alter table test_local_part partition (col2='1') rename to partition 
> (col2='4');
> {quote}
> Running: 
>describe formatted test_local_part partition (col2='2')
> # Detailed Partition Information   
> Partition Value:  [2]  
> Database: default  
> Table:test_local_part  
> CreateTime:   Mon Mar 20 13:25:28 PDT 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Location: 
> *hdfs://my.server.com:8020/tmp/testtable/test_local_part/col2=2*
> Running: 
>describe formatted test_local_part partition (col2='4')
> # Detailed Partition Information   
> Partition Value:  [4]  
> Database: default  
> Table:test_local_part  
> CreateTime:   Mon Mar 20 13:24:53 PDT 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Location: 
> *hdfs://my.server.com:8020/apps/hive/warehouse/test_local_part/col2=4*
> ---
> Per Sergio's comment - "The rename should create the new partition name in 
> the same location of the table. "



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15302) Relax the requirement that HoS needs Spark built w/o Hive

2017-04-15 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970228#comment-15970228
 ] 

Rui Li commented on HIVE-15302:
---

Thanks [~vanzin] for the suggestions. I'm trying to figure out the least 
required jars to set for {{spark.yarn.archive}}. The purpose of doing this is 
to avoid conflicts and potentially improve performance. Could you please 
explain more about how you figured out these jars in your work for Livy? It 
doesn't seem obvious to me.

> Relax the requirement that HoS needs Spark built w/o Hive
> -
>
> Key: HIVE-15302
> URL: https://issues.apache.org/jira/browse/HIVE-15302
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Rui Li
>
> This requirement becomes more and more unacceptable as SparkSQL becomes 
> widely adopted. Let's use this JIRA to find out how we can relax the 
> limitation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result

2017-04-15 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969952#comment-15969952
 ] 

Edward Capriolo commented on HIVE-16029:


Code looks look, but some of the q test files run the explain command:
https://builds.apache.org/job/PreCommit-HIVE-Build/4704/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_udaf_collect_set_/

You need to update the .q.out files so they do not fil


> COLLECT_SET and COLLECT_LIST does not return NULL in the result
> ---
>
> Key: HIVE-16029
> URL: https://issues.apache.org/jira/browse/HIVE-16029
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Eric Lin
>Assignee: Eric Lin
>Priority: Minor
> Attachments: HIVE-16029.2.patch, HIVE-16029.patch
>
>
> See the test case below:
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from collect_set_test;
> +-+
> | collect_set_test.a  |
> +-+
> | 1   |
> | 2   |
> | NULL|
> | 4   |
> | NULL|
> +-+
> 0: jdbc:hive2://localhost:1/default> select collect_set(a) from 
> collect_set_test;
> +---+
> |  _c0  |
> +---+
> | [1,2,4]  |
> +---+
> {code}
> The correct result should be:
> {code}
> 0: jdbc:hive2://localhost:1/default> select collect_set(a) from 
> collect_set_test;
> +---+
> |  _c0  |
> +---+
> | [1,2,null,4]  |
> +---+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result

2017-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969949#comment-15969949
 ] 

Hive QA commented on HIVE-16029:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12863552/HIVE-16029.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10579 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udaf_collect_set] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_order_null] 
(batchId=27)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[udaf_collect_set_2]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[udaf_collect_set] 
(batchId=102)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4704/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4704/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4704/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12863552 - PreCommit-HIVE-Build

> COLLECT_SET and COLLECT_LIST does not return NULL in the result
> ---
>
> Key: HIVE-16029
> URL: https://issues.apache.org/jira/browse/HIVE-16029
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Eric Lin
>Assignee: Eric Lin
>Priority: Minor
> Attachments: HIVE-16029.2.patch, HIVE-16029.patch
>
>
> See the test case below:
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from collect_set_test;
> +-+
> | collect_set_test.a  |
> +-+
> | 1   |
> | 2   |
> | NULL|
> | 4   |
> | NULL|
> +-+
> 0: jdbc:hive2://localhost:1/default> select collect_set(a) from 
> collect_set_test;
> +---+
> |  _c0  |
> +---+
> | [1,2,4]  |
> +---+
> {code}
> The correct result should be:
> {code}
> 0: jdbc:hive2://localhost:1/default> select collect_set(a) from 
> collect_set_test;
> +---+
> |  _c0  |
> +---+
> | [1,2,null,4]  |
> +---+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result

2017-04-15 Thread Eric Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969931#comment-15969931
 ] 

Eric Lin commented on HIVE-16029:
-

Review is also updated: https://reviews.apache.org/r/57009/.

Please help to review and see if there is any other changes required.

> COLLECT_SET and COLLECT_LIST does not return NULL in the result
> ---
>
> Key: HIVE-16029
> URL: https://issues.apache.org/jira/browse/HIVE-16029
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Eric Lin
>Assignee: Eric Lin
>Priority: Minor
> Attachments: HIVE-16029.2.patch, HIVE-16029.patch
>
>
> See the test case below:
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from collect_set_test;
> +-+
> | collect_set_test.a  |
> +-+
> | 1   |
> | 2   |
> | NULL|
> | 4   |
> | NULL|
> +-+
> 0: jdbc:hive2://localhost:1/default> select collect_set(a) from 
> collect_set_test;
> +---+
> |  _c0  |
> +---+
> | [1,2,4]  |
> +---+
> {code}
> The correct result should be:
> {code}
> 0: jdbc:hive2://localhost:1/default> select collect_set(a) from 
> collect_set_test;
> +---+
> |  _c0  |
> +---+
> | [1,2,null,4]  |
> +---+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result

2017-04-15 Thread Eric Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Lin updated HIVE-16029:

Attachment: HIVE-16029.2.patch

Attaching new patch so that COLLECT_SET takes two arguments, first one is the 
same as before, second one is boolean value of true or false, which was 
suggested by Edward.

> COLLECT_SET and COLLECT_LIST does not return NULL in the result
> ---
>
> Key: HIVE-16029
> URL: https://issues.apache.org/jira/browse/HIVE-16029
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Eric Lin
>Assignee: Eric Lin
>Priority: Minor
> Attachments: HIVE-16029.2.patch, HIVE-16029.patch
>
>
> See the test case below:
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from collect_set_test;
> +-+
> | collect_set_test.a  |
> +-+
> | 1   |
> | 2   |
> | NULL|
> | 4   |
> | NULL|
> +-+
> 0: jdbc:hive2://localhost:1/default> select collect_set(a) from 
> collect_set_test;
> +---+
> |  _c0  |
> +---+
> | [1,2,4]  |
> +---+
> {code}
> The correct result should be:
> {code}
> 0: jdbc:hive2://localhost:1/default> select collect_set(a) from 
> collect_set_test;
> +---+
> |  _c0  |
> +---+
> | [1,2,null,4]  |
> +---+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16451) Race condition between HiveStatement.getQueryLog and HiveStatement.runAsyncOnServer

2017-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969882#comment-15969882
 ] 

Hive QA commented on HIVE-16451:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12863540/HIVE-16451.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10579 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_order_null] 
(batchId=27)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_count_distinct]
 (batchId=109)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4703/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4703/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4703/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12863540 - PreCommit-HIVE-Build

> Race condition between HiveStatement.getQueryLog and 
> HiveStatement.runAsyncOnServer
> ---
>
> Key: HIVE-16451
> URL: https://issues.apache.org/jira/browse/HIVE-16451
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-16451.02.patch, HIVE-16451.03.patch, 
> HIVE-16451.patch
>
>
> During the BeeLineDriver testing I have met the following race condition:
> - Run the query asynchronously through BeeLine
> - Querying the logs in the BeeLine
> In the following code:
> {code:title=HiveStatement.runAsyncOnServer}
>   private void runAsyncOnServer(String sql) throws SQLException {
> checkConnection("execute");
> closeClientOperation();
> initFlags();
> [..]
>   }
> {code}
> {code:title=HiveStatement.getQueryLog}
>   public List getQueryLog(boolean incremental, int fetchSize)
>   throws SQLException, ClosedOrCancelledStatementException {
> [..]
> try {
>   if (stmtHandle != null) {
> [..]
>   } else {
> if (isQueryClosed) {
>   throw new ClosedOrCancelledStatementException("Method getQueryLog() 
> failed. The " +
>   "statement has been closed or cancelled.");
> } else {
>   return logs;
> }
>   }
> } catch (SQLException e) {
> [..]
> }
> [..]
>   }
> {code}
> The runAsyncOnServer {{closeClientOperation}} sets {{isQueryClosed}} flag to 
> true:
> {code:title=HiveStatement.closeClientOperation}
>   void closeClientOperation() throws SQLException {
> [..]
> isQueryClosed = true;
> isExecuteStatementFailed = false;
> stmtHandle = null;
>   }
> {code}
> The {{initFlags}} sets it to false:
> {code}
>   private void initFlags() {
> isCancelled = false;
> isQueryClosed = false;
> isLogBeingGenerated = true;
> isExecuteStatementFailed = false;
> isOperationComplete = false;
>   }
> {code}
> If the {{getQueryLog}} is called after the {{closeClientOperation}}, but 
> before the {{initFlags}}, then we will have a following warning if verbose 
> mode is set to true in BeeLine:
> {code}
> Warning: org.apache.hive.jdbc.ClosedOrCancelledStatementException: Method 
> getQueryLog() failed. The statement has been closed or cancelled. 
> (state=,code=0)
> {code}
> This caused this fail:
> https://builds.apache.org/job/PreCommit-HIVE-Build/4691/testReport/org.apache.hadoop.hive.cli/TestBeeLineDriver/testCliDriver_smb_mapjoin_11_/
> {code}
> Error Message
> Client result comparison failed with error code = 1 while executing 
> fname=smb_mapjoin_11
> 16a17
> > Warning: org.apache.hive.jdbc.ClosedOrCancelledStatementException: Method 
> > getQueryLog() failed. The statement has been closed or cancelled. 
> > (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16451) Race condition between HiveStatement.getQueryLog and HiveStatement.runAsyncOnServer

2017-04-15 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-16451:
--
Attachment: HIVE-16451.03.patch

Retriggering the precommit with the same file, to check again.



> Race condition between HiveStatement.getQueryLog and 
> HiveStatement.runAsyncOnServer
> ---
>
> Key: HIVE-16451
> URL: https://issues.apache.org/jira/browse/HIVE-16451
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-16451.02.patch, HIVE-16451.03.patch, 
> HIVE-16451.patch
>
>
> During the BeeLineDriver testing I have met the following race condition:
> - Run the query asynchronously through BeeLine
> - Querying the logs in the BeeLine
> In the following code:
> {code:title=HiveStatement.runAsyncOnServer}
>   private void runAsyncOnServer(String sql) throws SQLException {
> checkConnection("execute");
> closeClientOperation();
> initFlags();
> [..]
>   }
> {code}
> {code:title=HiveStatement.getQueryLog}
>   public List getQueryLog(boolean incremental, int fetchSize)
>   throws SQLException, ClosedOrCancelledStatementException {
> [..]
> try {
>   if (stmtHandle != null) {
> [..]
>   } else {
> if (isQueryClosed) {
>   throw new ClosedOrCancelledStatementException("Method getQueryLog() 
> failed. The " +
>   "statement has been closed or cancelled.");
> } else {
>   return logs;
> }
>   }
> } catch (SQLException e) {
> [..]
> }
> [..]
>   }
> {code}
> The runAsyncOnServer {{closeClientOperation}} sets {{isQueryClosed}} flag to 
> true:
> {code:title=HiveStatement.closeClientOperation}
>   void closeClientOperation() throws SQLException {
> [..]
> isQueryClosed = true;
> isExecuteStatementFailed = false;
> stmtHandle = null;
>   }
> {code}
> The {{initFlags}} sets it to false:
> {code}
>   private void initFlags() {
> isCancelled = false;
> isQueryClosed = false;
> isLogBeingGenerated = true;
> isExecuteStatementFailed = false;
> isOperationComplete = false;
>   }
> {code}
> If the {{getQueryLog}} is called after the {{closeClientOperation}}, but 
> before the {{initFlags}}, then we will have a following warning if verbose 
> mode is set to true in BeeLine:
> {code}
> Warning: org.apache.hive.jdbc.ClosedOrCancelledStatementException: Method 
> getQueryLog() failed. The statement has been closed or cancelled. 
> (state=,code=0)
> {code}
> This caused this fail:
> https://builds.apache.org/job/PreCommit-HIVE-Build/4691/testReport/org.apache.hadoop.hive.cli/TestBeeLineDriver/testCliDriver_smb_mapjoin_11_/
> {code}
> Error Message
> Client result comparison failed with error code = 1 while executing 
> fname=smb_mapjoin_11
> 16a17
> > Warning: org.apache.hive.jdbc.ClosedOrCancelledStatementException: Method 
> > getQueryLog() failed. The statement has been closed or cancelled. 
> > (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)