[jira] [Commented] (HIVE-1010) Implement INFORMATION_SCHEMA in Hive

2017-02-14 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867399#comment-15867399
 ] 

Gunther Hagleitner commented on HIVE-1010:
--

WIP. Based on patch in HIVE-1555. SCHEMATA, TABLES, and COLUMNS are there.

> Implement INFORMATION_SCHEMA in Hive
> 
>
> Key: HIVE-1010
> URL: https://issues.apache.org/jira/browse/HIVE-1010
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore, Query Processor, Server Infrastructure
>Reporter: Jeff Hammerbacher
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1010.1.patch
>
>
> INFORMATION_SCHEMA is part of the SQL92 standard and would be useful to 
> implement using our metastore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-1010) Implement INFORMATION_SCHEMA in Hive

2017-02-14 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1010:
-
Attachment: HIVE-1010.1.patch

> Implement INFORMATION_SCHEMA in Hive
> 
>
> Key: HIVE-1010
> URL: https://issues.apache.org/jira/browse/HIVE-1010
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore, Query Processor, Server Infrastructure
>Reporter: Jeff Hammerbacher
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1010.1.patch
>
>
> INFORMATION_SCHEMA is part of the SQL92 standard and would be useful to 
> implement using our metastore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-1010) Implement INFORMATION_SCHEMA in Hive

2017-02-14 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner reassigned HIVE-1010:


Assignee: Gunther Hagleitner

> Implement INFORMATION_SCHEMA in Hive
> 
>
> Key: HIVE-1010
> URL: https://issues.apache.org/jira/browse/HIVE-1010
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore, Query Processor, Server Infrastructure
>Reporter: Jeff Hammerbacher
>Assignee: Gunther Hagleitner
>
> INFORMATION_SCHEMA is part of the SQL92 standard and would be useful to 
> implement using our metastore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15161) migrate ColumnStats to use jackson

2017-02-14 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867375#comment-15867375
 ] 

Zoltan Haindrich commented on HIVE-15161:
-

[~pxiong] could you please take a look?

> migrate ColumnStats to use jackson
> --
>
> Key: HIVE-15161
> URL: https://issues.apache.org/jira/browse/HIVE-15161
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 2.2.0
>
> Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, 
> HIVE-15161.3.patch, HIVE-15161.4.patch, HIVE-15161.4.patch, 
> HIVE-15161.5.patch, HIVE-15161.5.patch, HIVE-15161.6.patch
>
>
> * json.org has license issues
> * jackson can provide a fully compatible alternative to it
> * there are a few flakiness issues caused by the order of the map entries of 
> the columns...this cat be addressed, org.json api was unfriendly in this 
> manner ;)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15710) HS2 Stopped when running in background

2017-02-14 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867372#comment-15867372
 ] 

Rui Li commented on HIVE-15710:
---

Thanks [~Ferd] for your suggestions. I guess the redundancy is not a big deal. 
More importantly, we need to prevent user from bypassing the fix by just 
calling {{hive --service}}. So maybe adding to {{ext/hiveserver2.sh}} is better?

> HS2 Stopped when running in background
> --
>
> Key: HIVE-15710
> URL: https://issues.apache.org/jira/browse/HIVE-15710
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-15710.1.patch
>
>
> To reproduce, start HS2 in background like {{hive --service hiveserver2 &}}, 
> and run some query from beeline using Tez or Spark as engine.
> I think it's similar to HIVE-6758: HS2 uses jline for the in-place progress 
> update, and receives SIGTTOU when trying to call stty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-6590) Hive does not work properly with boolean partition columns (wrong results and inserts to incorrect HDFS path)

2017-02-14 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-6590:
---
Attachment: HIVE-6590.2.patch

patch.2
* enable sort results for qtest
* schema_evol tests contained a string to boolean conversion 

> Hive does not work properly with boolean partition columns (wrong results and 
> inserts to incorrect HDFS path)
> -
>
> Key: HIVE-6590
> URL: https://issues.apache.org/jira/browse/HIVE-6590
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Metastore
>Affects Versions: 0.10.0
>Reporter: Lenni Kuff
>Assignee: Zoltan Haindrich
> Attachments: HIVE-6590.1.patch, HIVE-6590.2.patch
>
>
> Hive does not work properly with boolean partition columns. Queries return 
> wrong results and also insert to incorrect HDFS paths.
> {code}
> create table bool_part(int_col int) partitioned by(bool_col boolean);
> # This works, creating 3 unique partitions!
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE);
> ALTER TABLE bool_table ADD PARTITION (bool_col=false);
> ALTER TABLE bool_table ADD PARTITION (bool_col=False);
> {code}
> The first problem is that Hive cannot filter on a bool partition key column. 
> "select * from bool_part" returns the correct results, but if you apply a 
> filter on the bool partition key column hive won't return any results.
> The second problem is that Hive seems to just call "toString()" on the 
> boolean literal value. This means you can end up with multiple partitions 
> (FALSE, false, FaLSE, etc) mapping to the literal value 'FALSE'. For example, 
> if you can add three partition in have for the same logic value "false" doing:
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE) -> 
> /test-warehouse/bool_table/bool_col=FALSE/
> ALTER TABLE bool_table ADD PARTITION (bool_col=false) -> 
> /test-warehouse/bool_table/bool_col=false/
> ALTER TABLE bool_table ADD PARTITION (bool_col=False) -> 
> /test-warehouse/bool_table/bool_col=False/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15891) Detect query rewrite scenario for UPDATE/DELETE/MERGE and fail fast

2017-02-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867361#comment-15867361
 ] 

Hive QA commented on HIVE-15891:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12852720/HIVE-15891.2.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 10237 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join31] (batchId=81)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[auto_sortmerge_join_16]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[external_table_with_space_in_location_path]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[index_bitmap3]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[infer_bucket_sort_map_operators]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[parallel_orderby]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[remote_script]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[schemeAuthority]
 (batchId=161)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join31] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=230)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3559/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3559/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3559/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12852720 - PreCommit-HIVE-Build

> Detect query rewrite scenario for UPDATE/DELETE/MERGE and fail fast
> ---
>
> Key: HIVE-15891
> URL: https://issues.apache.org/jira/browse/HIVE-15891
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-15891.1.patch, HIVE-15891.2.patch
>
>
> Currently ACID UpdateDeleteSemanticAnalyzer directly manipulates the AST tree 
> but it's different from the general approach of modifying the token stream 
> and thus will cause AST tree mismatch if there is any rewrite happening after 
> UpdateDeleteSemanticAnalyzer.
> The long term solution will be to rewrite the AST handling logic in 
> UpdateDeleteSemanticAnalyzer, to make it consistent with the general approach.
> This ticket will for now detect the error prone cases and fail early. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15710) HS2 Stopped when running in background

2017-02-14 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867359#comment-15867359
 ] 

Ferdinand Xu commented on HIVE-15710:
-

At this point, only hive server2 and beeline requires 
"-Djline.terminal=jline.UnsupportedTerminal". I prefer to add it to HiveServer2 
instead of bin/hive. To avoid redundancy, you may use some utils function and 
call it under ext. Any ideas?[~mohitsabharwal]

> HS2 Stopped when running in background
> --
>
> Key: HIVE-15710
> URL: https://issues.apache.org/jira/browse/HIVE-15710
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-15710.1.patch
>
>
> To reproduce, start HS2 in background like {{hive --service hiveserver2 &}}, 
> and run some query from beeline using Tez or Spark as engine.
> I think it's similar to HIVE-6758: HS2 uses jline for the in-place progress 
> update, and receives SIGTTOU when trying to call stty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15710) HS2 Stopped when running in background

2017-02-14 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15710:
--
Status: Patch Available  (was: Open)

> HS2 Stopped when running in background
> --
>
> Key: HIVE-15710
> URL: https://issues.apache.org/jira/browse/HIVE-15710
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-15710.1.patch
>
>
> To reproduce, start HS2 in background like {{hive --service hiveserver2 &}}, 
> and run some query from beeline using Tez or Spark as engine.
> I think it's similar to HIVE-6758: HS2 uses jline for the in-place progress 
> update, and receives SIGTTOU when trying to call stty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15710) HS2 Stopped when running in background

2017-02-14 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15710:
--
Attachment: HIVE-15710.1.patch

Move the check to bin/hive in patch v1.
Also pinging [~Ferd] for reivew.

> HS2 Stopped when running in background
> --
>
> Key: HIVE-15710
> URL: https://issues.apache.org/jira/browse/HIVE-15710
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-15710.1.patch
>
>
> To reproduce, start HS2 in background like {{hive --service hiveserver2 &}}, 
> and run some query from beeline using Tez or Spark as engine.
> I think it's similar to HIVE-6758: HS2 uses jline for the in-place progress 
> update, and receives SIGTTOU when trying to call stty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive

2017-02-14 Thread Simanchal Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simanchal Das updated HIVE-15229:
-
Attachment: HIVE-15229.4.patch

> 'like any' and 'like all' operators in hive
> ---
>
> Key: HIVE-15229
> URL: https://issues.apache.org/jira/browse/HIVE-15229
> Project: Hive
>  Issue Type: New Feature
>  Components: Operators
>Reporter: Simanchal Das
>Assignee: Simanchal Das
>Priority: Minor
> Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, 
> HIVE-15229.3.patch, HIVE-15229.4.patch
>
>
> In Teradata 'like any' and 'like all' operators are mostly used when we are 
> matching a text field with numbers of patterns.
> 'like any' and 'like all' operator are equivalents of multiple like operator 
> like example below.
> {noformat}
> --like any
> select col1 from table1 where col2 like any ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like condition 
> select col1 from table1 where col2 like '%accountant%' or col2 like 
> '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like 
> '%insurance%' ;
> --like all
> select col1 from table1 where col2 like all ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like operator 
> select col1 from table1 where col2 like '%accountant%' and col2 like 
> '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like 
> '%insurance%' ;
> {noformat}
> Problem statement:
> Now a days so many data warehouse projects are being migrated from Teradata 
> to Hive.
> Always Data engineer and Business analyst are searching for these two 
> operator.
> If we introduce these two operator in hive then so many scripts will be 
> migrated smoothly instead of converting these operators to multiple like 
> operators.
> Result:
> 1. 'LIKE ANY' operator return true if a text(column value) matches to any 
> pattern.
> 2. 'LIKE ALL' operator return true if a text(column value) matches to all 
> patterns.
> 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the 
> left hand side is NULL, but also if one of the pattern in the list is NULL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive

2017-02-14 Thread Simanchal Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simanchal Das updated HIVE-15229:
-
Status: Patch Available  (was: Open)

> 'like any' and 'like all' operators in hive
> ---
>
> Key: HIVE-15229
> URL: https://issues.apache.org/jira/browse/HIVE-15229
> Project: Hive
>  Issue Type: New Feature
>  Components: Operators
>Reporter: Simanchal Das
>Assignee: Simanchal Das
>Priority: Minor
> Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, 
> HIVE-15229.3.patch, HIVE-15229.4.patch
>
>
> In Teradata 'like any' and 'like all' operators are mostly used when we are 
> matching a text field with numbers of patterns.
> 'like any' and 'like all' operator are equivalents of multiple like operator 
> like example below.
> {noformat}
> --like any
> select col1 from table1 where col2 like any ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like condition 
> select col1 from table1 where col2 like '%accountant%' or col2 like 
> '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like 
> '%insurance%' ;
> --like all
> select col1 from table1 where col2 like all ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like operator 
> select col1 from table1 where col2 like '%accountant%' and col2 like 
> '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like 
> '%insurance%' ;
> {noformat}
> Problem statement:
> Now a days so many data warehouse projects are being migrated from Teradata 
> to Hive.
> Always Data engineer and Business analyst are searching for these two 
> operator.
> If we introduce these two operator in hive then so many scripts will be 
> migrated smoothly instead of converting these operators to multiple like 
> operators.
> Result:
> 1. 'LIKE ANY' operator return true if a text(column value) matches to any 
> pattern.
> 2. 'LIKE ALL' operator return true if a text(column value) matches to all 
> patterns.
> 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the 
> left hand side is NULL, but also if one of the pattern in the list is NULL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive

2017-02-14 Thread Simanchal Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simanchal Das updated HIVE-15229:
-
Attachment: (was: HIVE-15229.4.patch)

> 'like any' and 'like all' operators in hive
> ---
>
> Key: HIVE-15229
> URL: https://issues.apache.org/jira/browse/HIVE-15229
> Project: Hive
>  Issue Type: New Feature
>  Components: Operators
>Reporter: Simanchal Das
>Assignee: Simanchal Das
>Priority: Minor
> Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, 
> HIVE-15229.3.patch
>
>
> In Teradata 'like any' and 'like all' operators are mostly used when we are 
> matching a text field with numbers of patterns.
> 'like any' and 'like all' operator are equivalents of multiple like operator 
> like example below.
> {noformat}
> --like any
> select col1 from table1 where col2 like any ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like condition 
> select col1 from table1 where col2 like '%accountant%' or col2 like 
> '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like 
> '%insurance%' ;
> --like all
> select col1 from table1 where col2 like all ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like operator 
> select col1 from table1 where col2 like '%accountant%' and col2 like 
> '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like 
> '%insurance%' ;
> {noformat}
> Problem statement:
> Now a days so many data warehouse projects are being migrated from Teradata 
> to Hive.
> Always Data engineer and Business analyst are searching for these two 
> operator.
> If we introduce these two operator in hive then so many scripts will be 
> migrated smoothly instead of converting these operators to multiple like 
> operators.
> Result:
> 1. 'LIKE ANY' operator return true if a text(column value) matches to any 
> pattern.
> 2. 'LIKE ALL' operator return true if a text(column value) matches to all 
> patterns.
> 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the 
> left hand side is NULL, but also if one of the pattern in the list is NULL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive

2017-02-14 Thread Simanchal Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simanchal Das updated HIVE-15229:
-
Status: Open  (was: Patch Available)

> 'like any' and 'like all' operators in hive
> ---
>
> Key: HIVE-15229
> URL: https://issues.apache.org/jira/browse/HIVE-15229
> Project: Hive
>  Issue Type: New Feature
>  Components: Operators
>Reporter: Simanchal Das
>Assignee: Simanchal Das
>Priority: Minor
> Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, 
> HIVE-15229.3.patch, HIVE-15229.4.patch
>
>
> In Teradata 'like any' and 'like all' operators are mostly used when we are 
> matching a text field with numbers of patterns.
> 'like any' and 'like all' operator are equivalents of multiple like operator 
> like example below.
> {noformat}
> --like any
> select col1 from table1 where col2 like any ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like condition 
> select col1 from table1 where col2 like '%accountant%' or col2 like 
> '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like 
> '%insurance%' ;
> --like all
> select col1 from table1 where col2 like all ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like operator 
> select col1 from table1 where col2 like '%accountant%' and col2 like 
> '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like 
> '%insurance%' ;
> {noformat}
> Problem statement:
> Now a days so many data warehouse projects are being migrated from Teradata 
> to Hive.
> Always Data engineer and Business analyst are searching for these two 
> operator.
> If we introduce these two operator in hive then so many scripts will be 
> migrated smoothly instead of converting these operators to multiple like 
> operators.
> Result:
> 1. 'LIKE ANY' operator return true if a text(column value) matches to any 
> pattern.
> 2. 'LIKE ALL' operator return true if a text(column value) matches to all 
> patterns.
> 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the 
> left hand side is NULL, but also if one of the pattern in the list is NULL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15881) Use new thread count variable name instead of mapred.dfsclient.parallelism.max

2017-02-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867321#comment-15867321
 ] 

Hive QA commented on HIVE-15881:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12852686/HIVE-15881.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10238 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join31] (batchId=81)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2]
 (batchId=152)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join31] 
(batchId=133)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3558/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3558/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3558/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12852686 - PreCommit-HIVE-Build

> Use new thread count variable name instead of mapred.dfsclient.parallelism.max
> --
>
> Key: HIVE-15881
> URL: https://issues.apache.org/jira/browse/HIVE-15881
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Minor
> Attachments: HIVE-15881.1.patch
>
>
> The Utilities class has two methods, {{getInputSummary}} and 
> {{getInputPaths}}, that use the variable {{mapred.dfsclient.parallelism.max}} 
> to get the summary of a list of input locations in parallel. These methods 
> are Hive related, but the variable name does not look it is specific for Hive.
> Also, the above variable is not on HiveConf nor used anywhere else. I just 
> found a reference on the Hadoop MR1 code.
> I'd like to propose the deprecation of {{mapred.dfsclient.parallelism.max}}, 
> and use a different variable name, such as 
> {{hive.get.input.listing.num.threads}}, that reflects the intention of the 
> variable. The removal of the old variable might happen on Hive 3.x



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12631) LLAP: support ORC ACID tables

2017-02-14 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-12631:
--
Component/s: Transactions
 llap

> LLAP: support ORC ACID tables
> -
>
> Key: HIVE-12631
> URL: https://issues.apache.org/jira/browse/HIVE-12631
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-12631.1.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and 
> parallelization of reads and processing. This path does not support ACID. As 
> far as I remember ACID logic is embedded inside ORC format; we need to 
> refactor it to be on top of some interface, if practical; or just port it to 
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is 
> currently low-level (CB-level in ORC), so we could just use it to read bases 
> and deltas (deltas should be cached with higher priority) and merge as usual. 
> We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12631) LLAP: support ORC ACID tables

2017-02-14 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-12631:
--
Status: Patch Available  (was: Open)

> LLAP: support ORC ACID tables
> -
>
> Key: HIVE-12631
> URL: https://issues.apache.org/jira/browse/HIVE-12631
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-12631.1.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and 
> parallelization of reads and processing. This path does not support ACID. As 
> far as I remember ACID logic is embedded inside ORC format; we need to 
> refactor it to be on top of some interface, if practical; or just port it to 
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is 
> currently low-level (CB-level in ORC), so we could just use it to read bases 
> and deltas (deltas should be cached with higher priority) and merge as usual. 
> We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-12631) LLAP: support ORC ACID tables

2017-02-14 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867303#comment-15867303
 ] 

Teddy Choi commented on HIVE-12631:
---

This draft patch implements the basic idea. OrcAcidEncodedDataConsumer merges 
base from LLAP and delta from files before consuming. The methods and classes 
that are shared between OrcAcidEncodedDataConsumer and 
VectorizedOrcAcidRowBatchReader are now in AcidMergeUtils. AcidMergeUtils 
handles not only VectorizedRowBatch but also ColumnVectorBatch.

However, this patch doesn't cache delta data on LLAP. I will try to cache it in 
the next patch.

> LLAP: support ORC ACID tables
> -
>
> Key: HIVE-12631
> URL: https://issues.apache.org/jira/browse/HIVE-12631
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-12631.1.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and 
> parallelization of reads and processing. This path does not support ACID. As 
> far as I remember ACID logic is embedded inside ORC format; we need to 
> refactor it to be on top of some interface, if practical; or just port it to 
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is 
> currently low-level (CB-level in ORC), so we could just use it to read bases 
> and deltas (deltas should be cached with higher priority) and merge as usual. 
> We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-12631) LLAP: support ORC ACID tables

2017-02-14 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-12631:
--
Attachment: HIVE-12631.1.patch

> LLAP: support ORC ACID tables
> -
>
> Key: HIVE-12631
> URL: https://issues.apache.org/jira/browse/HIVE-12631
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-12631.1.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and 
> parallelization of reads and processing. This path does not support ACID. As 
> far as I remember ACID logic is embedded inside ORC format; we need to 
> refactor it to be on top of some interface, if practical; or just port it to 
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is 
> currently low-level (CB-level in ORC), so we could just use it to read bases 
> and deltas (deltas should be cached with higher priority) and merge as usual. 
> We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15921) Re-order the slider stop command to avoid a force if possible

2017-02-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867285#comment-15867285
 ] 

Hive QA commented on HIVE-15921:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12852725/HIVE-15921.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10238 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join31] (batchId=81)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2]
 (batchId=152)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join31] 
(batchId=133)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3557/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3557/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3557/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12852725 - PreCommit-HIVE-Build

> Re-order the slider stop command to avoid a force if possible
> -
>
> Key: HIVE-15921
> URL: https://issues.apache.org/jira/browse/HIVE-15921
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15921.01.patch, HIVE-15921.02.patch
>
>
> A graceful stop is required for slider --service llapstatus to work properly



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15892) Vectorization: Fast Hash tables need to do bounds checking during expand

2017-02-14 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867264#comment-15867264
 ] 

Matt McCline commented on HIVE-15892:
-

Yes, it will fail the query.  I had a long conversation with Gopal and we think 
this is the best thing in the short/medium term.  Very large hash tables are 
not very efficient.  We need to encourage people to have smaller hash tables 
though a runtime error is obviously not ideal.

> Vectorization: Fast Hash tables need to do bounds checking during expand
> 
>
> Key: HIVE-15892
> URL: https://issues.apache.org/jira/browse/HIVE-15892
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15892.01.patch, HIVE-15892.02.patch
>
>
> VectorMapJoinFastLongHashTable line 165 gets NegativeArraySizeException:
> {code}
> long[] newSlotPairs = new long[newSlotPairArraySize];
> {code}
> We need to add a size check... Java math for this wrapped around to negative:
> {code}
> int newSlotPairArraySize = newLogicalHashBucketCount * 2;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15489) Alternatively use table scan stats for HoS

2017-02-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867249#comment-15867249
 ] 

Xuefu Zhang commented on HIVE-15489:


+1, Patch looks good. One minor suggestion: it would be great if we can add 
some map-join test cases with the introduced configuration to true to server as 
a sanity check. Otherwise, the new code path will not be exercised in ptest.

> Alternatively use table scan stats for HoS
> --
>
> Key: HIVE-15489
> URL: https://issues.apache.org/jira/browse/HIVE-15489
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark, Statistics
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-15489.1.patch, HIVE-15489.2.patch, 
> HIVE-15489.3.patch, HIVE-15489.6.patch, HIVE-15489.wip.patch
>
>
> For MapJoin in HoS, we should provide an option to only use stats in the TS 
> rather than the populated stats in each of the join branch. This could be 
> pretty conservative but more reliable.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15902) Select query involving date throwing Hive 2 Internal error: unsupported conversion from type: date

2017-02-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867228#comment-15867228
 ] 

Hive QA commented on HIVE-15902:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12852677/HIVE-15902.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10239 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join31] (batchId=81)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join31] 
(batchId=133)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3555/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3555/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3555/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12852677 - PreCommit-HIVE-Build

> Select query involving date throwing Hive 2 Internal error: unsupported 
> conversion from type: date
> --
>
> Key: HIVE-15902
> URL: https://issues.apache.org/jira/browse/HIVE-15902
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Jason Dere
> Attachments: HIVE-15902.1.patch
>
>
> The following query is throwing Hive 2 Internal error: unsupported conversion 
> from type: date
> Query:
> create table table_one (ts timestamp, dt date) stored as orc;
> insert into table_one values ('2034-08-04 17:42:59','2038-07-01');
> insert into table_one values ('2031-02-07 13:02:38','2072-10-19');
> create table table_two (ts timestamp, dt date) stored as orc;
> insert into table_two values ('2069-04-01 09:05:54','1990-10-12');
> insert into table_two values ('2031-02-07 13:02:38','2072-10-19');
> create table table_three as
> select count(*) from table_one
> group by ts,dt
> having dt in (select dt from table_two);
> Error while running task ( failure ) : 
> attempt_1486991777989_0184_18_02_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> 

[jira] [Commented] (HIVE-15710) HS2 Stopped when running in background

2017-02-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867218#comment-15867218
 ] 

Xuefu Zhang commented on HIVE-15710:


Hi [~lirui], your proposal looks good for me. Thanks for working on this.

> HS2 Stopped when running in background
> --
>
> Key: HIVE-15710
> URL: https://issues.apache.org/jira/browse/HIVE-15710
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>
> To reproduce, start HS2 in background like {{hive --service hiveserver2 &}}, 
> and run some query from beeline using Tez or Spark as engine.
> I think it's similar to HIVE-6758: HS2 uses jline for the in-place progress 
> update, and receives SIGTTOU when trying to call stty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15912) Executor kill task and Failed to get spark memory/core info

2017-02-14 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867198#comment-15867198
 ] 

Rui Li commented on HIVE-15912:
---

The failure to get mem/core info is just a warning. So it's not related to 
executor killing tasks. You should check the AM log (if in yarn-cluster mode) 
to see why the driver commands a shutdown.

> Executor kill task and Failed to get spark memory/core info
> ---
>
> Key: HIVE-15912
> URL: https://issues.apache.org/jira/browse/HIVE-15912
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 2.2.0
> Environment: hadoop2.7.1
> spark2.0.2
> Hive2.2
>Reporter: KaiXu
>
> Hive on Spark, failed with error:
> Starting Spark Job = 12a8cb8c-ed0d-4049-ae06-8d32d13fe285
> Failed to monitor Job[ 6] with exception 'java.lang.IllegalStateException(RPC 
> channel is closed.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> Hive's log:
> 2017-02-14T19:03:09,147  INFO [stderr-redir-1] client.SparkClientImpl: 
> 17/02/14 19:03:09 INFO yarn.Client: Application report for 
> application_1486905599813_0403 (state: ACCEPTED)
> 2017-02-14T19:03:10,817  WARN [5bcf13e5-cb54-4cfe-a0d4-9a6556ab48b1 main] 
> spark.SetSparkReducerParallelism: Failed to get spark memory/core info
> java.util.concurrent.TimeoutException
> at 
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49) 
> ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
> at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:155)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:165)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.getMemoryAndCores(SparkSessionImpl.java:77)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:119)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:158)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.runJoinOptimizations(SparkCompiler.java:291)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:120)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:140) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11085)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:279)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:510) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1302) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1442) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1222) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) 
> ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) 
> ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> 

[jira] [Commented] (HIVE-15904) select query throwing Null Pointer Exception from org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan

2017-02-14 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867193#comment-15867193
 ] 

Deepak Jaiswal commented on HIVE-15904:
---

Apache rb link,

https://reviews.apache.org/r/56695/

> select query throwing Null Pointer Exception from 
> org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan
> --
>
> Key: HIVE-15904
> URL: https://issues.apache.org/jira/browse/HIVE-15904
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15904.1.patch, HIVE-15904.2.patch, 
> HIVE-15904.3.patch, table_18.q, table_1.q
>
>
> Following query failing with Null Pointer Exception from 
> org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan
> Attaching create table statements for table_1 and table_18
> Query:
> SELECT
> COALESCE(498, LEAD(COALESCE(-973, -684, 515)) OVER (PARTITION BY 
> (t2.int_col_10 + t1.smallint_col_50) ORDER BY (t2.int_col_10 + 
> t1.smallint_col_50), FLOOR(t1.double_col_16) DESC), 524) AS int_col,
> (t2.int_col_10) + (t1.smallint_col_50) AS int_col_1,
> FLOOR(t1.double_col_16) AS float_col,
> COALESCE(SUM(COALESCE(62, -380, -435)) OVER (PARTITION BY (t2.int_col_10 + 
> t1.smallint_col_50) ORDER BY (t2.int_col_10 + t1.smallint_col_50) DESC, 
> FLOOR(t1.double_col_16) DESC ROWS BETWEEN UNBOUNDED PRECEDING AND 48 
> FOLLOWING), 704) AS int_col_2
> FROM table_1 t1
> INNER JOIN table_18 t2 ON (((t2.tinyint_col_15) = (t1.bigint_col_7)) AND
> ((t2.decimal2709_col_9) = (t1.decimal2016_col_26))) AND
> ((t2.tinyint_col_20) = (t1.tinyint_col_3))
> WHERE (t2.smallint_col_19) IN (SELECT
> COALESCE(-92, -994) AS int_col
> FROM table_1 tt1
> INNER JOIN table_18 tt2 ON (tt2.decimal1911_col_16) = (tt1.decimal2612_col_77)
> WHERE (t1.timestamp_col_9) = (tt2.timestamp_col_18));
> Error Stack:
> org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: NullPointerException null
> at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:387)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:193)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:276)
>  
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:324) 
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:507)
>  
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:495)
>  
> at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:308)
>  
> at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:506)
>  
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
>  
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422)
>  
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:599)
>  
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [?:1.8.0_112]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan(DynamicPartitionPruningOptimization.java:402)
>  
> at 
> org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.process(DynamicPartitionPruningOptimization.java:226)
>  
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>  
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>  
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>  
> at 
> org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:74) 
> at 
> 

[jira] [Commented] (HIVE-3040) Hive Exec jar should not include Thrift

2017-02-14 Thread Andrew Muraco (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867166#comment-15867166
 ] 

Andrew Muraco commented on HIVE-3040:
-

So what's the current work-around right now?

I'm on CDH-5.5.1-1.cdh5.5.1.p0.11 (hive 1.1.0-cdh5.5.1 jars) and when I 
schedule a Hive or HiveServer2 step in oozie , it fails based on if the 
hive-exec jar appears after libthrift (failure) or before (success).


> Hive Exec jar should not include Thrift
> ---
>
> Key: HIVE-3040
> URL: https://issues.apache.org/jira/browse/HIVE-3040
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.8.1
>Reporter: Bhushan Mandhani
>Assignee: Namit Jain
>Priority: Minor
> Fix For: 0.8.1
>
> Attachments: HIVE-3040.1.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> hive-exec jar includes Thrift classes even though it does not need them. This 
> can create problems because it can load some wrong version of Thrift and 
> other  jars that need Thrift get stuck with the wrong version. We will remove 
> Thrift from this jar.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15900) Beeline prints tez job progress in stdout instead of stderr

2017-02-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867165#comment-15867165
 ] 

Thejas M Nair commented on HIVE-15900:
--

As part of the adding test for the fix, I also refactored 
TestBeeLineWithArgs.java  so that it checks for the expected strings in a 
specific output stream (stdout vs stderr).


> Beeline prints tez job progress in stdout instead of stderr
> ---
>
> Key: HIVE-15900
> URL: https://issues.apache.org/jira/browse/HIVE-15900
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Thejas M Nair
> Attachments: HIVE-15900.1.patch, std_out
>
>
> Tez job progress messages are getting updated to stdout instead of stderr
> Attaching output file for below command, with the tez job status printed
> $HIVE_HOME/bin/beeline -n  -p  -u " --outputformat=tsv -e "analyze table studenttab10k compute statistics;" > 
> stdout



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15900) Beeline prints tez job progress in stdout instead of stderr

2017-02-14 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-15900:
-
Attachment: HIVE-15900.1.patch

> Beeline prints tez job progress in stdout instead of stderr
> ---
>
> Key: HIVE-15900
> URL: https://issues.apache.org/jira/browse/HIVE-15900
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Thejas M Nair
> Attachments: HIVE-15900.1.patch, std_out
>
>
> Tez job progress messages are getting updated to stdout instead of stderr
> Attaching output file for below command, with the tez job status printed
> $HIVE_HOME/bin/beeline -n  -p  -u " --outputformat=tsv -e "analyze table studenttab10k compute statistics;" > 
> stdout



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15900) Beeline prints tez job progress in stdout instead of stderr

2017-02-14 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair reassigned HIVE-15900:


Assignee: Thejas M Nair

> Beeline prints tez job progress in stdout instead of stderr
> ---
>
> Key: HIVE-15900
> URL: https://issues.apache.org/jira/browse/HIVE-15900
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Thejas M Nair
> Attachments: std_out
>
>
> Tez job progress messages are getting updated to stdout instead of stderr
> Attaching output file for below command, with the tez job status printed
> $HIVE_HOME/bin/beeline -n  -p  -u " --outputformat=tsv -e "analyze table studenttab10k compute statistics;" > 
> stdout



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15900) Beeline prints tez job progress in stdout instead of stderr

2017-02-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867159#comment-15867159
 ] 

Thejas M Nair commented on HIVE-15900:
--

[~daijy] [~anishek] Can you please review ?


> Beeline prints tez job progress in stdout instead of stderr
> ---
>
> Key: HIVE-15900
> URL: https://issues.apache.org/jira/browse/HIVE-15900
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Aswathy Chellammal Sreekumar
> Attachments: std_out
>
>
> Tez job progress messages are getting updated to stdout instead of stderr
> Attaching output file for below command, with the tez job status printed
> $HIVE_HOME/bin/beeline -n  -p  -u " --outputformat=tsv -e "analyze table studenttab10k compute statistics;" > 
> stdout



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15900) Beeline prints tez job progress in stdout instead of stderr

2017-02-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867158#comment-15867158
 ] 

ASF GitHub Bot commented on HIVE-15900:
---

GitHub user thejasmn opened a pull request:

https://github.com/apache/hive/pull/148

HIVE-15900



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/thejasmn/hive HIVE-15900

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/148.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #148


commit 73a2b4be8e30d3dfa8c5f974540f6c8a43da4327
Author: Thejas M Nair 
Date:   2017-02-14T06:43:50Z

refactor tests to specifically test stdout/stderr

commit 45f7b9347fedf9b5f0c213c493f0a4d2e729e5c9
Author: Thejas M Nair 
Date:   2017-02-14T19:45:45Z

improve minihs2 tez type usage

commit 06751edf48c2c81f04a479ad0f000c5d6b370d32
Author: Thejas M Nair 
Date:   2017-02-15T03:38:12Z

the beeline fix and test updates




> Beeline prints tez job progress in stdout instead of stderr
> ---
>
> Key: HIVE-15900
> URL: https://issues.apache.org/jira/browse/HIVE-15900
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Aswathy Chellammal Sreekumar
> Attachments: std_out
>
>
> Tez job progress messages are getting updated to stdout instead of stderr
> Attaching output file for below command, with the tez job status printed
> $HIVE_HOME/bin/beeline -n  -p  -u " --outputformat=tsv -e "analyze table studenttab10k compute statistics;" > 
> stdout



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15882) HS2 generating high memory pressure with many partitions and concurrent queries

2017-02-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867157#comment-15867157
 ] 

Hive QA commented on HIVE-15882:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12852678/HIVE-15882.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10238 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join31] (batchId=81)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2]
 (batchId=152)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join31] 
(batchId=133)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3554/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3554/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3554/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12852678 - PreCommit-HIVE-Build

> HS2 generating high memory pressure with many partitions and concurrent 
> queries
> ---
>
> Key: HIVE-15882
> URL: https://issues.apache.org/jira/browse/HIVE-15882
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-15882.01.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code:
> 1. 24.5% of memory is wasted by duplicate strings (see section 6). With 
> String.intern() calls added in the ~10 relevant places in the code, this 
> overhead can be highly reduced.
> 2. Almost 20% of memory is wasted due to various suboptimally used 
> collections (see section 8). There are many maps and lists that are either 
> empty or have just 1 element. By modifying the code that creates and 
> populates these collections, we may likely save 5-10% of memory.
> 3. Almost 20% of memory is used by instances of java.util.Properties. It 
> looks like these objects are highly duplicate, since for each Partition each 
> concurrently running query creates its own copy of Partion, PartitionDesc and 
> Properties. Thus we have nearly 100,000 (50 queries * 2,000 partitions) 
> Properties in memory. By interning/deduplicating these objects we may be able 
> to save perhaps 15% of memory.
> So overall, I think there is a good chance to reduce HS2 memory consumption 
> in this scenario by ~40%.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15886) LLAP: Provide logs URL for in-progress and completed task attemtps

2017-02-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867153#comment-15867153
 ] 

Siddharth Seth commented on HIVE-15886:
---

+1. Ideally we should check for nulls during taskAttempt to containerId 
conversion. However, Tez does not fail queries if there is an exception while 
constructing the log URL

> LLAP: Provide logs URL for in-progress and completed task attemtps
> --
>
> Key: HIVE-15886
> URL: https://issues.apache.org/jira/browse/HIVE-15886
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15886.1.patch
>
>
> YARN provides a webservice to access logs with YARN-6011. This can be used to 
> populate the in-progress and completed task attempts logs. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15887) could not get APP ID and cause failed to connect to spark driver on yarn-client mode

2017-02-14 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867150#comment-15867150
 ] 

Rui Li commented on HIVE-15887:
---

The symptom is quite similar to that of HIVE-12650. Is this a busy cluster?

> could not get APP ID and cause failed to connect to spark driver on 
> yarn-client mode
> 
>
> Key: HIVE-15887
> URL: https://issues.apache.org/jira/browse/HIVE-15887
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 2.2.0
> Environment: Hive2.2
> Spark2.0.2
> hadoop2.7.1
>Reporter: KaiXu
>
> when I run Hive queries on Spark, got below error in the console, after check 
> the container's log, found it failed to connected to spark driver. I have set 
>  hive.spark.job.monitor.timeout=3600s, so the log said 'Job hasn't been 
> submitted after 3601s', actually during this long-time period it's impossible 
> no available resource, and also did not see any issue related to the network, 
> so the cause is not clear from the message "Possible reasons include network 
> issues, errors in remote driver or the cluster has no available resources, 
> etc.".
> From Hive's log, failed to get APP ID, so this might be the cause why the 
> driver did not start up.
> console log:
> Starting Spark Job = e9ce42c8-ff20-4ac8-803f-7668678c2a00
> Job hasn't been submitted after 3601s. Aborting it.
> Possible reasons include network issues, errors in remote driver or the 
> cluster has no available resources, etc.
> Please check YARN or Spark driver's logs for further information.
> Status: SENT
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> container's log:
> 17/02/13 05:05:54 INFO yarn.ApplicationMaster: Preparing Local resources
> 17/02/13 05:05:54 INFO yarn.ApplicationMaster: Prepared Local resources 
> Map(__spark_libs__ -> resource { scheme: "hdfs" host: "hsx-node1" port: 8020 
> file: 
> "/user/root/.sparkStaging/application_1486905599813_0046/__spark_libs__6842484649003444330.zip"
>  } size: 153484072 timestamp: 1486926551130 type: ARCHIVE visibility: 
> PRIVATE, __spark_conf__ -> resource { scheme: "hdfs" host: "hsx-node1" port: 
> 8020 file: 
> "/user/root/.sparkStaging/application_1486905599813_0046/__spark_conf__.zip" 
> } size: 116245 timestamp: 1486926551318 type: ARCHIVE visibility: PRIVATE)
> 17/02/13 05:05:54 INFO yarn.ApplicationMaster: ApplicationAttemptId: 
> appattempt_1486905599813_0046_02
> 17/02/13 05:05:54 INFO spark.SecurityManager: Changing view acls to: root
> 17/02/13 05:05:54 INFO spark.SecurityManager: Changing modify acls to: root
> 17/02/13 05:05:54 INFO spark.SecurityManager: Changing view acls groups to: 
> 17/02/13 05:05:54 INFO spark.SecurityManager: Changing modify acls groups to: 
> 17/02/13 05:05:54 INFO spark.SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users  with view permissions: Set(root); groups 
> with view permissions: Set(); users  with modify permissions: Set(root); 
> groups with modify permissions: Set()
> 17/02/13 05:05:54 INFO yarn.ApplicationMaster: Waiting for Spark driver to be 
> reachable.
> 17/02/13 05:05:54 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 05:05:54 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 05:05:54 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 05:05:55 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 05:05:55 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 05:05:55 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 05:05:55 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 05:05:55 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 05:05:55 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 05:05:55 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 05:05:55 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 05:05:55 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 05:05:55 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 05:05:56 ERROR yarn.ApplicationMaster: Failed to connect to driver 
> at 192.168.1.1:43656, retrying ...
> 17/02/13 

[jira] [Commented] (HIVE-15921) Re-order the slider stop command to avoid a force if possible

2017-02-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867140#comment-15867140
 ] 

Siddharth Seth commented on HIVE-15921:
---

I believe stop is not synchronous. Not sure why it's always worked. cc [~gsaha]

Will change the last timeout to 30.


> Re-order the slider stop command to avoid a force if possible
> -
>
> Key: HIVE-15921
> URL: https://issues.apache.org/jira/browse/HIVE-15921
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15921.01.patch, HIVE-15921.02.patch
>
>
> A graceful stop is required for slider --service llapstatus to work properly



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15921) Re-order the slider stop command to avoid a force if possible

2017-02-14 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-15921:
--
Attachment: HIVE-15921.02.patch

> Re-order the slider stop command to avoid a force if possible
> -
>
> Key: HIVE-15921
> URL: https://issues.apache.org/jira/browse/HIVE-15921
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15921.01.patch, HIVE-15921.02.patch
>
>
> A graceful stop is required for slider --service llapstatus to work properly



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15917) incorrect error handling from BackgroundWork can cause beeline query to hang

2017-02-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867134#comment-15867134
 ] 

Siddharth Seth commented on HIVE-15917:
---

I'm not sure when this gets invoked in Hive. e.g. If a query is submitted, and 
there's no session available, will this get invoked up front, or only after a 
session is obtained.
If it is after a session is obtained, then mostly looks good.
May want to cap the timeOutMs to some value instead of shifting it each time - 
(will go -ve at some point, and to a really large value before that). This is 
in case someone decides to change the timeout in the future.

> incorrect error handling from BackgroundWork can cause beeline query to hang
> 
>
> Key: HIVE-15917
> URL: https://issues.apache.org/jira/browse/HIVE-15917
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15917.01.patch, HIVE-15917.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-11208) Can not drop a default partition __HIVE_DEFAULT_PARTITION__ which is not a "string" type

2017-02-14 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867125#comment-15867125
 ] 

Aihua Xu commented on HIVE-11208:
-

[~sershe] Yeah. That seems to be a problem. Please go ahead to revert the patch 
and we can rethink about the new way for this jira so we have clean 
implementation.

> Can not drop a default partition __HIVE_DEFAULT_PARTITION__ which is not a 
> "string" type
> 
>
> Key: HIVE-11208
> URL: https://issues.apache.org/jira/browse/HIVE-11208
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.1.0
>Reporter: Yongzhi Chen
>Assignee: Aihua Xu
> Fix For: 2.2.0
>
> Attachments: HIVE-11208.2.patch, HIVE-11208.3.patch
>
>
> When partition is not a string type, for example, if it is a int type, when 
> drop the default partition __HIVE_DEFAULT_PARTITION__, you will get:
> SemanticException Unexpected unknown partitions
> Reproduce:
> {noformat}
> SET hive.exec.dynamic.partition=true;
> SET hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.max.dynamic.partitions.pernode=1;
> DROP TABLE IF EXISTS test;
> CREATE TABLE test (col1 string) PARTITIONED BY (p1 int) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '\001' STORED AS TEXTFILE;
> INSERT OVERWRITE TABLE test PARTITION (p1) SELECT code, IF(salary > 600, 100, 
> null) as p1 FROM jsmall;
> hive> SHOW PARTITIONS test;
> OK
> p1=100
> p1=__HIVE_DEFAULT_PARTITION__
> Time taken: 0.124 seconds, Fetched: 2 row(s)
> hive> ALTER TABLE test DROP partition (p1 = '__HIVE_DEFAULT_PARTITION__');
> FAILED: SemanticException Unexpected unknown partitions for (p1 = null)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-6590) Hive does not work properly with boolean partition columns (wrong results and inserts to incorrect HDFS path)

2017-02-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867122#comment-15867122
 ] 

Hive QA commented on HIVE-6590:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12852671/HIVE-6590.1.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10240 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join31] (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_boolean] 
(batchId=21)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_primitive]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part_all_primitive]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_primitive]
 (batchId=150)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join31] 
(batchId=133)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3553/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3553/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3553/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12852671 - PreCommit-HIVE-Build

> Hive does not work properly with boolean partition columns (wrong results and 
> inserts to incorrect HDFS path)
> -
>
> Key: HIVE-6590
> URL: https://issues.apache.org/jira/browse/HIVE-6590
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Metastore
>Affects Versions: 0.10.0
>Reporter: Lenni Kuff
>Assignee: Zoltan Haindrich
> Attachments: HIVE-6590.1.patch
>
>
> Hive does not work properly with boolean partition columns. Queries return 
> wrong results and also insert to incorrect HDFS paths.
> {code}
> create table bool_part(int_col int) partitioned by(bool_col boolean);
> # This works, creating 3 unique partitions!
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE);
> ALTER TABLE bool_table ADD PARTITION (bool_col=false);
> ALTER TABLE bool_table ADD PARTITION (bool_col=False);
> {code}
> The first problem is that Hive cannot filter on a bool partition key column. 
> "select * from bool_part" returns the correct results, but if you apply a 
> filter on the bool partition key column hive won't return any results.
> The second problem is that Hive seems to just call "toString()" on the 
> boolean literal value. This means you can end up with multiple partitions 
> (FALSE, false, FaLSE, etc) mapping to the literal value 'FALSE'. For example, 
> if you can add three partition in have for the same logic value "false" doing:
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE) -> 
> /test-warehouse/bool_table/bool_col=FALSE/
> ALTER TABLE bool_table ADD PARTITION (bool_col=false) -> 
> /test-warehouse/bool_table/bool_col=false/
> ALTER TABLE bool_table ADD PARTITION (bool_col=False) -> 
> /test-warehouse/bool_table/bool_col=False/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15923) Hive default partition causes errors in get partitions

2017-02-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15923:

Attachment: HIVE-15923.patch

> Hive default partition causes errors in get partitions
> --
>
> Key: HIVE-15923
> URL: https://issues.apache.org/jira/browse/HIVE-15923
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 2.2.0
>
> Attachments: HIVE-15923.patch
>
>
> This is the ORM error, direct SQL fails too before that, with a similar error.
> {noformat}
> 2017-02-14T17:45:11,158 ERROR [09fdd887-0164-4f55-97e9-4ba147d962be main] 
> metastore.ObjectStore:java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.plan.ExprNodeConstantDefaultDesc cannot be cast to 
> java.lang.Long
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:40)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectInspectorUtils.java:801)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$DoubleConverter.convert(P
> rimitiveObjectInspectorConverter.java:240) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan.evaluate(GenericUDFOPEqualOrGreaterThan.java:145)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBetween.evaluate(GenericUDFBetween.java:57)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:63)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.evaluateExprOnPart(PartExprEvalUtils.java:126)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15923) Hive default partition causes errors in get partitions

2017-02-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15923:

Attachment: (was: HIVE-15923.patch)

> Hive default partition causes errors in get partitions
> --
>
> Key: HIVE-15923
> URL: https://issues.apache.org/jira/browse/HIVE-15923
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 2.2.0
>
>
> This is the ORM error, direct SQL fails too before that, with a similar error.
> {noformat}
> 2017-02-14T17:45:11,158 ERROR [09fdd887-0164-4f55-97e9-4ba147d962be main] 
> metastore.ObjectStore:java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.plan.ExprNodeConstantDefaultDesc cannot be cast to 
> java.lang.Long
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:40)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectInspectorUtils.java:801)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$DoubleConverter.convert(P
> rimitiveObjectInspectorConverter.java:240) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan.evaluate(GenericUDFOPEqualOrGreaterThan.java:145)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBetween.evaluate(GenericUDFBetween.java:57)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:63)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.evaluateExprOnPart(PartExprEvalUtils.java:126)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15923) Hive default partition causes errors in get partitions

2017-02-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15923:

Status: Patch Available  (was: Open)

> Hive default partition causes errors in get partitions
> --
>
> Key: HIVE-15923
> URL: https://issues.apache.org/jira/browse/HIVE-15923
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 2.2.0
>
>
> This is the ORM error, direct SQL fails too before that, with a similar error.
> {noformat}
> 2017-02-14T17:45:11,158 ERROR [09fdd887-0164-4f55-97e9-4ba147d962be main] 
> metastore.ObjectStore:java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.plan.ExprNodeConstantDefaultDesc cannot be cast to 
> java.lang.Long
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:40)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectInspectorUtils.java:801)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$DoubleConverter.convert(P
> rimitiveObjectInspectorConverter.java:240) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan.evaluate(GenericUDFOPEqualOrGreaterThan.java:145)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBetween.evaluate(GenericUDFBetween.java:57)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:63)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.evaluateExprOnPart(PartExprEvalUtils.java:126)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-11208) Can not drop a default partition __HIVE_DEFAULT_PARTITION__ which is not a "string" type

2017-02-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867098#comment-15867098
 ] 

Sergey Shelukhin commented on HIVE-11208:
-

See HIVE-15923; I think it would be better to revert this patch, otherwise I'll 
fix it there. Test attached there - breaks any filters on part columns that 
have the default partitions, and are not =/!=.
The fundamental problem is that exprs are not supposed to be exposed to UDFs; 
exprs should be evaluated to comparable Hive data objects.

> Can not drop a default partition __HIVE_DEFAULT_PARTITION__ which is not a 
> "string" type
> 
>
> Key: HIVE-11208
> URL: https://issues.apache.org/jira/browse/HIVE-11208
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.1.0
>Reporter: Yongzhi Chen
>Assignee: Aihua Xu
> Fix For: 2.2.0
>
> Attachments: HIVE-11208.2.patch, HIVE-11208.3.patch
>
>
> When partition is not a string type, for example, if it is a int type, when 
> drop the default partition __HIVE_DEFAULT_PARTITION__, you will get:
> SemanticException Unexpected unknown partitions
> Reproduce:
> {noformat}
> SET hive.exec.dynamic.partition=true;
> SET hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.max.dynamic.partitions.pernode=1;
> DROP TABLE IF EXISTS test;
> CREATE TABLE test (col1 string) PARTITIONED BY (p1 int) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '\001' STORED AS TEXTFILE;
> INSERT OVERWRITE TABLE test PARTITION (p1) SELECT code, IF(salary > 600, 100, 
> null) as p1 FROM jsmall;
> hive> SHOW PARTITIONS test;
> OK
> p1=100
> p1=__HIVE_DEFAULT_PARTITION__
> Time taken: 0.124 seconds, Fetched: 2 row(s)
> hive> ALTER TABLE test DROP partition (p1 = '__HIVE_DEFAULT_PARTITION__');
> FAILED: SemanticException Unexpected unknown partitions for (p1 = null)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15923) Hive default partition causes errors in get partitions

2017-02-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15923:

Attachment: HIVE-15923.patch

Preliminary patch that fixes the main use case.
The approach in HIVE-11208 is fundamentally broken, it mixes levels by using 
exprs where exprs should already have been evaluated.
I am going to take a look on how to do it, probably by supporting is-null in 
the parser, if that's simple, cause that would be the minimal code change from 
existing master.
May be better to just remove the constant thing and make a drop specific change 
(UDFs modes?)

> Hive default partition causes errors in get partitions
> --
>
> Key: HIVE-15923
> URL: https://issues.apache.org/jira/browse/HIVE-15923
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 2.2.0
>
> Attachments: HIVE-15923.patch
>
>
> This is the ORM error, direct SQL fails too before that, with a similar error.
> {noformat}
> 2017-02-14T17:45:11,158 ERROR [09fdd887-0164-4f55-97e9-4ba147d962be main] 
> metastore.ObjectStore:java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.plan.ExprNodeConstantDefaultDesc cannot be cast to 
> java.lang.Long
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:40)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectInspectorUtils.java:801)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$DoubleConverter.convert(P
> rimitiveObjectInspectorConverter.java:240) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan.evaluate(GenericUDFOPEqualOrGreaterThan.java:145)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBetween.evaluate(GenericUDFBetween.java:57)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:63)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.evaluateExprOnPart(PartExprEvalUtils.java:126)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15904) select query throwing Null Pointer Exception from org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan

2017-02-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867086#comment-15867086
 ] 

Hive QA commented on HIVE-15904:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12852669/HIVE-15904.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10238 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join31] (batchId=81)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2]
 (batchId=152)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join31] 
(batchId=133)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3552/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3552/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3552/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12852669 - PreCommit-HIVE-Build

> select query throwing Null Pointer Exception from 
> org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan
> --
>
> Key: HIVE-15904
> URL: https://issues.apache.org/jira/browse/HIVE-15904
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15904.1.patch, HIVE-15904.2.patch, 
> HIVE-15904.3.patch, table_18.q, table_1.q
>
>
> Following query failing with Null Pointer Exception from 
> org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan
> Attaching create table statements for table_1 and table_18
> Query:
> SELECT
> COALESCE(498, LEAD(COALESCE(-973, -684, 515)) OVER (PARTITION BY 
> (t2.int_col_10 + t1.smallint_col_50) ORDER BY (t2.int_col_10 + 
> t1.smallint_col_50), FLOOR(t1.double_col_16) DESC), 524) AS int_col,
> (t2.int_col_10) + (t1.smallint_col_50) AS int_col_1,
> FLOOR(t1.double_col_16) AS float_col,
> COALESCE(SUM(COALESCE(62, -380, -435)) OVER (PARTITION BY (t2.int_col_10 + 
> t1.smallint_col_50) ORDER BY (t2.int_col_10 + t1.smallint_col_50) DESC, 
> FLOOR(t1.double_col_16) DESC ROWS BETWEEN UNBOUNDED PRECEDING AND 48 
> FOLLOWING), 704) AS int_col_2
> FROM table_1 t1
> INNER JOIN table_18 t2 ON (((t2.tinyint_col_15) = (t1.bigint_col_7)) AND
> ((t2.decimal2709_col_9) = (t1.decimal2016_col_26))) AND
> ((t2.tinyint_col_20) = (t1.tinyint_col_3))
> WHERE (t2.smallint_col_19) IN (SELECT
> COALESCE(-92, -994) AS int_col
> FROM table_1 tt1
> INNER JOIN table_18 tt2 ON (tt2.decimal1911_col_16) = (tt1.decimal2612_col_77)
> WHERE (t1.timestamp_col_9) = (tt2.timestamp_col_18));
> Error Stack:
> org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: NullPointerException null
> at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:387)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:193)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:276)
>  
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:324) 
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:507)
>  
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:495)
>  
> at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:308)
>  
> at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:506)
>  
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
>  
> at 
> 

[jira] [Updated] (HIVE-15891) Detect query rewrite scenario for UPDATE/DELETE/MERGE and fail fast

2017-02-14 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-15891:
-
Attachment: HIVE-15891.2.patch

> Detect query rewrite scenario for UPDATE/DELETE/MERGE and fail fast
> ---
>
> Key: HIVE-15891
> URL: https://issues.apache.org/jira/browse/HIVE-15891
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-15891.1.patch, HIVE-15891.2.patch
>
>
> Currently ACID UpdateDeleteSemanticAnalyzer directly manipulates the AST tree 
> but it's different from the general approach of modifying the token stream 
> and thus will cause AST tree mismatch if there is any rewrite happening after 
> UpdateDeleteSemanticAnalyzer.
> The long term solution will be to rewrite the AST handling logic in 
> UpdateDeleteSemanticAnalyzer, to make it consistent with the general approach.
> This ticket will for now detect the error prone cases and fail early. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15891) Detect query rewrite scenario for UPDATE/DELETE/MERGE and fail fast

2017-02-14 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-15891:
-
Status: Open  (was: Patch Available)

> Detect query rewrite scenario for UPDATE/DELETE/MERGE and fail fast
> ---
>
> Key: HIVE-15891
> URL: https://issues.apache.org/jira/browse/HIVE-15891
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-15891.1.patch, HIVE-15891.2.patch
>
>
> Currently ACID UpdateDeleteSemanticAnalyzer directly manipulates the AST tree 
> but it's different from the general approach of modifying the token stream 
> and thus will cause AST tree mismatch if there is any rewrite happening after 
> UpdateDeleteSemanticAnalyzer.
> The long term solution will be to rewrite the AST handling logic in 
> UpdateDeleteSemanticAnalyzer, to make it consistent with the general approach.
> This ticket will for now detect the error prone cases and fail early. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15891) Detect query rewrite scenario for UPDATE/DELETE/MERGE and fail fast

2017-02-14 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-15891:
-
Status: Patch Available  (was: Open)

> Detect query rewrite scenario for UPDATE/DELETE/MERGE and fail fast
> ---
>
> Key: HIVE-15891
> URL: https://issues.apache.org/jira/browse/HIVE-15891
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-15891.1.patch, HIVE-15891.2.patch
>
>
> Currently ACID UpdateDeleteSemanticAnalyzer directly manipulates the AST tree 
> but it's different from the general approach of modifying the token stream 
> and thus will cause AST tree mismatch if there is any rewrite happening after 
> UpdateDeleteSemanticAnalyzer.
> The long term solution will be to rewrite the AST handling logic in 
> UpdateDeleteSemanticAnalyzer, to make it consistent with the general approach.
> This ticket will for now detect the error prone cases and fail early. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15923) Hive default partition causes errors in get partitions

2017-02-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-15923:
---

Assignee: Sergey Shelukhin

> Hive default partition causes errors in get partitions
> --
>
> Key: HIVE-15923
> URL: https://issues.apache.org/jira/browse/HIVE-15923
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 2.2.0
>
>
> This is the ORM error, direct SQL fails too before that, with a similar error.
> {noformat}
> 2017-02-14T17:45:11,158 ERROR [09fdd887-0164-4f55-97e9-4ba147d962be main] 
> metastore.ObjectStore:java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.plan.ExprNodeConstantDefaultDesc cannot be cast to 
> java.lang.Long
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:40)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectInspectorUtils.java:801)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$DoubleConverter.convert(P
> rimitiveObjectInspectorConverter.java:240) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan.evaluate(GenericUDFOPEqualOrGreaterThan.java:145)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBetween.evaluate(GenericUDFBetween.java:57)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:63)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.evaluateExprOnPart(PartExprEvalUtils.java:126)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15905) Inefficient plan for correlated subqueries

2017-02-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15905:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Vineet!

> Inefficient plan for correlated subqueries
> --
>
> Key: HIVE-15905
> URL: https://issues.apache.org/jira/browse/HIVE-15905
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 2.2.0
>
> Attachments: HIVE-15905.1.patch, HIVE-15905.2.patch
>
>
> Currently Calcite produces an un-necessary join to generate correlated values 
> for inner query. More details are at CALCITE-1494.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15892) Vectorization: Fast Hash tables need to do bounds checking during expand

2017-02-14 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867032#comment-15867032
 ] 

Jason Dere commented on HIVE-15892:
---

Looks like this throws a more user-friendly error message - but this condition 
will still fail the query? Is there a way to get the query to not fail? Or is 
it that this table should not have been selected for hash join in the first 
place?

> Vectorization: Fast Hash tables need to do bounds checking during expand
> 
>
> Key: HIVE-15892
> URL: https://issues.apache.org/jira/browse/HIVE-15892
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15892.01.patch, HIVE-15892.02.patch
>
>
> VectorMapJoinFastLongHashTable line 165 gets NegativeArraySizeException:
> {code}
> long[] newSlotPairs = new long[newSlotPairArraySize];
> {code}
> We need to add a size check... Java math for this wrapped around to negative:
> {code}
> int newSlotPairArraySize = newLogicalHashBucketCount * 2;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15918) Add some debug messages to identify an issue getting runtimeInfo from tez

2017-02-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867028#comment-15867028
 ] 

Hive QA commented on HIVE-15918:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12852666/HIVE-15918.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10224 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join31] (batchId=81)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2]
 (batchId=152)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join31] 
(batchId=133)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3551/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3551/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3551/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12852666 - PreCommit-HIVE-Build

> Add some debug messages to identify an issue getting runtimeInfo from tez
> -
>
> Key: HIVE-15918
> URL: https://issues.apache.org/jira/browse/HIVE-15918
> Project: Hive
>  Issue Type: Task
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15918.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15905) Inefficient plan for correlated subqueries

2017-02-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867021#comment-15867021
 ] 

Ashutosh Chauhan commented on HIVE-15905:
-

+1

> Inefficient plan for correlated subqueries
> --
>
> Key: HIVE-15905
> URL: https://issues.apache.org/jira/browse/HIVE-15905
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15905.1.patch, HIVE-15905.2.patch
>
>
> Currently Calcite produces an un-necessary join to generate correlated values 
> for inner query. More details are at CALCITE-1494.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-11208) Can not drop a default partition __HIVE_DEFAULT_PARTITION__ which is not a "string" type

2017-02-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867020#comment-15867020
 ] 

Sergey Shelukhin commented on HIVE-11208:
-

Why does this add special handing only to 2 UDFs? In fact, why should special 
handling be necessary? It looks like evaluators are invalid, see  HIVE-15923  
where it fails in a different UDF.
Evaluators should instead return null for this case; or, every UDF would need 
this magic.

> Can not drop a default partition __HIVE_DEFAULT_PARTITION__ which is not a 
> "string" type
> 
>
> Key: HIVE-11208
> URL: https://issues.apache.org/jira/browse/HIVE-11208
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.1.0
>Reporter: Yongzhi Chen
>Assignee: Aihua Xu
> Fix For: 2.2.0
>
> Attachments: HIVE-11208.2.patch, HIVE-11208.3.patch
>
>
> When partition is not a string type, for example, if it is a int type, when 
> drop the default partition __HIVE_DEFAULT_PARTITION__, you will get:
> SemanticException Unexpected unknown partitions
> Reproduce:
> {noformat}
> SET hive.exec.dynamic.partition=true;
> SET hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.max.dynamic.partitions.pernode=1;
> DROP TABLE IF EXISTS test;
> CREATE TABLE test (col1 string) PARTITIONED BY (p1 int) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY '\001' STORED AS TEXTFILE;
> INSERT OVERWRITE TABLE test PARTITION (p1) SELECT code, IF(salary > 600, 100, 
> null) as p1 FROM jsmall;
> hive> SHOW PARTITIONS test;
> OK
> p1=100
> p1=__HIVE_DEFAULT_PARTITION__
> Time taken: 0.124 seconds, Fetched: 2 row(s)
> hive> ALTER TABLE test DROP partition (p1 = '__HIVE_DEFAULT_PARTITION__');
> FAILED: SemanticException Unexpected unknown partitions for (p1 = null)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15489) Alternatively use table scan stats for HoS

2017-02-14 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866986#comment-15866986
 ] 

Chao Sun commented on HIVE-15489:
-

[~xuefuz] can you take another look at this :) ?

> Alternatively use table scan stats for HoS
> --
>
> Key: HIVE-15489
> URL: https://issues.apache.org/jira/browse/HIVE-15489
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark, Statistics
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-15489.1.patch, HIVE-15489.2.patch, 
> HIVE-15489.3.patch, HIVE-15489.6.patch, HIVE-15489.wip.patch
>
>
> For MapJoin in HoS, we should provide an option to only use stats in the TS 
> rather than the populated stats in each of the join branch. This could be 
> pretty conservative but more reliable.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15917) incorrect error handling from BackgroundWork can cause beeline query to hang

2017-02-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866981#comment-15866981
 ] 

Sergey Shelukhin commented on HIVE-15917:
-

Test failures are unrelated... spark tests have some RPC issues

> incorrect error handling from BackgroundWork can cause beeline query to hang
> 
>
> Key: HIVE-15917
> URL: https://issues.apache.org/jira/browse/HIVE-15917
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15917.01.patch, HIVE-15917.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-13533) Remove AST dump

2017-02-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866977#comment-15866977
 ] 

Sergey Shelukhin commented on HIVE-13533:
-

AST dump is useful in some cases... perhaps there should be a separate explain 
that would output it?

> Remove AST dump
> ---
>
> Key: HIVE-13533
> URL: https://issues.apache.org/jira/browse/HIVE-13533
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.1.0
>
> Attachments: HIVE-13533.patch, HIVE-13533.patch
>
>
> For very large queries, dumping the AST can lead to OOM errors. Currently 
> there are two places where we dump the AST:
> - CalcitePlanner if we are running in DEBUG mode (line 300).
> - ExplainTask if we use extended explain (line 179).
> I guess the original reason to add the dump was to check whether the AST 
> conversion from CBO was working properly, but I think we are past that stage 
> now.
> We will remove the logic to dump the AST in explain extended. For debug mode 
> in CalcitePlanner, we will lower the level to LOG.TRACE.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15916) Add blobstore tests for CTAS

2017-02-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866968#comment-15866968
 ] 

Hive QA commented on HIVE-15916:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12852655/HIVE-15916.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10240 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join31] (batchId=81)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2]
 (batchId=152)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join31] 
(batchId=133)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3550/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3550/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3550/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12852655 - PreCommit-HIVE-Build

> Add blobstore tests for CTAS
> 
>
> Key: HIVE-15916
> URL: https://issues.apache.org/jira/browse/HIVE-15916
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Affects Versions: 2.1.1
>Reporter: Juan Rodríguez Hortalá
>Assignee: Juan Rodríguez Hortalá
> Attachments: HIVE-15916.patch
>
>
> This patch covers 3 tests checking CTAS operations against blobstore 
> filesystems. The tests check we can create a table with a CTAS statement from 
> another table, for the source-target combinations blobtore-blobstore, 
> blobstore-hdfs, hdfs-blobstore, and for two target tables, one in the same 
> default database as  the source, and another in a new database. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15909) HiveMetaStoreChecker::checkTable can be expensive in ObjectStores

2017-02-14 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866942#comment-15866942
 ] 

Vihang Karajgaonkar commented on HIVE-15909:


Thanks [~rajesh.balamohan]

> HiveMetaStoreChecker::checkTable can be expensive in ObjectStores
> -
>
> Key: HIVE-15909
> URL: https://issues.apache.org/jira/browse/HIVE-15909
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Priority: Minor
>
> With objectstores like s3, HiveMetaStoreChecker::checkTable can be expensive 
> with partitioned dataset.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15909) HiveMetaStoreChecker::checkTable can be expensive in ObjectStores

2017-02-14 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-15909:
--

Assignee: Vihang Karajgaonkar

> HiveMetaStoreChecker::checkTable can be expensive in ObjectStores
> -
>
> Key: HIVE-15909
> URL: https://issues.apache.org/jira/browse/HIVE-15909
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Vihang Karajgaonkar
>Priority: Minor
>
> With objectstores like s3, HiveMetaStoreChecker::checkTable can be expensive 
> with partitioned dataset.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15924) move ORC PPD failure message caused by a dynamic value to DEBUG level

2017-02-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866918#comment-15866918
 ] 

Sergey Shelukhin commented on HIVE-15924:
-

+1

> move ORC PPD failure message caused by a dynamic value to DEBUG level
> -
>
> Key: HIVE-15924
> URL: https://issues.apache.org/jira/browse/HIVE-15924
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15924.1.patch
>
>
> Several WARN msgs are observed like below when running LLAP with default 
> configurations
> {code}
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-8 
> (1484282558103_6753_2_05_30_2)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_19_store_ss_store_sk_min StatsType: 
> Long PredicateType: null
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-3 
> (1484282558103_6753_2_05_57_0)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_19_store_ss_store_sk_min StatsType: 
> Long PredicateType: null
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-8 
> (1484282558103_6753_2_05_30_2)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_13_item_ss_item_sk_min StatsType: 
> Long PredicateType: null
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-3 
> (1484282558103_6753_2_05_57_0)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_13_item_ss_item_sk_min StatsType: 
> Long PredicateType: null
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-8 
> (1484282558103_6753_2_05_30_2)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_19_store_ss_store_sk_min StatsType: 
> Long PredicateType: null
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15909) HiveMetaStoreChecker::checkTable can be expensive in ObjectStores

2017-02-14 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866913#comment-15866913
 ] 

Rajesh Balamohan commented on HIVE-15909:
-

Thanks [~vihangk1]. Please feel free to assign this in your name. 

> HiveMetaStoreChecker::checkTable can be expensive in ObjectStores
> -
>
> Key: HIVE-15909
> URL: https://issues.apache.org/jira/browse/HIVE-15909
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Priority: Minor
>
> With objectstores like s3, HiveMetaStoreChecker::checkTable can be expensive 
> with partitioned dataset.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15924) move ORC PPD failure message caused by a dynamic value to DEBUG level

2017-02-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15924:
-
Attachment: HIVE-15924.1.patch

Attaching patch for branch-2.x. 

> move ORC PPD failure message caused by a dynamic value to DEBUG level
> -
>
> Key: HIVE-15924
> URL: https://issues.apache.org/jira/browse/HIVE-15924
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15924.1.patch
>
>
> Several WARN msgs are observed like below when running LLAP with default 
> configurations
> {code}
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-8 
> (1484282558103_6753_2_05_30_2)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_19_store_ss_store_sk_min StatsType: 
> Long PredicateType: null
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-3 
> (1484282558103_6753_2_05_57_0)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_19_store_ss_store_sk_min StatsType: 
> Long PredicateType: null
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-8 
> (1484282558103_6753_2_05_30_2)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_13_item_ss_item_sk_min StatsType: 
> Long PredicateType: null
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-3 
> (1484282558103_6753_2_05_57_0)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_13_item_ss_item_sk_min StatsType: 
> Long PredicateType: null
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-8 
> (1484282558103_6753_2_05_30_2)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_19_store_ss_store_sk_min StatsType: 
> Long PredicateType: null
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15924) move ORC PPD failure message caused by a dynamic value to DEBUG level

2017-02-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-15924:



> move ORC PPD failure message caused by a dynamic value to DEBUG level
> -
>
> Key: HIVE-15924
> URL: https://issues.apache.org/jira/browse/HIVE-15924
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
>
> Several WARN msgs are observed like below when running LLAP with default 
> configurations
> {code}
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-8 
> (1484282558103_6753_2_05_30_2)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_19_store_ss_store_sk_min StatsType: 
> Long PredicateType: null
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-3 
> (1484282558103_6753_2_05_57_0)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_19_store_ss_store_sk_min StatsType: 
> Long PredicateType: null
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-8 
> (1484282558103_6753_2_05_30_2)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_13_item_ss_item_sk_min StatsType: 
> Long PredicateType: null
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-3 
> (1484282558103_6753_2_05_57_0)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_13_item_ss_item_sk_min StatsType: 
> Long PredicateType: null
> 2017-02-14T17:42:06,665  WARN [IO-Elevator-Thread-8 
> (1484282558103_6753_2_05_30_2)] impl.RecordReaderImpl: 
> IllegalStateException when evaluating predicate. Skipping ORC PPD. Exception: 
> Failed to retrieve dynamic value for RS_19_store_ss_store_sk_min StatsType: 
> Long PredicateType: null
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15917) incorrect error handling from BackgroundWork can cause beeline query to hang

2017-02-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866900#comment-15866900
 ] 

Hive QA commented on HIVE-15917:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12852644/HIVE-15917.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 10238 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join31] (batchId=81)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_sortmerge_join_12]
 (batchId=109)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby3_map_multi_distinct]
 (batchId=109)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_bigdata] 
(batchId=109)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[innerjoin] 
(batchId=109)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join31] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[metadata_only_queries]
 (batchId=109)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_6] 
(batchId=109)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3549/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3549/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3549/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12852644 - PreCommit-HIVE-Build

> incorrect error handling from BackgroundWork can cause beeline query to hang
> 
>
> Key: HIVE-15917
> URL: https://issues.apache.org/jira/browse/HIVE-15917
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15917.01.patch, HIVE-15917.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-15922) SchemaEvolution must guarantee that getFileIncluded is not null

2017-02-14 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HIVE-15922.
--
   Resolution: Invalid
Fix Version/s: (was: 2.1.2)

Sorry for the noise.

> SchemaEvolution must guarantee that getFileIncluded is not null
> ---
>
> Key: HIVE-15922
> URL: https://issues.apache.org/jira/browse/HIVE-15922
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.1
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>
> This only impacts branch-2.1, because it is already fixed in master by 
> HIVE-14007.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15909) HiveMetaStoreChecker::checkTable can be expensive in ObjectStores

2017-02-14 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866887#comment-15866887
 ] 

Vihang Karajgaonkar commented on HIVE-15909:


Hi [~rajesh.balamohan], thanks for creating this JIRA. Seems like a useful 
improvement to have. If you are not planning to take it up, I can help and take 
it up. Please let me know. Thanks!

> HiveMetaStoreChecker::checkTable can be expensive in ObjectStores
> -
>
> Key: HIVE-15909
> URL: https://issues.apache.org/jira/browse/HIVE-15909
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Priority: Minor
>
> With objectstores like s3, HiveMetaStoreChecker::checkTable can be expensive 
> with partitioned dataset.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15923) Hive default partition causes errors in get partitions

2017-02-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15923:

Fix Version/s: 2.2.0

> Hive default partition causes errors in get partitions
> --
>
> Key: HIVE-15923
> URL: https://issues.apache.org/jira/browse/HIVE-15923
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Blocker
> Fix For: 2.2.0
>
>
> This is the ORM error, direct SQL fails too before that, with a similar error.
> {noformat}
> 2017-02-14T17:45:11,158 ERROR [09fdd887-0164-4f55-97e9-4ba147d962be main] 
> metastore.ObjectStore:java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.plan.ExprNodeConstantDefaultDesc cannot be cast to 
> java.lang.Long
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:40)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectInspectorUtils.java:801)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$DoubleConverter.convert(P
> rimitiveObjectInspectorConverter.java:240) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan.evaluate(GenericUDFOPEqualOrGreaterThan.java:145)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBetween.evaluate(GenericUDFBetween.java:57)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:63)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.evaluateExprOnPart(PartExprEvalUtils.java:126)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15923) Hive default partition causes errors in get partitions

2017-02-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15923:

Target Version/s: 2.2.0

> Hive default partition causes errors in get partitions
> --
>
> Key: HIVE-15923
> URL: https://issues.apache.org/jira/browse/HIVE-15923
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Blocker
>
> This is the ORM error, direct SQL fails too before that, with a similar error.
> {noformat}
> 2017-02-14T17:45:11,158 ERROR [09fdd887-0164-4f55-97e9-4ba147d962be main] 
> metastore.ObjectStore:java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.plan.ExprNodeConstantDefaultDesc cannot be cast to 
> java.lang.Long
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:40)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectInspectorUtils.java:801)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$DoubleConverter.convert(P
> rimitiveObjectInspectorConverter.java:240) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan.evaluate(GenericUDFOPEqualOrGreaterThan.java:145)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBetween.evaluate(GenericUDFBetween.java:57)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:63)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.evaluateExprOnPart(PartExprEvalUtils.java:126)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15923) Hive default partition causes errors in get partitions

2017-02-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15923:

Priority: Blocker  (was: Major)

> Hive default partition causes errors in get partitions
> --
>
> Key: HIVE-15923
> URL: https://issues.apache.org/jira/browse/HIVE-15923
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Blocker
>
> This is the ORM error, direct SQL fails too before that, with a similar error.
> {noformat}
> 2017-02-14T17:45:11,158 ERROR [09fdd887-0164-4f55-97e9-4ba147d962be main] 
> metastore.ObjectStore:java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.plan.ExprNodeConstantDefaultDesc cannot be cast to 
> java.lang.Long
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:40)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectInspectorUtils.java:801)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$DoubleConverter.convert(P
> rimitiveObjectInspectorConverter.java:240) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan.evaluate(GenericUDFOPEqualOrGreaterThan.java:145)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBetween.evaluate(GenericUDFBetween.java:57)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:63)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.evaluateExprOnPart(PartExprEvalUtils.java:126)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15881) Use new thread count variable name instead of mapred.dfsclient.parallelism.max

2017-02-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-15881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15881:
---
Status: Patch Available  (was: Open)

> Use new thread count variable name instead of mapred.dfsclient.parallelism.max
> --
>
> Key: HIVE-15881
> URL: https://issues.apache.org/jira/browse/HIVE-15881
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Minor
> Attachments: HIVE-15881.1.patch
>
>
> The Utilities class has two methods, {{getInputSummary}} and 
> {{getInputPaths}}, that use the variable {{mapred.dfsclient.parallelism.max}} 
> to get the summary of a list of input locations in parallel. These methods 
> are Hive related, but the variable name does not look it is specific for Hive.
> Also, the above variable is not on HiveConf nor used anywhere else. I just 
> found a reference on the Hadoop MR1 code.
> I'd like to propose the deprecation of {{mapred.dfsclient.parallelism.max}}, 
> and use a different variable name, such as 
> {{hive.get.input.listing.num.threads}}, that reflects the intention of the 
> variable. The removal of the old variable might happen on Hive 3.x



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15881) Use new thread count variable name instead of mapred.dfsclient.parallelism.max

2017-02-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-15881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15881:
---
Attachment: HIVE-15881.1.patch

> Use new thread count variable name instead of mapred.dfsclient.parallelism.max
> --
>
> Key: HIVE-15881
> URL: https://issues.apache.org/jira/browse/HIVE-15881
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Minor
> Attachments: HIVE-15881.1.patch
>
>
> The Utilities class has two methods, {{getInputSummary}} and 
> {{getInputPaths}}, that use the variable {{mapred.dfsclient.parallelism.max}} 
> to get the summary of a list of input locations in parallel. These methods 
> are Hive related, but the variable name does not look it is specific for Hive.
> Also, the above variable is not on HiveConf nor used anywhere else. I just 
> found a reference on the Hadoop MR1 code.
> I'd like to propose the deprecation of {{mapred.dfsclient.parallelism.max}}, 
> and use a different variable name, such as 
> {{hive.get.input.listing.num.threads}}, that reflects the intention of the 
> variable. The removal of the old variable might happen on Hive 3.x



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15882) HS2 generating high memory pressure with many partitions and concurrent queries

2017-02-14 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866879#comment-15866879
 ] 

Misha Dmitriev commented on HIVE-15882:
---

For convenience, I've created a code review here: 
https://reviews.apache.org/r/56687/

> HS2 generating high memory pressure with many partitions and concurrent 
> queries
> ---
>
> Key: HIVE-15882
> URL: https://issues.apache.org/jira/browse/HIVE-15882
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-15882.01.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code:
> 1. 24.5% of memory is wasted by duplicate strings (see section 6). With 
> String.intern() calls added in the ~10 relevant places in the code, this 
> overhead can be highly reduced.
> 2. Almost 20% of memory is wasted due to various suboptimally used 
> collections (see section 8). There are many maps and lists that are either 
> empty or have just 1 element. By modifying the code that creates and 
> populates these collections, we may likely save 5-10% of memory.
> 3. Almost 20% of memory is used by instances of java.util.Properties. It 
> looks like these objects are highly duplicate, since for each Partition each 
> concurrently running query creates its own copy of Partion, PartitionDesc and 
> Properties. Thus we have nearly 100,000 (50 queries * 2,000 partitions) 
> Properties in memory. By interning/deduplicating these objects we may be able 
> to save perhaps 15% of memory.
> So overall, I think there is a good chance to reduce HS2 memory consumption 
> in this scenario by ~40%.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15905) Inefficient plan for correlated subqueries

2017-02-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15905:
---
Status: Patch Available  (was: Open)

> Inefficient plan for correlated subqueries
> --
>
> Key: HIVE-15905
> URL: https://issues.apache.org/jira/browse/HIVE-15905
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15905.1.patch, HIVE-15905.2.patch
>
>
> Currently Calcite produces an un-necessary join to generate correlated values 
> for inner query. More details are at CALCITE-1494.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15922) SchemaEvolution must guarantee that getFileIncluded is not null

2017-02-14 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reassigned HIVE-15922:


Assignee: Owen O'Malley

> SchemaEvolution must guarantee that getFileIncluded is not null
> ---
>
> Key: HIVE-15922
> URL: https://issues.apache.org/jira/browse/HIVE-15922
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.1
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.1.2
>
>
> This only impacts branch-2.1, because it is already fixed in master by 
> HIVE-14007.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15905) Inefficient plan for correlated subqueries

2017-02-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15905:
---
Attachment: HIVE-15905.2.patch

> Inefficient plan for correlated subqueries
> --
>
> Key: HIVE-15905
> URL: https://issues.apache.org/jira/browse/HIVE-15905
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15905.1.patch, HIVE-15905.2.patch
>
>
> Currently Calcite produces an un-necessary join to generate correlated values 
> for inner query. More details are at CALCITE-1494.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15905) Inefficient plan for correlated subqueries

2017-02-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15905:
---
Status: Open  (was: Patch Available)

> Inefficient plan for correlated subqueries
> --
>
> Key: HIVE-15905
> URL: https://issues.apache.org/jira/browse/HIVE-15905
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15905.1.patch
>
>
> Currently Calcite produces an un-necessary join to generate correlated values 
> for inner query. More details are at CALCITE-1494.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15921) Re-order the slider stop command to avoid a force if possible

2017-02-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866855#comment-15866855
 ] 

Sergey Shelukhin commented on HIVE-15921:
-

As far as I understand, stop is synchronous by default. How did it always work 
before?
If not, the last wait should be a large value.

> Re-order the slider stop command to avoid a force if possible
> -
>
> Key: HIVE-15921
> URL: https://issues.apache.org/jira/browse/HIVE-15921
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15921.01.patch
>
>
> A graceful stop is required for slider --service llapstatus to work properly



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15921) Re-order the slider stop command to avoid a force if possible

2017-02-14 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-15921:
--
Attachment: HIVE-15921.01.patch

[~sershe] - can you please take a look.

The wait is added to make stop pseudo synchronous. destroy will fail if the 
stop has not succeeded.

> Re-order the slider stop command to avoid a force if possible
> -
>
> Key: HIVE-15921
> URL: https://issues.apache.org/jira/browse/HIVE-15921
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15921.01.patch
>
>
> A graceful stop is required for slider --service llapstatus to work properly



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15921) Re-order the slider stop command to avoid a force if possible

2017-02-14 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-15921:
--
Status: Patch Available  (was: Open)

> Re-order the slider stop command to avoid a force if possible
> -
>
> Key: HIVE-15921
> URL: https://issues.apache.org/jira/browse/HIVE-15921
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15921.01.patch
>
>
> A graceful stop is required for slider --service llapstatus to work properly



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15920) Implement a blocking version of a command to compact

2017-02-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15920:
--
Description: 
currently 
{noformat}
alter table AcidTable compact 'major'
{noformat} 
is supported which enqueues a msg to compact.

Would be nice for testing and script building to support 
{noformat} 
alter table AcidTable compact 'major' blocking
{noformat} 
perhaps another variation is to block until either compaction is done or until 
cleaning is finished.


DDLTask.compact() gets a request id back so it can then just block and wait for 
it using some new API


  was:
currently 
{noformat}
alter table AcidTable compact 'major'
{noformat} 
is supported which enqueues a msg to compact.

Would be nice for testing and script building to support 
{noformat} 
alter table AcidTable compact 'major' blocking
{noformat} 
perhaps another variation is to block until either compaction is done or until 
cleaning is finished.



> Implement a blocking version of a command to compact
> 
>
> Key: HIVE-15920
> URL: https://issues.apache.org/jira/browse/HIVE-15920
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> currently 
> {noformat}
> alter table AcidTable compact 'major'
> {noformat} 
> is supported which enqueues a msg to compact.
> Would be nice for testing and script building to support 
> {noformat} 
> alter table AcidTable compact 'major' blocking
> {noformat} 
> perhaps another variation is to block until either compaction is done or 
> until cleaning is finished.
> DDLTask.compact() gets a request id back so it can then just block and wait 
> for it using some new API



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15921) Re-order the slider stop command to avoid a force if possible

2017-02-14 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth reassigned HIVE-15921:
-


> Re-order the slider stop command to avoid a force if possible
> -
>
> Key: HIVE-15921
> URL: https://issues.apache.org/jira/browse/HIVE-15921
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>
> A graceful stop is required for slider --service llapstatus to work properly



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15920) Implement a blocking version of a command to compact

2017-02-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-15920:
-


> Implement a blocking version of a command to compact
> 
>
> Key: HIVE-15920
> URL: https://issues.apache.org/jira/browse/HIVE-15920
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> currently 
> {noformat}
> alter table AcidTable compact 'major'
> {noformat} 
> is supported which enqueues a msg to compact.
> Would be nice for testing and script building to support 
> {noformat} 
> alter table AcidTable compact 'major' blocking
> {noformat} 
> perhaps another variation is to block until either compaction is done or 
> until cleaning is finished.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15881) Use new thread count variable name instead of mapred.dfsclient.parallelism.max

2017-02-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-15881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866824#comment-15866824
 ] 

Sergio Peña commented on HIVE-15881:


Great, thanks for the suggestions. [~poeppt] Although I like your idea of the 
'maximum number allowable' using 0, I think we should continue using the 0 as 
using only one thread for the work. The rest of the configuration variables for 
threads use 0 to disable the use of threads. Let's keep it consistent.

I will submit a patch with the following:
- New variable name {{hive.exec.input.listing.max.threads}} for getInputSummary 
and getInputPaths
- Mark {{mapred.dfsclient.parallelism.max}} as deprecated, but continue using 
it.
- Default the value for {{hive.exec.input.listing.max.threads}} to 0 (no 
threads or just one thread). I think we should keep it disable because
  on HDFS there's no benefit of using threads, and we can multiple RPC 
connections with the namenode.

> Use new thread count variable name instead of mapred.dfsclient.parallelism.max
> --
>
> Key: HIVE-15881
> URL: https://issues.apache.org/jira/browse/HIVE-15881
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Minor
>
> The Utilities class has two methods, {{getInputSummary}} and 
> {{getInputPaths}}, that use the variable {{mapred.dfsclient.parallelism.max}} 
> to get the summary of a list of input locations in parallel. These methods 
> are Hive related, but the variable name does not look it is specific for Hive.
> Also, the above variable is not on HiveConf nor used anywhere else. I just 
> found a reference on the Hadoop MR1 code.
> I'd like to propose the deprecation of {{mapred.dfsclient.parallelism.max}}, 
> and use a different variable name, such as 
> {{hive.get.input.listing.num.threads}}, that reflects the intention of the 
> variable. The removal of the old variable might happen on Hive 3.x



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15917) incorrect error handling from BackgroundWork can cause beeline query to hang

2017-02-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866822#comment-15866822
 ] 

Hive QA commented on HIVE-15917:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12852644/HIVE-15917.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10238 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3548/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3548/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3548/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12852644 - PreCommit-HIVE-Build

> incorrect error handling from BackgroundWork can cause beeline query to hang
> 
>
> Key: HIVE-15917
> URL: https://issues.apache.org/jira/browse/HIVE-15917
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15917.01.patch, HIVE-15917.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15882) HS2 generating high memory pressure with many partitions and concurrent queries

2017-02-14 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-15882:
--
Status: Patch Available  (was: In Progress)

The supplied patch de-dupes most of the duplicate strings.

> HS2 generating high memory pressure with many partitions and concurrent 
> queries
> ---
>
> Key: HIVE-15882
> URL: https://issues.apache.org/jira/browse/HIVE-15882
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-15882.01.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code:
> 1. 24.5% of memory is wasted by duplicate strings (see section 6). With 
> String.intern() calls added in the ~10 relevant places in the code, this 
> overhead can be highly reduced.
> 2. Almost 20% of memory is wasted due to various suboptimally used 
> collections (see section 8). There are many maps and lists that are either 
> empty or have just 1 element. By modifying the code that creates and 
> populates these collections, we may likely save 5-10% of memory.
> 3. Almost 20% of memory is used by instances of java.util.Properties. It 
> looks like these objects are highly duplicate, since for each Partition each 
> concurrently running query creates its own copy of Partion, PartitionDesc and 
> Properties. Thus we have nearly 100,000 (50 queries * 2,000 partitions) 
> Properties in memory. By interning/deduplicating these objects we may be able 
> to save perhaps 15% of memory.
> So overall, I think there is a good chance to reduce HS2 memory consumption 
> in this scenario by ~40%.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15919) Row count mismatch for count * query

2017-02-14 Thread Aswathy Chellammal Sreekumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aswathy Chellammal Sreekumar updated HIVE-15919:

Description: 
The following query is returning different output when run against hive and 
postgres.

Query:

SELECT COUNT (*)
FROM
(SELECT LAG(COALESCE(t2.int_col_14, t1.int_col_80),22) OVER (ORDER BY 
t1.tinyint_col_52 DESC) AS int_col
FROM table_6 t1
INNER JOIN table_14 t2 ON ((t2.decimal0101_col_55) = (t1.decimal0101_col_9))) 
AS FOO;

>From hive: 0
>From postgres: 66903279

Attaching ddl files for the tables.

  was:
The following query is returning different output when run against hive and 
postgres.

Query:

SELECT COUNT (*)
FROM
(SELECT LAG(COALESCE(t2.int_col_14, t1.int_col_80),22) OVER (ORDER BY 
t1.tinyint_col_52 DESC) AS int_col
FROM table_6 t1
INNER JOIN table_14 t2 ON ((t2.decimal0101_col_55) = (t1.decimal0101_col_9))) 
AS FOO;

>From hive: 0
>From postgres: 66903279

Attaching ddl and data files for the tables.


> Row count mismatch for count * query
> 
>
> Key: HIVE-15919
> URL: https://issues.apache.org/jira/browse/HIVE-15919
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Aswathy Chellammal Sreekumar
> Attachments: table_14.q, table_6.q
>
>
> The following query is returning different output when run against hive and 
> postgres.
> Query:
> SELECT COUNT (*)
> FROM
> (SELECT LAG(COALESCE(t2.int_col_14, t1.int_col_80),22) OVER (ORDER BY 
> t1.tinyint_col_52 DESC) AS int_col
> FROM table_6 t1
> INNER JOIN table_14 t2 ON ((t2.decimal0101_col_55) = (t1.decimal0101_col_9))) 
> AS FOO;
> From hive: 0
> From postgres: 66903279
> Attaching ddl files for the tables.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15919) Row count mismatch for count * query

2017-02-14 Thread Aswathy Chellammal Sreekumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aswathy Chellammal Sreekumar updated HIVE-15919:

Attachment: table_6.q
table_14.q

> Row count mismatch for count * query
> 
>
> Key: HIVE-15919
> URL: https://issues.apache.org/jira/browse/HIVE-15919
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Aswathy Chellammal Sreekumar
> Attachments: table_14.q, table_6.q
>
>
> The following query is returning different output when run against hive and 
> postgres.
> Query:
> SELECT COUNT (*)
> FROM
> (SELECT LAG(COALESCE(t2.int_col_14, t1.int_col_80),22) OVER (ORDER BY 
> t1.tinyint_col_52 DESC) AS int_col
> FROM table_6 t1
> INNER JOIN table_14 t2 ON ((t2.decimal0101_col_55) = (t1.decimal0101_col_9))) 
> AS FOO;
> From hive: 0
> From postgres: 66903279
> Attaching ddl and data files for the tables.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15902) Select query involving date throwing Hive 2 Internal error: unsupported conversion from type: date

2017-02-14 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-15902:
--
Attachment: HIVE-15902.1.patch

Vectorized BETWEEN with Date dynamic values requires its own generated class 
since there is no implicit conversion from date to long.
Also added test case. [~mmccline] can you review?

> Select query involving date throwing Hive 2 Internal error: unsupported 
> conversion from type: date
> --
>
> Key: HIVE-15902
> URL: https://issues.apache.org/jira/browse/HIVE-15902
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Jason Dere
> Attachments: HIVE-15902.1.patch
>
>
> The following query is throwing Hive 2 Internal error: unsupported conversion 
> from type: date
> Query:
> create table table_one (ts timestamp, dt date) stored as orc;
> insert into table_one values ('2034-08-04 17:42:59','2038-07-01');
> insert into table_one values ('2031-02-07 13:02:38','2072-10-19');
> create table table_two (ts timestamp, dt date) stored as orc;
> insert into table_two values ('2069-04-01 09:05:54','1990-10-12');
> insert into table_two values ('2031-02-07 13:02:38','2072-10-19');
> create table table_three as
> select count(*) from table_one
> group by ts,dt
> having dt in (select dt from table_two);
> Error while running task ( failure ) : 
> attempt_1486991777989_0184_18_02_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:420)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
>   ... 15 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:883)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
>   ... 18 more
> Caused by: java.lang.RuntimeException: Hive 2 Internal error: unsupported 
> conversion from type: date
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getLong(PrimitiveObjectInspectorUtils.java:770)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FilterLongColumnBetweenDynamicValue.evaluate(FilterLongColumnBetweenDynamicValue.java:82)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.FilterExprAndExpr.evaluate(FilterExprAndExpr.java:39)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:112)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:883)
>   at 
> 

[jira] [Updated] (HIVE-15882) HS2 generating high memory pressure with many partitions and concurrent queries

2017-02-14 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-15882:
--
Attachment: HIVE-15882.01.patch

> HS2 generating high memory pressure with many partitions and concurrent 
> queries
> ---
>
> Key: HIVE-15882
> URL: https://issues.apache.org/jira/browse/HIVE-15882
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-15882.01.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code:
> 1. 24.5% of memory is wasted by duplicate strings (see section 6). With 
> String.intern() calls added in the ~10 relevant places in the code, this 
> overhead can be highly reduced.
> 2. Almost 20% of memory is wasted due to various suboptimally used 
> collections (see section 8). There are many maps and lists that are either 
> empty or have just 1 element. By modifying the code that creates and 
> populates these collections, we may likely save 5-10% of memory.
> 3. Almost 20% of memory is used by instances of java.util.Properties. It 
> looks like these objects are highly duplicate, since for each Partition each 
> concurrently running query creates its own copy of Partion, PartitionDesc and 
> Properties. Thus we have nearly 100,000 (50 queries * 2,000 partitions) 
> Properties in memory. By interning/deduplicating these objects we may be able 
> to save perhaps 15% of memory.
> So overall, I think there is a good chance to reduce HS2 memory consumption 
> in this scenario by ~40%.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15902) Select query involving date throwing Hive 2 Internal error: unsupported conversion from type: date

2017-02-14 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-15902:
--
Status: Patch Available  (was: Open)

> Select query involving date throwing Hive 2 Internal error: unsupported 
> conversion from type: date
> --
>
> Key: HIVE-15902
> URL: https://issues.apache.org/jira/browse/HIVE-15902
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Jason Dere
> Attachments: HIVE-15902.1.patch
>
>
> The following query is throwing Hive 2 Internal error: unsupported conversion 
> from type: date
> Query:
> create table table_one (ts timestamp, dt date) stored as orc;
> insert into table_one values ('2034-08-04 17:42:59','2038-07-01');
> insert into table_one values ('2031-02-07 13:02:38','2072-10-19');
> create table table_two (ts timestamp, dt date) stored as orc;
> insert into table_two values ('2069-04-01 09:05:54','1990-10-12');
> insert into table_two values ('2031-02-07 13:02:38','2072-10-19');
> create table table_three as
> select count(*) from table_one
> group by ts,dt
> having dt in (select dt from table_two);
> Error while running task ( failure ) : 
> attempt_1486991777989_0184_18_02_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:420)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
>   ... 15 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:883)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
>   ... 18 more
> Caused by: java.lang.RuntimeException: Hive 2 Internal error: unsupported 
> conversion from type: date
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getLong(PrimitiveObjectInspectorUtils.java:770)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FilterLongColumnBetweenDynamicValue.evaluate(FilterLongColumnBetweenDynamicValue.java:82)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.FilterExprAndExpr.evaluate(FilterExprAndExpr.java:39)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:112)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:883)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:783)
>   ... 19 more



--
This 

[jira] [Commented] (HIVE-15918) Add some debug messages to identify an issue getting runtimeInfo from tez

2017-02-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866790#comment-15866790
 ] 

Prasanth Jayachandran commented on HIVE-15918:
--

+1

> Add some debug messages to identify an issue getting runtimeInfo from tez
> -
>
> Key: HIVE-15918
> URL: https://issues.apache.org/jira/browse/HIVE-15918
> Project: Hive
>  Issue Type: Task
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15918.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15916) Add blobstore tests for CTAS

2017-02-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-15916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866781#comment-15866781
 ] 

Juan Rodríguez Hortalá commented on HIVE-15916:
---

https://reviews.apache.org/r/56684/

> Add blobstore tests for CTAS
> 
>
> Key: HIVE-15916
> URL: https://issues.apache.org/jira/browse/HIVE-15916
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Affects Versions: 2.1.1
>Reporter: Juan Rodríguez Hortalá
>Assignee: Juan Rodríguez Hortalá
> Attachments: HIVE-15916.patch
>
>
> This patch covers 3 tests checking CTAS operations against blobstore 
> filesystems. The tests check we can create a table with a CTAS statement from 
> another table, for the source-target combinations blobtore-blobstore, 
> blobstore-hdfs, hdfs-blobstore, and for two target tables, one in the same 
> default database as  the source, and another in a new database. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Work started] (HIVE-15882) HS2 generating high memory pressure with many partitions and concurrent queries

2017-02-14 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-15882 started by Misha Dmitriev.
-
> HS2 generating high memory pressure with many partitions and concurrent 
> queries
> ---
>
> Key: HIVE-15882
> URL: https://issues.apache.org/jira/browse/HIVE-15882
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code:
> 1. 24.5% of memory is wasted by duplicate strings (see section 6). With 
> String.intern() calls added in the ~10 relevant places in the code, this 
> overhead can be highly reduced.
> 2. Almost 20% of memory is wasted due to various suboptimally used 
> collections (see section 8). There are many maps and lists that are either 
> empty or have just 1 element. By modifying the code that creates and 
> populates these collections, we may likely save 5-10% of memory.
> 3. Almost 20% of memory is used by instances of java.util.Properties. It 
> looks like these objects are highly duplicate, since for each Partition each 
> concurrently running query creates its own copy of Partion, PartitionDesc and 
> Properties. Thus we have nearly 100,000 (50 queries * 2,000 partitions) 
> Properties in memory. By interning/deduplicating these objects we may be able 
> to save perhaps 15% of memory.
> So overall, I think there is a good chance to reduce HS2 memory consumption 
> in this scenario by ~40%.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-6590) Hive does not work properly with boolean partition columns (wrong results and inserts to incorrect HDFS path)

2017-02-14 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-6590:
---
Status: Patch Available  (was: Open)

> Hive does not work properly with boolean partition columns (wrong results and 
> inserts to incorrect HDFS path)
> -
>
> Key: HIVE-6590
> URL: https://issues.apache.org/jira/browse/HIVE-6590
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Metastore
>Affects Versions: 0.10.0
>Reporter: Lenni Kuff
>Assignee: Zoltan Haindrich
> Attachments: HIVE-6590.1.patch
>
>
> Hive does not work properly with boolean partition columns. Queries return 
> wrong results and also insert to incorrect HDFS paths.
> {code}
> create table bool_part(int_col int) partitioned by(bool_col boolean);
> # This works, creating 3 unique partitions!
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE);
> ALTER TABLE bool_table ADD PARTITION (bool_col=false);
> ALTER TABLE bool_table ADD PARTITION (bool_col=False);
> {code}
> The first problem is that Hive cannot filter on a bool partition key column. 
> "select * from bool_part" returns the correct results, but if you apply a 
> filter on the bool partition key column hive won't return any results.
> The second problem is that Hive seems to just call "toString()" on the 
> boolean literal value. This means you can end up with multiple partitions 
> (FALSE, false, FaLSE, etc) mapping to the literal value 'FALSE'. For example, 
> if you can add three partition in have for the same logic value "false" doing:
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE) -> 
> /test-warehouse/bool_table/bool_col=FALSE/
> ALTER TABLE bool_table ADD PARTITION (bool_col=false) -> 
> /test-warehouse/bool_table/bool_col=false/
> ALTER TABLE bool_table ADD PARTITION (bool_col=False) -> 
> /test-warehouse/bool_table/bool_col=False/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15914) Fix issues with druid-handler pom file

2017-02-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866760#comment-15866760
 ] 

Ashutosh Chauhan commented on HIVE-15914:
-

+1

> Fix issues with druid-handler pom file
> --
>
> Key: HIVE-15914
> URL: https://issues.apache.org/jira/browse/HIVE-15914
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15914.patch
>
>
> Patch fixes multiple issues, including warnings when Hive is compiled due to 
> multiple definitions of the same dependency (joda-time).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15894) Add logical semijoin config in sqlstd safe list

2017-02-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15894:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> Add logical semijoin config in sqlstd safe list 
> 
>
> Key: HIVE-15894
> URL: https://issues.apache.org/jira/browse/HIVE-15894
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 2.2.0
>
> Attachments: HIVE-15894.2.patch, HIVE-15894.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-6590) Hive does not work properly with boolean partition columns (wrong results and inserts to incorrect HDFS path)

2017-02-14 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-6590:
---
Attachment: HIVE-6590.1.patch

serde treated all non-zero length strings as true.
i've changed this to only consider strings starting with 't' or 'T' to be true.

> Hive does not work properly with boolean partition columns (wrong results and 
> inserts to incorrect HDFS path)
> -
>
> Key: HIVE-6590
> URL: https://issues.apache.org/jira/browse/HIVE-6590
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Metastore
>Affects Versions: 0.10.0
>Reporter: Lenni Kuff
>Assignee: Zoltan Haindrich
> Attachments: HIVE-6590.1.patch
>
>
> Hive does not work properly with boolean partition columns. Queries return 
> wrong results and also insert to incorrect HDFS paths.
> {code}
> create table bool_part(int_col int) partitioned by(bool_col boolean);
> # This works, creating 3 unique partitions!
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE);
> ALTER TABLE bool_table ADD PARTITION (bool_col=false);
> ALTER TABLE bool_table ADD PARTITION (bool_col=False);
> {code}
> The first problem is that Hive cannot filter on a bool partition key column. 
> "select * from bool_part" returns the correct results, but if you apply a 
> filter on the bool partition key column hive won't return any results.
> The second problem is that Hive seems to just call "toString()" on the 
> boolean literal value. This means you can end up with multiple partitions 
> (FALSE, false, FaLSE, etc) mapping to the literal value 'FALSE'. For example, 
> if you can add three partition in have for the same logic value "false" doing:
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE) -> 
> /test-warehouse/bool_table/bool_col=FALSE/
> ALTER TABLE bool_table ADD PARTITION (bool_col=false) -> 
> /test-warehouse/bool_table/bool_col=false/
> ALTER TABLE bool_table ADD PARTITION (bool_col=False) -> 
> /test-warehouse/bool_table/bool_col=False/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15904) select query throwing Null Pointer Exception from org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan

2017-02-14 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-15904:
--
Attachment: HIVE-15904.3.patch

Correct patch uploaded.

> select query throwing Null Pointer Exception from 
> org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan
> --
>
> Key: HIVE-15904
> URL: https://issues.apache.org/jira/browse/HIVE-15904
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15904.1.patch, HIVE-15904.2.patch, 
> HIVE-15904.3.patch, table_18.q, table_1.q
>
>
> Following query failing with Null Pointer Exception from 
> org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan
> Attaching create table statements for table_1 and table_18
> Query:
> SELECT
> COALESCE(498, LEAD(COALESCE(-973, -684, 515)) OVER (PARTITION BY 
> (t2.int_col_10 + t1.smallint_col_50) ORDER BY (t2.int_col_10 + 
> t1.smallint_col_50), FLOOR(t1.double_col_16) DESC), 524) AS int_col,
> (t2.int_col_10) + (t1.smallint_col_50) AS int_col_1,
> FLOOR(t1.double_col_16) AS float_col,
> COALESCE(SUM(COALESCE(62, -380, -435)) OVER (PARTITION BY (t2.int_col_10 + 
> t1.smallint_col_50) ORDER BY (t2.int_col_10 + t1.smallint_col_50) DESC, 
> FLOOR(t1.double_col_16) DESC ROWS BETWEEN UNBOUNDED PRECEDING AND 48 
> FOLLOWING), 704) AS int_col_2
> FROM table_1 t1
> INNER JOIN table_18 t2 ON (((t2.tinyint_col_15) = (t1.bigint_col_7)) AND
> ((t2.decimal2709_col_9) = (t1.decimal2016_col_26))) AND
> ((t2.tinyint_col_20) = (t1.tinyint_col_3))
> WHERE (t2.smallint_col_19) IN (SELECT
> COALESCE(-92, -994) AS int_col
> FROM table_1 tt1
> INNER JOIN table_18 tt2 ON (tt2.decimal1911_col_16) = (tt1.decimal2612_col_77)
> WHERE (t1.timestamp_col_9) = (tt2.timestamp_col_18));
> Error Stack:
> org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: NullPointerException null
> at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:387)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:193)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:276)
>  
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:324) 
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:507)
>  
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:495)
>  
> at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:308)
>  
> at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:506)
>  
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
>  
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422)
>  
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:599)
>  
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [?:1.8.0_112]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan(DynamicPartitionPruningOptimization.java:402)
>  
> at 
> org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.process(DynamicPartitionPruningOptimization.java:226)
>  
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>  
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>  
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>  
> at 
> org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:74) 
> at 
> 

  1   2   >