date:20170119

[jira] [Commented] (HIVE-15478) Add file + checksum list for create table/partition during notification creation (whenever relevant)

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829506#comment-15829506
 ] 

Hive QA commented on HIVE-15478:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12847769/HIVE-15478.3.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 77 failed/errored test(s), 10959 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=218)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=224)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=225)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] 
(batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=149)

[jira] [Commented] (HIVE-15651) LLAP: llap status tool enhancements

2017-01-19 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829520#comment-15829520
 ] 

Siddharth Seth commented on HIVE-15651:
---

Instead of accepting a valid "State" - i.e. RUNNING_PARTIAL - can we treat this 
flag as only usable for RUNNING_PARTIAL/RUNNING_ALL.
Either
1) flag does not accept a parameter, and applies only for final state 
RUNNING_ALL/RUNNING_PARTIAL
2) Accept parameter as RUNNING (and wait for RUNNING_PARTIAL or RUNNING_ALL) 
100% means RUNNING_ALL

Other than this, looks good.

> LLAP: llap status tool enhancements
> ---
>
> Key: HIVE-15651
> URL: https://issues.apache.org/jira/browse/HIVE-15651
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15651.1.patch
>
>
> Per [~sseth] following enhancements can be made to llap status tool
> 1) If state changes from an ACTIVE state to STOPPED - terminate the script 
> immediately (fail fast)
> 2) Add a threshold of what is acceptable in terms of the running state - 
> RUNNING_PARTIAL may be ok if 80% nodes are up for example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15616) Improve contents of qfile test output

2017-01-19 Thread Barna Zsombor Klara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-15616:
---
Attachment: HIVE-15616.1.patch

> Improve contents of qfile test output
> -
>
> Key: HIVE-15616
> URL: https://issues.apache.org/jira/browse/HIVE-15616
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 2.1.1
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Attachments: HIVE-15616.1.patch, HIVE-15616.patch
>
>
> The current output of the failed qtests has a less than ideal signal to noise 
> ratio.
> We have duplicated stack traces and messages between the error message/stack 
> trace/error out.
> For diff errors the actual difference is missing from the error message and 
> can be found only in the standard out.
> I would like to simplify this output by removing duplications, moving 
> relevant information to the top.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15621) Use Hive's own JvmPauseMonitor instead of Hadoop's in LLAP

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829738#comment-15829738
 ] 

Hive QA commented on HIVE-15621:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848107/HIVE-15621.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 79 failed/errored test(s), 10960 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=218)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=224)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=225)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] 
(batchId=154)

[jira] [Commented] (HIVE-15582) Druid CTAS should support BYTE/SHORT/INT types

2017-01-19 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829568#comment-15829568
 ] 

Lefty Leverenz commented on HIVE-15582:
---

Does this need to be documented in the wiki?

* [Druid Integration | 
https://cwiki.apache.org/confluence/display/Hive/Druid+Integration]

> Druid CTAS should support BYTE/SHORT/INT types
> --
>
> Key: HIVE-15582
> URL: https://issues.apache.org/jira/browse/HIVE-15582
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.2.0
>
> Attachments: HIVE-15582.02.patch, HIVE-15582.patch
>
>
> Currently these types are not recognized and we throw an exception when we 
> try to create a table with them.
> {noformat}
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: Unknown type: INT
>   at 
> org.apache.hadoop.hive.druid.serde.DruidSerDe.serialize(DruidSerDe.java:414)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:715)
>   ... 22 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15579) Support HADOOP_PROXY_USER for secure impersonation in hive metastore client

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829651#comment-15829651
 ] 

Hive QA commented on HIVE-15579:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848102/HIVE-15579.003.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 79 failed/errored test(s), 10960 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=218)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=224)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=225)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] 
(batchId=154)

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829567#comment-15829567
 ] 

Hive QA commented on HIVE-15580:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848236/HIVE-15580.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 83 failed/errored test(s), 10946 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=218)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=224)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=225)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] 
(batchId=154)

[jira] [Commented] (HIVE-15582) Druid CTAS should support BYTE/SHORT/INT types

2017-01-19 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829571#comment-15829571
 ] 

Jesus Camacho Rodriguez commented on HIVE-15582:


[~leftylev], thanks for pointing that out. I think the documentation for 
HIVE-15277 might include a reference to the column types, but no need to 
specifically label this issue.

> Druid CTAS should support BYTE/SHORT/INT types
> --
>
> Key: HIVE-15582
> URL: https://issues.apache.org/jira/browse/HIVE-15582
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.2.0
>
> Attachments: HIVE-15582.02.patch, HIVE-15582.patch
>
>
> Currently these types are not recognized and we throw an exception when we 
> try to create a table with them.
> {noformat}
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: Unknown type: INT
>   at 
> org.apache.hadoop.hive.druid.serde.DruidSerDe.serialize(DruidSerDe.java:414)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:715)
>   ... 22 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15616) Improve contents of qfile test output

2017-01-19 Thread Barna Zsombor Klara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-15616:
---
Attachment: (was: HIVE-15616.1.patch)

> Improve contents of qfile test output
> -
>
> Key: HIVE-15616
> URL: https://issues.apache.org/jira/browse/HIVE-15616
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 2.1.1
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Attachments: HIVE-15616.1.patch, HIVE-15616.patch
>
>
> The current output of the failed qtests has a less than ideal signal to noise 
> ratio.
> We have duplicated stack traces and messages between the error message/stack 
> trace/error out.
> For diff errors the actual difference is missing from the error message and 
> can be found only in the standard out.
> I would like to simplify this output by removing duplications, moving 
> relevant information to the top.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15269) Dynamic Min-Max runtime-filtering for Tez

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829794#comment-15829794
 ] 

Hive QA commented on HIVE-15269:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848110/HIVE-15269.13.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3036/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3036/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3036/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-01-19 12:07:43.835
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-3036/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-01-19 12:07:43.838
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at ef33237  IVE-15297: Hive should not split semicolon within 
quoted string literals (Pengcheng Xiong, reviewed by Ashutosh Chauhan) 
(addendum I)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at ef33237  IVE-15297: Hive should not split semicolon within 
quoted string literals (Pengcheng Xiong, reviewed by Ashutosh Chauhan) 
(addendum I)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-01-19 12:07:45.002
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: No such 
file or directory
error: a/itests/src/test/resources/testconfiguration.properties: No such file 
or directory
error: a/orc/src/test/org/apache/orc/impl/TestRecordReaderImpl.java: No such 
file or directory
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java: No 
such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java: No 
such file or directory
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeColumnEvaluator.java: No 
such file or directory
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeConstantDefaultEvaluator.java:
 No such file or directory
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeConstantEvaluator.java: No 
such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluator.java: No 
such file or directory
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorFactory.java: No 
such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorHead.java: 
No such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorRef.java: 
No such file or directory
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeFieldEvaluator.java: No 
such file or directory
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java: 
No such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java: No 
such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java: No 
such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java: No 
such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java: 
No such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java: No such file 
or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/ObjectCache.java: No such 
file or directory
error:

[jira] [Commented] (HIVE-15472) JDBC: Standalone jar is missing ZK dependencies

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829879#comment-15829879
 ] 

Hive QA commented on HIVE-15472:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848112/HIVE-15472.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 79 failed/errored test(s), 10946 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=218)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=224)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=225)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] 
(batchId=154)

[jira] [Commented] (HIVE-15629) Set DDLTask’s exception with its subtask’s exception

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829790#comment-15829790
 ] 

Hive QA commented on HIVE-15629:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848178/HIVE-15629.000.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 80 failed/errored test(s), 10946 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=218)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=224)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=225)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] 
(batchId=154)

[jira] [Commented] (HIVE-15666) Select query with view adds base table partition as direct input in spark engine

2017-01-19 Thread Niklaus Xiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829892#comment-15829892
 ] 

Niklaus Xiao commented on HIVE-15666:
-

Add a test case to reproduce the issue.

cc [~aihuaxu] & [~navis]  similar issue 
https://issues.apache.org/jira/browse/HIVE-14805, 
https://issues.apache.org/jira/browse/HIVE-10875

> Select query with view adds base table partition as direct input in spark 
> engine
> 
>
> Key: HIVE-15666
> URL: https://issues.apache.org/jira/browse/HIVE-15666
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0
>Reporter: Niklaus Xiao
> Attachments: TestViewEntityInSparkEngine.patch
>
>
> repro steps:
> {code}
> set hive.execution.engine=spark;
> create table base(id int) partitioned by (dt string);
> alter table base add partition(dt='2017');
> create view view1 as select * from base where id < 10;
> select * from view1;
> {code}
>  it requires the access not only for view1 but also for base@dt=2017 
> partition, which should not be required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15666) Select query with view adds base table partition as direct input in spark engine

2017-01-19 Thread Niklaus Xiao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklaus Xiao updated HIVE-15666:

Attachment: TestViewEntityInSparkEngine.patch

> Select query with view adds base table partition as direct input in spark 
> engine
> 
>
> Key: HIVE-15666
> URL: https://issues.apache.org/jira/browse/HIVE-15666
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0
>Reporter: Niklaus Xiao
> Attachments: TestViewEntityInSparkEngine.patch
>
>
> repo steps:
> {code}
> set hive.execution.engine=spark;
> create table base(id int) partitioned by (dt string);
> alter table base add partition(dt='2017');
> create view view1 as select * from base where id < 10;
> select * from view1;
> {code}
>  it requires the access not only for view1 but also for base@dt=2017 
> partition, which should not be required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15666) Select query with view adds base table partition as direct input in spark engine

2017-01-19 Thread Niklaus Xiao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklaus Xiao updated HIVE-15666:

Description: 
repro steps:
{code}
set hive.execution.engine=spark;
create table base(id int) partitioned by (dt string);
alter table base add partition(dt='2017');

create view view1 as select * from base where id < 10;
select * from view1;
{code}

 it requires the access not only for view1 but also for base@dt=2017 partition, 
which should not be required.

  was:
repo steps:
{code}
set hive.execution.engine=spark;
create table base(id int) partitioned by (dt string);
alter table base add partition(dt='2017');

create view view1 as select * from base where id < 10;
select * from view1;
{code}

 it requires the access not only for view1 but also for base@dt=2017 partition, 
which should not be required.


> Select query with view adds base table partition as direct input in spark 
> engine
> 
>
> Key: HIVE-15666
> URL: https://issues.apache.org/jira/browse/HIVE-15666
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0
>Reporter: Niklaus Xiao
> Attachments: TestViewEntityInSparkEngine.patch
>
>
> repro steps:
> {code}
> set hive.execution.engine=spark;
> create table base(id int) partitioned by (dt string);
> alter table base add partition(dt='2017');
> create view view1 as select * from base where id < 10;
> select * from view1;
> {code}
>  it requires the access not only for view1 but also for base@dt=2017 
> partition, which should not be required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15659) StackOverflowError when ClassLoader.loadClass for Spark

2017-01-19 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829905#comment-15829905
 ] 

Xuefu Zhang commented on HIVE-15659:


[~csun], do you know whether the SOFE exception happens at the driver or 
executor? Secondly, I'm not sure if Spark will load additional jars for each 
input file. To me, it seems to be "per task".

> StackOverflowError when ClassLoader.loadClass for Spark
> ---
>
> Key: HIVE-15659
> URL: https://issues.apache.org/jira/browse/HIVE-15659
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>
> Sometimes a query needs to process a large number of input files, which could 
> cause the following error:
> {code}
> 17/01/15 09:31:52 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 
> (TID 0, hadoopworker1344-sjc1.prod.uber.internal): 
> java.lang.StackOverflowError
> at 
> java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:1535)
> at java.lang.ClassLoader.getClassLoadingLock(ClassLoader.java:463)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:404)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> {code}
> The cause, I think, is that for each input file we may need to load 
> additional jars to the class loader of the current thread. This accumulates 
> with the number of input files. When adding a new class loader, the old class 
> loader will be used as the parent of the new one. 
> See 
> [Utilities#getBaseWork|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L388]
>  for more details.
> One possible solution is to detect duplicated jar paths before creating the 
> new class loader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-19 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-15580:
---
Attachment: HIVE-15580.5.patch

> Replace Spark's groupByKey operator with something with bounded memory
> --
>
> Key: HIVE-15580
> URL: https://issues.apache.org/jira/browse/HIVE-15580
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-15580.1.patch, HIVE-15580.1.patch, 
> HIVE-15580.2.patch, HIVE-15580.2.patch, HIVE-15580.3.patch, 
> HIVE-15580.4.patch, HIVE-15580.5.patch, HIVE-15580.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829978#comment-15829978
 ] 

Hive QA commented on HIVE-13014:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848115/HIVE-13014.06.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 79 failed/errored test(s), 10963 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=218)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=224)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=225)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] 
(batchId=154)

[jira] [Commented] (HIVE-15666) Select query with view adds base table partition as direct input in spark engine

2017-01-19 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830038#comment-15830038
 ] 

Aihua Xu commented on HIVE-15666:
-

Thanks [~niklaus.xiao] to report this. I will take a look. So does this case 
work with MR but not with Spark?

> Select query with view adds base table partition as direct input in spark 
> engine
> 
>
> Key: HIVE-15666
> URL: https://issues.apache.org/jira/browse/HIVE-15666
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0
>Reporter: Niklaus Xiao
>Assignee: Aihua Xu
> Attachments: TestViewEntityInSparkEngine.patch
>
>
> repro steps:
> {code}
> set hive.execution.engine=spark;
> create table base(id int) partitioned by (dt string);
> alter table base add partition(dt='2017');
> create view view1 as select * from base where id < 10;
> select * from view1;
> {code}
>  it requires the access not only for view1 but also for base@dt=2017 
> partition, which should not be required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-15666) Select query with view adds base table partition as direct input in spark engine

2017-01-19 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-15666:
---

Assignee: Aihua Xu

> Select query with view adds base table partition as direct input in spark 
> engine
> 
>
> Key: HIVE-15666
> URL: https://issues.apache.org/jira/browse/HIVE-15666
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0
>Reporter: Niklaus Xiao
>Assignee: Aihua Xu
> Attachments: TestViewEntityInSparkEngine.patch
>
>
> repro steps:
> {code}
> set hive.execution.engine=spark;
> create table base(id int) partitioned by (dt string);
> alter table base add partition(dt='2017');
> create view view1 as select * from base where id < 10;
> select * from view1;
> {code}
>  it requires the access not only for view1 but also for base@dt=2017 
> partition, which should not be required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15617) Improve the avg performance for Range based window

2017-01-19 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15617:

Status: In Progress  (was: Patch Available)

> Improve the avg performance for Range based window
> --
>
> Key: HIVE-15617
> URL: https://issues.apache.org/jira/browse/HIVE-15617
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-15617.1.patch
>
>
> Similar to HIVE-15520, we need to improve the performance for avg().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (HIVE-14194) Investigate optimizing the query compilation of long or-list in where statement

2017-01-19 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reopened HIVE-14194:
-
  Assignee: Aihua Xu

> Investigate optimizing the query compilation of long or-list in where 
> statement 
> 
>
> Key: HIVE-14194
> URL: https://issues.apache.org/jira/browse/HIVE-14194
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> The following query will take long time to compile if the where statement has 
> a long list of 'or'. Investigate if we can optimize it.
> select * from src 
> where key = 1
> or key =2
> or 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15617) Improve the avg performance for Range based window

2017-01-19 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15617:

Attachment: (was: HIVE-15617.1.patch)

> Improve the avg performance for Range based window
> --
>
> Key: HIVE-15617
> URL: https://issues.apache.org/jira/browse/HIVE-15617
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-15617.1.patch
>
>
> Similar to HIVE-15520, we need to improve the performance for avg().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15617) Improve the avg performance for Range based window

2017-01-19 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15617:

Attachment: HIVE-15617.1.patch

> Improve the avg performance for Range based window
> --
>
> Key: HIVE-15617
> URL: https://issues.apache.org/jira/browse/HIVE-15617
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-15617.1.patch
>
>
> Similar to HIVE-15520, we need to improve the performance for avg().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15617) Improve the avg performance for Range based window

2017-01-19 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15617:

Status: Patch Available  (was: In Progress)

> Improve the avg performance for Range based window
> --
>
> Key: HIVE-15617
> URL: https://issues.apache.org/jira/browse/HIVE-15617
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-15617.1.patch
>
>
> Similar to HIVE-15520, we need to improve the performance for avg().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15622) Remove HWI component from Hive

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830062#comment-15830062
 ] 

Hive QA commented on HIVE-15622:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848120/HIVE-15622.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 80 failed/errored test(s), 10959 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
TestHWIServer - did not produce a TEST-*.xml file (likely timed out) 
(batchId=281)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=218)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=218)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=230)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=230)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=221)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=224)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=224)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=225)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] 
(batchId=154)

[jira] [Updated] (HIVE-14194) Investigate optimizing the query compilation of long or-list in where statement

2017-01-19 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-14194:

Description: 
If a query has a long OR-list, when possible, the OR-list can be optimized with 
IN-list to have better performance during compilation (resolved by HIVE-11424).
{noformat}
select * from src 
where key = 1
or key =2
or 
{noformat}

But for the cases which can't be converted, we will still have performance 
issue during compilation. Investigate if we can optimize it.

  was:
The following query will take long time to compile if the where statement has a 
long list of 'or'. Investigate if we can optimize it.

select * from src 
where key = 1
or key =2
or 



> Investigate optimizing the query compilation of long or-list in where 
> statement 
> 
>
> Key: HIVE-14194
> URL: https://issues.apache.org/jira/browse/HIVE-14194
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> If a query has a long OR-list, when possible, the OR-list can be optimized 
> with IN-list to have better performance during compilation (resolved by 
> HIVE-11424).
> {noformat}
> select * from src 
> where key = 1
> or key =2
> or 
> {noformat}
> But for the cases which can't be converted, we will still have performance 
> issue during compilation. Investigate if we can optimize it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13150) When multiple queries are running in the same session, they are sharing the same HMS Client.

2017-01-19 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13150:

Description: 
Seems we should create different HMSClient for different queries if multiple 
queries are executing in the same session in async at the same time to have 
better performance.
Right now, we are unnecessarily use one HMSClient and we have to make HMS calls 
in sync among different queries.


  was:
Seems we should create different HMSClient for different queries if multiple 
queries are executing in the same session in async at the same time to have 
better performance.
Right now, we are unnecessarily to use one HMSClient and we have to make HMS 
calls in sync among different queries.



> When multiple queries are running in the same session, they are sharing the 
> same HMS Client.
> 
>
> Key: HIVE-13150
> URL: https://issues.apache.org/jira/browse/HIVE-13150
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> Seems we should create different HMSClient for different queries if multiple 
> queries are executing in the same session in async at the same time to have 
> better performance.
> Right now, we are unnecessarily use one HMSClient and we have to make HMS 
> calls in sync among different queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15269) Dynamic Min-Max runtime-filtering for Tez

2017-01-19 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-15269:
--
Attachment: HIVE-15269.14.patch

> Dynamic Min-Max runtime-filtering for Tez
> -
>
> Key: HIVE-15269
> URL: https://issues.apache.org/jira/browse/HIVE-15269
> Project: Hive
>  Issue Type: New Feature
>Reporter: Jason Dere
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15269.10.patch, HIVE-15269.11.patch, 
> HIVE-15269.12.patch, HIVE-15269.13.patch, HIVE-15269.14.patch, 
> HIVE-15269.1.patch, HIVE-15269.2.patch, HIVE-15269.3.patch, 
> HIVE-15269.4.patch, HIVE-15269.5.patch, HIVE-15269.6.patch, 
> HIVE-15269.7.patch, HIVE-15269.8.patch, HIVE-15269.9.patch
>
>
> If a dimension table and fact table are joined:
> {noformat}
> select *
> from store join store_sales on (store.id = store_sales.store_id)
> where store.s_store_name = 'My Store'
> {noformat}
> One optimization that can be done is to get the min/max store id values that 
> come out of the scan/filter of the store table, and send this min/max value 
> (via Tez edge) to the task which is scanning the store_sales table.
> We can add a BETWEEN(min, max) predicate to the store_sales TableScan, where 
> this predicate can be pushed down to the storage handler (for example for ORC 
> formats). Pushing a min/max predicate to the ORC reader would allow us to 
> avoid having to entire whole row groups during the table scan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-01-19 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15653:
---
Status: Patch Available  (was: Open)

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Assignee: Chaoyu Tang
>Priority: Critical
> Attachments: HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-01-19 Thread Alexander Behm (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830719#comment-15830719
 ] 

Alexander Behm edited comment on HIVE-15653 at 1/19/17 10:28 PM:
-

[~ctang.ma] Impala calls the Metastore API alter_table(). I tried the following 
alterations in Impala and those did wipe the table stats:
ALTER TABLE ADD COLUMNS
ALTER TABLE CHANGE COLUMN
ALTER TABLE SET TBLPROPERTIES
ALTER TABLE SET SERDEPROPERTIES
ALTER TABLE SET LOCATION
ALTER TABLE SET FILEFORMAT
ALTER TABLE SET CACHED

So I would say most ALTER commands do wipe the stats (from Impala). Just trying 
to make sure the fix on the Hive side is complete, i.e. the alter_table() API 
call on the Metastore is fixed and not just the Hive DDL commands.

The ALTER TABLE RENAME command worked fine (preserved table stats).


was (Author: alex.behm):
[~ctang.ma] Impala calls the Metastore API alter_table(). I tried the following 
alterations and those did wipe the table stats:
ALTER TABLE ADD COLUMNS
ALTER TABLE CHANGE COLUMN
ALTER TABLE SET TBLPROPERTIES
ALTER TABLE SET SERDEPROPERTIES
ALTER TABLE SET LOCATION
ALTER TABLE SET FILEFORMAT
ALTER TABLE SET CACHED

So I would say most ALTER commands do wipe the stats. Just trying to make sure 
the fix on the Hive side is complete (i.e. the alter_table() API call on the 
Metastore).

The ALTER TABLE RENAME command worked fine (preserved table stats).

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Assignee: Chaoyu Tang
>Priority: Critical
> Attachments: HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-01-19 Thread Alexander Behm (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830719#comment-15830719
 ] 

Alexander Behm commented on HIVE-15653:
---

[~ctang.ma] Impala calls the Metastore API alter_table(). I tried the following 
alterations and those did wipe the table stats:
ALTER TABLE ADD COLUMNS
ALTER TABLE CHANGE COLUMN
ALTER TABLE SET TBLPROPERTIES
ALTER TABLE SET SERDEPROPERTIES
ALTER TABLE SET LOCATION
ALTER TABLE SET FILEFORMAT
ALTER TABLE SET CACHED

So I would say most ALTER commands do wipe the stats. Just trying to make sure 
the fix on the Hive side is complete (i.e. the alter_table() API call on the 
Metastore).

The ALTER TABLE RENAME command worked fine (preserved table stats).

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Assignee: Chaoyu Tang
>Priority: Critical
> Attachments: HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15630) add operation handle before operation.run instead of after operation.run

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830744#comment-15830744
 ] 

Hive QA commented on HIVE-15630:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848177/HIVE-15630.000.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 79 failed/errored test(s), 10949 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=117)

[join39.q,bucketsortoptimize_insert_7.q,vector_distinct_2.q,bucketmapjoin10.q,join11.q,union13.q,auto_sortmerge_join_16.q,windowing.q,union_remove_3.q,skewjoinopt7.q,stats7.q,annotate_stats_join.q,multi_insert_lateral_view.q,ptf_streaming.q,join_1to1.q]
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=219)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=225)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=226)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=226)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic]

[jira] [Commented] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-01-19 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830768#comment-15830768
 ] 

Chaoyu Tang commented on HIVE-15653:


[~alex.behm] I think the "ALTER TABLE SET LOCATION" should change table stats 
as expected since the underlying table files could change. For others (except 
SET CACHED which I need verify), this patch has fixed the issue and they won't 
change table stats. Yes, the ALTER TABLE RENAME worked before.

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Assignee: Chaoyu Tang
>Priority: Critical
> Attachments: HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15297) Hive should not split semicolon within quoted string literals

2017-01-19 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830790#comment-15830790
 ] 

Pengcheng Xiong commented on HIVE-15297:


[~sershe], i am fixing it now.

> Hive should not split semicolon within quoted string literals
> -
>
> Key: HIVE-15297
> URL: https://issues.apache.org/jira/browse/HIVE-15297
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-15297.01.patch, HIVE-15297.02.patch, 
> HIVE-15297.03.patch, HIVE-15297.04.patch, HIVE-15297.05.patch
>
>
> String literals in query cannot have reserved symbols. The same set of query 
> works fine in mysql and postgresql. 
> {code}
> hive> CREATE TABLE ts(s varchar(550));
> OK
> Time taken: 0.075 seconds
> hive> INSERT INTO ts VALUES ('Mozilla/5.0 (iPhone; CPU iPhone OS 5_0');
> MismatchedTokenException(14!=326)
>   at 
> org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617)
>   at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valueRowConstructor(HiveParser_FromClauseParser.java:7271)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesTableConstructor(HiveParser_FromClauseParser.java:7370)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesClause(HiveParser_FromClauseParser.java:7510)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.valuesClause(HiveParser.java:51854)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:45432)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:44578)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:8)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1694)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1176)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:402)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:326)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1169)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1288)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1083)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> FAILED: ParseException line 1:31 mismatched input '/' expecting ) near 
> 'Mozilla' in value row constructor
> hive>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-01-19 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15653:
---
Attachment: HIVE-15653.patch

For most of alter table operations like table rename, add columns, change 
column type etc (besides the set table properties), the table stats status 
should not change. But for some other operations like update statistics, change 
location, the basic stats status should change.
[~pxiong] could you review the patch?

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Assignee: Chaoyu Tang
>Priority: Critical
> Attachments: HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15661) Add security error logging to LLAP

2017-01-19 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830761#comment-15830761
 ] 

Jason Dere commented on HIVE-15661:
---

+1

> Add security error logging to LLAP
> --
>
> Key: HIVE-15661
> URL: https://issues.apache.org/jira/browse/HIVE-15661
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15661.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15667) TestBlobstoreCliDriver tests are failing due to output differences

2017-01-19 Thread Thomas Poepping (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830703#comment-15830703
 ] 

Thomas Poepping commented on HIVE-15667:


LGTM, pending Jenkins run. Nonbinding +1

For those curious, like I was, it looks like HIVE-15297 made the change 
requiring this Jira.

> TestBlobstoreCliDriver tests are failing due to output differences
> --
>
> Key: HIVE-15667
> URL: https://issues.apache.org/jira/browse/HIVE-15667
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-15667.1.patch
>
>
> All itests/hive-blobstore are failing and their .q.out files need to be 
> updated.
> CC: [~poeppt]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15661) Add security error logging to LLAP

2017-01-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830786#comment-15830786
 ] 

Sergey Shelukhin commented on HIVE-15661:
-

Tests are broken by HIVE-15297

> Add security error logging to LLAP
> --
>
> Key: HIVE-15661
> URL: https://issues.apache.org/jira/browse/HIVE-15661
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15661.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-01-19 Thread Alexander Behm (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830771#comment-15830771
 ] 

Alexander Behm commented on HIVE-15653:
---

Good to know thanks. Regarding ALTER TABLE SET LOCATION: I suppose it's 
arguable. Imo, we should not add side-effects to ALTER TABLE especially when it 
comes to stats because they are expensive to compute. This is more of a product 
question, so maybe [~grahn] or [~skumar] can weigh in.

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Assignee: Chaoyu Tang
>Priority: Critical
> Attachments: HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15297) Hive should not split semicolon within quoted string literals

2017-01-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830785#comment-15830785
 ] 

Sergey Shelukhin commented on HIVE-15297:
-

[~pxiong] this seems to have broken many tests under HiveQA. Can you please fix 
or revert?

> Hive should not split semicolon within quoted string literals
> -
>
> Key: HIVE-15297
> URL: https://issues.apache.org/jira/browse/HIVE-15297
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-15297.01.patch, HIVE-15297.02.patch, 
> HIVE-15297.03.patch, HIVE-15297.04.patch, HIVE-15297.05.patch
>
>
> String literals in query cannot have reserved symbols. The same set of query 
> works fine in mysql and postgresql. 
> {code}
> hive> CREATE TABLE ts(s varchar(550));
> OK
> Time taken: 0.075 seconds
> hive> INSERT INTO ts VALUES ('Mozilla/5.0 (iPhone; CPU iPhone OS 5_0');
> MismatchedTokenException(14!=326)
>   at 
> org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617)
>   at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valueRowConstructor(HiveParser_FromClauseParser.java:7271)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesTableConstructor(HiveParser_FromClauseParser.java:7370)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesClause(HiveParser_FromClauseParser.java:7510)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.valuesClause(HiveParser.java:51854)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:45432)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:44578)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:8)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1694)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1176)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:402)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:326)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1169)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1288)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1083)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> FAILED: ParseException line 1:31 mismatched input '/' expecting ) near 
> 'Mozilla' in value row constructor
> hive>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15546) Optimize Utilities.getInputPaths() so each listStatus of a partition is done in parallel

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830381#comment-15830381
 ] 

Hive QA commented on HIVE-15546:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848161/HIVE-15546.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 82 failed/errored test(s), 10965 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=219)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=225)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=226)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=226)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] 
(batchId=139)

[jira] [Updated] (HIVE-15580) Eliminate unbounded memory usage for orderBy and groupBy in Hive on Spark

2017-01-19 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-15580:
---
Description: 
Currently, orderBy (sortBy) and groupBy in Hive on Spark uses unbounded memory. 
For orderBy, Hive accumulates key groups using ArrayList (described in 
HIVE-15527). For groupBy, Hive currently uses Spark's groupByKey operator, 
which has a shortcoming of not being able to spill to disk within a key group. 
Thus, for large key group, memory usage is also unbounded.

It's likely that this will impact performance. We will profile and optimize 
afterwards. We could also make this change configurable.

  was:Currently, orderBy (sortBy) and groupBy in Hive on Spark uses unbounded 
memory. For orderBy, Hive accumulates key groups using ArrayList (described in 
HIVE-15527). For groupBy, Hive currently uses Spark's groupByKey operator, 
which has a shortcoming of not being able to spill to disk within a key group. 
Thus, for large key group, memory usage is also unbounded.


> Eliminate unbounded memory usage for orderBy and groupBy in Hive on Spark
> -
>
> Key: HIVE-15580
> URL: https://issues.apache.org/jira/browse/HIVE-15580
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-15580.1.patch, HIVE-15580.1.patch, 
> HIVE-15580.2.patch, HIVE-15580.2.patch, HIVE-15580.3.patch, 
> HIVE-15580.4.patch, HIVE-15580.5.patch, HIVE-15580.patch
>
>
> Currently, orderBy (sortBy) and groupBy in Hive on Spark uses unbounded 
> memory. For orderBy, Hive accumulates key groups using ArrayList (described 
> in HIVE-15527). For groupBy, Hive currently uses Spark's groupByKey operator, 
> which has a shortcoming of not being able to spill to disk within a key 
> group. Thus, for large key group, memory usage is also unbounded.
> It's likely that this will impact performance. We will profile and optimize 
> afterwards. We could also make this change configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10562) Add version column to NOTIFICATION_LOG table and DbNotificationListener

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830481#comment-15830481
 ] 

Hive QA commented on HIVE-10562:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848165/HIVE-10562.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 81 failed/errored test(s), 10949 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=122)

[auto_sortmerge_join_13.q,join4.q,join35.q,udf_percentile.q,join_reorder3.q,subquery_in.q,auto_join19.q,stats14.q,vectorization_15.q,union7.q,vectorization_nested_udf.q,vector_groupby_3.q,vectorized_ptf.q,auto_join2.q,groupby1_map_skew.q]
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=219)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=225)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=226)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=226)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=135)

[jira] [Updated] (HIVE-15668) change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword

2017-01-19 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-15668:

Attachment: HIVE-15668.patch

Attaching patch.

> change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword
> -
>
> Key: HIVE-15668
> URL: https://issues.apache.org/jira/browse/HIVE-15668
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-15668.patch
>
>
> Currently, REPL DUMP syntax goes:
> {noformat}
> REPL DUMP [[.]] [FROM  [BATCH ]]
> {noformat}
> The BATCH directive says that when doing an event dump, to not dump out more 
> than _batchSize_ number of events. However, there is a clearer keyword for 
> the same effect, and that is LIMIT. Thus, rephrasing the syntax as follows 
> makes it clearer:
> {noformat}
> REPL DUMP [[.]] [FROM  [LIMIT ]]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15586) Make Insert and Create statement Transactional

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830650#comment-15830650
 ] 

Hive QA commented on HIVE-15586:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848167/HIVE-15586.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 81 failed/errored test(s), 10964 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=219)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=225)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=226)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=226)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] 
(batchId=139)

[jira] [Assigned] (HIVE-15036) Druid code recently included in Hive pulls in GPL jar

2017-01-19 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra reassigned HIVE-15036:
-

Assignee: slim bouguerra  (was: Jesus Camacho Rodriguez)

> Druid code recently included in Hive pulls in GPL jar
> -
>
> Key: HIVE-15036
> URL: https://issues.apache.org/jira/browse/HIVE-15036
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Alan Gates
>Assignee: slim bouguerra
>Priority: Blocker
>
> Druid pulls in a jar annotation-2.3.jar.  According to its pom file it is 
> licensed under GPL.  We cannot ship a binary distribution that includes this 
> jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15036) Druid code recently included in Hive pulls in GPL jar

2017-01-19 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15036:
--
Status: Patch Available  (was: Open)

> Druid code recently included in Hive pulls in GPL jar
> -
>
> Key: HIVE-15036
> URL: https://issues.apache.org/jira/browse/HIVE-15036
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Alan Gates
>Assignee: slim bouguerra
>Priority: Blocker
>
> Druid pulls in a jar annotation-2.3.jar.  According to its pom file it is 
> licensed under GPL.  We cannot ship a binary distribution that includes this 
> jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15036) Druid code recently included in Hive pulls in GPL jar

2017-01-19 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15036:
--
Attachment: HIVE-15036.patch

Excluded the concerned jar and added a banning plugin to enforce the rule even 
on transitive dependencies.

> Druid code recently included in Hive pulls in GPL jar
> -
>
> Key: HIVE-15036
> URL: https://issues.apache.org/jira/browse/HIVE-15036
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Alan Gates
>Assignee: slim bouguerra
>Priority: Blocker
> Attachments: HIVE-15036.patch
>
>
> Druid pulls in a jar annotation-2.3.jar.  According to its pom file it is 
> licensed under GPL.  We cannot ship a binary distribution that includes this 
> jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15036) Druid code recently included in Hive pulls in GPL jar

2017-01-19 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830510#comment-15830510
 ] 

slim bouguerra commented on HIVE-15036:
---

[~alangates] thanks very much please check this patch. As an add-on i have 
added a banning plugin to enforce the rule over the entire project, this will 
allow to catch transitive dependencies.
http://maven.apache.org/enforcer/enforcer-rules/bannedDependencies.html

> Druid code recently included in Hive pulls in GPL jar
> -
>
> Key: HIVE-15036
> URL: https://issues.apache.org/jira/browse/HIVE-15036
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Alan Gates
>Assignee: slim bouguerra
>Priority: Blocker
> Attachments: HIVE-15036.patch
>
>
> Druid pulls in a jar annotation-2.3.jar.  According to its pom file it is 
> licensed under GPL.  We cannot ship a binary distribution that includes this 
> jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14707) ACID: Insert shuffle sort-merges on blank KEY

2017-01-19 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14707:
--
Attachment: HIVE-14707.24.patch

> ACID: Insert shuffle sort-merges on blank KEY
> -
>
> Key: HIVE-14707
> URL: https://issues.apache.org/jira/browse/HIVE-14707
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Eugene Koifman
> Attachments: HIVE-14707.01.patch, HIVE-14707.02.patch, 
> HIVE-14707.03.patch, HIVE-14707.04.patch, HIVE-14707.05.patch, 
> HIVE-14707.06.patch, HIVE-14707.08.patch, HIVE-14707.09.patch, 
> HIVE-14707.10.patch, HIVE-14707.11.patch, HIVE-14707.13.patch, 
> HIVE-14707.14.patch, HIVE-14707.16.patch, HIVE-14707.17.patch, 
> HIVE-14707.18.patch, HIVE-14707.19.patch, HIVE-14707.19.patch, 
> HIVE-14707.20.patch, HIVE-14707.21.patch, HIVE-14707.22.patch, 
> HIVE-14707.23.patch, HIVE-14707.24.patch
>
>
> The ACID insert codepath uses a sorted shuffle, while they key used for 
> shuffle is always 0 bytes long.
> {code}
> hive (sales_acid)> explain insert into sales values(1, 2, 
> '3400---009', 1, null);
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: gopal_20160906172626_80261c4c-79cc-4e02-87fe-3133be404e55:2
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: values__tmp__table__2
>   Statistics: Num rows: 1 Data size: 28 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: tmp_values_col1 (type: string), 
> tmp_values_col2 (type: string), tmp_values_col3 (type: string), 
> tmp_values_col4 (type: string), tmp_values_col5 (type: string)
> outputColumnNames: _col0, _col1, _col2, _col3, _col4
> Statistics: Num rows: 1 Data size: 28 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: UDFToLong(_col1) (type: 
> bigint)
>   Statistics: Num rows: 1 Data size: 28 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: string), _col1 (type: 
> string), _col2 (type: string), _col3 (type: string), _col4 (type: string)
> Execution mode: vectorized, llap
> LLAP IO: no inputs
> {code}
> Note the missing "+" / "-" in the Sort Order fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14949) Enforce that target:source is not 1:N

2017-01-19 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14949:
--
Attachment: HIVE-14949.01.patch

> Enforce that target:source is not 1:N
> -
>
> Key: HIVE-14949
> URL: https://issues.apache.org/jira/browse/HIVE-14949
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14949.01.patch
>
>
> If > 1 row on source side matches the same row on target side that means that 
>  we are forced update (or delete) the same row in target more than once as 
> part of the same SQL statement.  This should raise an error per SQL Spec
> ISO/IEC 9075-2:2011(E)
> Section 14.2 under "General Rules" Item 6/Subitem a/Subitem 2/Subitem B
> There is no sure way to do this via static analysis of the query.
> Can we add something to ROJ operator to pay attention to ROW__ID of target 
> side row and compare it with ROW__ID of target side of previous row output?  
> If they are the same, that means > 1 source row matched.
> Or perhaps just mark each row in the hash table that it matched.  And if it 
> matches again, throw an error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14949) Enforce that target:source is not 1:N

2017-01-19 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14949:
--
Status: Patch Available  (was: Open)

> Enforce that target:source is not 1:N
> -
>
> Key: HIVE-14949
> URL: https://issues.apache.org/jira/browse/HIVE-14949
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14949.01.patch
>
>
> If > 1 row on source side matches the same row on target side that means that 
>  we are forced update (or delete) the same row in target more than once as 
> part of the same SQL statement.  This should raise an error per SQL Spec
> ISO/IEC 9075-2:2011(E)
> Section 14.2 under "General Rules" Item 6/Subitem a/Subitem 2/Subitem B
> There is no sure way to do this via static analysis of the query.
> Can we add something to ROJ operator to pay attention to ROW__ID of target 
> side row and compare it with ROW__ID of target side of previous row output?  
> If they are the same, that means > 1 source row matched.
> Or perhaps just mark each row in the hash table that it matched.  And if it 
> matches again, throw an error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15623) Use customized version of netty for llap

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830569#comment-15830569
 ] 

Hive QA commented on HIVE-15623:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848164/HIVE-15623.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 79 failed/errored test(s), 10964 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=219)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=225)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=226)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=226)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] 
(batchId=154)

[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-19 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: HIVE-15439.2.patch

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.2.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15586) Make Insert and Create statement Transactional

2017-01-19 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15586:
--
Attachment: HIVE-15586.2.patch

> Make Insert and Create statement Transactional
> --
>
> Key: HIVE-15586
> URL: https://issues.apache.org/jira/browse/HIVE-15586
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15586.2.patch, HIVE-15586.patch, HIVE-15586.patch, 
> HIVE-15586.patch
>
>
> Currently insert/create will return the handle to user without waiting for 
> the data been loaded by the druid cluster. In order to avoid that will add a 
> passive wait till the segment are loaded by historical in case the 
> coordinator is UP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15036) Druid code recently included in Hive pulls in GPL jar

2017-01-19 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830590#comment-15830590
 ] 

Alan Gates commented on HIVE-15036:
---

+1

> Druid code recently included in Hive pulls in GPL jar
> -
>
> Key: HIVE-15036
> URL: https://issues.apache.org/jira/browse/HIVE-15036
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Alan Gates
>Assignee: slim bouguerra
>Priority: Blocker
> Attachments: HIVE-15036.patch
>
>
> Druid pulls in a jar annotation-2.3.jar.  According to its pom file it is 
> licensed under GPL.  We cannot ship a binary distribution that includes this 
> jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8244) INSERT/UPDATE/DELETE should return count of rows affected

2017-01-19 Thread Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830618#comment-15830618
 ] 

Kumar commented on HIVE-8244:
-

Hello, is this ask being considered at all as enhancement?  Hive CLI provides a 
crude way to identify if the rows has been added or not for INSERT for e.g.

> INSERT/UPDATE/DELETE should return count of rows affected
> -
>
> Key: HIVE-8244
> URL: https://issues.apache.org/jira/browse/HIVE-8244
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>
> it's common in SQL and JDBC 
> [API|http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#executeUpdate(java.lang.String)]
>  to return count of affected rows.
> Hive should do the same (it's not as of 9/24/2014)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15147) LLAP: use LLAP cache for non-columnar formats in a somewhat general way

2017-01-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15147:

Fix Version/s: 2.2.0

> LLAP: use LLAP cache for non-columnar formats in a somewhat general way
> ---
>
> Key: HIVE-15147
> URL: https://issues.apache.org/jira/browse/HIVE-15147
> Project: Hive
>  Issue Type: New Feature
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-15147.01.patch, HIVE-15147.patch, 
> HIVE-15147.WIP.noout.patch, perf-top-cache.png, pre-cache.svg, 
> writerimpl-addrow.png
>
>
> The primary goal for the first pass is caching text files. Nothing would 
> prevent other formats from using the same path, in principle, although, as 
> was originally done with ORC, it may be better to have native caching support 
> optimized for each particular format.
> Given that caching pure text is not smart, and we already have ORC-encoded 
> cache that is columnar due to ORC file structure, we will transform data into 
> columnar ORC.
> The general idea is to treat all the data in the world as merely ORC that was 
> compressed with some poor compression codec, such as csv. Using the original 
> IF and serde, as well as an ORC writer (with some heavyweight optimizations 
> disabled, potentially), we can "uncompress" the csv/whatever data into its 
> "original" ORC representation, then cache it efficiently, by column, and also 
> reuse a lot of the existing code.
> Various other points:
> 1) Caching granularity will have to be somehow determined (i.e. how do we 
> slice the file horizontally, to avoid caching entire columns). As with ORC 
> uncompressed files, the specific offsets don't really matter as long as they 
> are consistent between reads. The problem is that the file offsets will 
> actually need to be propagated to the new reader from the original 
> inputformat. Row counts are easier to use but there's a problem of how to 
> actually map them to missing ranges to read from disk.
> 2) Obviously, for row-based formats, if any one column that is to be read has 
> been evicted or is otherwise missing, "all the columns" have to be read for 
> the corresponding slice to cache and read that one column. The vague plan is 
> to handle this implicitly, similarly to how ORC reader handles CB-RG overlaps 
> - it will just so happen that a missing column in disk range list to retrieve 
> will expand the disk-range-to-read into the whole horizontal slice of the 
> file.
> 3) Granularity/etc. won't work for gzipped text. If anything at all is 
> evicted, the entire file has to be re-read. Gzipped text is a ridiculous 
> feature, so this is by design.
> 4) In future, it would be possible to also build some form or 
> metadata/indexes for this cached data to do PPD, etc. This is out of the 
> scope for now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15665) LLAP: OrcFileMetadata objects in cache can impact heap usage

2017-01-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830629#comment-15830629
 ] 

Sergey Shelukhin commented on HIVE-15665:
-

There's JIRA somewhere to move metadata cache off-heap, using the same 
allocator and storing ORC buffers as is (or concatenated). That should be 
pretty simple design-wise.

> LLAP: OrcFileMetadata objects in cache can impact heap usage
> 
>
> Key: HIVE-15665
> URL: https://issues.apache.org/jira/browse/HIVE-15665
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Rajesh Balamohan
>
> OrcFileMetadata internally has filestats, stripestats etc which are allocated 
> in heap. On large data sets, this could have an impact on the heap usage and 
> the memory usage by different executors in LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15546) Optimize Utilities.getInputPaths() so each listStatus of a partition is done in parallel

2017-01-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830635#comment-15830635
 ] 

Sergey Shelukhin commented on HIVE-15546:
-

[~stakiar] wrt listStatuss, when NullScanFS is used, we run listStatus on that, 
which is a no-op.
We do need to examine the FS during split generation, that is unavoidable.

> Optimize Utilities.getInputPaths() so each listStatus of a partition is done 
> in parallel
> 
>
> Key: HIVE-15546
> URL: https://issues.apache.org/jira/browse/HIVE-15546
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15546.1.patch, HIVE-15546.2.patch, 
> HIVE-15546.3.patch, HIVE-15546.4.patch, HIVE-15546.5.patch
>
>
> When running on blobstores (like S3) where metadata operations (like 
> listStatus) are costly, Utilities.getInputPaths() can add significant 
> overhead when setting up the input paths for an MR / Spark / Tez job.
> The method performs a listStatus on all input paths in order to check if the 
> path is empty. If the path is empty, a dummy file is created for the given 
> partition. This is all done sequentially. This can be really slow when there 
> are a lot of empty partitions. Even when all partitions have input data, this 
> can take a long time.
> We should either:
> (1) Just remove the logic to check if each input path is empty, and handle 
> any edge cases accordingly.
> (2) Multi-thread the listStatus calls



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15541) Hive OOM when ATSHook enabled and ATS goes down

2017-01-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830639#comment-15830639
 ] 

Sergey Shelukhin commented on HIVE-15541:
-

fireAndForget does not need conf. Not sure if Java is smart enough to remove it 
from Runnable capture, I hope so (since it's not final and works anyway), but 
removing it would be harmless 

+1 can be fixed on commit

> Hive OOM when ATSHook enabled and ATS goes down
> ---
>
> Key: HIVE-15541
> URL: https://issues.apache.org/jira/browse/HIVE-15541
> Project: Hive
>  Issue Type: Bug
>  Components: Hooks
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-15541.1.patch, HIVE-15541.2.patch, 
> HIVE-15541.3.patch, HIVE-15541.4.patch
>
>
> The ATS API used by the Hive ATSHook is a blocking call, if ATS goes down 
> this can block the ATSHook executor, while the hook continues to submit work 
> to the executor with each query.
> Over time the buildup of queued items can cause OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15661) Add security error logging to LLAP

2017-01-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15661:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master; thanks for the review!

> Add security error logging to LLAP
> --
>
> Key: HIVE-15661
> URL: https://issues.apache.org/jira/browse/HIVE-15661
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-15661.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15544) Support scalar subqueries

2017-01-19 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15544:
---
Status: Open  (was: Patch Available)

> Support scalar subqueries
> -
>
> Key: HIVE-15544
> URL: https://issues.apache.org/jira/browse/HIVE-15544
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15544.1.patch, HIVE-15544.2.patch, 
> HIVE-15544.3.patch, HIVE-15544.4.patch, HIVE-15544.5.patch
>
>
> Currently HIVE only support IN/EXISTS/NOT IN/NOT EXISTS subqueries. HIVE 
> doesn't allow sub-queries such as:
> {code}
> explain select  a.ca_state state, count(*) cnt
>  from customer_address a
>  ,customer c
>  ,store_sales s
>  ,date_dim d
>  ,item i
>  where   a.ca_address_sk = c.c_current_addr_sk
>   and c.c_customer_sk = s.ss_customer_sk
>   and s.ss_sold_date_sk = d.d_date_sk
>   and s.ss_item_sk = i.i_item_sk
>   and d.d_month_seq = 
>(select distinct (d_month_seq)
> from date_dim
>where d_year = 2000
>   and d_moy = 2 )
>   and i.i_current_price > 1.2 * 
>  (select avg(j.i_current_price) 
>from item j 
>where j.i_category = i.i_category)
>  group by a.ca_state
>  having count(*) >= 10
>  order by cnt 
>  limit 100;
> {code}
> We initially plan to support such scalar subqueries in filter i.e. WHERE and 
> HAVING



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15544) Support scalar subqueries

2017-01-19 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15544:
---
Attachment: HIVE-15544.5.patch

Addressed review comments.

> Support scalar subqueries
> -
>
> Key: HIVE-15544
> URL: https://issues.apache.org/jira/browse/HIVE-15544
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15544.1.patch, HIVE-15544.2.patch, 
> HIVE-15544.3.patch, HIVE-15544.4.patch, HIVE-15544.5.patch
>
>
> Currently HIVE only support IN/EXISTS/NOT IN/NOT EXISTS subqueries. HIVE 
> doesn't allow sub-queries such as:
> {code}
> explain select  a.ca_state state, count(*) cnt
>  from customer_address a
>  ,customer c
>  ,store_sales s
>  ,date_dim d
>  ,item i
>  where   a.ca_address_sk = c.c_current_addr_sk
>   and c.c_customer_sk = s.ss_customer_sk
>   and s.ss_sold_date_sk = d.d_date_sk
>   and s.ss_item_sk = i.i_item_sk
>   and d.d_month_seq = 
>(select distinct (d_month_seq)
> from date_dim
>where d_year = 2000
>   and d_moy = 2 )
>   and i.i_current_price > 1.2 * 
>  (select avg(j.i_current_price) 
>from item j 
>where j.i_category = i.i_category)
>  group by a.ca_state
>  having count(*) >= 10
>  order by cnt 
>  limit 100;
> {code}
> We initially plan to support such scalar subqueries in filter i.e. WHERE and 
> HAVING



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15649:

Status: Patch Available  (was: Open)

> LLAP IO may NPE on all-column read
> --
>
> Key: HIVE-15649
> URL: https://issues.apache.org/jira/browse/HIVE-15649
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15649.01.patch, HIVE-15649.02.patch, 
> HIVE-15649.patch
>
>
> It seems like very few paths use READ_ALL_COLUMNS config, but some do. LLAP 
> IO doesn't account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15544) Support scalar subqueries

2017-01-19 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15544:
---
Status: Patch Available  (was: Open)

> Support scalar subqueries
> -
>
> Key: HIVE-15544
> URL: https://issues.apache.org/jira/browse/HIVE-15544
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15544.1.patch, HIVE-15544.2.patch, 
> HIVE-15544.3.patch, HIVE-15544.4.patch, HIVE-15544.5.patch
>
>
> Currently HIVE only support IN/EXISTS/NOT IN/NOT EXISTS subqueries. HIVE 
> doesn't allow sub-queries such as:
> {code}
> explain select  a.ca_state state, count(*) cnt
>  from customer_address a
>  ,customer c
>  ,store_sales s
>  ,date_dim d
>  ,item i
>  where   a.ca_address_sk = c.c_current_addr_sk
>   and c.c_customer_sk = s.ss_customer_sk
>   and s.ss_sold_date_sk = d.d_date_sk
>   and s.ss_item_sk = i.i_item_sk
>   and d.d_month_seq = 
>(select distinct (d_month_seq)
> from date_dim
>where d_year = 2000
>   and d_moy = 2 )
>   and i.i_current_price > 1.2 * 
>  (select avg(j.i_current_price) 
>from item j 
>where j.i_category = i.i_category)
>  group by a.ca_state
>  having count(*) >= 10
>  order by cnt 
>  limit 100;
> {code}
> We initially plan to support such scalar subqueries in filter i.e. WHERE and 
> HAVING



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15649:

Attachment: HIVE-15649.02.patch

Rebased the patch

> LLAP IO may NPE on all-column read
> --
>
> Key: HIVE-15649
> URL: https://issues.apache.org/jira/browse/HIVE-15649
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15649.01.patch, HIVE-15649.02.patch, 
> HIVE-15649.patch
>
>
> It seems like very few paths use READ_ALL_COLUMNS config, but some do. LLAP 
> IO doesn't account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15544) Support scalar subqueries

2017-01-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830870#comment-15830870
 ] 

Ashutosh Chauhan commented on HIVE-15544:
-

+1

> Support scalar subqueries
> -
>
> Key: HIVE-15544
> URL: https://issues.apache.org/jira/browse/HIVE-15544
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15544.1.patch, HIVE-15544.2.patch, 
> HIVE-15544.3.patch, HIVE-15544.4.patch, HIVE-15544.5.patch
>
>
> Currently HIVE only support IN/EXISTS/NOT IN/NOT EXISTS subqueries. HIVE 
> doesn't allow sub-queries such as:
> {code}
> explain select  a.ca_state state, count(*) cnt
>  from customer_address a
>  ,customer c
>  ,store_sales s
>  ,date_dim d
>  ,item i
>  where   a.ca_address_sk = c.c_current_addr_sk
>   and c.c_customer_sk = s.ss_customer_sk
>   and s.ss_sold_date_sk = d.d_date_sk
>   and s.ss_item_sk = i.i_item_sk
>   and d.d_month_seq = 
>(select distinct (d_month_seq)
> from date_dim
>where d_year = 2000
>   and d_moy = 2 )
>   and i.i_current_price > 1.2 * 
>  (select avg(j.i_current_price) 
>from item j 
>where j.i_category = i.i_category)
>  group by a.ca_state
>  having count(*) >= 10
>  order by cnt 
>  limit 100;
> {code}
> We initially plan to support such scalar subqueries in filter i.e. WHERE and 
> HAVING



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15651) LLAP: llap status tool enhancements

2017-01-19 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830871#comment-15830871
 ] 

Prasanth Jayachandran commented on HIVE-15651:
--

That makes sense. I also don't see any use to support all states. I am updating 
the patch to support watch mode only for running states (no value needs to be 
specified for state). Also another option for threshold. So when watch mode is 
enabled the default behaviour is to wait for all nodes to come up (threshold 
1.0). When threshold is explicitly specified it will break when the threshold 
is met. Will post patch shortly after testing. 

> LLAP: llap status tool enhancements
> ---
>
> Key: HIVE-15651
> URL: https://issues.apache.org/jira/browse/HIVE-15651
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15651.1.patch
>
>
> Per [~sseth] following enhancements can be made to llap status tool
> 1) If state changes from an ACTIVE state to STOPPED - terminate the script 
> immediately (fail fast)
> 2) Add a threshold of what is acceptable in terms of the running state - 
> RUNNING_PARTIAL may be ok if 80% nodes are up for example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15622) Remove HWI component from Hive

2017-01-19 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-15622:
-
Attachment: HIVE-15622.3.patch

> Remove HWI component from Hive
> --
>
> Key: HIVE-15622
> URL: https://issues.apache.org/jira/browse/HIVE-15622
> Project: Hive
>  Issue Type: Task
>  Components: Web UI
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-15622.1.patch, HIVE-15622.2.patch, 
> HIVE-15622.3.patch
>
>
> This component seems to be obsolete, as it didn't get any meaningful update 
> since 2012. And we don't see people discussing or complaining issues about 
> this. Moreover, it caused a number of ptest issues which can be avoided.
> We should remove this component as a cleanup effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15666) Select query with view adds base table partition as direct input in spark engine

2017-01-19 Thread Niklaus Xiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830909#comment-15830909
 ] 

Niklaus Xiao commented on HIVE-15666:
-

Yes, MR engine works fine.

> Select query with view adds base table partition as direct input in spark 
> engine
> 
>
> Key: HIVE-15666
> URL: https://issues.apache.org/jira/browse/HIVE-15666
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0
>Reporter: Niklaus Xiao
>Assignee: Aihua Xu
> Attachments: TestViewEntityInSparkEngine.patch
>
>
> repro steps:
> {code}
> set hive.execution.engine=spark;
> create table base(id int) partitioned by (dt string);
> alter table base add partition(dt='2017');
> create view view1 as select * from base where id < 10;
> select * from view1;
> {code}
>  it requires the access not only for view1 but also for base@dt=2017 
> partition, which should not be required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15621) Use Hive's own JvmPauseMonitor instead of Hadoop's in LLAP

2017-01-19 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830931#comment-15830931
 ] 

Wei Zheng commented on HIVE-15621:
--

offset_limit_ppd_optimizer is the only test failure with age==1, which runs 
fine locally.

[~sershe] Can you take another look please?

> Use Hive's own JvmPauseMonitor instead of Hadoop's in LLAP
> --
>
> Key: HIVE-15621
> URL: https://issues.apache.org/jira/browse/HIVE-15621
> Project: Hive
>  Issue Type: Task
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-15621.1.patch, HIVE-15621.2.patch, 
> HIVE-15621.3.patch, HIVE-15621.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14083) ALTER INDEX in Tez causes NullPointerException

2017-01-19 Thread Thomas Poepping (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830811#comment-15830811
 ] 

Thomas Poepping edited comment on HIVE-14083 at 1/20/17 1:01 AM:
-

Sorry to jump on this so late, but why are we okay with accepting that indexing 
just doesn't work? Are there no follow-up Jira issues to attack the problem and 
work on supporting indexes with Tez?


was (Author: poeppt):
Sorry to jump on this so late, buy why are we okay with accepting that indexing 
just doesn't work? Are there no follow-up Jira issues to attack the problem and 
work on supporting indexes with Tez?

> ALTER INDEX in Tez causes NullPointerException
> --
>
> Key: HIVE-14083
> URL: https://issues.apache.org/jira/browse/HIVE-14083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14083.patch
>
>
> ALTER INDEX causes a NullPointerException when run under TEZ execution 
> engine. Query runs without issue when submitted using MR execution mode.
> To reproduce:
> 1. CREATE INDEX sample_08_index ON TABLE sample_08 (code) AS 'COMPACT' WITH 
> DEFERRED REBUILD; 
> 2. ALTER INDEX sample_08_index ON sample_08 REBUILD; 
> *Stacktrace from Hive 1.2.1*
> {code:java}
> ERROR : Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1460577396252_0005_1_00, diagnostics=[Task failed, 
> taskId=task_1460577396252_0005_1_00_00, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running task:java.lang.RuntimeException: 
> java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:196)
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:135)
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:101)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:149)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80)
>   at 
> org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:650)
>   at 
> org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:621)
>   at 
> org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
>   at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:269)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:233)
>   at 
>

[jira] [Updated] (HIVE-15269) Dynamic Min-Max runtime-filtering for Tez

2017-01-19 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-15269:
--
Attachment: HIVE-15269.15.patch

> Dynamic Min-Max runtime-filtering for Tez
> -
>
> Key: HIVE-15269
> URL: https://issues.apache.org/jira/browse/HIVE-15269
> Project: Hive
>  Issue Type: New Feature
>Reporter: Jason Dere
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15269.10.patch, HIVE-15269.11.patch, 
> HIVE-15269.12.patch, HIVE-15269.13.patch, HIVE-15269.14.patch, 
> HIVE-15269.15.patch, HIVE-15269.1.patch, HIVE-15269.2.patch, 
> HIVE-15269.3.patch, HIVE-15269.4.patch, HIVE-15269.5.patch, 
> HIVE-15269.6.patch, HIVE-15269.7.patch, HIVE-15269.8.patch, HIVE-15269.9.patch
>
>
> If a dimension table and fact table are joined:
> {noformat}
> select *
> from store join store_sales on (store.id = store_sales.store_id)
> where store.s_store_name = 'My Store'
> {noformat}
> One optimization that can be done is to get the min/max store id values that 
> come out of the scan/filter of the store table, and send this min/max value 
> (via Tez edge) to the task which is scanning the store_sales table.
> We can add a BETWEEN(min, max) predicate to the store_sales TableScan, where 
> this predicate can be pushed down to the storage handler (for example for ORC 
> formats). Pushing a min/max predicate to the ORC reader would allow us to 
> avoid having to entire whole row groups during the table scan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14083) ALTER INDEX in Tez causes NullPointerException

2017-01-19 Thread Thomas Poepping (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830811#comment-15830811
 ] 

Thomas Poepping commented on HIVE-14083:


Sorry to jump on this so late, buy why are we okay with accepting that indexing 
just doesn't work? Are there no follow-up Jira issues to attack the problem and 
work on supporting indexes with Tez?

> ALTER INDEX in Tez causes NullPointerException
> --
>
> Key: HIVE-14083
> URL: https://issues.apache.org/jira/browse/HIVE-14083
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14083.patch
>
>
> ALTER INDEX causes a NullPointerException when run under TEZ execution 
> engine. Query runs without issue when submitted using MR execution mode.
> To reproduce:
> 1. CREATE INDEX sample_08_index ON TABLE sample_08 (code) AS 'COMPACT' WITH 
> DEFERRED REBUILD; 
> 2. ALTER INDEX sample_08_index ON sample_08 REBUILD; 
> *Stacktrace from Hive 1.2.1*
> {code:java}
> ERROR : Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1460577396252_0005_1_00, diagnostics=[Task failed, 
> taskId=task_1460577396252_0005_1_00_00, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running task:java.lang.RuntimeException: 
> java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:196)
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:135)
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:101)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:149)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80)
>   at 
> org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:650)
>   at 
> org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:621)
>   at 
> org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
>   at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:269)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:233)
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193)
>   ... 25 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15662) check startTime in SparkTask to make sure startTime is not less than submitTime

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830836#comment-15830836
 ] 

Hive QA commented on HIVE-15662:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848175/HIVE-15662.000.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 78 failed/errored test(s), 10964 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=219)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=19)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=225)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=226)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=226)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] 
(batchId=154)

[jira] [Updated] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15649:

Status: Open  (was: Patch Available)

> LLAP IO may NPE on all-column read
> --
>
> Key: HIVE-15649
> URL: https://issues.apache.org/jira/browse/HIVE-15649
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15649.01.patch, HIVE-15649.patch
>
>
> It seems like very few paths use READ_ALL_COLUMNS config, but some do. LLAP 
> IO doesn't account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15649:

Attachment: (was: HIVE-15649.02.patch)

> LLAP IO may NPE on all-column read
> --
>
> Key: HIVE-15649
> URL: https://issues.apache.org/jira/browse/HIVE-15649
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15649.01.patch, HIVE-15649.patch
>
>
> It seems like very few paths use READ_ALL_COLUMNS config, but some do. LLAP 
> IO doesn't account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-15663) Add more interval tests to HivePerfCliDriver

2017-01-19 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong resolved HIVE-15663.

Resolution: Fixed

> Add more interval tests to HivePerfCliDriver
> 
>
> Key: HIVE-15663
> URL: https://issues.apache.org/jira/browse/HIVE-15663
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> following HIVE-13557



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15646) Column level lineage is not available for table Views

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830970#comment-15830970
 ] 

Hive QA commented on HIVE-15646:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848184/HIVE-15646.02.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3054/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3054/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3054/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-01-20 01:17:20.707
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-3054/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-01-20 01:17:20.710
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   b569b49..9209082  master -> origin/master
+ git reset --hard HEAD
HEAD is now at b569b49 HIVE-15661 : Add security error logging to LLAP (Sergey 
Shelukhin, reviewed by Jason Dere)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 3 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 9209082 HIVE-15297: Hive should not split semicolon within 
quoted string literals (Pengcheng Xiong, reviewed by Ashutosh Chauhan) 
(addendum  IV)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-01-20 01:17:22.149
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java: No such file 
or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java: No 
such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/session/LineageState.java: No 
such file or directory
error: a/ql/src/test/results/clientpositive/alter_view_as_select.q.out: No such 
file or directory
error: a/ql/src/test/results/clientpositive/alter_view_rename.q.out: No such 
file or directory
error: a/ql/src/test/results/clientpositive/authorization_8.q.out: No such file 
or directory
error: a/ql/src/test/results/clientpositive/authorization_cli_createtab.q.out: 
No such file or directory
error: 
a/ql/src/test/results/clientpositive/authorization_cli_createtab_noauthzapi.q.out:
 No such file or directory
error: a/ql/src/test/results/clientpositive/authorization_owner_actions.q.out: 
No such file or directory
error: a/ql/src/test/results/clientpositive/authorization_view_1.q.out: No such 
file or directory
error: a/ql/src/test/results/clientpositive/authorization_view_2.q.out: No such 
file or directory
error: a/ql/src/test/results/clientpositive/authorization_view_3.q.out: No such 
file or directory
error: a/ql/src/test/results/clientpositive/authorization_view_4.q.out: No such 
file or directory
error: 
a/ql/src/test/results/clientpositive/authorization_view_disable_cbo_1.q.out: No 
such file or directory
error: 
a/ql/src/test/results/clientpositive/authorization_view_disable_cbo_2.q.out: No 
such file or directory
error: 
a/ql/src/test/results/clientpositive/authorization_view_disable_cbo_3.q.out: No 
such file or directory
error: 
a/ql/src/test/results/clientpositive/authorization_view_disable_cbo_4.q.out: No 
such file or directory
error: a/ql/src/test/results/clientpositive/cbo_const.q.out: No such file or 
directory
error: a/ql/src/test/results/clientpositive/cbo_subq_exists.q.out: No such file 
or directory
error: a/ql/src/test/results/clientpositive/cbo_union_view.q.out: No such file

[jira] [Updated] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15649:

Attachment: HIVE-15649.02.patch

Not sure why it didn't apply.. trying again

> LLAP IO may NPE on all-column read
> --
>
> Key: HIVE-15649
> URL: https://issues.apache.org/jira/browse/HIVE-15649
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15649.01.patch, HIVE-15649.02.patch, 
> HIVE-15649.patch
>
>
> It seems like very few paths use READ_ALL_COLUMNS config, but some do. LLAP 
> IO doesn't account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15591) Hive can not use "," in quoted column name

2017-01-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830851#comment-15830851
 ] 

Ashutosh Chauhan commented on HIVE-15591:
-

+1

> Hive can not use "," in quoted column name
> --
>
> Key: HIVE-15591
> URL: https://issues.apache.org/jira/browse/HIVE-15591
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15591.01.patch
>
>
> As reported by [~cartershanklin]
> hive> create table test (`x,y` int);
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: 
> MetaException(message:org.apache.hadoop.hive.serde2.SerDeException 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe: columns has 2 elements 
> while columns.types has 1 elements!)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15651) LLAP: llap status tool enhancements

2017-01-19 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830885#comment-15830885
 ] 

Prasanth Jayachandran commented on HIVE-15651:
--

[~sseth] Can you please take another look?

> LLAP: llap status tool enhancements
> ---
>
> Key: HIVE-15651
> URL: https://issues.apache.org/jira/browse/HIVE-15651
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15651.1.patch, HIVE-15651.2.patch
>
>
> Per [~sseth] following enhancements can be made to llap status tool
> 1) If state changes from an ACTIVE state to STOPPED - terminate the script 
> immediately (fail fast)
> 2) Add a threshold of what is acceptable in terms of the running state - 
> RUNNING_PARTIAL may be ok if 80% nodes are up for example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15651) LLAP: llap status tool enhancements

2017-01-19 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15651:
-
Attachment: HIVE-15651.2.patch

Following updates are made
- Watch mode accepts no arguments. By default watches/waits until RUNNING_ALL 
state is reached.
- If threshold is specified, watch mode waits until RUNNING_PARTIAL state is 
reached and the threshold for running nodes is met

> LLAP: llap status tool enhancements
> ---
>
> Key: HIVE-15651
> URL: https://issues.apache.org/jira/browse/HIVE-15651
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15651.1.patch, HIVE-15651.2.patch
>
>
> Per [~sseth] following enhancements can be made to llap status tool
> 1) If state changes from an ACTIVE state to STOPPED - terminate the script 
> immediately (fail fast)
> 2) Add a threshold of what is acceptable in terms of the running state - 
> RUNNING_PARTIAL may be ok if 80% nodes are up for example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15622) Remove HWI component from Hive

2017-01-19 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830908#comment-15830908
 ] 

Wei Zheng commented on HIVE-15622:
--

[~leftylev] Thanks for the note. I've updated the relevant wiki docs.

> Remove HWI component from Hive
> --
>
> Key: HIVE-15622
> URL: https://issues.apache.org/jira/browse/HIVE-15622
> Project: Hive
>  Issue Type: Task
>  Components: Web UI
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 2.2.0
>
> Attachments: HIVE-15622.1.patch, HIVE-15622.2.patch, 
> HIVE-15622.3.patch
>
>
> This component seems to be obsolete, as it didn't get any meaningful update 
> since 2012. And we don't see people discussing or complaining issues about 
> this. Moreover, it caused a number of ptest issues which can be avoided.
> We should remove this component as a cleanup effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15669) LLAP: Improve aging in shortest job first scheduler

2017-01-19 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15669:
-
Reporter: Rajesh Balamohan  (was: Prasanth Jayachandran)

> LLAP: Improve aging in shortest job first scheduler
> ---
>
> Key: HIVE-15669
> URL: https://issues.apache.org/jira/browse/HIVE-15669
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
>
> Under high concurrency, some jobs can gets starved for longer time when 
> hive.llap.task.scheduler.locality.delay is set to -1 (infinitely wait for 
> locality).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15669) LLAP: Improve aging in shortest job first scheduler

2017-01-19 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830937#comment-15830937
 ] 

Prasanth Jayachandran commented on HIVE-15669:
--

cc [~rajesh.balamohan]


> LLAP: Improve aging in shortest job first scheduler
> ---
>
> Key: HIVE-15669
> URL: https://issues.apache.org/jira/browse/HIVE-15669
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
>
> Under high concurrency, some jobs can gets starved for longer time when 
> hive.llap.task.scheduler.locality.delay is set to -1 (infinitely wait for 
> locality).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15621) Use Hive's own JvmPauseMonitor instead of Hadoop's in LLAP

2017-01-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830949#comment-15830949
 ] 

Sergey Shelukhin commented on HIVE-15621:
-

+1

> Use Hive's own JvmPauseMonitor instead of Hadoop's in LLAP
> --
>
> Key: HIVE-15621
> URL: https://issues.apache.org/jira/browse/HIVE-15621
> Project: Hive
>  Issue Type: Task
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-15621.1.patch, HIVE-15621.2.patch, 
> HIVE-15621.3.patch, HIVE-15621.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15572) Improve the response time for query canceling when it happens during acquiring locks

2017-01-19 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831015#comment-15831015
 ] 

Yongzhi Chen commented on HIVE-15572:
-

The failures are not related.

> Improve the response time for query canceling when it happens during 
> acquiring locks
> 
>
> Key: HIVE-15572
> URL: https://issues.apache.org/jira/browse/HIVE-15572
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-15572.1.patch, HIVE-15572.2.patch
>
>
> When query canceling command sent during Hive Acquire locks (from zookeeper), 
> hive will finish acquiring all the locks and release them. As it is shown in 
> the following log:
> It took 165 s to finish acquire the lock,then spend 81s to release them.
> We can improve the performance by not acquiring any more locks and releasing 
> held locks when the query canceling command is received. 
> {noformat}
> Background-Pool: Thread-224]:  from=org.apache.hadoop.hive.ql.Driver>
> 2017-01-03 10:50:35,413 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [HiveServer2-Background-Pool: Thread-224]:  method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver>
> 2017-01-03 10:51:00,671 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [HiveServer2-Background-Pool: Thread-218]:  method=acquireReadWriteLocks start=1483469295080 end=1483469460671 
> duration=165591 from=org.apache.hadoop.hive.ql.Driver>
> 2017-01-03 10:51:00,672 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [HiveServer2-Background-Pool: Thread-218]:  from=org.apache.hadoop.hive.ql.Driver>
> 2017-01-03 10:51:00,672 ERROR org.apache.hadoop.hive.ql.Driver: 
> [HiveServer2-Background-Pool: Thread-218]: FAILED: query select count(*) from 
> manyparttbl has been cancelled
> 2017-01-03 10:51:00,673 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [HiveServer2-Background-Pool: Thread-218]:  from=org.apache.hadoop.hive.ql.Driver>
> 2017-01-03 10:51:40,755 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
> [HiveServer2-Background-Pool: Thread-215]:  start=1483469419487 end=1483469500755 duration=81268 
> from=org.apache.hadoop.hive.ql.Driver>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15622) Remove HWI component from Hive

2017-01-19 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-15622:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed patch 3 to master

> Remove HWI component from Hive
> --
>
> Key: HIVE-15622
> URL: https://issues.apache.org/jira/browse/HIVE-15622
> Project: Hive
>  Issue Type: Task
>  Components: Web UI
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 2.2.0
>
> Attachments: HIVE-15622.1.patch, HIVE-15622.2.patch, 
> HIVE-15622.3.patch
>
>
> This component seems to be obsolete, as it didn't get any meaningful update 
> since 2012. And we don't see people discussing or complaining issues about 
> this. Moreover, it caused a number of ptest issues which can be avoided.
> We should remove this component as a cleanup effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15541) Hive OOM when ATSHook enabled and ATS goes down

2017-01-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830942#comment-15830942
 ] 

Hive QA commented on HIVE-15541:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12848181/HIVE-15541.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 50 failed/errored test(s), 10964 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown]
 (batchId=219)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=219)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] 
(batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=231)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64]
 (batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] 
(batchId=222)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence]
 (batchId=225)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex]
 (batchId=225)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic]
 (batchId=158)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static]
 (batchId=156)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=157)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl]
 (batchId=159)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables]
 (batchId=157)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop]
 (batchId=226)
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path]
 (batchId=226)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] 
(batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[bucket_num_reducers2]
 (batchId=84)
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[bucket_num_reducers]
 (batchId=84)
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[index_bitmap3] 
(batchId=83)

[jira] [Commented] (HIVE-13282) GroupBy and select operator encounter ArrayIndexOutOfBoundsException

2017-01-19 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830959#comment-15830959
 ] 

Wei Zheng commented on HIVE-13282:
--

[~mmccline] I'm wondering about the status of this issue. I saw a similar 
backtrace, but this is with bucket map join disabled - set 
hive.convert.join.bucket.mapjoin.tez=false;
{code}
Status: Failed
Vertex failed, vertexName=Reducer 2, vertexId=vertex_1484780661793_0061_1_02, 
diagnostics=[Task failed, taskId=task_1484780661793_0061_1_02_02, 
diagnostics=[TaskAttempt 0 failed, info=[Error:
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1668)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row (tag=0) {"key":{"_col0":-2147185208},"value":null}
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:249)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row (tag=0) {"key":{"_col0":-2147185208},"value":null}
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:292)
... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Hive Runtime Error while processing row (tag=1) {"key":{"
at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:419)
at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:387)
at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:212)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1016)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:821)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:695)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:761)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361)
... 17 more
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row (tag=1) {"key":{"_col0":-2147270511},"value":null}
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302)
at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:411)
... 25 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row (tag=1) {"key":{"_col0":-2147270511},"value":null}
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:292)
... 26 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:708)
at

[jira] [Comment Edited] (HIVE-13282) GroupBy and select operator encounter ArrayIndexOutOfBoundsException

2017-01-19 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830959#comment-15830959
 ] 

Wei Zheng edited comment on HIVE-13282 at 1/20/17 1:14 AM:
---

[~mmccline] I'm wondering about the status of this issue. I saw a similar 
backtrace
{code}
Status: Failed
Vertex failed, vertexName=Reducer 2, vertexId=vertex_1484780661793_0061_1_02, 
diagnostics=[Task failed, taskId=task_1484780661793_0061_1_02_02, 
diagnostics=[TaskAttempt 0 failed, info=[Error:
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1668)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row (tag=0) {"key":{"_col0":-2147185208},"value":null}
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:249)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row (tag=0) {"key":{"_col0":-2147185208},"value":null}
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:292)
... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Hive Runtime Error while processing row (tag=1) {"key":{"
at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:419)
at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:387)
at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:212)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1016)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:821)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:695)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:761)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361)
... 17 more
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row (tag=1) {"key":{"_col0":-2147270511},"value":null}
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302)
at 
org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:411)
... 25 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row (tag=1) {"key":{"_col0":-2147270511},"value":null}
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:292)
... 26 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:708)
at

[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-19 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: (was: HIVE-15439.patch)

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.3.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-19 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: HIVE-15439.3.patch

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.3.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-19 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: (was: HIVE-15439.patch)

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.3.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15671) RPCServer.registerClient() erroneously uses server/client handshake timeout for connection timeout

2017-01-19 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831072#comment-15831072
 ] 

Xuefu Zhang commented on HIVE-15671:


[~vanzin]/[~lirui], could you please review?
cc: [~csun]

> RPCServer.registerClient() erroneously uses server/client handshake timeout 
> for connection timeout
> --
>
> Key: HIVE-15671
> URL: https://issues.apache.org/jira/browse/HIVE-15671
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-15671.patch
>
>
> {code}
>   /**
>* Tells the RPC server to expect a connection from a new client.
>* ...
>*/
>   public Future registerClient(final String clientId, String secret,
>   RpcDispatcher serverDispatcher) {
> return registerClient(clientId, secret, serverDispatcher, 
> config.getServerConnectTimeoutMs());
>   }
> {code}
> config.getServerConnectTimeoutMs() returns value for 
> hive.spark.client.server.connect.timeout, which is meant for timeout for 
> handshake between Hive client and remote Spark driver. Instead, the timeout 
> should be hive.spark.client.connect.timeout, which is for timeout for remote 
> Spark driver in connecting back to Hive client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15669) LLAP: Improve aging in shortest job first scheduler

2017-01-19 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15669:
-
Attachment: HIVE-15669.1.patch

[~rajesh.balamohan]/[~gopalv]/[~sseth] can someone please take a look?

> LLAP: Improve aging in shortest job first scheduler
> ---
>
> Key: HIVE-15669
> URL: https://issues.apache.org/jira/browse/HIVE-15669
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15669.1.patch
>
>
> Under high concurrency, some jobs can gets starved for longer time when 
> hive.llap.task.scheduler.locality.delay is set to -1 (infinitely wait for 
> locality).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15669) LLAP: Improve aging in shortest job first scheduler

2017-01-19 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15669:
-
Status: Patch Available  (was: Open)

> LLAP: Improve aging in shortest job first scheduler
> ---
>
> Key: HIVE-15669
> URL: https://issues.apache.org/jira/browse/HIVE-15669
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15669.1.patch
>
>
> Under high concurrency, some jobs can gets starved for longer time when 
> hive.llap.task.scheduler.locality.delay is set to -1 (infinitely wait for 
> locality).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 164 matches

Mail list logo