[jira] [Updated] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861

2016-10-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14917:
---
Attachment: HIVE-14917.01.patch

> explainanalyze_2.q fails after HIVE-14861
> -
>
> Key: HIVE-14917
> URL: https://issues.apache.org/jira/browse/HIVE-14917
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14917.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861

2016-10-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14917:
---
Status: Patch Available  (was: Open)

> explainanalyze_2.q fails after HIVE-14861
> -
>
> Key: HIVE-14917
> URL: https://issues.apache.org/jira/browse/HIVE-14917
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14917.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14858) Analyze command should support custom input formats

2016-10-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15557206#comment-15557206
 ] 

Lefty Leverenz commented on HIVE-14858:
---

Okay, thanks.

> Analyze command should support custom input formats
> ---
>
> Key: HIVE-14858
> URL: https://issues.apache.org/jira/browse/HIVE-14858
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-14858.1.patch
>
>
> Currently analyze command with partialscan or noscan only applies to 
> OrcInputFormat and MapredParquetInputFormat. However, if custom input formats 
> extend these two they should also be able to use the same command to collect 
> stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14913) Add new unit tests

2016-10-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15557087#comment-15557087
 ] 

Hive QA commented on HIVE-14913:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832212/HIVE-14913.2.patch

{color:green}SUCCESS:{color} +1 due to 11 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 10663 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_deep_filters]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_add_column]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cast1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[folder_predicate]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert0]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_local_directory_1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lateral_view]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_ppd_is_null]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_create]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[rcfile_createas1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[regex_col]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[view]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing]
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[alter_merge_orc]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_0]
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1432/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1432/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1432/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 23 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832212 - PreCommit-HIVE-Build

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14861) Support precedence for set operator using parentheses

2016-10-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14861:
---
Affects Version/s: 2.1.0

> Support precedence for set operator using parentheses
> -
>
> Key: HIVE-14861
> URL: https://issues.apache.org/jira/browse/HIVE-14861
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-14861.01.patch, HIVE-14861.02.patch
>
>
> We should support precedence for set operator by using parentheses. For 
> example
> {code}
> select * from src union all (select * from src union select * from src);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14861) Support precedence for set operator using parentheses

2016-10-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14861:
---
Fix Version/s: 2.2.0

> Support precedence for set operator using parentheses
> -
>
> Key: HIVE-14861
> URL: https://issues.apache.org/jira/browse/HIVE-14861
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-14861.01.patch, HIVE-14861.02.patch
>
>
> We should support precedence for set operator by using parentheses. For 
> example
> {code}
> select * from src union all (select * from src union select * from src);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14861) Support precedence for set operator using parentheses

2016-10-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14861:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Support precedence for set operator using parentheses
> -
>
> Key: HIVE-14861
> URL: https://issues.apache.org/jira/browse/HIVE-14861
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-14861.01.patch, HIVE-14861.02.patch
>
>
> We should support precedence for set operator by using parentheses. For 
> example
> {code}
> select * from src union all (select * from src union select * from src);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14861) Support precedence for set operator using parentheses

2016-10-07 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15557004#comment-15557004
 ] 

Pengcheng Xiong commented on HIVE-14861:


Hi [~sseth], the 18 test failures are mostly due to the golden file updates. 
When I commit, I also update the golden files. I will run the explainanalyze_2 
again.

> Support precedence for set operator using parentheses
> -
>
> Key: HIVE-14861
> URL: https://issues.apache.org/jira/browse/HIVE-14861
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14861.01.patch, HIVE-14861.02.patch
>
>
> We should support precedence for set operator by using parentheses. For 
> example
> {code}
> select * from src union all (select * from src union select * from src);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14855) test patch

2016-10-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14855:
--
Attachment: (was: HIVE-14883.5.patch)

> test patch
> --
>
> Key: HIVE-14855
> URL: https://issues.apache.org/jira/browse/HIVE-14855
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14855.2.patch, HIVE-14855.3.patch, 
> HIVE-14855.4.patch, HIVE-14855.5.patch, HIVE-14855.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14855) test patch

2016-10-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14855:
--
Attachment: HIVE-14883.5.patch

> test patch
> --
>
> Key: HIVE-14855
> URL: https://issues.apache.org/jira/browse/HIVE-14855
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14855.2.patch, HIVE-14855.3.patch, 
> HIVE-14855.4.patch, HIVE-14855.5.patch, HIVE-14855.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14855) test patch

2016-10-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14855:
--
Attachment: HIVE-14855.5.patch

> test patch
> --
>
> Key: HIVE-14855
> URL: https://issues.apache.org/jira/browse/HIVE-14855
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14855.2.patch, HIVE-14855.3.patch, 
> HIVE-14855.4.patch, HIVE-14855.5.patch, HIVE-14855.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14912) Fix the test failures for 2.1.1 caused by HIVE-13409

2016-10-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556992#comment-15556992
 ] 

Hive QA commented on HIVE-14912:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832200/HIVE-14912.1-branch-2.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 122 failed/errored test(s), 10461 tests 
executed
*Failed tests:*
{noformat}
249_TestHWISessionManager - did not produce a TEST-*.xml file
392_TestMsgBusConnection - did not produce a TEST-*.xml file
783_TestHiveDruidQueryBasedInputFormat - did not produce a TEST-*.xml file
784_TestDruidSerDe - did not produce a TEST-*.xml file
820_TestOperationLoggingAPIWithTez - did not produce a TEST-*.xml file
828_TestJdbcWithMiniHA - did not produce a TEST-*.xml file
842_TestJdbcWithMiniMr - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_table_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_explain
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_outer_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_udf1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_orig_table_use_metadata
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_json_serde1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_create
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_int_type_promotion
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_schema_evol_3a
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_outer_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_fetchwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_fetchwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_nonvec_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_nonvec_mapwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_vec_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_vec_mapwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_vecrow_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_vecrow_mapwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_null_optimizer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_str_to_map
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert

[jira] [Commented] (HIVE-14887) Reduce the memory requirements for tests

2016-10-07 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556948#comment-15556948
 ] 

Ferdinand Xu commented on HIVE-14887:
-

HIVE-14916 was filed addressing reducing memory for Spark tests.

> Reduce the memory requirements for tests
> 
>
> Key: HIVE-14887
> URL: https://issues.apache.org/jira/browse/HIVE-14887
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14887.01.patch
>
>
> The clusters that we spin up end up requiring 16GB at times. Also the maven 
> arguments seem a little heavy weight.
> Reducing this will allow for additional ptest drones per box, which should 
> bring down the runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14861) Support precedence for set operator using parentheses

2016-10-07 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556865#comment-15556865
 ] 

Siddharth Seth commented on HIVE-14861:
---

This seems to be committed... but the jira is still open. Also I believe 
TestMiniTezCliDriver.explainanalyze_2 has started failing consistently after 
this commit. Can we please revert this, till there's a good test run (there's a 
consistent set of tests which fail - this shouldn't be committed at 18 test 
failures)

> Support precedence for set operator using parentheses
> -
>
> Key: HIVE-14861
> URL: https://issues.apache.org/jira/browse/HIVE-14861
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14861.01.patch, HIVE-14861.02.patch
>
>
> We should support precedence for set operator by using parentheses. For 
> example
> {code}
> select * from src union all (select * from src union select * from src);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14721) Fix TestJdbcWithMiniHS2 runtime

2016-10-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556778#comment-15556778
 ] 

Hive QA commented on HIVE-14721:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832161/HIVE-14721.5.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10665 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2]
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testParallelCompilation2
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testURIDatabaseName
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1430/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1430/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1430/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832161 - PreCommit-HIVE-Build

> Fix TestJdbcWithMiniHS2 runtime
> ---
>
> Key: HIVE-14721
> URL: https://issues.apache.org/jira/browse/HIVE-14721
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-14721.1.patch, HIVE-14721.2.patch, 
> HIVE-14721.3.patch, HIVE-14721.3.patch, HIVE-14721.3.patch, 
> HIVE-14721.4.patch, HIVE-14721.4.patch, HIVE-14721.5.patch
>
>
> Currently 450s



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14745) Remove jira user/password from profiles by using another command to submit results to jira

2016-10-07 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556759#comment-15556759
 ] 

Siddharth Seth commented on HIVE-14745:
---

[~spena] - I think you've summarized this nicely. Option#1 is the cleanest, and 
easiest to manage, but a little more effort. I'm assuming this will invoke the 
JiraService class from the jenkins script.
Option #3 is much quicker though - so it just comes down to time.

Option#1 requires access to jenkins to modify whatever is required. Would this 
be a problem if we move to a custom jenkins job again (Apache build servers 
take a long time - jobs are in the queue waiting for executors for the longest 
time). Option#3 requires access to the node.

> Remove jira user/password from profiles by using another command to submit 
> results to jira
> --
>
> Key: HIVE-14745
> URL: https://issues.apache.org/jira/browse/HIVE-14745
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Testing Infrastructure
>Reporter: Sergio Peña
>
> Hive ptest uses some properties files per branch that contain information 
> about how to execute the tests.
> This profile includes the user & password to submit the results to JIRA. We 
> should get rid of this sensitive information from the profile by moving the 
> jira submission task to another command or script executed directly by 
> Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14915) Add an option to skip log collection for successful tests

2016-10-07 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14915:
--
Status: Patch Available  (was: Open)

> Add an option to skip log collection for successful tests
> -
>
> Key: HIVE-14915
> URL: https://issues.apache.org/jira/browse/HIVE-14915
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14915.01.patch
>
>
> We generate multiple gigs of tests at the moment. An option to skip log 
> collection for successful tests could be useful.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14915) Add an option to skip log collection for successful tests

2016-10-07 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14915:
--
Description: 
We generate multiple gigs of tests at the moment. An option to skip log 
collection for successful tests could be useful.

NO PRECOMMIT TESTS

  was:We generate multiple gigs of tests at the moment. An option to skip log 
collection for successful tests could be useful.


> Add an option to skip log collection for successful tests
> -
>
> Key: HIVE-14915
> URL: https://issues.apache.org/jira/browse/HIVE-14915
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14915.01.patch
>
>
> We generate multiple gigs of tests at the moment. An option to skip log 
> collection for successful tests could be useful.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14915) Add an option to skip log collection for successful tests

2016-10-07 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556728#comment-15556728
 ] 

Siddharth Seth edited comment on HIVE-14915 at 10/8/16 12:22 AM:
-

Patch which adds a parameter for this, and changes the structure of the 
downloaded directories.

Applies on top of HIVE-14914, HIVE-14915

cc [~prasanth_j], [~spena] for review.
For reference: Downloaded size with this enabled - 240MB (14 failed tests), 6GB 
without.

ptest tests pass.


was (Author: sseth):
Patch which adds a parameter for this, and changes the structure of the 
downloaded directories.

Applies on top of HIVE-14914, HIVE-14915

cc [~prasanth_j], [~spena] for review.
For reference: Downloaded size with this enabled - 240MB (14 failed tests), 6GB 
without.

> Add an option to skip log collection for successful tests
> -
>
> Key: HIVE-14915
> URL: https://issues.apache.org/jira/browse/HIVE-14915
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14915.01.patch
>
>
> We generate multiple gigs of tests at the moment. An option to skip log 
> collection for successful tests could be useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14858) Analyze command should support custom input formats

2016-10-07 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556729#comment-15556729
 ] 

Chao Sun commented on HIVE-14858:
-

Thanks [~leftylev] for mentioning this. I don't think we need to change the 
wiki since this is more about internal implementation stuff.

> Analyze command should support custom input formats
> ---
>
> Key: HIVE-14858
> URL: https://issues.apache.org/jira/browse/HIVE-14858
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-14858.1.patch
>
>
> Currently analyze command with partialscan or noscan only applies to 
> OrcInputFormat and MapredParquetInputFormat. However, if custom input formats 
> extend these two they should also be able to use the same command to collect 
> stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14915) Add an option to skip log collection for successful tests

2016-10-07 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14915:
--
Attachment: HIVE-14915.01.patch

Patch which adds a parameter for this, and changes the structure of the 
downloaded directories.

Applies on top of HIVE-14914, HIVE-14915

cc [~prasanth_j], [~spena] for review.
For reference: Downloaded size with this enabled - 240MB (14 failed tests), 6GB 
without.

> Add an option to skip log collection for successful tests
> -
>
> Key: HIVE-14915
> URL: https://issues.apache.org/jira/browse/HIVE-14915
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14915.01.patch
>
>
> We generate multiple gigs of tests at the moment. An option to skip log 
> collection for successful tests could be useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14476) Fix logging issue for branch-1

2016-10-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556512#comment-15556512
 ] 

Hive QA commented on HIVE-14476:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12822637/HIVE-14476.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1429/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1429/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1429/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-10-07 22:54:15.052
+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1429/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-10-07 22:54:15.054
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 2435e70 HIVE-14861: Support precedence for set operator using 
parentheses (Pengcheng Xiong, reviewed by Ashutosh Chauhan)
+ git clean -f -d
Removing b/
Removing ql/src/test/results/clientpositive/llap/autoColumnStats_1.q.out
Removing ql/src/test/results/clientpositive/llap/autoColumnStats_2.q.out
Removing ql/src/test/results/clientpositive/llap/bucket_groupby.q.out
Removing 
ql/src/test/results/clientpositive/llap/bucketsortoptimize_insert_2.q.out
Removing ql/src/test/results/clientpositive/llap/cbo_rp_gby.q.out
Removing ql/src/test/results/clientpositive/llap/cbo_rp_join.q.out
Removing ql/src/test/results/clientpositive/llap/cbo_rp_lineage2.q.out
Removing ql/src/test/results/clientpositive/llap/cbo_rp_semijoin.q.out
Removing ql/src/test/results/clientpositive/llap/cbo_rp_subq_not_in.q.out
Removing ql/src/test/results/clientpositive/llap/cbo_rp_unionDistinct_2.q.out
Removing ql/src/test/results/clientpositive/llap/cbo_rp_windowing_2.q.out
Removing ql/src/test/results/clientpositive/llap/correlationoptimizer2.q.out
Removing ql/src/test/results/clientpositive/llap/correlationoptimizer4.q.out
Removing ql/src/test/results/clientpositive/llap/correlationoptimizer6.q.out
Removing 
ql/src/test/results/clientpositive/llap/dynpart_sort_optimization_acid.q.out
Removing ql/src/test/results/clientpositive/llap/escape1.q.out
Removing ql/src/test/results/clientpositive/llap/escape2.q.out
Removing ql/src/test/results/clientpositive/llap/global_limit.q.out
Removing ql/src/test/results/clientpositive/llap/groupby_sort_1_23.q.out
Removing ql/src/test/results/clientpositive/llap/groupby_sort_skew_1_23.q.out
Removing ql/src/test/results/clientpositive/llap/insert_into_with_schema.q.out
Removing ql/src/test/results/clientpositive/llap/join43.q.out
Removing ql/src/test/results/clientpositive/llap/join_filters.q.out
Removing ql/src/test/results/clientpositive/llap/join_nulls.q.out
Removing ql/src/test/results/clientpositive/llap/limit_join_transpose.q.out
Removing ql/src/test/results/clientpositive/llap/lineage2.q.out
Removing ql/src/test/results/clientpositive/llap/lineage3.q.out
Removing ql/src/test/results/clientpositive/llap/llap_partitioned.q.out
Removing ql/src/test/results/clientpositive/llap/load_dyn_part5.q.out
Removing ql/src/test/results/clientpositive/llap/multiMapJoin1.q.out
Removing ql/src/test/results/clientpositive/llap/multiMapJoin2.q.out
Removing 
ql/src/test/results/clientpositive/llap/multi_insert_move_tasks_share_dependencies.q.out
Removing ql/src/test/results/clientpositive/llap/orc_ppd_date.q.out
Removing ql/src/test/results/clientpositive/llap/orc_ppd_decimal.q.out
Removing ql/src/test/results/clientpositive/llap/orc_ppd_timestamp.q.out
Removing ql/src/test/results/clientpositive/llap/parquet_ppd_decimal.q.out
Removing ql/src/test/results/clientpositive/llap/partition_multilevels.q.out
Removing ql/src/test/results/clientpositive/llap/rcfile_createas1.q.out
Removing 

[jira] [Updated] (HIVE-11957) SHOW TRANSACTIONS should show queryID/agent id of the creator

2016-10-07 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-11957:
-
Attachment: HIVE-11957.2.patch

patch 2 changes from int32 to int64 for the two new timestamp fields

> SHOW TRANSACTIONS should show queryID/agent id of the creator
> -
>
> Key: HIVE-11957
> URL: https://issues.apache.org/jira/browse/HIVE-11957
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-11957.1.patch, HIVE-11957.2.patch
>
>
> this would be very useful for debugging
> should also include heartbeat/create timestamps
> would be nice to support some filtering/sorting options, like sort by create 
> time, agent id. filter by table, database, etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-07 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Status: Patch Available  (was: Open)

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-07 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Assignee: Vineet Garg

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-07 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Attachment: HIVE-14913.2.patch

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-07 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Status: Open  (was: Patch Available)

Missed few output files

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14877) Move slow CliDriver tests to MiniLlap

2016-10-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556463#comment-15556463
 ] 

Hive QA commented on HIVE-14877:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832187/HIVE-14877.5.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10633 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_partitioned]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_lineage2]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2]
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1428/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1428/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1428/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832187 - PreCommit-HIVE-Build

> Move slow CliDriver tests to MiniLlap
> -
>
> Key: HIVE-14877
> URL: https://issues.apache.org/jira/browse/HIVE-14877
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14877.1.patch, HIVE-14877.2.patch, 
> HIVE-14877.3.patch, HIVE-14877.4.patch, HIVE-14877.5.patch
>
>
> When analyzing the test runtimes, there are many CliDriver tests that shows 
> up as stragglers and are slow. Most of these tests are not really testing the 
> execution engine. For example special_character_in_tabnames_1.q is the 
> slowest test case that takes 419s in CliDriver but only 62s in MiniLlap. 
> Similarly there are many test cases that can benefit from fast runtimes. We 
> should consider moving the tests that are not testing the execution engine to 
> MiniLlap (assuming it provides significant performance benefit).
> Here is the list of top 100 slow tests based on build #1055
> ||QFiles||TestCliDriver elapsed time||
> |special_character_in_tabnames_1.q|419.229|
> |unionDistinct_1.q|278.583|
> |vector_leftsemi_mapjoin.q|232.313|
> |join_filters.q|172.436|
> |escape2.q|167.503|
> |archive_excludeHadoop20.q|163.522|
> |escape1.q|130.217|
> |lineage3.q|110.935|
> |insert_into_with_schema.q|107.345|
> |auto_join_filters.q|104.331|
> |windowing.q|99.622|
> |index_compact_binary_search.q|97.637|
> |cbo_rp_windowing_2.q|95.108|
> |vectorized_ptf.q|93.397|
> |dynpart_sort_optimization_acid.q|91.831|
> |partition_multilevels.q|90.392|
> |ptf.q|89.115|
> |sample_islocalmode_hook.q|88.293|
> |udaf_collect_set_2.q|84.725|
> |skewjoin.q|84.588|
> |lineage2.q|84.187|
> |correlationoptimizer1.q|80.367|
> |dynpart_sort_optimization.q|77.07|
> |orc_ppd_decimal.q|75.523|
> |orc_ppd_schema_evol_3a.q|75.352|
> |groupby_sort_skew_1_23.q|75.342|
> |cbo_rp_lineage2.q|75.283|
> |parquet_ppd_decimal.q|74.063|
> |sample_islocalmode_hook_use_metadata.q|73.988|
> |orc_analyze.q|73.803|
> |join_nulls.q|72.417|
> |semijoin.q|70.403|
> |correlationoptimizer6.q|69.151|
> |table_access_keys_stats.q|68.699|
> |autoColumnStats_2.q|68.632|
> |cbo_join.q|68.325|
> |cbo_rp_join.q|68.317|
> |sample10.q|64.513|
> |mergejoin.q|63.647|
> |multi_insert_move_tasks_share_dependencies.q|62.079|
> |union_view.q|61.772|
> |autoColumnStats_1.q|61.246|
> |groupby_sort_1_23.q|61.129|
> |pcr.q|59.546|
> |vectorization_short_regress.q|58.775|
> |auto_sortmerge_join_9.q|58.3|
> |correlationoptimizer2.q|56.591|
> |alter_merge_stats_orc.q|55.202|
> |vector_join30.q|54.85|
> |selectDistinctStar.q|53.981|
> |vector_decimal_udf.q|53.879|
> |auto_join30.q|53.762|
> |subquery_notin.q|52.879|
> |cbo_rp_subq_not_in.q|52.609|
> |cbo_rp_gby.q|51.866|
> |cbo_subq_not_in.q|51.672|
> |cbo_gby.q|50.361|
> |infer_bucket_sort.q|49.158|
> |ptf_streaming.q|48.484|
> |join_1to1.q|48.268|
> |load_dyn_part5.q|47.796|
> |limit_join_transpose.q|47.517|
> |ppd_windowing2.q|47.318|
> |dynpart_sort_opt_vectorization.q|47.208|
> |vector_number_compare_projection.q|47.024|
> |correlationoptimizer4.q|45.472|
> |orc_ppd_date.q|45.19|
> |global_limit.q|44.438|
> |union_top_level.q|44.229|
> |llap_partitioned.q|44.139|
> |orc_ppd_timestamp.q|43.617|
> |parquet_ppd_date.q|43.539|
> |multiMapJoin2.q|43.036|
> |parquet_ppd_timestamp.q|42.665|
> 

[jira] [Updated] (HIVE-14914) Improve the 'TestClass' did not produce a TEST-*.xml file message

2016-10-07 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14914:
--
Attachment: HIVE-14914.01.patch

Fairly straightforward patch. [~prasanth_j], [~spena] - could you please take a 
look.
ptest unit tests pass locally.

> Improve the 'TestClass' did not produce a TEST-*.xml file message
> -
>
> Key: HIVE-14914
> URL: https://issues.apache.org/jira/browse/HIVE-14914
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14914.01.patch
>
>
> For timed out unit test batches - this report may not be generated correctly.
> Also, there's no differentiation between 0 tests in a batch vs an actual 
> missing report.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14914) Improve the 'TestClass' did not produce a TEST-*.xml file message

2016-10-07 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14914:
--
Status: Patch Available  (was: Open)

> Improve the 'TestClass' did not produce a TEST-*.xml file message
> -
>
> Key: HIVE-14914
> URL: https://issues.apache.org/jira/browse/HIVE-14914
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14914.01.patch
>
>
> For timed out unit test batches - this report may not be generated correctly.
> Also, there's no differentiation between 0 tests in a batch vs an actual 
> missing report.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-07 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Status: Patch Available  (was: Open)

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
> Attachments: HIVE-14913.1.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-07 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Attachment: HIVE-14913.1.patch

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
> Attachments: HIVE-14913.1.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14912) Fix the test failures for 2.1.1 caused by HIVE-13409

2016-10-07 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-14912:

Status: Patch Available  (was: Open)

> Fix the test failures for 2.1.1 caused by HIVE-13409
> 
>
> Key: HIVE-14912
> URL: https://issues.apache.org/jira/browse/HIVE-14912
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-14912.1-branch-2.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14912) Fix the test failures for 2.1.1 caused by HIVE-13409

2016-10-07 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-14912:

Attachment: HIVE-14912.1-branch-2.1.patch

> Fix the test failures for 2.1.1 caused by HIVE-13409
> 
>
> Key: HIVE-14912
> URL: https://issues.apache.org/jira/browse/HIVE-14912
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-14912.1-branch-2.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14689) Failing test: TestCliDriver explainuser_3

2016-10-07 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-14689.
---
Resolution: Done

Looks like this got fixed somewhere.

> Failing test: TestCliDriver explainuser_3
> -
>
> Key: HIVE-14689
> URL: https://issues.apache.org/jira/browse/HIVE-14689
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>
> Consistent failures for quite a while.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14476) Fix logging issue for branch-1

2016-10-07 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-14476:
--
Status: Patch Available  (was: Open)

> Fix logging issue for branch-1
> --
>
> Key: HIVE-14476
> URL: https://issues.apache.org/jira/browse/HIVE-14476
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-14476.1.patch
>
>
> This issue is from branch-1 code when we decide if a log entry is an 
> operational log or not (the operational logs are visible to the client). The 
> problem is that the code is checking the logging mode at the beginning of the 
> decide() method, while the logging mode is updated after that check. Due to 
> this issue, we ran into an issue that an operational log could be filtered 
> out if it's the very first log being checked from the this method. As a 
> result, that particular log is not showing up for the end user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14721) Fix TestJdbcWithMiniHS2 runtime

2016-10-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556093#comment-15556093
 ] 

Hive QA commented on HIVE-14721:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832161/HIVE-14721.5.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 10663 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1427/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1427/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1427/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832161 - PreCommit-HIVE-Build

> Fix TestJdbcWithMiniHS2 runtime
> ---
>
> Key: HIVE-14721
> URL: https://issues.apache.org/jira/browse/HIVE-14721
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-14721.1.patch, HIVE-14721.2.patch, 
> HIVE-14721.3.patch, HIVE-14721.3.patch, HIVE-14721.3.patch, 
> HIVE-14721.4.patch, HIVE-14721.4.patch, HIVE-14721.5.patch
>
>
> Currently 450s



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14476) Fix logging issue for branch-1

2016-10-07 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556084#comment-15556084
 ] 

Thejas M Nair commented on HIVE-14476:
--

Can you click on submit patch to get it run via unit test ?

> Fix logging issue for branch-1
> --
>
> Key: HIVE-14476
> URL: https://issues.apache.org/jira/browse/HIVE-14476
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-14476.1.patch
>
>
> This issue is from branch-1 code when we decide if a log entry is an 
> operational log or not (the operational logs are visible to the client). The 
> problem is that the code is checking the logging mode at the beginning of the 
> decide() method, while the logging mode is updated after that check. Due to 
> this issue, we ran into an issue that an operational log could be filtered 
> out if it's the very first log being checked from the this method. As a 
> result, that particular log is not showing up for the end user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14476) Fix logging issue for branch-1

2016-10-07 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556082#comment-15556082
 ] 

Thejas M Nair commented on HIVE-14476:
--

+1

> Fix logging issue for branch-1
> --
>
> Key: HIVE-14476
> URL: https://issues.apache.org/jira/browse/HIVE-14476
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-14476.1.patch
>
>
> This issue is from branch-1 code when we decide if a log entry is an 
> operational log or not (the operational logs are visible to the client). The 
> problem is that the code is checking the logging mode at the beginning of the 
> decide() method, while the logging mode is updated after that check. Due to 
> this issue, we ran into an issue that an operational log could be filtered 
> out if it's the very first log being checked from the this method. As a 
> result, that particular log is not showing up for the end user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14877) Move slow CliDriver tests to MiniLlap

2016-10-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14877:
-
Attachment: HIVE-14877.5.patch

Explicitly setting the stats to avoid flakiness in the test. 

> Move slow CliDriver tests to MiniLlap
> -
>
> Key: HIVE-14877
> URL: https://issues.apache.org/jira/browse/HIVE-14877
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14877.1.patch, HIVE-14877.2.patch, 
> HIVE-14877.3.patch, HIVE-14877.4.patch, HIVE-14877.5.patch
>
>
> When analyzing the test runtimes, there are many CliDriver tests that shows 
> up as stragglers and are slow. Most of these tests are not really testing the 
> execution engine. For example special_character_in_tabnames_1.q is the 
> slowest test case that takes 419s in CliDriver but only 62s in MiniLlap. 
> Similarly there are many test cases that can benefit from fast runtimes. We 
> should consider moving the tests that are not testing the execution engine to 
> MiniLlap (assuming it provides significant performance benefit).
> Here is the list of top 100 slow tests based on build #1055
> ||QFiles||TestCliDriver elapsed time||
> |special_character_in_tabnames_1.q|419.229|
> |unionDistinct_1.q|278.583|
> |vector_leftsemi_mapjoin.q|232.313|
> |join_filters.q|172.436|
> |escape2.q|167.503|
> |archive_excludeHadoop20.q|163.522|
> |escape1.q|130.217|
> |lineage3.q|110.935|
> |insert_into_with_schema.q|107.345|
> |auto_join_filters.q|104.331|
> |windowing.q|99.622|
> |index_compact_binary_search.q|97.637|
> |cbo_rp_windowing_2.q|95.108|
> |vectorized_ptf.q|93.397|
> |dynpart_sort_optimization_acid.q|91.831|
> |partition_multilevels.q|90.392|
> |ptf.q|89.115|
> |sample_islocalmode_hook.q|88.293|
> |udaf_collect_set_2.q|84.725|
> |skewjoin.q|84.588|
> |lineage2.q|84.187|
> |correlationoptimizer1.q|80.367|
> |dynpart_sort_optimization.q|77.07|
> |orc_ppd_decimal.q|75.523|
> |orc_ppd_schema_evol_3a.q|75.352|
> |groupby_sort_skew_1_23.q|75.342|
> |cbo_rp_lineage2.q|75.283|
> |parquet_ppd_decimal.q|74.063|
> |sample_islocalmode_hook_use_metadata.q|73.988|
> |orc_analyze.q|73.803|
> |join_nulls.q|72.417|
> |semijoin.q|70.403|
> |correlationoptimizer6.q|69.151|
> |table_access_keys_stats.q|68.699|
> |autoColumnStats_2.q|68.632|
> |cbo_join.q|68.325|
> |cbo_rp_join.q|68.317|
> |sample10.q|64.513|
> |mergejoin.q|63.647|
> |multi_insert_move_tasks_share_dependencies.q|62.079|
> |union_view.q|61.772|
> |autoColumnStats_1.q|61.246|
> |groupby_sort_1_23.q|61.129|
> |pcr.q|59.546|
> |vectorization_short_regress.q|58.775|
> |auto_sortmerge_join_9.q|58.3|
> |correlationoptimizer2.q|56.591|
> |alter_merge_stats_orc.q|55.202|
> |vector_join30.q|54.85|
> |selectDistinctStar.q|53.981|
> |vector_decimal_udf.q|53.879|
> |auto_join30.q|53.762|
> |subquery_notin.q|52.879|
> |cbo_rp_subq_not_in.q|52.609|
> |cbo_rp_gby.q|51.866|
> |cbo_subq_not_in.q|51.672|
> |cbo_gby.q|50.361|
> |infer_bucket_sort.q|49.158|
> |ptf_streaming.q|48.484|
> |join_1to1.q|48.268|
> |load_dyn_part5.q|47.796|
> |limit_join_transpose.q|47.517|
> |ppd_windowing2.q|47.318|
> |dynpart_sort_opt_vectorization.q|47.208|
> |vector_number_compare_projection.q|47.024|
> |correlationoptimizer4.q|45.472|
> |orc_ppd_date.q|45.19|
> |global_limit.q|44.438|
> |union_top_level.q|44.229|
> |llap_partitioned.q|44.139|
> |orc_ppd_timestamp.q|43.617|
> |parquet_ppd_date.q|43.539|
> |multiMapJoin2.q|43.036|
> |parquet_ppd_timestamp.q|42.665|
> |vector_partitioned_date_time.q|42.511|
> |auto_sortmerge_join_8.q|42.377|
> |create_view.q|42.23|
> |windowing_windowspec2.q|42.202|
> |multiMapJoin1.q|41.176|
> |vector_decimal_2.q|41.026|
> |bucket_groupby.q|40.565|
> |rcfile_merge2.q|39.782|
> |index_compact_2.q|39.765|
> |join_nullsafe.q|39.698|
> |vector_join_filters.q|39.343|
> |cbo_rp_auto_join1.q|39.308|
> |vector_auto_smb_mapjoin_14.q|39.17|
> |vector_udf1.q|38.988|
> |rcfile_createas1.q|38.932|
> |cbo_rp_semijoin.q|38.675|
> |auto_join_nulls.q|38.519|
> |cbo_rp_unionDistinct_2.q|37.815|
> |union_remove_26.q|37.672|
> |rcfile_merge3.q|37.373|
> |rcfile_merge4.q|37.194|
> |bucketsortoptimize_insert_2.q|37.187|
> |cbo_limit.q|37.038|
> |auto_sortmerge_join_6.q|36.663|
> |join43.q|36.656|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14745) Remove jira user/password from profiles by using another command to submit results to jira

2016-10-07 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1933#comment-1933
 ] 

Sergio Peña commented on HIVE-14745:


We can set environment variables on Jenkins, but those won't be accessible by 
PTest because they're running in different instances. 
Jenkins is the client which calls PTestClient (this has access to the env. 
variables), then PTestClient makes a HTTP call to PTest with the patch to test.

> Remove jira user/password from profiles by using another command to submit 
> results to jira
> --
>
> Key: HIVE-14745
> URL: https://issues.apache.org/jira/browse/HIVE-14745
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Testing Infrastructure
>Reporter: Sergio Peña
>
> Hive ptest uses some properties files per branch that contain information 
> about how to execute the tests.
> This profile includes the user & password to submit the results to JIRA. We 
> should get rid of this sensitive information from the profile by moving the 
> jira submission task to another command or script executed directly by 
> Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14745) Remove jira user/password from profiles by using another command to submit results to jira

2016-10-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1919#comment-1919
 ] 

Prasanth Jayachandran commented on HIVE-14745:
--

Correct me if I am wrong. Can we set the information from jenkins to some env 
variable accessible only to hiveptest user? PTest can then read it from env 
variable and use that to publish to jira.

> Remove jira user/password from profiles by using another command to submit 
> results to jira
> --
>
> Key: HIVE-14745
> URL: https://issues.apache.org/jira/browse/HIVE-14745
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Testing Infrastructure
>Reporter: Sergio Peña
>
> Hive ptest uses some properties files per branch that contain information 
> about how to execute the tests.
> This profile includes the user & password to submit the results to JIRA. We 
> should get rid of this sensitive information from the profile by moving the 
> jira submission task to another command or script executed directly by 
> Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14861) Support precedence for set operator using parentheses

2016-10-07 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1908#comment-1908
 ] 

Pengcheng Xiong commented on HIVE-14861:


Note that this will change the current hive behavior
{code}
create table empsalary (depname varchar(10), salary int);

Q1: SELECT sum(salary) OVER w, avg(salary) OVER w
  FROM empsalary
  WINDOW w AS (PARTITION BY depname ORDER BY salary DESC);


Q2: SELECT sum(salary) OVER w as s , avg(salary) OVER w as a
  FROM empsalary
  WINDOW w AS (PARTITION BY depname ORDER BY salary DESC)
  order by s;

Q3: SELECT sum(salary) OVER w as s , avg(salary) OVER w as a
  FROM empsalary
  order by s
  WINDOW w AS (PARTITION BY depname ORDER BY salary DESC);
{code}

Current hive (before this patch) will succeed on Q1 and Q3, will fail on Q2. 
Postgres will succeed on Q1 and Q2, will fail on Q3. Oracle sqlplus will fail 
Q1,2,3 as it does not support separate definition of windowing. After this 
patch, hive will follow what postgres does.

> Support precedence for set operator using parentheses
> -
>
> Key: HIVE-14861
> URL: https://issues.apache.org/jira/browse/HIVE-14861
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14861.01.patch, HIVE-14861.02.patch
>
>
> We should support precedence for set operator by using parentheses. For 
> example
> {code}
> select * from src union all (select * from src union select * from src);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14745) Remove jira user/password from profiles by using another command to submit results to jira

2016-10-07 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1861#comment-1861
 ] 

Sergio Peña commented on HIVE-14745:


[~sseth] [~prasanth_j] I investigated this. The jira comment is published at 
the end of the test on PTest.java. All JIRA information is read from the 
profile file.

I'm thinking about 3 options. Which one do you think is better?

Option #1 (This is preferred as we can separate testing and jira interaction)
- Remove JIRA submission from PTest, and write JIRA comments to a file instead.
- Download the comments file as part of the test results to Jenkins.
- Publish the comments (read from the file) to JIRA from 
jenkinks-execute-build.sh (this script will have JIRA user and password).

Option #2 (Don't like this too much as we will send user/password through HTTP 
request)
- Send JIRA information to the server through PTestClient parameters.
- PTest will read those values from TestStartRequest, and use them on 
JIRAService

Option #3 (Easiest way to do it, but it is another file to manage on the server)
- Store JIRA information on a different properties file stored on the ptest 
server.
- PTest will read the properties file and use the values on JIRAService

> Remove jira user/password from profiles by using another command to submit 
> results to jira
> --
>
> Key: HIVE-14745
> URL: https://issues.apache.org/jira/browse/HIVE-14745
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Testing Infrastructure
>Reporter: Sergio Peña
>
> Hive ptest uses some properties files per branch that contain information 
> about how to execute the tests.
> This profile includes the user & password to submit the results to JIRA. We 
> should get rid of this sensitive information from the profile by moving the 
> jira submission task to another command or script executed directly by 
> Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14811) Failing test: TestCliDriver ctas

2016-10-07 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-14811.
---
Resolution: Duplicate

Fixed by HIVE-14896

> Failing test: TestCliDriver ctas
> 
>
> Key: HIVE-14811
> URL: https://issues.apache.org/jira/browse/HIVE-14811
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14812) Failing test: TestCliDriver acid_mapjoin

2016-10-07 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-14812.
---
Resolution: Duplicate

Fixed by HIVE-14896

> Failing test: TestCliDriver acid_mapjoin
> 
>
> Key: HIVE-14812
> URL: https://issues.apache.org/jira/browse/HIVE-14812
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14665) vector_join_part_col_char.q failure

2016-10-07 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14665:
--
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Fixed by HIVE-14896

> vector_join_part_col_char.q failure
> ---
>
> Key: HIVE-14665
> URL: https://issues.apache.org/jira/browse/HIVE-14665
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14665.1.patch
>
>
> Happens 100% of the time. Looks like a missed golden file update from 
> HIVE-14502.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14721) Fix TestJdbcWithMiniHS2 runtime

2016-10-07 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-14721:

Attachment: HIVE-14721.5.patch

> Fix TestJdbcWithMiniHS2 runtime
> ---
>
> Key: HIVE-14721
> URL: https://issues.apache.org/jira/browse/HIVE-14721
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-14721.1.patch, HIVE-14721.2.patch, 
> HIVE-14721.3.patch, HIVE-14721.3.patch, HIVE-14721.3.patch, 
> HIVE-14721.4.patch, HIVE-14721.4.patch, HIVE-14721.5.patch
>
>
> Currently 450s



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14908) Upgrade ANTLR to 3.5.2

2016-10-07 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1699#comment-1699
 ] 

Pengcheng Xiong commented on HIVE-14908:


Hi [~kgyrtkirk], if you have time, you can open a jira and try to upgrade it to 
v4 and then we can see if it benefits. I can expect that the major work is to 
rewrite "->" in v4. :)

> Upgrade ANTLR to 3.5.2
> --
>
> Key: HIVE-14908
> URL: https://issues.apache.org/jira/browse/HIVE-14908
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14908.01.patch
>
>
> Antlr v4 is also available but it does not support "->" which is widely used 
> in our grammar. Antlr 3.5.2 is the latest v3 version. It will reduce the code 
> size:
> {code}
> Here is summary of current parser code size
> 422345  HiveLexer.java
> 2436601  HiveParser.java
> 814184  HiveParser_FromClauseParser.java
> 2705920  HiveParser_IdentifiersParser.java
> 777665 HiveParser_SelectClauseParser.java
>After change, it will become
> 319589 HiveLexer.java
> 1853104 HiveParser.java
> 574156 HiveParser_FromClauseParser.java
> 1799195 HiveParser_IdentifiersParser.java
> 587305 HiveParser_SelectClauseParser.java
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14877) Move slow CliDriver tests to MiniLlap

2016-10-07 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1686#comment-1686
 ] 

Siddharth Seth commented on HIVE-14877:
---

[~prasanth_j] - I was seeing the same with some other test; would pass 
individually, but fail when run as a group. Had to eventually add it as an 
isolated test.

> Move slow CliDriver tests to MiniLlap
> -
>
> Key: HIVE-14877
> URL: https://issues.apache.org/jira/browse/HIVE-14877
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14877.1.patch, HIVE-14877.2.patch, 
> HIVE-14877.3.patch, HIVE-14877.4.patch
>
>
> When analyzing the test runtimes, there are many CliDriver tests that shows 
> up as stragglers and are slow. Most of these tests are not really testing the 
> execution engine. For example special_character_in_tabnames_1.q is the 
> slowest test case that takes 419s in CliDriver but only 62s in MiniLlap. 
> Similarly there are many test cases that can benefit from fast runtimes. We 
> should consider moving the tests that are not testing the execution engine to 
> MiniLlap (assuming it provides significant performance benefit).
> Here is the list of top 100 slow tests based on build #1055
> ||QFiles||TestCliDriver elapsed time||
> |special_character_in_tabnames_1.q|419.229|
> |unionDistinct_1.q|278.583|
> |vector_leftsemi_mapjoin.q|232.313|
> |join_filters.q|172.436|
> |escape2.q|167.503|
> |archive_excludeHadoop20.q|163.522|
> |escape1.q|130.217|
> |lineage3.q|110.935|
> |insert_into_with_schema.q|107.345|
> |auto_join_filters.q|104.331|
> |windowing.q|99.622|
> |index_compact_binary_search.q|97.637|
> |cbo_rp_windowing_2.q|95.108|
> |vectorized_ptf.q|93.397|
> |dynpart_sort_optimization_acid.q|91.831|
> |partition_multilevels.q|90.392|
> |ptf.q|89.115|
> |sample_islocalmode_hook.q|88.293|
> |udaf_collect_set_2.q|84.725|
> |skewjoin.q|84.588|
> |lineage2.q|84.187|
> |correlationoptimizer1.q|80.367|
> |dynpart_sort_optimization.q|77.07|
> |orc_ppd_decimal.q|75.523|
> |orc_ppd_schema_evol_3a.q|75.352|
> |groupby_sort_skew_1_23.q|75.342|
> |cbo_rp_lineage2.q|75.283|
> |parquet_ppd_decimal.q|74.063|
> |sample_islocalmode_hook_use_metadata.q|73.988|
> |orc_analyze.q|73.803|
> |join_nulls.q|72.417|
> |semijoin.q|70.403|
> |correlationoptimizer6.q|69.151|
> |table_access_keys_stats.q|68.699|
> |autoColumnStats_2.q|68.632|
> |cbo_join.q|68.325|
> |cbo_rp_join.q|68.317|
> |sample10.q|64.513|
> |mergejoin.q|63.647|
> |multi_insert_move_tasks_share_dependencies.q|62.079|
> |union_view.q|61.772|
> |autoColumnStats_1.q|61.246|
> |groupby_sort_1_23.q|61.129|
> |pcr.q|59.546|
> |vectorization_short_regress.q|58.775|
> |auto_sortmerge_join_9.q|58.3|
> |correlationoptimizer2.q|56.591|
> |alter_merge_stats_orc.q|55.202|
> |vector_join30.q|54.85|
> |selectDistinctStar.q|53.981|
> |vector_decimal_udf.q|53.879|
> |auto_join30.q|53.762|
> |subquery_notin.q|52.879|
> |cbo_rp_subq_not_in.q|52.609|
> |cbo_rp_gby.q|51.866|
> |cbo_subq_not_in.q|51.672|
> |cbo_gby.q|50.361|
> |infer_bucket_sort.q|49.158|
> |ptf_streaming.q|48.484|
> |join_1to1.q|48.268|
> |load_dyn_part5.q|47.796|
> |limit_join_transpose.q|47.517|
> |ppd_windowing2.q|47.318|
> |dynpart_sort_opt_vectorization.q|47.208|
> |vector_number_compare_projection.q|47.024|
> |correlationoptimizer4.q|45.472|
> |orc_ppd_date.q|45.19|
> |global_limit.q|44.438|
> |union_top_level.q|44.229|
> |llap_partitioned.q|44.139|
> |orc_ppd_timestamp.q|43.617|
> |parquet_ppd_date.q|43.539|
> |multiMapJoin2.q|43.036|
> |parquet_ppd_timestamp.q|42.665|
> |vector_partitioned_date_time.q|42.511|
> |auto_sortmerge_join_8.q|42.377|
> |create_view.q|42.23|
> |windowing_windowspec2.q|42.202|
> |multiMapJoin1.q|41.176|
> |vector_decimal_2.q|41.026|
> |bucket_groupby.q|40.565|
> |rcfile_merge2.q|39.782|
> |index_compact_2.q|39.765|
> |join_nullsafe.q|39.698|
> |vector_join_filters.q|39.343|
> |cbo_rp_auto_join1.q|39.308|
> |vector_auto_smb_mapjoin_14.q|39.17|
> |vector_udf1.q|38.988|
> |rcfile_createas1.q|38.932|
> |cbo_rp_semijoin.q|38.675|
> |auto_join_nulls.q|38.519|
> |cbo_rp_unionDistinct_2.q|37.815|
> |union_remove_26.q|37.672|
> |rcfile_merge3.q|37.373|
> |rcfile_merge4.q|37.194|
> |bucketsortoptimize_insert_2.q|37.187|
> |cbo_limit.q|37.038|
> |auto_sortmerge_join_6.q|36.663|
> |join43.q|36.656|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14494) Add support for BUILD DEFERRED

2016-10-07 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14494:
---
Description: 
This is an important feature, as it allows to declare materialized views but do 
not materialize them till they are used for the first time, or a REBUILD 
statement is executed. The extension for the CREATE MATERIALIZED VIEW syntax 
should be as follows:

{code:sql}
CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db_name.]materialized_view_name
  [BUILD DEFERRED] -- NEW!
  [COMMENT materialized_view_comment]
  [
   [ROW FORMAT row_format] 
   [STORED AS file_format]
 | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]
  ]
  [LOCATION hdfs_path]
  [TBLPROPERTIES (property_name=property_value, ...)]
  AS select_statement;
{code}

  was:
This is an important feature, as it allows to declare materialized views but do 
not materialize them till they are used for the first use, or a REBUILD 
statement is executed. The extension for the CREATE MATERIALIZED VIEW syntax 
should be as follows:

{code:sql}
CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db_name.]materialized_view_name
  [BUILD DEFERRED] -- NEW!
  [COMMENT materialized_view_comment]
  [
   [ROW FORMAT row_format] 
   [STORED AS file_format]
 | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]
  ]
  [LOCATION hdfs_path]
  [TBLPROPERTIES (property_name=property_value, ...)]
  AS select_statement;
{code}


> Add support for BUILD DEFERRED
> --
>
> Key: HIVE-14494
> URL: https://issues.apache.org/jira/browse/HIVE-14494
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>
> This is an important feature, as it allows to declare materialized views but 
> do not materialize them till they are used for the first time, or a REBUILD 
> statement is executed. The extension for the CREATE MATERIALIZED VIEW syntax 
> should be as follows:
> {code:sql}
> CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db_name.]materialized_view_name
>   [BUILD DEFERRED] -- NEW!
>   [COMMENT materialized_view_comment]
>   [
>[ROW FORMAT row_format] 
>[STORED AS file_format]
>  | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]
>   ]
>   [LOCATION hdfs_path]
>   [TBLPROPERTIES (property_name=property_value, ...)]
>   AS select_statement;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14497) Fine control for using materialized views in rewriting

2016-10-07 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-14497:
--

Assignee: Jesus Camacho Rodriguez

> Fine control for using materialized views in rewriting
> --
>
> Key: HIVE-14497
> URL: https://issues.apache.org/jira/browse/HIVE-14497
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up of HIVE-14495. Since the number of materialized views in the system 
> might grow very large, and query rewriting using materialized views might be 
> very expensive, we need to include a mechanism to enable/disable materialized 
> views for query rewriting.
> Thus, we should extend the CREATE MATERIALIZED VIEW statement as follows:
> {code:sql}
> CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db_name.]materialized_view_name
>   [BUILD DEFERRED]
>   [ENABLE REWRITE] -- NEW!
>   [COMMENT materialized_view_comment]
>   [
>[ROW FORMAT row_format] 
>[STORED AS file_format]
>  | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]
>   ]
>   [LOCATION hdfs_path]
>   [TBLPROPERTIES (property_name=property_value, ...)]
>   AS select_statement;
> {code}
> Further, we should extend the ALTER statement in case we want to change the 
> behavior of the materialized view after we have created it.
> {code:sql}
> ALTER MATERIALIZED VIEW [db_name.]materialized_view_name DISABLE REWRITE;
> {code}
> {code:sql}
> ALTER MATERIALIZED VIEW [db_name.]materialized_view_name ENABLE REWRITE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14877) Move slow CliDriver tests to MiniLlap

2016-10-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1646#comment-1646
 ] 

Prasanth Jayachandran commented on HIVE-14877:
--

dynpart_sort_optimization_acid.q failure looks flaky. It doesn't fail on my 
local run. Will see if I can repro it running along with other tests.

> Move slow CliDriver tests to MiniLlap
> -
>
> Key: HIVE-14877
> URL: https://issues.apache.org/jira/browse/HIVE-14877
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14877.1.patch, HIVE-14877.2.patch, 
> HIVE-14877.3.patch, HIVE-14877.4.patch
>
>
> When analyzing the test runtimes, there are many CliDriver tests that shows 
> up as stragglers and are slow. Most of these tests are not really testing the 
> execution engine. For example special_character_in_tabnames_1.q is the 
> slowest test case that takes 419s in CliDriver but only 62s in MiniLlap. 
> Similarly there are many test cases that can benefit from fast runtimes. We 
> should consider moving the tests that are not testing the execution engine to 
> MiniLlap (assuming it provides significant performance benefit).
> Here is the list of top 100 slow tests based on build #1055
> ||QFiles||TestCliDriver elapsed time||
> |special_character_in_tabnames_1.q|419.229|
> |unionDistinct_1.q|278.583|
> |vector_leftsemi_mapjoin.q|232.313|
> |join_filters.q|172.436|
> |escape2.q|167.503|
> |archive_excludeHadoop20.q|163.522|
> |escape1.q|130.217|
> |lineage3.q|110.935|
> |insert_into_with_schema.q|107.345|
> |auto_join_filters.q|104.331|
> |windowing.q|99.622|
> |index_compact_binary_search.q|97.637|
> |cbo_rp_windowing_2.q|95.108|
> |vectorized_ptf.q|93.397|
> |dynpart_sort_optimization_acid.q|91.831|
> |partition_multilevels.q|90.392|
> |ptf.q|89.115|
> |sample_islocalmode_hook.q|88.293|
> |udaf_collect_set_2.q|84.725|
> |skewjoin.q|84.588|
> |lineage2.q|84.187|
> |correlationoptimizer1.q|80.367|
> |dynpart_sort_optimization.q|77.07|
> |orc_ppd_decimal.q|75.523|
> |orc_ppd_schema_evol_3a.q|75.352|
> |groupby_sort_skew_1_23.q|75.342|
> |cbo_rp_lineage2.q|75.283|
> |parquet_ppd_decimal.q|74.063|
> |sample_islocalmode_hook_use_metadata.q|73.988|
> |orc_analyze.q|73.803|
> |join_nulls.q|72.417|
> |semijoin.q|70.403|
> |correlationoptimizer6.q|69.151|
> |table_access_keys_stats.q|68.699|
> |autoColumnStats_2.q|68.632|
> |cbo_join.q|68.325|
> |cbo_rp_join.q|68.317|
> |sample10.q|64.513|
> |mergejoin.q|63.647|
> |multi_insert_move_tasks_share_dependencies.q|62.079|
> |union_view.q|61.772|
> |autoColumnStats_1.q|61.246|
> |groupby_sort_1_23.q|61.129|
> |pcr.q|59.546|
> |vectorization_short_regress.q|58.775|
> |auto_sortmerge_join_9.q|58.3|
> |correlationoptimizer2.q|56.591|
> |alter_merge_stats_orc.q|55.202|
> |vector_join30.q|54.85|
> |selectDistinctStar.q|53.981|
> |vector_decimal_udf.q|53.879|
> |auto_join30.q|53.762|
> |subquery_notin.q|52.879|
> |cbo_rp_subq_not_in.q|52.609|
> |cbo_rp_gby.q|51.866|
> |cbo_subq_not_in.q|51.672|
> |cbo_gby.q|50.361|
> |infer_bucket_sort.q|49.158|
> |ptf_streaming.q|48.484|
> |join_1to1.q|48.268|
> |load_dyn_part5.q|47.796|
> |limit_join_transpose.q|47.517|
> |ppd_windowing2.q|47.318|
> |dynpart_sort_opt_vectorization.q|47.208|
> |vector_number_compare_projection.q|47.024|
> |correlationoptimizer4.q|45.472|
> |orc_ppd_date.q|45.19|
> |global_limit.q|44.438|
> |union_top_level.q|44.229|
> |llap_partitioned.q|44.139|
> |orc_ppd_timestamp.q|43.617|
> |parquet_ppd_date.q|43.539|
> |multiMapJoin2.q|43.036|
> |parquet_ppd_timestamp.q|42.665|
> |vector_partitioned_date_time.q|42.511|
> |auto_sortmerge_join_8.q|42.377|
> |create_view.q|42.23|
> |windowing_windowspec2.q|42.202|
> |multiMapJoin1.q|41.176|
> |vector_decimal_2.q|41.026|
> |bucket_groupby.q|40.565|
> |rcfile_merge2.q|39.782|
> |index_compact_2.q|39.765|
> |join_nullsafe.q|39.698|
> |vector_join_filters.q|39.343|
> |cbo_rp_auto_join1.q|39.308|
> |vector_auto_smb_mapjoin_14.q|39.17|
> |vector_udf1.q|38.988|
> |rcfile_createas1.q|38.932|
> |cbo_rp_semijoin.q|38.675|
> |auto_join_nulls.q|38.519|
> |cbo_rp_unionDistinct_2.q|37.815|
> |union_remove_26.q|37.672|
> |rcfile_merge3.q|37.373|
> |rcfile_merge4.q|37.194|
> |bucketsortoptimize_insert_2.q|37.187|
> |cbo_limit.q|37.038|
> |auto_sortmerge_join_6.q|36.663|
> |join43.q|36.656|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7926) long-lived daemons for query fragment execution, I/O and caching

2016-10-07 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1612#comment-1612
 ] 

Siddharth Seth commented on HIVE-7926:
--

bq. by this I assume a tez vertex interacts with the LLAP daemon to ensure the 
combined resources do not exceed what is allocated to eaither one of them, 
right ?
I don't quite understand the statement. Tez interacts with LLAP daemons to 
submit work - both the Tez AM and individual LLAP daemons work to ensure that 
individual daemons stay within their allocated resources. Let me know if more 
clarifications are required.

bq. Given that the LLAP caches data sets to be used by multiple tasks, am 
interested to know against which application (assuming LLAP is deployed as a 
YARN app) these resources would be charged against.
The resources are charged against the LLAP application in YARN - and not to 
individual queries / users submitting the queries.

> long-lived daemons for query fragment execution, I/O and caching
> 
>
> Key: HIVE-7926
> URL: https://issues.apache.org/jira/browse/HIVE-7926
> Project: Hive
>  Issue Type: New Feature
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: LLAPdesigndocument.pdf
>
>
> We are proposing a new execution model for Hive that is a combination of 
> existing process-based tasks and long-lived daemons running on worker nodes. 
> These nodes can take care of efficient I/O, caching and query fragment 
> execution, while heavy lifting like most joins, ordering, etc. can be handled 
> by tasks.
> The proposed model is not a 2-system solution for small and large queries; 
> neither it is a separate execution engine like MR or Tez. It can be used by 
> any Hive execution engine, if support is added; in future even external 
> products (e.g. Pig) can use it.
> The document with high-level design we are proposing will be attached shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7926) long-lived daemons for query fragment execution, I/O and caching

2016-10-07 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1594#comment-1594
 ] 

Arun Suresh commented on HIVE-7926:
---

Thanks for the clarification [~sseth].

bq.  LLAP makes use of co-operative scheduling to manage concurrency.
by this I assume a tez vertex interacts with the LLAP daemon to ensure the 
combined resources do not exceed what is allocated to eaither one of them, 
right ?
Given that the LLAP caches data sets to be used by multiple tasks, am 
interested to know against which application (assuming LLAP is deployed as a 
YARN app) these resources would be charged against.


> long-lived daemons for query fragment execution, I/O and caching
> 
>
> Key: HIVE-7926
> URL: https://issues.apache.org/jira/browse/HIVE-7926
> Project: Hive
>  Issue Type: New Feature
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: LLAPdesigndocument.pdf
>
>
> We are proposing a new execution model for Hive that is a combination of 
> existing process-based tasks and long-lived daemons running on worker nodes. 
> These nodes can take care of efficient I/O, caching and query fragment 
> execution, while heavy lifting like most joins, ordering, etc. can be handled 
> by tasks.
> The proposed model is not a 2-system solution for small and large queries; 
> neither it is a separate execution engine like MR or Tez. It can be used by 
> any Hive execution engine, if support is added; in future even external 
> products (e.g. Pig) can use it.
> The document with high-level design we are proposing will be attached shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14908) Upgrade ANTLR to 3.5.2

2016-10-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1535#comment-1535
 ] 

Ashutosh Chauhan commented on HIVE-14908:
-

yeah.. benefits are worth the upgrade. 
Patch LGTM +1

> Upgrade ANTLR to 3.5.2
> --
>
> Key: HIVE-14908
> URL: https://issues.apache.org/jira/browse/HIVE-14908
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14908.01.patch
>
>
> Antlr v4 is also available but it does not support "->" which is widely used 
> in our grammar. Antlr 3.5.2 is the latest v3 version. It will reduce the code 
> size:
> {code}
> Here is summary of current parser code size
> 422345  HiveLexer.java
> 2436601  HiveParser.java
> 814184  HiveParser_FromClauseParser.java
> 2705920  HiveParser_IdentifiersParser.java
> 777665 HiveParser_SelectClauseParser.java
>After change, it will become
> 319589 HiveLexer.java
> 1853104 HiveParser.java
> 574156 HiveParser_FromClauseParser.java
> 1799195 HiveParser_IdentifiersParser.java
> 587305 HiveParser_SelectClauseParser.java
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14839) Improve the stability of TestSessionManagerMetrics

2016-10-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1418#comment-1418
 ] 

Hive QA commented on HIVE-14839:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832137/HIVE-14839.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10656 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1426/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1426/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1426/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832137 - PreCommit-HIVE-Build

> Improve the stability of TestSessionManagerMetrics
> --
>
> Key: HIVE-14839
> URL: https://issues.apache.org/jira/browse/HIVE-14839
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Minor
> Attachments: HIVE-14839.patch
>
>
> The TestSessionManagerMetrics fails occasionally with the following error: 
> {noformat}
> org.junit.ComparisonFailure: expected:<[0]> but was:<[1]>
>   at 
> org.apache.hive.service.cli.session.TestSessionManagerMetrics.testThreadPoolMetrics(TestSessionManagerMetrics.java:98)
> Failed tests: 
>   TestSessionManagerMetrics.testThreadPoolMetrics:98 expected:<[0]> but 
> was:<[1]>
> {noformat}
> This test starts four background threads with a "wait" call in their run 
> method. The threads are using the common "barrier" object as lock. 
> The expected behaviour is that two threads will be in the async pool (because 
> the hive.server2.async.exec.threads is set to 2) and the other two thread 
> will be waiting in the queue. This condition is checked like this:
> {noformat}
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_POOL_SIZE, 2);
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_QUEUE_SIZE, 2);
> {noformat}
>   
> Then a notifyAll is called on the lock object, so the two threads in the pool 
> should "wake up" and complete and the other two threads should go from the 
> queue to the pool. This is checked like this in the test:
> {noformat}
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_POOL_SIZE, 2);
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_QUEUE_SIZE, 0);
> {noformat}
> 
> There are two use cases which can cause error in this test:
> # The notifyAll call happens before both threads in the pool are up and 
> running and in the "wait" phase.
> In this case the thread which is not up in time will stuck in the pool, so 
> the other two threads can not move from the queue to the pool. 
> # After the notifyAll call, the threads in the pool "wake up" with some 
> delay. So they don't complete and removed from the pool and the other two 
> threads are not moved from the queue to the pool until the metrics are 
> checked. Therefore the check fails, since the queue is not empty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14908) Upgrade ANTLR to 3.5.2

2016-10-07 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1365#comment-1365
 ] 

Eugene Koifman commented on HIVE-14908:
---

Given how much this upgrade reduced the generated file sizes it seems worth 
doing regardless of what we decide with respect to reserved words

> Upgrade ANTLR to 3.5.2
> --
>
> Key: HIVE-14908
> URL: https://issues.apache.org/jira/browse/HIVE-14908
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14908.01.patch
>
>
> Antlr v4 is also available but it does not support "->" which is widely used 
> in our grammar. Antlr 3.5.2 is the latest v3 version. It will reduce the code 
> size:
> {code}
> Here is summary of current parser code size
> 422345  HiveLexer.java
> 2436601  HiveParser.java
> 814184  HiveParser_FromClauseParser.java
> 2705920  HiveParser_IdentifiersParser.java
> 777665 HiveParser_SelectClauseParser.java
>After change, it will become
> 319589 HiveLexer.java
> 1853104 HiveParser.java
> 574156 HiveParser_FromClauseParser.java
> 1799195 HiveParser_IdentifiersParser.java
> 587305 HiveParser_SelectClauseParser.java
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14889) Beeline leaks sensitive environment variables of HiveServer2 when you type set;

2016-10-07 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-14889:
---
   Resolution: Fixed
Fix Version/s: 2.1.1
   2.2.0
   Status: Resolved  (was: Patch Available)

Thanks [~vihangk1]. I committed the patch to master and branch-2.1

> Beeline leaks sensitive environment variables of HiveServer2 when you type 
> set;
> ---
>
> Key: HIVE-14889
> URL: https://issues.apache.org/jira/browse/HIVE-14889
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14889.1.patch, HIVE-14889.2.patch
>
>
> When you type set; beeline prints all the environment variables including 
> passwords which could be major security risk. Eg: HADOOP_CREDENTIAL_PASSWORD 
> below is leaked.
> {noformat}
> | env:HADOOP_CREDSTORE_PASSWORD=password |
> | env:HADOOP_DATANODE_OPTS=-Dhadoop.security.logger=ERROR,RFAS  |
> | env:HADOOP_HOME_WARN_SUPPRESS=true |
> | env:HADOOP_IDENT_STRING=vihang |
> | env:HADOOP_PID_DIR=|
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14907) Hive Metastore should use repeatable-read consistency level

2016-10-07 Thread Lenni Kuff (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1247#comment-1247
 ] 

Lenni Kuff commented on HIVE-14907:
---

+[~mohitsabharwal] - FYI 

> Hive Metastore should use repeatable-read consistency level
> ---
>
> Key: HIVE-14907
> URL: https://issues.apache.org/jira/browse/HIVE-14907
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Lenni Kuff
>
> Currently HMS uses the "read-committed" consistency level which is the 
> default for DataNucleus. This could cause potential problems since the state 
> visible to each transaction can actually see updates from other transactions, 
>  so it is very difficult to reason about any code that reads multiple pieces 
> of data.
> Instead it should use "repeatable-read" consistency which guarantees that any 
> transaction only sees the state at the beginning of a transaction plus any 
> updates done within a transaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14839) Improve the stability of TestSessionManagerMetrics

2016-10-07 Thread Marta Kuczora (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1164#comment-1164
 ] 

Marta Kuczora commented on HIVE-14839:
--

The patch is attached. 
Created review on Review Board.

> Improve the stability of TestSessionManagerMetrics
> --
>
> Key: HIVE-14839
> URL: https://issues.apache.org/jira/browse/HIVE-14839
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Minor
> Attachments: HIVE-14839.patch
>
>
> The TestSessionManagerMetrics fails occasionally with the following error: 
> {noformat}
> org.junit.ComparisonFailure: expected:<[0]> but was:<[1]>
>   at 
> org.apache.hive.service.cli.session.TestSessionManagerMetrics.testThreadPoolMetrics(TestSessionManagerMetrics.java:98)
> Failed tests: 
>   TestSessionManagerMetrics.testThreadPoolMetrics:98 expected:<[0]> but 
> was:<[1]>
> {noformat}
> This test starts four background threads with a "wait" call in their run 
> method. The threads are using the common "barrier" object as lock. 
> The expected behaviour is that two threads will be in the async pool (because 
> the hive.server2.async.exec.threads is set to 2) and the other two thread 
> will be waiting in the queue. This condition is checked like this:
> {noformat}
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_POOL_SIZE, 2);
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_QUEUE_SIZE, 2);
> {noformat}
>   
> Then a notifyAll is called on the lock object, so the two threads in the pool 
> should "wake up" and complete and the other two threads should go from the 
> queue to the pool. This is checked like this in the test:
> {noformat}
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_POOL_SIZE, 2);
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_QUEUE_SIZE, 0);
> {noformat}
> 
> There are two use cases which can cause error in this test:
> # The notifyAll call happens before both threads in the pool are up and 
> running and in the "wait" phase.
> In this case the thread which is not up in time will stuck in the pool, so 
> the other two threads can not move from the queue to the pool. 
> # After the notifyAll call, the threads in the pool "wake up" with some 
> delay. So they don't complete and removed from the pool and the other two 
> threads are not moved from the queue to the pool until the metrics are 
> checked. Therefore the check fails, since the queue is not empty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14839) Improve the stability of TestSessionManagerMetrics

2016-10-07 Thread Marta Kuczora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-14839:
-
Status: Patch Available  (was: Open)

> Improve the stability of TestSessionManagerMetrics
> --
>
> Key: HIVE-14839
> URL: https://issues.apache.org/jira/browse/HIVE-14839
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Minor
> Attachments: HIVE-14839.patch
>
>
> The TestSessionManagerMetrics fails occasionally with the following error: 
> {noformat}
> org.junit.ComparisonFailure: expected:<[0]> but was:<[1]>
>   at 
> org.apache.hive.service.cli.session.TestSessionManagerMetrics.testThreadPoolMetrics(TestSessionManagerMetrics.java:98)
> Failed tests: 
>   TestSessionManagerMetrics.testThreadPoolMetrics:98 expected:<[0]> but 
> was:<[1]>
> {noformat}
> This test starts four background threads with a "wait" call in their run 
> method. The threads are using the common "barrier" object as lock. 
> The expected behaviour is that two threads will be in the async pool (because 
> the hive.server2.async.exec.threads is set to 2) and the other two thread 
> will be waiting in the queue. This condition is checked like this:
> {noformat}
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_POOL_SIZE, 2);
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_QUEUE_SIZE, 2);
> {noformat}
>   
> Then a notifyAll is called on the lock object, so the two threads in the pool 
> should "wake up" and complete and the other two threads should go from the 
> queue to the pool. This is checked like this in the test:
> {noformat}
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_POOL_SIZE, 2);
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_QUEUE_SIZE, 0);
> {noformat}
> 
> There are two use cases which can cause error in this test:
> # The notifyAll call happens before both threads in the pool are up and 
> running and in the "wait" phase.
> In this case the thread which is not up in time will stuck in the pool, so 
> the other two threads can not move from the queue to the pool. 
> # After the notifyAll call, the threads in the pool "wake up" with some 
> delay. So they don't complete and removed from the pool and the other two 
> threads are not moved from the queue to the pool until the metrics are 
> checked. Therefore the check fails, since the queue is not empty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14839) Improve the stability of TestSessionManagerMetrics

2016-10-07 Thread Marta Kuczora (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-14839:
-
Attachment: HIVE-14839.patch

> Improve the stability of TestSessionManagerMetrics
> --
>
> Key: HIVE-14839
> URL: https://issues.apache.org/jira/browse/HIVE-14839
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Minor
> Attachments: HIVE-14839.patch
>
>
> The TestSessionManagerMetrics fails occasionally with the following error: 
> {noformat}
> org.junit.ComparisonFailure: expected:<[0]> but was:<[1]>
>   at 
> org.apache.hive.service.cli.session.TestSessionManagerMetrics.testThreadPoolMetrics(TestSessionManagerMetrics.java:98)
> Failed tests: 
>   TestSessionManagerMetrics.testThreadPoolMetrics:98 expected:<[0]> but 
> was:<[1]>
> {noformat}
> This test starts four background threads with a "wait" call in their run 
> method. The threads are using the common "barrier" object as lock. 
> The expected behaviour is that two threads will be in the async pool (because 
> the hive.server2.async.exec.threads is set to 2) and the other two thread 
> will be waiting in the queue. This condition is checked like this:
> {noformat}
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_POOL_SIZE, 2);
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_QUEUE_SIZE, 2);
> {noformat}
>   
> Then a notifyAll is called on the lock object, so the two threads in the pool 
> should "wake up" and complete and the other two threads should go from the 
> queue to the pool. This is checked like this in the test:
> {noformat}
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_POOL_SIZE, 2);
> MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.GAUGE, 
> MetricsConstant.EXEC_ASYNC_QUEUE_SIZE, 0);
> {noformat}
> 
> There are two use cases which can cause error in this test:
> # The notifyAll call happens before both threads in the pool are up and 
> running and in the "wait" phase.
> In this case the thread which is not up in time will stuck in the pool, so 
> the other two threads can not move from the queue to the pool. 
> # After the notifyAll call, the threads in the pool "wake up" with some 
> delay. So they don't complete and removed from the pool and the other two 
> threads are not moved from the queue to the pool until the metrics are 
> checked. Therefore the check fails, since the queue is not empty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14908) Upgrade ANTLR to 3.5.2

2016-10-07 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15554436#comment-15554436
 ] 

Zoltan Haindrich edited comment on HIVE-14908 at 10/7/16 1:22 PM:
--

I've also tried 3.5.2  while I experimented with interval fixes..it didnt 
helped in that neither ;)

But I was not sure about antlr upgrade preferences ...
So i think I will look into it...but not sure how hard it will be...now that 
I've tried it..there are large differences.
but..antlr4 has much better dev tooling support which would be useful during 
development

althrough the removal of the optional reservation of these keywords will help - 
but I think identifierparser should be split up
 [~pxiong] 


was (Author: kgyrtkirk):
I've also tried 3.5.2  while I experimented with interval fixes..it didnt 
helped in that neither ;)

But I was not sure about antlr upgrade preferences ... afaik antlr4 does syn 
rules automatically..
So i think it might worth a try - I think i will try it.
Antlr4 has much better dev tooling support which would be useful during 
development

althrough the removal of the optional reservation of these keywords will help - 
but I think identifierparser should be split up
 [~pxiong] 

> Upgrade ANTLR to 3.5.2
> --
>
> Key: HIVE-14908
> URL: https://issues.apache.org/jira/browse/HIVE-14908
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14908.01.patch
>
>
> Antlr v4 is also available but it does not support "->" which is widely used 
> in our grammar. Antlr 3.5.2 is the latest v3 version. It will reduce the code 
> size:
> {code}
> Here is summary of current parser code size
> 422345  HiveLexer.java
> 2436601  HiveParser.java
> 814184  HiveParser_FromClauseParser.java
> 2705920  HiveParser_IdentifiersParser.java
> 777665 HiveParser_SelectClauseParser.java
>After change, it will become
> 319589 HiveLexer.java
> 1853104 HiveParser.java
> 574156 HiveParser_FromClauseParser.java
> 1799195 HiveParser_IdentifiersParser.java
> 587305 HiveParser_SelectClauseParser.java
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14555) JDBC:ClassNotFoundException when executing a map join query with UDF

2016-10-07 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1045#comment-1045
 ] 

Aihua Xu commented on HIVE-14555:
-

OK. The issue should have been fixed by HIVE-12302 which will set the 
classLoader to the new thread's one.

> JDBC:ClassNotFoundException when executing a map join query with UDF
> 
>
> Key: HIVE-14555
> URL: https://issues.apache.org/jira/browse/HIVE-14555
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.1.0
>Reporter: hizero
>Assignee: hizero
> Fix For: 1.1.0
>
> Attachments: HIVE-14555.patch
>
>
> when I submit a map join query with UDF using JDBC  and sometimes it throws:
> Error while compiling statement: FAILED: SemanticException Generate Map Join 
> Task Error: Unable to find class: com.kingnetdc.hive.udf.FilterByMap 
> Serialization trace: genericUDF 
> (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) colExprMap 
> (org.apache.hadoop.hive.ql.exec.SelectOperator) childOperators 
> (org.apache.hadoop.hive.ql.exec.FilterOperator) childOperators 
> (org.apache.hadoop.hive.ql.exec.JoinOperator) reducer 
> (org.apache.hadoop.hive.ql.plan.ReduceWork) reduceWork 
> (org.apache.hadoop.hive.ql.plan.MapredWork)
>  I have found the fact that it fails at cloning plan when invoking 
> Utilities.deserializePlan.
> An existing thread deals with the query and its static threadlocal 
> variable,cloningQueryPlanKryo has been initialed at most once per thread.When 
> this thread registered UDF setting in aux_jar_paths  it wont reinitialize the 
> cloningQueryPlanKryo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14908) Upgrade ANTLR to 3.5.2

2016-10-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15554804#comment-15554804
 ] 

Hive QA commented on HIVE-14908:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832091/HIVE-14908.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10656 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[columnstats_partlvl_multiple_part_clause]
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1425/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1425/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1425/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832091 - PreCommit-HIVE-Build

> Upgrade ANTLR to 3.5.2
> --
>
> Key: HIVE-14908
> URL: https://issues.apache.org/jira/browse/HIVE-14908
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14908.01.patch
>
>
> Antlr v4 is also available but it does not support "->" which is widely used 
> in our grammar. Antlr 3.5.2 is the latest v3 version. It will reduce the code 
> size:
> {code}
> Here is summary of current parser code size
> 422345  HiveLexer.java
> 2436601  HiveParser.java
> 814184  HiveParser_FromClauseParser.java
> 2705920  HiveParser_IdentifiersParser.java
> 777665 HiveParser_SelectClauseParser.java
>After change, it will become
> 319589 HiveLexer.java
> 1853104 HiveParser.java
> 574156 HiveParser_FromClauseParser.java
> 1799195 HiveParser_IdentifiersParser.java
> 587305 HiveParser_SelectClauseParser.java
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14877) Move slow CliDriver tests to MiniLlap

2016-10-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15554507#comment-15554507
 ] 

Hive QA commented on HIVE-14877:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832073/HIVE-14877.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10626 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization_acid]
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1424/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1424/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1424/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832073 - PreCommit-HIVE-Build

> Move slow CliDriver tests to MiniLlap
> -
>
> Key: HIVE-14877
> URL: https://issues.apache.org/jira/browse/HIVE-14877
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14877.1.patch, HIVE-14877.2.patch, 
> HIVE-14877.3.patch, HIVE-14877.4.patch
>
>
> When analyzing the test runtimes, there are many CliDriver tests that shows 
> up as stragglers and are slow. Most of these tests are not really testing the 
> execution engine. For example special_character_in_tabnames_1.q is the 
> slowest test case that takes 419s in CliDriver but only 62s in MiniLlap. 
> Similarly there are many test cases that can benefit from fast runtimes. We 
> should consider moving the tests that are not testing the execution engine to 
> MiniLlap (assuming it provides significant performance benefit).
> Here is the list of top 100 slow tests based on build #1055
> ||QFiles||TestCliDriver elapsed time||
> |special_character_in_tabnames_1.q|419.229|
> |unionDistinct_1.q|278.583|
> |vector_leftsemi_mapjoin.q|232.313|
> |join_filters.q|172.436|
> |escape2.q|167.503|
> |archive_excludeHadoop20.q|163.522|
> |escape1.q|130.217|
> |lineage3.q|110.935|
> |insert_into_with_schema.q|107.345|
> |auto_join_filters.q|104.331|
> |windowing.q|99.622|
> |index_compact_binary_search.q|97.637|
> |cbo_rp_windowing_2.q|95.108|
> |vectorized_ptf.q|93.397|
> |dynpart_sort_optimization_acid.q|91.831|
> |partition_multilevels.q|90.392|
> |ptf.q|89.115|
> |sample_islocalmode_hook.q|88.293|
> |udaf_collect_set_2.q|84.725|
> |skewjoin.q|84.588|
> |lineage2.q|84.187|
> |correlationoptimizer1.q|80.367|
> |dynpart_sort_optimization.q|77.07|
> |orc_ppd_decimal.q|75.523|
> |orc_ppd_schema_evol_3a.q|75.352|
> |groupby_sort_skew_1_23.q|75.342|
> |cbo_rp_lineage2.q|75.283|
> |parquet_ppd_decimal.q|74.063|
> |sample_islocalmode_hook_use_metadata.q|73.988|
> |orc_analyze.q|73.803|
> |join_nulls.q|72.417|
> |semijoin.q|70.403|
> |correlationoptimizer6.q|69.151|
> |table_access_keys_stats.q|68.699|
> |autoColumnStats_2.q|68.632|
> |cbo_join.q|68.325|
> |cbo_rp_join.q|68.317|
> |sample10.q|64.513|
> |mergejoin.q|63.647|
> |multi_insert_move_tasks_share_dependencies.q|62.079|
> |union_view.q|61.772|
> |autoColumnStats_1.q|61.246|
> |groupby_sort_1_23.q|61.129|
> |pcr.q|59.546|
> |vectorization_short_regress.q|58.775|
> |auto_sortmerge_join_9.q|58.3|
> |correlationoptimizer2.q|56.591|
> |alter_merge_stats_orc.q|55.202|
> |vector_join30.q|54.85|
> |selectDistinctStar.q|53.981|
> |vector_decimal_udf.q|53.879|
> |auto_join30.q|53.762|
> |subquery_notin.q|52.879|
> |cbo_rp_subq_not_in.q|52.609|
> |cbo_rp_gby.q|51.866|
> |cbo_subq_not_in.q|51.672|
> |cbo_gby.q|50.361|
> |infer_bucket_sort.q|49.158|
> |ptf_streaming.q|48.484|
> |join_1to1.q|48.268|
> |load_dyn_part5.q|47.796|
> |limit_join_transpose.q|47.517|
> |ppd_windowing2.q|47.318|
> |dynpart_sort_opt_vectorization.q|47.208|
> |vector_number_compare_projection.q|47.024|
> |correlationoptimizer4.q|45.472|
> |orc_ppd_date.q|45.19|
> |global_limit.q|44.438|
> |union_top_level.q|44.229|
> |llap_partitioned.q|44.139|
> |orc_ppd_timestamp.q|43.617|
> |parquet_ppd_date.q|43.539|
> |multiMapJoin2.q|43.036|
> |parquet_ppd_timestamp.q|42.665|
> |vector_partitioned_date_time.q|42.511|
> |auto_sortmerge_join_8.q|42.377|
> |create_view.q|42.23|
> |windowing_windowspec2.q|42.202|
> |multiMapJoin1.q|41.176|
> 

[jira] [Updated] (HIVE-14909) Preserve the "parent location" of the table when an "alter table rename to " is submitted (the case when the db location is not specified and the Hive de

2016-10-07 Thread Adriano (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adriano updated HIVE-14909:
---
Assignee: Chaoyu Tang

> Preserve the "parent location" of the table when an "alter table  
> rename to " is submitted (the case when the db location is not 
> specified and the Hive defult db is outside the same encrypted zone).
> --
>
> Key: HIVE-14909
> URL: https://issues.apache.org/jira/browse/HIVE-14909
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Adriano
>Assignee: Chaoyu Tang
>
> Alter Table operation for db_enc.rename_test failed to move data due to: 
> '/hdfs/encrypted_path/db_enc/rename_test can't be moved from an encryption 
> zone.'
> When Hive renames a managed table, it always creates the new renamed table 
> directory under its database directory in order to keep a db/table hierarchy. 
> In this case, the renamed table directory is created under "default db" 
> directory "hive/warehouse/". When Hive renames a managed table, it always 
> creates the new renamed table directory under its database directory in order 
> to keep a db/table hierarchy. In this case, the renamed table directory is 
> created under "default' db directory typically set as /hive/warehouse/ . 
> This error doesn't appear if first create a database which points to a 
> directory outside /hive/warehouse/, say '/hdfs/encrypted_path', you won't 
> have this problem. For example, 
> create database db_enc location '/hdfs/encrypted_path/db_enc; 
> use db_enc; 
> create table rename_test (...) location 
> '/hdfs/encrypted_path/db_enc/rename_test'; 
> alter table rename_test rename to test_rename; 
> The renamed test_rename directory is created under 
> /hdfs/encrypted_path/db_enc. 
> Considering that the encryption of a filesystem is part of the evolution 
> hardening of a system (where the system and the data contained can already 
> exists) and a db can be already created without location set (because it is 
> not strictly required)and the default db is outside the same encryption zone 
> (or in a no-encryption zone) the alter table rename operation will fail.
> Improvement:
> Preserve the "parent location" of the table when an "alter table  
> rename to " is submitted (the case when the db location is not 
> specified and the Hive defult db is outside the same encrypted zone).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14908) Upgrade ANTLR to 3.5.2

2016-10-07 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15554436#comment-15554436
 ] 

Zoltan Haindrich commented on HIVE-14908:
-

I've also tried 3.5.2  while I experimented with interval fixes..it didnt 
helped in that neither ;)

But I was not sure about antlr upgrade preferences ... afaik antlr4 does syn 
rules automatically..
So i think it might worth a try - I think i will try it.
Antlr4 has much better dev tooling support which would be useful during 
development

althrough the removal of the optional reservation of these keywords will help - 
but I think identifierparser should be split up
 [~pxiong] 

> Upgrade ANTLR to 3.5.2
> --
>
> Key: HIVE-14908
> URL: https://issues.apache.org/jira/browse/HIVE-14908
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14908.01.patch
>
>
> Antlr v4 is also available but it does not support "->" which is widely used 
> in our grammar. Antlr 3.5.2 is the latest v3 version. It will reduce the code 
> size:
> {code}
> Here is summary of current parser code size
> 422345  HiveLexer.java
> 2436601  HiveParser.java
> 814184  HiveParser_FromClauseParser.java
> 2705920  HiveParser_IdentifiersParser.java
> 777665 HiveParser_SelectClauseParser.java
>After change, it will become
> 319589 HiveLexer.java
> 1853104 HiveParser.java
> 574156 HiveParser_FromClauseParser.java
> 1799195 HiveParser_IdentifiersParser.java
> 587305 HiveParser_SelectClauseParser.java
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14099) Hive security authorization can be disabled by users

2016-10-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15554428#comment-15554428
 ] 

Lefty Leverenz commented on HIVE-14099:
---

Doc note:  This changes the default value of *hive.conf.restricted.list* so the 
wiki needs to be updated.  (It's out of date anyway:  the default changed in 
0.13.0 with HIVE-5953, 0.14.0 with HIVE-6437, and 2.1.0 with HIVE-13853.)

* [Configuration Properties -- hive.conf.restricted.list | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.conf.restricted.list]

Added a TODOC2.2 label.

> Hive security authorization can be disabled by users
> 
>
> Key: HIVE-14099
> URL: https://issues.apache.org/jira/browse/HIVE-14099
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Affects Versions: 0.13.1
>Reporter: Prashant Kumar Singh
>Assignee: Aihua Xu
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14099.1.patch
>
>
> In case we enables :
> hive.security.authorization.enabled=true in hive-site.xml
> this setting can be disabled by users at their hive prompt. There should be 
> hardcoded setting in the configs.
> The other thing is once we enable authorization, the tables that got created 
> before enabling looses access as they don't have authorization defined. How 
> this situation can be tackled in hive.
> Note that this issue does not affect SQL standard or ranger authorization 
> plugin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14099) Hive security authorization can be disabled by users

2016-10-07 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14099:
--
Labels: TODOC2.2  (was: )

> Hive security authorization can be disabled by users
> 
>
> Key: HIVE-14099
> URL: https://issues.apache.org/jira/browse/HIVE-14099
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Affects Versions: 0.13.1
>Reporter: Prashant Kumar Singh
>Assignee: Aihua Xu
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14099.1.patch
>
>
> In case we enables :
> hive.security.authorization.enabled=true in hive-site.xml
> this setting can be disabled by users at their hive prompt. There should be 
> hardcoded setting in the configs.
> The other thing is once we enable authorization, the tables that got created 
> before enabling looses access as they don't have authorization defined. How 
> this situation can be tackled in hive.
> Note that this issue does not affect SQL standard or ranger authorization 
> plugin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14866) Set hive.limit.optimize.enable to true by default

2016-10-07 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14866:
---
Description: 
Currently, we set up the global limit for the query in two different places 
through two different variables: SemanticAnalyzer and through an optimization 
rule GlobalLimitOptimizer (the latest is off by default).

This leads to several problems that I have observed:
- Global limit might not be set for very simple queries, e.g., if the query 
does not contain a RS. GlobalLimitOptimizer would set the limit in this case, 
but as stated above, it is off by default.
- Some other optimizations are not checking both variables, thus missing 
opportunities.
- The variable set by SemanticAnalyzer does not take into account offset of the 
query, which I think might lead to incorrect results if FetchOptimizer kicks in 
(not verified yet). GlobalLimitOptimizer does take into account offset of query.

This issue is to set hive.limit.optimize.enable to _true_ by default, i.e., use 
GlobalLimitOptimizer, and thus getting rid of the variable set by 
SemanticAnalyzer. Maybe there are some gaps (cases covered by SemanticAnalyzer 
alternative and not covered by GlobalLimitOptimizer) that we will need to work 
on.

  was:
Currently, we set up the global limit for the query in two different places 
through two different variables: SemanticAnalyzer and through an optimization 
rule GlobalLimitOptimizer (the latest is off by default).

This leads to several problems that I have observed:
- Global limit might not be set for very simple queries, e.g., if the query 
does not contain a RS). GlobalLimitOptimizer would set the limit in this case, 
but as stated above, it is off by default.
- Some other optimizations are not checking both variables, thus missing 
opportunities.
- The variable set by SemanticAnalyzer does not take into account offset of the 
query, which I think might lead to incorrect results if FetchOptimizer kicks in 
(not verified yet). GlobalLimitOptimizer does take into account offset of query.

This issue is to set hive.limit.optimize.enable to _true_ by default, i.e., use 
GlobalLimitOptimizer, and thus getting rid of the variable set by 
SemanticAnalyzer. Maybe there are some gaps (cases covered by SemanticAnalyzer 
alternative and not covered by GlobalLimitOptimizer) that we will need to work 
on.


> Set hive.limit.optimize.enable to true by default
> -
>
> Key: HIVE-14866
> URL: https://issues.apache.org/jira/browse/HIVE-14866
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14866.patch
>
>
> Currently, we set up the global limit for the query in two different places 
> through two different variables: SemanticAnalyzer and through an optimization 
> rule GlobalLimitOptimizer (the latest is off by default).
> This leads to several problems that I have observed:
> - Global limit might not be set for very simple queries, e.g., if the query 
> does not contain a RS. GlobalLimitOptimizer would set the limit in this case, 
> but as stated above, it is off by default.
> - Some other optimizations are not checking both variables, thus missing 
> opportunities.
> - The variable set by SemanticAnalyzer does not take into account offset of 
> the query, which I think might lead to incorrect results if FetchOptimizer 
> kicks in (not verified yet). GlobalLimitOptimizer does take into account 
> offset of query.
> This issue is to set hive.limit.optimize.enable to _true_ by default, i.e., 
> use GlobalLimitOptimizer, and thus getting rid of the variable set by 
> SemanticAnalyzer. Maybe there are some gaps (cases covered by 
> SemanticAnalyzer alternative and not covered by GlobalLimitOptimizer) that we 
> will need to work on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14866) Set hive.limit.optimize.enable to true by default

2016-10-07 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15554301#comment-15554301
 ] 

Jesus Camacho Rodriguez commented on HIVE-14866:


Yes, still need to take a look...

> Set hive.limit.optimize.enable to true by default
> -
>
> Key: HIVE-14866
> URL: https://issues.apache.org/jira/browse/HIVE-14866
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14866.patch
>
>
> Currently, we set up the global limit for the query in two different places 
> through two different variables: SemanticAnalyzer and through an optimization 
> rule GlobalLimitOptimizer (the latest is off by default).
> This leads to several problems that I have observed:
> - Global limit might not be set for very simple queries, e.g., if the query 
> does not contain a RS). GlobalLimitOptimizer would set the limit in this 
> case, but as stated above, it is off by default.
> - Some other optimizations are not checking both variables, thus missing 
> opportunities.
> - The variable set by SemanticAnalyzer does not take into account offset of 
> the query, which I think might lead to incorrect results if FetchOptimizer 
> kicks in (not verified yet). GlobalLimitOptimizer does take into account 
> offset of query.
> This issue is to set hive.limit.optimize.enable to _true_ by default, i.e., 
> use GlobalLimitOptimizer, and thus getting rid of the variable set by 
> SemanticAnalyzer. Maybe there are some gaps (cases covered by 
> SemanticAnalyzer alternative and not covered by GlobalLimitOptimizer) that we 
> will need to work on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14474) Create datasource in Druid from Hive

2016-10-07 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15554295#comment-15554295
 ] 

Jesus Camacho Rodriguez commented on HIVE-14474:


[~ashutoshc], it is up-to-date; it is just that the initial commit was 24 days 
ago, and then I just amended it... :)

> Create datasource in Druid from Hive
> 
>
> Key: HIVE-14474
> URL: https://issues.apache.org/jira/browse/HIVE-14474
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14474.01.patch, HIVE-14474.02.patch, 
> HIVE-14474.03.patch, HIVE-14474.04.patch, HIVE-14474.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In the initial implementation proposed in this issue, we will write the 
> results of the query to HDFS (or the location specified in the CTAS 
> statement), and submit a HadoopIndexing task to the Druid overlord. The task 
> will contain the path where data was stored, it will read it and create the 
> segments in Druid. Once this is done, the results are removed from Hive.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "my_query_based_datasource")
> AS ;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'my_query_based_datasource'. One of the columns of the query 
> needs to be the time dimension, which is mandatory in Druid. In particular, 
> we use the same convention that it is used for Druid: there needs to be a the 
> column named '\_\_time' in the result of the executed query, which will act 
> as the time dimension column in Druid. Currently, the time column dimension 
> needs to be a 'timestamp' type column.
> This initial implementation interacts with Druid API as it is currently 
> exposed to the user. In a follow-up issue, we should propose an 
> implementation that integrates tighter with Druid. In particular, we would 
> like to store segments directly in Druid from Hive, thus avoiding the 
> overhead of writing Hive results to HDFS and then launching a MR job that 
> basically reads them again to create the segments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14908) Upgrade ANTLR to 3.5.2

2016-10-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14908:
---
Attachment: HIVE-14908.01.patch

> Upgrade ANTLR to 3.5.2
> --
>
> Key: HIVE-14908
> URL: https://issues.apache.org/jira/browse/HIVE-14908
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14908.01.patch
>
>
> Antlr v4 is also available but it does not support "->" which is widely used 
> in our grammar. Antlr 3.5.2 is the latest v3 version. It will reduce the code 
> size:
> {code}
> Here is summary of current parser code size
> 422345  HiveLexer.java
> 2436601  HiveParser.java
> 814184  HiveParser_FromClauseParser.java
> 2705920  HiveParser_IdentifiersParser.java
> 777665 HiveParser_SelectClauseParser.java
>After change, it will become
> 319589 HiveLexer.java
> 1853104 HiveParser.java
> 574156 HiveParser_FromClauseParser.java
> 1799195 HiveParser_IdentifiersParser.java
> 587305 HiveParser_SelectClauseParser.java
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14908) Upgrade ANTLR to 3.5.2

2016-10-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14908:
---
Status: Patch Available  (was: Open)

> Upgrade ANTLR to 3.5.2
> --
>
> Key: HIVE-14908
> URL: https://issues.apache.org/jira/browse/HIVE-14908
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14908.01.patch
>
>
> Antlr v4 is also available but it does not support "->" which is widely used 
> in our grammar. Antlr 3.5.2 is the latest v3 version. It will reduce the code 
> size:
> {code}
> Here is summary of current parser code size
> 422345  HiveLexer.java
> 2436601  HiveParser.java
> 814184  HiveParser_FromClauseParser.java
> 2705920  HiveParser_IdentifiersParser.java
> 777665 HiveParser_SelectClauseParser.java
>After change, it will become
> 319589 HiveLexer.java
> 1853104 HiveParser.java
> 574156 HiveParser_FromClauseParser.java
> 1799195 HiveParser_IdentifiersParser.java
> 587305 HiveParser_SelectClauseParser.java
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14908) Upgrade ANTLR to 3.5.2

2016-10-07 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15554267#comment-15554267
 ] 

Pengcheng Xiong commented on HIVE-14908:


Thanks for [~ekoifman]'s suggestion. However, it is still not enough if we 
would like to support set operator without "distinct" keywords. We still need 
to drop the configuration for SQL2011 keywords used as identifier. ccing 
[~ashutoshc] and [~alangates]

> Upgrade ANTLR to 3.5.2
> --
>
> Key: HIVE-14908
> URL: https://issues.apache.org/jira/browse/HIVE-14908
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> Antlr v4 is also available but it does not support "->" which is widely used 
> in our grammar. Antlr 3.5.2 is the latest v3 version. It will reduce the code 
> size:
> {code}
> Here is summary of current parser code size
> 422345  HiveLexer.java
> 2436601  HiveParser.java
> 814184  HiveParser_FromClauseParser.java
> 2705920  HiveParser_IdentifiersParser.java
> 777665 HiveParser_SelectClauseParser.java
>After change, it will become
> 319589 HiveLexer.java
> 1853104 HiveParser.java
> 574156 HiveParser_FromClauseParser.java
> 1799195 HiveParser_IdentifiersParser.java
> 587305 HiveParser_SelectClauseParser.java
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)