[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-04-29 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15265038#comment-15265038
 ] 

Lefty Leverenz commented on HIVE-12963:
---

Doc note:  This adds *hive.groupby.limit.extrastep* to HiveConf.java, so it 
needs to be documented in the wiki for release 2.1.0.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-12963.1.patch, HIVE-12963.2.patch, 
> HIVE-12963.3.patch, HIVE-12963.4.patch, HIVE-12963.6.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with querie:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-04-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262677#comment-15262677
 ] 

Sergey Shelukhin commented on HIVE-12963:
-

Sorry, forgot about this... the test failed in the above QA run, and it passes 
for other JIRAs. I'll run it locally to see if it passes and commit if it does.

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Attachments: HIVE-12963.1.patch, HIVE-12963.2.patch, 
> HIVE-12963.3.patch, HIVE-12963.4.patch, HIVE-12963.6.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with querie:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-04-28 Thread Alina Abramova (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261766#comment-15261766
 ] 

Alina Abramova commented on HIVE-12963:
---

[~sershe] Sorry, is this failed test related with the fix ?

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Attachments: HIVE-12963.1.patch, HIVE-12963.2.patch, 
> HIVE-12963.3.patch, HIVE-12963.4.patch, HIVE-12963.6.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with querie:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-03-21 Thread Alina Abramova (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15204073#comment-15204073
 ] 

Alina Abramova commented on HIVE-12963:
---

Hi!
What about this test? Does it not work as it should?

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Attachments: HIVE-12963.1.patch, HIVE-12963.2.patch, 
> HIVE-12963.3.patch, HIVE-12963.4.patch, HIVE-12963.6.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with querie:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-03-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15191352#comment-15191352
 ] 

Sergey Shelukhin commented on HIVE-12963:
-

groupby1_limit failure might be related. I will try it locally, and commit on 
monday if it works and there are no objections.

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Attachments: HIVE-12963.1.patch, HIVE-12963.2.patch, 
> HIVE-12963.3.patch, HIVE-12963.4.patch, HIVE-12963.6.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with querie:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-03-11 Thread Alina Abramova (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190963#comment-15190963
 ] 

Alina Abramova commented on HIVE-12963:
---

Anybody has comments for this issue?

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Attachments: HIVE-12963.1.patch, HIVE-12963.2.patch, 
> HIVE-12963.3.patch, HIVE-12963.4.patch, HIVE-12963.6.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with querie:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-02-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159920#comment-15159920
 ] 

Hive QA commented on HIVE-12963:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12789230/HIVE-12963.6.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9803 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby1_limit
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7071/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7071/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7071/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12789230 - PreCommit-HIVE-TRUNK-Build

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Attachments: HIVE-12963.1.patch, HIVE-12963.2.patch, 
> HIVE-12963.3.patch, HIVE-12963.4.patch, HIVE-12963.6.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with querie:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-02-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142481#comment-15142481
 ] 

Hive QA commented on HIVE-12963:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12787156/HIVE-12963.4.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 9758 tests 
executed
*Failed tests:*
{noformat}
TestCliDriver-ppd_union.q-udf_var_samp.q-custom_input_output_format.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby1_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_limit_extrastep
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_extrastep
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_limit_extrastep
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown_extrastep
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_limit_pushdown
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6940/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6940/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6940/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12787156 - PreCommit-HIVE-TRUNK-Build

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Attachments: HIVE-12963.1.patch, HIVE-12963.2.patch, 
> HIVE-12963.3.patch, HIVE-12963.4.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with querie:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-02-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143204#comment-15143204
 ] 

Sergey Shelukhin commented on HIVE-12963:
-

[~ashutoshc] can you comment? I am not very familiar with this code. Do we have 
a good test for this?

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Attachments: HIVE-12963.1.patch, HIVE-12963.2.patch, 
> HIVE-12963.3.patch, HIVE-12963.4.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with querie:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-02-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15135736#comment-15135736
 ] 

Hive QA commented on HIVE-12963:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12786299/HIVE-12963.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 80 failed/errored test(s), 10052 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_SortUnionTransposeRule
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_union1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_colname
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby1_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input11_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input14_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input1_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input3_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view_noalias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view_onview
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_join_transpose
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown_negative
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonreserved_keywords_insert_into1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_offset_limit_ppd_optimizer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_predicate_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_mixed_partition_formats2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_predicate_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_script_pipe
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udtf_explode
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unionDistinct_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_top_level
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_varchar_union1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_simple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_partitioned_date_time
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_varchar_simple
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_constprog_dpp
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_constprog_dpp
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ctas
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_script_pipe
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_unionDistinct_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_char_simple
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_partitioned_date_time

[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-02-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127705#comment-15127705
 ] 

Hive QA commented on HIVE-12963:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12785501/HIVE-12963.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10018 tests 
executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-orc_merge5.q-vectorization_limit.q-tez_dynpart_hashjoin_1.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testAddPartitions
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.createTable
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testLockRetryLimit
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.updateSelectUpdate
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6832/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6832/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6832/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12785501 - PreCommit-HIVE-TRUNK-Build

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Attachments: HIVE-12963.1.patch, HIVE-12963.2.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with querie:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-01-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125272#comment-15125272
 ] 

Hive QA commented on HIVE-12963:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12785214/HIVE-12963.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6812/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6812/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6812/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] 
[INFO] 
[INFO] Building Hive ORC 2.1.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-orc ---
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/orc/target
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/orc 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-orc ---
[INFO] 
[INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-orc ---
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/orc/src/gen/protobuf-java 
added.
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-orc ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hive-orc 
---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-github-source-source/orc/src/main/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-orc ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-orc ---
[INFO] Compiling 60 source files to 
/data/hive-ptest/working/apache-github-source-source/orc/target/classes
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
hive-orc ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-github-source-source/orc/src/test/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-orc ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/orc/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/orc/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/orc/target/tmp/conf
 [copy] Copying 16 files to 
/data/hive-ptest/working/apache-github-source-source/orc/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hive-orc ---
[INFO] Compiling 12 source files to 
/data/hive-ptest/working/apache-github-source-source/orc/target/test-classes
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/orc/src/test/org/apache/orc/impl/TestRunLengthIntegerReader.java:
 Some input files use or override a deprecated API.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/orc/src/test/org/apache/orc/impl/TestRunLengthIntegerReader.java:
 Recompile with -Xlint:deprecation for details.
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-orc ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-orc ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/orc/target/hive-orc-2.1.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
hive-orc ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-orc ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/orc/target/hive-orc-2.1.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/hive-orc/2.1.0-SNAPSHOT/hive-orc-2.1.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/orc/pom.xml to 
/data/hive-ptest/working/maven/org/apache/hive/hive-orc/2.1.0-SNAPSHOT/hive-orc-2.1.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Common 2.1.0-SNAPSHOT
[INFO] 
[INFO] 

[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-01-30 Thread Alina Abramova (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124823#comment-15124823
 ] 

Alina Abramova commented on HIVE-12963:
---

But I see that if line with creating of genReduceSinkPlan in method 
genLimitMapRedPlan is commented then finish set is sorted too. It means that we 
could refuse the creating of extra job, and do sorting in the same MR job, 
doesn't it?

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Attachments: HIVE-12963.1.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with queries:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer

2016-01-29 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124000#comment-15124000
 ] 

Sergey Shelukhin commented on HIVE-12963:
-

I believe it's caused by the fact that Hive doesn't perform the sort, and 
relies on MR to sort the data; which means that any job with order by has to 
have one reducer at some point, so that all the data is sorted together. On 
non-MR engines like Tez it's less of a problem.

> LIMIT statement with SORT BY creates additional MR job with hardcoded only 
> one reducer
> --
>
> Key: HIVE-12963
> URL: https://issues.apache.org/jira/browse/HIVE-12963
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.1, 0.13
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Attachments: HIVE-12963.1.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;  
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a 
> long time. I think we could allow to user choose number of reducers of last 
> job or refuse extra MR job.
> The same behavior I observed with queries:
> hive> create table new_test as select age from test1 group by age.age  limit 
> 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)