[jira] [Commented] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-19 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292078#comment-15292078
 ] 

Jesus Camacho Rodriguez commented on HIVE-13750:


Fails are unrelated, except 
{{org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver_udf_row_sequence}}
 for which I regenerated the .q file.

Pushed to master, thanks for reviewing [~ashutoshc]!

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.1.0
>
> Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, 
> HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291912#comment-15291912
 ] 

Hive QA commented on HIVE-13750:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12804916/HIVE-13750.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 88 failed/errored test(s), 10026 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_join1.q-schema_evol_text_vec_mapwork_part_all_complex.q-vector_complex_join.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-groupby2.q-tez_dynpart_hashjoin_1.q-custom_input_output_format.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-load_dyn_part2.q-selectDistinctStar.q-vector_decimal_5.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver_udf_row_sequence
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_bucket_map_join_tez2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_3
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_5
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_3
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_4
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_5
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llap_nullscan
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llapdecider
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_mrr
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dml
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_tests
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_joins_explain
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_smb_main
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_all_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_merge1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_ppd_basic
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_smb_cache
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_transform_ppr2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join0
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestFirstInFirstOutComparator.testWaitQueueComparatorWithinDagPriority
org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure
org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote.org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote
org.apache.hadoop.hive.metastore.TestFilterHooks.org.apache.hadoop.hive.metastore.TestFilterHooks

[jira] [Commented] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291490#comment-15291490
 ] 

Ashutosh Chauhan commented on HIVE-13750:
-

+1

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, 
> HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15290128#comment-15290128
 ] 

Ashutosh Chauhan commented on HIVE-13750:
-

Compiler side changes look good. I left some comments on RB. 
But I wonder if this breaks any assumption for FS operator about order in which 
it expects rows to arrive to be written out. Since earlier all rows for a 
corresponding partition in a Reducer needs to come sorted in a single batch, 
but now they may come sorted but in multiple batches. [~prasanth_j] Can you 
also please take a look at patch and comment?

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, 
> HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-18 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289902#comment-15289902
 ] 

Jesus Camacho Rodriguez commented on HIVE-13750:


Regenerated two q files in new patch:
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
{noformat}

[~ashutoshc], could you take a look? thanks!

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, 
> HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289795#comment-15289795
 ] 

Hive QA commented on HIVE-13750:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12804385/HIVE-13750.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 50 failed/errored test(s), 9228 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
TestMiniTezCliDriver-constprog_dpp.q-dynamic_partition_pruning.q-vectorization_10.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-order_null.q-vector_acid3.q-orc_merge10.q-and-12-more - 
did not produce a TEST-*.xml file
TestMiniTezCliDriver-tez_union_group_by.q-vector_auto_smb_mapjoin_14.q-union_fast_stats.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_coalesce.q-cbo_windowing.q-tez_join.q-and-12-more - 
did not produce a TEST-*.xml file
TestNegativeCliDriver-udf_invalid.q-nopart_insert.q-insert_into_with_schema.q-and-734-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-bucketmapjoin10.q-join_rc.q-skewjoinopt13.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-groupby2_noskew_multi_distinct.q-vectorization_10.q-list_bucket_dml_2.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join9.q-join_casesensitive.q-filter_join_breaktask.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_nullsafe
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_join2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_multi_insert
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ptf_streaming
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_runtime_skewjoin_mapjoin_spark
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union20
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_20
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_9
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testPreemptionQueueComparator
org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf.org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf
org.apache.hadoop.hive.metastore.TestMetaStoreEndFunctionListener.testEndFunctionListener
org.apache.hadoop.hive.metastore.TestMetaStoreEventListenerOnlyOnCommit.testEventStatus
org.apache.hadoop.hive.metastore.TestMetaStoreInitListener.testMetaStoreInitListener
org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAddPartitionWithValidPartVal
org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithCommas
org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithUnicode
org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithValidCharacters
org.apache.hadoop.hive.ql.security.TestExtendedAcls.org.apache.hadoop.hive.ql.security.TestExtendedAcls
org.apache.hadoop.hive.ql.security.TestMultiAuthorizationPreEventListener.org.apache.hadoop.hive.ql.security.TestMultiAuthorizationPreEventListener
org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure

[jira] [Commented] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285853#comment-15285853
 ] 

Hive QA commented on HIVE-13750:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12804193/HIVE-13750.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/302/testReport
Console output: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/302/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-302/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-302/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at e738914 Remove unintended import that caused build failure for 
JDK 8 in commit 4533d21b0be487e1f11fcc95578a2ba103e72a64 HIVE-13682: 
EOFException with fast hashtable (Matt McCline, reviewed by Sergey Shelukhin)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at e738914 Remove unintended import that caused build failure for 
JDK 8 in commit 4533d21b0be487e1f11fcc95578a2ba103e72a64 HIVE-13682: 
EOFException with fast hashtable (Matt McCline, reviewed by Sergey Shelukhin)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12804193 - PreCommit-HIVE-MASTER-Build

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not 

[jira] [Commented] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283737#comment-15283737
 ] 

Hive QA commented on HIVE-13750:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12803733/HIVE-13750.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/274/testReport
Console output: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/274/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-274/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-274/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at fbeee62 HIVE-13686 : TestRecordReaderImpl is deleting target/tmp 
causing all the tests after it to fail (Rajat Khandelwal via Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at fbeee62 HIVE-13686 : TestRecordReaderImpl is deleting target/tmp 
causing all the tests after it to fail (Rajat Khandelwal via Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12803733 - PreCommit-HIVE-MASTER-Build

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)