[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2018-12-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718413#comment-16718413
 ] 

Hive QA commented on HIVE-17935:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12908851/HIVE-17935.8.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15270/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15270/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15270/

Messages:
{noformat}
 This message was trimmed, see log for full details 
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-15270/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-12-12 02:54:29.157
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at b650083 HIVE-16100: Dynamic Sorted Partition optimizer loses 
sibling operators (Vineet Garg, Gopal V reviewed by Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at b650083 HIVE-16100: Dynamic Sorted Partition optimizer loses 
sibling operators (Vineet Garg, Gopal V reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-12-12 02:54:29.717
+ rm -rf ../yetus_PreCommit-HIVE-Build-15270
+ mkdir ../yetus_PreCommit-HIVE-Build-15270
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-15270
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-15270/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
ql/src/test/results/clientpositive/spark/load_dyn_part1.q.out:61
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/load_dyn_part1.q.out' with conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/load_dyn_part10.q.out:49
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/load_dyn_part10.q.out' with conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/load_dyn_part14.q.out:79
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/load_dyn_part14.q.out' with conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/load_dyn_part3.q.out:47
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/load_dyn_part3.q.out' with conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/load_dyn_part4.q.out:57
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/load_dyn_part4.q.out' with conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/load_dyn_part5.q.out:34
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/load_dyn_part5.q.out' with conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/load_dyn_part8.q.out:53
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/load_dyn_part8.q.out' with conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/load_dyn_part9.q.out:49
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/load_dyn_part9.q.out' with conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/orc_merge2.q.out:37
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/orc_merge2.q.out' 
with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/stats2.q.out:19
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/stats2.q.out' with 
conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/union14.q.out:122
Falling back to three-way merge...

[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2018-12-11 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718295#comment-16718295
 ] 

Vineet Garg commented on HIVE-17935:


[~asherman] Since now this optimization is turned on by default (HIVE-20703 & 
HIVE-20915) I don't believe we need this JIRA anymore. Is it ok to close it?

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Priority: Major
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch, 
> HIVE-17935.3.patch, HIVE-17935.4.patch, HIVE-17935.5.patch, 
> HIVE-17935.6.patch, HIVE-17935.7.patch, HIVE-17935.8.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2018-02-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349763#comment-16349763
 ] 

Hive QA commented on HIVE-17935:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12908851/HIVE-17935.8.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 92 failed/errored test(s), 12965 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_part] 
(batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_1] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_2] 
(batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_6] 
(batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_8] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_all_partitioned] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[extrapolate_part_stats_partial]
 (batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[implicit_cast_during_insert]
 (batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_into6] 
(batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid] (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid_fast] 
(batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part10] 
(batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part14] 
(batchId=90)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part1] 
(batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part3] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part4] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part8] 
(batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part9] 
(batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge3] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge4] (batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge_dynamic_partition3]
 (batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge_dynamic_partition4]
 (batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge_dynamic_partition5]
 (batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_int_type_promotion] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge2] (batchId=91)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats2] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats4] (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_empty_dyn_part] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_remove_15] 
(batchId=87)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_remove_16] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_remove_17] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_remove_18] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_remove_25] 
(batchId=89)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_all_partitioned] 
(batchId=52)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_stats] 
(batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge2] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_all_partitioned]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dp_counter_mm]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dp_counter_non_mm]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[extrapolate_part_stats_partial_ndv]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] 
(batchId=163)

[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2018-02-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349723#comment-16349723
 ] 

Hive QA commented on HIVE-17935:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m 17s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 32b8994 |
| Default Java | 1.8.0_111 |
| modules | C: common ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8980/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch, 
> HIVE-17935.3.patch, HIVE-17935.4.patch, HIVE-17935.5.patch, 
> HIVE-17935.6.patch, HIVE-17935.7.patch, HIVE-17935.8.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the 

[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256625#comment-16256625
 ] 

Hive QA commented on HIVE-17935:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12898098/HIVE-17935.7.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7876/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7876/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7876/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-11-17 08:13:54.847
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-7876/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-11-17 08:13:54.850
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 987d130 HIVE-16756 : Vectorization: LongColModuloLongColumn 
throws java.lang.ArithmeticException: / by zero (Vihang Karajgaonkar, reviewed 
by Matt McCline)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 987d130 HIVE-16756 : Vectorization: LongColModuloLongColumn 
throws java.lang.ArithmeticException: / by zero (Vihang Karajgaonkar, reviewed 
by Matt McCline)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-11-17 08:13:59.272
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
ql/src/test/results/clientpositive/llap/ppd_union_view.q.out:258
error: ql/src/test/results/clientpositive/llap/ppd_union_view.q.out: patch does 
not apply
error: patch failed: ql/src/test/results/clientpositive/llap/sysdb.q.out:2190
error: ql/src/test/results/clientpositive/llap/sysdb.q.out: patch does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12898098 - PreCommit-HIVE-Build

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch, 
> HIVE-17935.3.patch, HIVE-17935.4.patch, HIVE-17935.5.patch, 
> HIVE-17935.6.patch, HIVE-17935.7.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write 

[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16250477#comment-16250477
 ] 

Hive QA commented on HIVE-17935:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12897347/HIVE-17935.5.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 11380 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ppd_union_view]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_part_update]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_all_partitioned]
 (batchId=160)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.TestTxnCommands2.testDynamicPartitionsMerge2 
(batchId=274)
org.apache.hadoop.hive.ql.TestTxnCommands2.testMultiInsert (batchId=274)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDynamicPartitionsMerge2
 (batchId=284)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMultiInsert
 (batchId=284)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.schemaEvolutionAddColDynamicPartitioningUpdate
 (batchId=221)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=233)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=233)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitionsMultiInsert
 (batchId=230)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=230)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7793/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7793/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7793/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 20 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12897347 - PreCommit-HIVE-Build

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch, 
> HIVE-17935.3.patch, HIVE-17935.4.patch, HIVE-17935.5.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245285#comment-16245285
 ] 

Hive QA commented on HIVE-17935:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896740/HIVE-17935.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 39 failed/errored test(s), 11372 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_all_partitioned] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid] (batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid_fast] 
(batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_all_partitioned] 
(batchId=50)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge2] 
(batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_all_partitioned]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dp_counter_mm]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dp_counter_non_mm]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[extrapolate_part_stats_partial_ndv]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] 
(batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=165)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[load_dyn_part1]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[load_dyn_part3]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[load_dyn_part5]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_part_update]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acidvec_part_update]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_dml] 
(batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_all_partitioned]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_partitioned_date_time]
 (batchId=165)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge2]
 (batchId=176)
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[infer_bucket_sort_dyn_part]
 (batchId=89)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.TestTxnCommands2.testDynamicPartitionsMerge2 
(batchId=274)
org.apache.hadoop.hive.ql.TestTxnCommands2.testMultiInsert (batchId=274)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDynamicPartitionsMerge2
 (batchId=284)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMultiInsert
 (batchId=284)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.schemaEvolutionAddColDynamicPartitioningUpdate
 (batchId=221)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=230)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=230)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7727/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7727/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7727/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 39 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 

[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-01 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234703#comment-16234703
 ] 

Prasanth Jayachandran commented on HIVE-17935:
--

bq. Is it fair to say that the gains from changing the default are potentially 
large while the losses are comparatively small?
For cases where it is beneficial, this is definitely a huge gain. There are 
many gains with this optimization. The point I was trying to make is that users 
should be aware of the regression for some cases until optimizer makes this 
decision automatically.

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-01 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234701#comment-16234701
 ] 

Prasanth Jayachandran commented on HIVE-17935:
--

bq. Do you think the possible performance regression for some jobs to be large? 
Unfortunately, not quantifiable. Overhead is essentially sort + shuffle + new 
tasks spin up for reduce tasks. If partition column count is low and data size 
is small, the regression factor will be completely different than the case with 
large data set. 

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-01 Thread Andrew Sherman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234652#comment-16234652
 ] 

Andrew Sherman commented on HIVE-17935:
---

Thanks [~prasanth_j] for helpful comments. Do you think the possible 
performance regression for some jobs to be large? Is it fair to say that the 
gains from changing the default are potentially large while the losses are 
comparatively small?

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-01 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234570#comment-16234570
 ] 

Prasanth Jayachandran commented on HIVE-17935:
--

The thing to note is that this might cause performance regression for some 
jobs. Jobs with partition column values in the order of 10s will have 
regression as it may run as map only job. This feature will force a reducer 
stage even for small jobs. In some cases, reducer deduplication can bring in 
gains but in cases where there is extra reducer and small partition count this 
will slow down. This optimization is really beneficial when there are lots of 
partition which can cause queries to OOM or create GC pressure. In all cases, 
this will also result in optimal file structure (concurrent writers for ORC can 
result in too many small stripes per file which is suboptimal). So there are 
good and bad about this optimization. Ideally we want optimizer to make smart 
decision during planning whether to enable this or not based on column stats 
from source table. cc/ [~ashutoshc]

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)