[jira] [Commented] (HIVE-17257) Hive should merge empty files

2018-01-18 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16330994#comment-16330994
 ] 

Xuefu Zhang commented on HIVE-17257:


Thanks for the update, [~csun]. I also verified with the patch and it fixed the 
problem for both MR and Spark. Will commit the patch shortly.

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HIVE-17257.0.patch, HIVE-17257.1.patch, 
> HIVE-17257.2.patch, HIVE-17257.3.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2018-01-18 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16330983#comment-16330983
 ] 

Chao Sun commented on HIVE-17257:
-

In the latest test run, most test failures are not new except the following 3:
{code:java}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[smb_mapjoin_15]
{code}
Tested locally, and I couldn't reproduce the failures - the output is the same 
whether with or without my patch (and llap_smb generate a different q.out file 
even without the patch).

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HIVE-17257.0.patch, HIVE-17257.1.patch, 
> HIVE-17257.2.patch, HIVE-17257.3.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2018-01-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16330454#comment-16330454
 ] 

Hive QA commented on HIVE-17257:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12906544/HIVE-17257.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 11613 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=149)

[intersect_all.q,unionDistinct_1.q,orc_ppd_schema_evol_3a.q,table_nonprintable.q,tez_union_dynamic_partition.q,tez_union_dynamic_partition_2.q,temp_table_external.q,global_limit.q,llap_udf.q,schemeAuthority.q,cte_2.q,rcfile_createas1.q,dynamic_partition_pruning_2.q,intersect_merge.q,parallel_colstats.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=173)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=165)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[smb_mapjoin_15]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=160)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=121)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=254)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=232)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=232)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=232)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8676/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8676/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8676/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12906544 - PreCommit-HIVE-Build

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HIVE-17257.0.patch, HIVE-17257.1.patch, 
> HIVE-17257.2.patch, HIVE-17257.3.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2018-01-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16330408#comment-16330408
 ] 

Hive QA commented on HIVE-17257:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 80e6f7b |
| Default Java | 1.8.0_111 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8676/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HIVE-17257.0.patch, HIVE-17257.1.patch, 
> HIVE-17257.2.patch, HIVE-17257.3.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2018-01-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16330393#comment-16330393
 ] 

Hive QA commented on HIVE-17257:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12906544/HIVE-17257.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 11628 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=173)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=165)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=160)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=121)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=254)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=232)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=232)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=232)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8675/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8675/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8675/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12906544 - PreCommit-HIVE-Build

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HIVE-17257.0.patch, HIVE-17257.1.patch, 
> HIVE-17257.2.patch, HIVE-17257.3.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2018-01-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16330349#comment-16330349
 ] 

Hive QA commented on HIVE-17257:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m  8s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 80e6f7b |
| Default Java | 1.8.0_111 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8675/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HIVE-17257.0.patch, HIVE-17257.1.patch, 
> HIVE-17257.2.patch, HIVE-17257.3.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2018-01-17 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329074#comment-16329074
 ] 

Chao Sun commented on HIVE-17257:
-

{quote}+1 for the patch. However, I'm not sure if those test failures are 
related.
{quote}
The last test result has been removed so I'm not sure. I'm waiting for the 
latest patch to be triggered by jenkins.

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HIVE-17257.0.patch, HIVE-17257.1.patch, 
> HIVE-17257.2.patch, HIVE-17257.3.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2018-01-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329018#comment-16329018
 ] 

Xuefu Zhang commented on HIVE-17257:


+1 for the patch. However, I'm not sure if those test failures are related.

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HIVE-17257.0.patch, HIVE-17257.1.patch, 
> HIVE-17257.2.patch, HIVE-17257.3.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2018-01-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323901#comment-16323901
 ] 

Hive QA commented on HIVE-17257:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12905762/HIVE-17257.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 11567 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_aggregator_error_1]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=120)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query39] 
(batchId=247)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
 (batchId=213)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8590/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8590/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8590/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 20 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12905762 - PreCommit-HIVE-Build

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17257.0.patch, HIVE-17257.1.patch, 
> HIVE-17257.2.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2018-01-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323859#comment-16323859
 ] 

Hive QA commented on HIVE-17257:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / fd4e222 |
| Default Java | 1.8.0_111 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8590/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17257.0.patch, HIVE-17257.1.patch, 
> HIVE-17257.2.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2017-08-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119381#comment-16119381
 ] 

Hive QA commented on HIVE-17257:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880932/HIVE-17257.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_mat_4] (batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=75)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6313/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6313/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6313/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880932 - PreCommit-HIVE-Build

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17257.0.patch, HIVE-17257.1.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2017-08-07 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117565#comment-16117565
 ] 

Chao Sun commented on HIVE-17257:
-

[~kellyzly]: the empty files maybe generated if the result set is empty and if 
you have multiple mapper/reducers with file sink. Example:
{code}
set hive.execution.engine=spark;
set hive.auto.convert.join=false;
set mapreduce.job.reduces=1000;
create table dummy (a string);
insert overwrite directory '/tmp/test' select src.key from src join dummy on 
src.key = dummy.a;
{code}
The above will generate 1000 empty files in /tmp/test.

[~xuefuz]: I need to revise the patch. There's an issue where HoS won't launch 
task for the final merge job since the input data is empty.

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17257.0.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2017-08-07 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116933#comment-16116933
 ] 

Xuefu Zhang commented on HIVE-17257:


Patch looks simple and good to me. Is it possible to have a test case on this?

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17257.0.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2017-08-07 Thread liyunzhang_intel (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116172#comment-16116172
 ] 

liyunzhang_intel commented on HIVE-17257:
-

[~csun]: before i met empty files when using parquet files. The reason is that 
hive read parquet meta info to construct ParquetInputSplit in 
ParquetRecordReaderBase#getSplit.  In 
[ParquetRecordReaderBase#getSplit|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L106],
 sometimes return NULL. Thus will cause empty file.

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17257.0.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2017-08-06 Thread liyunzhang_intel (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116144#comment-16116144
 ] 

liyunzhang_intel commented on HIVE-17257:
-

Why there are empty files? The raw data is empty or the empty files is 
generated after loading?

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17257.0.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2017-08-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116004#comment-16116004
 ] 

Hive QA commented on HIVE-17257:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880576/HIVE-17257.0.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10990 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=239)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=239)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=234)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6274/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6274/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6274/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880576 - PreCommit-HIVE-Build

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17257.0.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17257) Hive should merge empty files

2017-08-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115932#comment-16115932
 ] 

Hive QA commented on HIVE-17257:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880576/HIVE-17257.0.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10989 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=239)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat6]
 (batchId=7)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=234)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6272/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6272/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6272/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880576 - PreCommit-HIVE-Build

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17257.0.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)