[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2018-01-09 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319249#comment-16319249
 ] 

Prasanth Jayachandran commented on HIVE-18269:
--

lgtm, +1

> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18269.01.patch, HIVE-18269.02.patch, 
> HIVE-18269.03.patch, HIVE-18269.1.patch, HIVE-18269.bad.patch, Screen Shot 
> 2017-12-13 at 1.15.16 AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2018-01-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319218#comment-16319218
 ] 

Sergey Shelukhin commented on HIVE-18269:
-

Looks like llap_acid test is failing on most recent runs. [~prasanth_j] 
[~jdere] can you review the update?

> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18269.01.patch, HIVE-18269.02.patch, 
> HIVE-18269.03.patch, HIVE-18269.1.patch, HIVE-18269.bad.patch, Screen Shot 
> 2017-12-13 at 1.15.16 AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2018-01-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317820#comment-16317820
 ] 

Hive QA commented on HIVE-18269:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12905175/HIVE-18269.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 11549 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_aggregator_error_1]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=120)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
 (batchId=213)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8514/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8514/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8514/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12905175 - PreCommit-HIVE-Build

> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18269.01.patch, HIVE-18269.02.patch, 
> HIVE-18269.03.patch, HIVE-18269.1.patch, HIVE-18269.bad.patch, Screen Shot 
> 2017-12-13 at 1.15.16 AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2018-01-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317742#comment-16317742
 ] 

Hive QA commented on HIVE-18269:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
32s{color} | {color:red} ql: The patch generated 6 new + 118 unchanged - 8 
fixed = 124 total (was 126) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
13s{color} | {color:red} llap-server: The patch generated 4 new + 250 unchanged 
- 4 fixed = 254 total (was 254) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 8412748 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8514/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8514/yetus/diff-checkstyle-llap-server.txt
 |
| modules | C: common ql llap-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8514/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18269.01.patch, HIVE-18269.02.patch, 
> HIVE-18269.03.patch, HIVE-18269.1.patch, HIVE-18269.bad.patch, Screen Shot 
> 2017-12-13 at 1.15.16 AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--

[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2018-01-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314554#comment-16314554
 ] 

Hive QA commented on HIVE-18269:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904863/HIVE-18269.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 11549 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[materialized_view_authorization_create_no_grant]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_aggregator_error_1]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_publisher_error_1]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=120)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
 (batchId=213)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatementParallel
 (batchId=229)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8481/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8481/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8481/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 23 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12904863 - PreCommit-HIVE-Build

> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18269.01.patch, HIVE-18269.02.patch, 
> HIVE-18269.1.patch, HIVE-18269.bad.patch, Screen Shot 2017-12-13 at 1.15.16 
> AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2018-01-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16314543#comment-16314543
 ] 

Hive QA commented on HIVE-18269:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
33s{color} | {color:red} ql: The patch generated 6 new + 118 unchanged - 8 
fixed = 124 total (was 126) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
13s{color} | {color:red} llap-server: The patch generated 4 new + 250 unchanged 
- 4 fixed = 254 total (was 254) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 16m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / a6b88d9 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8481/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8481/yetus/diff-checkstyle-llap-server.txt
 |
| modules | C: common ql llap-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8481/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18269.01.patch, HIVE-18269.02.patch, 
> HIVE-18269.1.patch, HIVE-18269.bad.patch, Screen Shot 2017-12-13 at 1.15.16 
> AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--
This message was sent 

[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2018-01-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313672#comment-16313672
 ] 

Hive QA commented on HIVE-18269:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904313/HIVE-18269.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 282 failed/errored test(s), 11547 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid] (batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid_fast] 
(batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_reader] (batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_text] (batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_decimal] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_nullscan] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_all] 
(batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters1]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge10] 
(batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge4] 
(batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge_diff_fs]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_complex_types_vectorization]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_map_type_vectorization]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_types_vectorization]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_globallimit]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets]
 (batchId=165)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_vectorization_original]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[alter_merge_2_orc]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[alter_merge_orc]
 (batchId=165)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez2]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_1] 
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[current_date_timestamp]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_all_non_partitioned]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_all_partitioned]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_tmp_table]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_where_no_match]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_where_non_partitioned]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_where_partitioned]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_whole_partition]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_partition_pruning]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_sw]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2]
 

[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2018-01-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313577#comment-16313577
 ] 

Hive QA commented on HIVE-18269:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
32s{color} | {color:red} ql: The patch generated 7 new + 119 unchanged - 7 
fixed = 126 total (was 126) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} llap-server: The patch generated 5 new + 250 unchanged 
- 4 fixed = 255 total (was 254) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / b0e653a |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8463/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8463/yetus/diff-checkstyle-llap-server.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8463/yetus/whitespace-eol.txt 
|
| modules | C: common ql llap-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8463/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18269.01.patch, HIVE-18269.1.patch, 
> HIVE-18269.bad.patch, Screen Shot 2017-12-13 at 1.15.16 AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that 

[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2018-01-04 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312290#comment-16312290
 ] 

Jason Dere commented on HIVE-18269:
---

Looks ok I think .. can you submit the patch so we can see precommit test 
results?

> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18269.01.patch, HIVE-18269.1.patch, 
> HIVE-18269.bad.patch, Screen Shot 2017-12-13 at 1.15.16 AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2018-01-04 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312155#comment-16312155
 ] 

Sergey Shelukhin commented on HIVE-18269:
-

[~prasanth_j] [~jdere] [~gopalv] can someone please review this patch? thnx

> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18269.01.patch, HIVE-18269.1.patch, 
> HIVE-18269.bad.patch, Screen Shot 2017-12-13 at 1.15.16 AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2017-12-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302114#comment-16302114
 ] 

Sergey Shelukhin commented on HIVE-18269:
-


Update: we've seen large queue also producing GC problems without getting close 
to OOM with many decimal columns. The temp patch to see if the limit works 
performed well with queue size of 3-10, which I suspect will be insufficient 
for a cloud FS like S3 if IO thread is blocked - if pipeline can process 10 
VRBs rapidly, it will have to wait for a while until the unblocked S3 reader 
produces more data and blocks, then process it quickly again and block, etc. 
This might require some testing. There are 3 possible approaches that I see:
1) Don't block physical reads from FS, but only block the decoding/etc. that 
produces java objects. That may be a complex threading change and/or would 
require separate throttle for the buffers (that may be more forgiving) lest 
they cause OOM.
2) Determine queue size dynamically based on speed of processing - e.g. start 
high, then see how fast next calls are coming and how fast IO is putting stuff 
in queue, and adjust down if IO is much faster; or start low  (~10) and expand 
aggressively every time the next() waits (meaning IO is not keeping up). This 
is rather complex although may be the best long term solution.
3) Determine queue size per fragment (vertex, really) based on schema. 
Configure a high default limit (e.g. 10k to prevent OOMs), and the lower bound 
of the limit (e.g. 10). Then, at init time start with the limit as the high 
boundary, and reduce it based on the number and type of VRB vectors (reduce 
proportionally assuming the maximum limit is for a single INT vector, and it 
can never go below the minimum). This is hand wavy but easy to implement and 
reason about, and as a fail safe one can always set min=max to fix the queue 
size.

I think we can start with 3 and consider 2 later. 1 is only good if we decide 
to separate FS and decoding threads that was a plan long time ago that was not 
implemented.

[~gopalv] [~prasanth_j] [~hagleitn] any input?

> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-18269.1.patch, Screen Shot 2017-12-13 at 1.15.16 
> AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2017-12-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290247#comment-16290247
 ] 

Hive QA commented on HIVE-18269:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12901931/HIVE-18269.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 39 failed/errored test(s), 10764 tests 
executed
*Failed tests:*
{noformat}
TestCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=39)

[unionall_join_nullconstant.q,tez_join.q,cbo_rp_windowing.q,orc_merge11.q,udf_float.q,udf_sentences.q,bucketmapjoin13.q,udf_split.q,load_dyn_part9.q,auto_join16.q,vector_reduce2.q,tez_joins_explain.q,udf_replace.q,create_or_replace_view.q,alter_partition_clusterby_sortby.q,exchange_partition2.q,vector_aggregate_9.q,udf_greaterthan.q,exim_15_external_part.q,delete_orig_table.q,index_auto_unused.q,groupby_position.q,llap_acid_fast.q,acid_subquery.q,nullformatCTAS.q,decimal_join2.q,join21.q,cbo_rp_groupby3_noskew_multi_distinct.q,transform1.q,delete_where_partitioned.q]
TestCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=58)

[load_dyn_part2.q,llap_uncompressed.q,smb_mapjoin_7.q,mapjoin46.q,temp_table_external.q,ctas_colname.q,index_auto_empty.q,index_in_db.q,subquery_in_having.q,vectorized_string_funcs.q,vectorization_1.q,stats_ppr_all.q,join0.q,timestamptz_1.q,decimal_6.q,udf_sign.q,alter_file_format.q,vector_udf1.q,select_unquote_not.q,join14_hadoop20.q,constprog_when_case.q,druid_timeseries.q,avro_change_schema.q,create_udaf.q,array_size_estimation.q,merge3.q,lateral_view_onview.q,groupby4_map_skew.q,ppd_constant_expr.q,drop_table_with_stats.q]
TestCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=74)

[auto_join24.q,parquet_schema_evolution.q,udf_to_string.q,vectorized_distinct_gby.q,mapreduce8.q,constantfolding.q,groupby8.q,serde_opencsv.q,druidmini_test1.q,tez_vector_dynpart_hashjoin_1.q,groupby_multi_insert_common_distinct.q,join6.q,expr_cached.q,script_pipe.q,udf_bitwise_or.q,multiMapJoin2.q,filter_join_breaktask.q,udf_regexp.q,udf_xpath_long.q,ppd_multi_insert.q,alter_merge_2_orc.q,join_thrift.q,pointlookup4.q,union4.q,load_fs2.q,llap_text.q,input42.q,udf_mask.q,dynamic_semijoin_reduction_3.q,stats_aggregator_error_1.q]
TestCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=8)

[dp_counter_mm.q,llap_reader.q,columnstats_tbllvl.q,insert_into_with_schema.q,groupby_map_ppr.q,input_part1.q,convert_enum_to_string.q,union14.q,subquery_unqual_corr_expr.q,annotate_stats_filter.q,sort_merge_join_desc_8.q,udf_format_number.q,dynamic_semijoin_reduction_sw.q,alter_change_db_location.q,udf_minute.q,groupby_sort_test_1.q,authorization_update.q,authorization_cli_createtab_noauthzapi.q,tez_insert_overwrite_local_directory_1.q,testSetQueryString.q,parquet_ppd_partition.q,nested_complex.q,alter_table_serde.q,drop_view.q,exim_09_part_spec_nonoverlap.q,delimiter.q,udaf_collect_set.q,authorization_view_4.q,groupby_sort_skew_1_23.q,skewjoinopt12.q]
TestCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=80)

[groupby6_map.q,tez_union_group_by.q,llap_acid.q,groupby_nullvalues.q,join15.q,msck_repair_0.q,msck_repair_1.q,udf_round_2.q,setop_no_distinct.q,authorization_reset.q,vectorization_decimal_date.q,windowing_columnPruning.q,create_nested_type.q,stats13.q,stats_publisher_error_1.q,groupby_sort_3.q,partInit.q,auto_join13.q,partition_decode_name.q,date_1.q,join_acid_non_acid.q,udf9.q,vector_groupby_grouping_window.q,auto_join21.q,join_view.q,input_lazyserde2.q,encryption_insert_partition_dynamic.q,crtseltbl_serdeprops.q,fold_eq_with_case_when.q,dynamic_partition_skip_default.q]
TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=147)

[mapreduce2.q,orc_llap_counters1.q,bucket6.q,insert_into1.q,empty_dir_in_table.q,orc_merge1.q,parquet_types_vectorization.q,orc_merge_diff_fs.q,llap_stats.q,llapdecider.q,load_hdfs_file_with_space_in_the_name.q,llap_nullscan.q,orc_ppd_basic.q,rcfile_merge4.q,orc_merge3.q]
TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=148)
[acid_bucket_pruning.q]
TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=149)

[intersect_all.q,unionDistinct_1.q,orc_ppd_schema_evol_3a.q,table_nonprintable.q,tez_union_dynamic_partition.q,tez_union_dynamic_partition_2.q,temp_table_external.q,global_limit.q,llap_udf.q,schemeAuthority.q,cte_2.q,rcfile_createas1.q,dynamic_partition_pruning_2.q,intersect_merge.q,parallel_colstats.q]
TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=150)


[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2017-12-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290173#comment-16290173
 ] 

Hive QA commented on HIVE-18269:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
17s{color} | {color:red} llap-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
18s{color} | {color:red} llap-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 18s{color} 
| {color:red} llap-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
18s{color} | {color:red} common: The patch generated 2 new + 931 unchanged - 0 
fixed = 933 total (was 931) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} llap-server: The patch generated 0 new + 31 
unchanged - 1 fixed = 31 total (was 32) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 11m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 8ab523b |
| Default Java | 1.8.0_111 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8229/yetus/patch-mvninstall-llap-server.txt
 |
| compile | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8229/yetus/patch-compile-llap-server.txt
 |
| javac | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8229/yetus/patch-compile-llap-server.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8229/yetus/diff-checkstyle-common.txt
 |
| modules | C: common llap-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8229/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-18269.1.patch, Screen Shot 2017-12-13 at 1.15.16 
> AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> 

[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2017-12-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289886#comment-16289886
 ] 

Sergey Shelukhin commented on HIVE-18269:
-

Is this actually going to work? Seems like the sync block inside which take and 
put are happening will block each other, so if one blocks the other cannot 
enter and unblock the first.

> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-18269.1.patch, Screen Shot 2017-12-13 at 1.15.16 
> AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2017-12-13 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289779#comment-16289779
 ] 

Prasanth Jayachandran commented on HIVE-18269:
--

[~sershe] can  you please take a look?

> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-18269.1.patch, Screen Shot 2017-12-13 at 1.15.16 
> AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2017-12-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289635#comment-16289635
 ] 

Sergey Shelukhin commented on HIVE-18269:
-

Interesting... CV memory use may be hard to estimate. Maybe the backpressure 
can be based on list length, start at relatively low value, and then if 
backpressure was triggered before and the list has emptied (causing operators 
to wait) the value would go up?

> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: Screen Shot 2017-12-13 at 1.15.16 AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may have 
> grow indefinitely when Llap IO is faster than processing pipeline. Since we 
> don't have backpressure to slow down the IO, this can lead to indefinite 
> growth of pending data leading to severe GC pressure and eventually lead to 
> OOM.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)