[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-03-26 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802192#comment-16802192
 ] 

Prasanth Jayachandran commented on HIVE-21305:
--

[~gopalv] LLAP sets ROWS_EMITTED as VRB batch size, whereas tez counts VRB as 1 
record, hence the difference. 

[~rajesh.balamohan] This disables read through cache behavior for ETL queries 
reading from text table and writing elsewhere (which could be temp table as 
well). 

> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21305.1.patch, HIVE-21305.2.patch, 
> HIVE-21305.3.patch
>
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-03-24 Thread Rajesh Balamohan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800184#comment-16800184
 ] 

Rajesh Balamohan commented on HIVE-21305:
-

Would this skip temp table creation which may not be part of ETL as well by any 
chance?

> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21305.1.patch, HIVE-21305.2.patch, 
> HIVE-21305.3.patch
>
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-03-24 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800137#comment-16800137
 ] 

Gopal V commented on HIVE-21305:


{code}
diff --git a/ql/src/test/results/clientpositive/llap/tez_input_counters.q.out 
b/ql/src/test/results/clientpositive/llap/tez_input_counters.q.out
index 16a45feb8b..9aa6a211d8 100644
--- a/ql/src/test/results/clientpositive/llap/tez_input_counters.q.out
+++ b/ql/src/test/results/clientpositive/llap/tez_input_counters.q.out
@@ -1815,7 +1815,7 @@ Stage-1 HIVE COUNTERS:
CREATED_DYNAMIC_PARTITIONS: 74
CREATED_FILES: 76
DESERIALIZE_ERRORS: 0
-   RECORDS_IN_Map_1: 240
+   RECORDS_IN_Map_1: 1
{code}

Unrelated bug?

> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21305.1.patch, HIVE-21305.2.patch, 
> HIVE-21305.3.patch
>
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-03-24 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800135#comment-16800135
 ] 

Gopal V commented on HIVE-21305:


LGTM - +1

> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21305.1.patch, HIVE-21305.2.patch, 
> HIVE-21305.3.patch
>
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-03-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799966#comment-16799966
 ] 

Hive QA commented on HIVE-21305:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12963542/HIVE-21305.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15837 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16659/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16659/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16659/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12963542 - PreCommit-HIVE-Build

> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21305.1.patch, HIVE-21305.2.patch, 
> HIVE-21305.3.patch
>
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-03-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799955#comment-16799955
 ] 

Hive QA commented on HIVE-21305:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
42s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
9s{color} | {color:blue} ql in master has 2255 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} common: The patch generated 1 new + 430 unchanged - 0 
fixed = 431 total (was 430) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 2 new + 456 unchanged - 0 
fixed = 458 total (was 456) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16659/dev-support/hive-personality.sh
 |
| git revision | master / 2fa22bf |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16659/yetus/diff-checkstyle-common.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16659/yetus/diff-checkstyle-ql.txt
 |
| modules | C: common ql itests U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16659/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21305.1.patch, HIVE-21305.2.patch, 
> HIVE-21305.3.patch
>
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This 

[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-03-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799942#comment-16799942
 ] 

Hive QA commented on HIVE-21305:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12963540/HIVE-21305.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15837 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_io_etl] 
(batchId=153)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16658/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16658/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16658/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12963540 - PreCommit-HIVE-Build

> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21305.1.patch, HIVE-21305.2.patch
>
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-03-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799934#comment-16799934
 ] 

Hive QA commented on HIVE-21305:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
11s{color} | {color:blue} ql in master has 2255 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
15s{color} | {color:red} common: The patch generated 1 new + 430 unchanged - 0 
fixed = 431 total (was 430) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 2 new + 456 unchanged - 0 
fixed = 458 total (was 456) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16658/dev-support/hive-personality.sh
 |
| git revision | master / 2fa22bf |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16658/yetus/diff-checkstyle-common.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16658/yetus/diff-checkstyle-ql.txt
 |
| modules | C: common ql itests U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16658/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21305.1.patch, HIVE-21305.2.patch
>
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by 

[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-03-23 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799579#comment-16799579
 ] 

Hive QA commented on HIVE-21305:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12963472/HIVE-21305.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 15838 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_io_etl] (batchId=25)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge1] 
(batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge3] 
(batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge4] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_input_counters]
 (batchId=181)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=190)
org.apache.hive.minikdc.TestJdbcWithMiniKdcSQLAuthHttp.testAuthorization1 
(batchId=276)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16649/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16649/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16649/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12963472 - PreCommit-HIVE-Build

> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21305.1.patch
>
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-03-23 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799573#comment-16799573
 ] 

Hive QA commented on HIVE-21305:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
11s{color} | {color:blue} ql in master has 2255 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} common: The patch generated 1 new + 430 unchanged - 0 
fixed = 431 total (was 430) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 2 new + 456 unchanged - 0 
fixed = 458 total (was 456) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16649/dev-support/hive-personality.sh
 |
| git revision | master / 2fa22bf |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16649/yetus/diff-checkstyle-common.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16649/yetus/diff-checkstyle-ql.txt
 |
| modules | C: common ql itests U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16649/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21305.1.patch
>
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by Atlassian JIRA

[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-03-22 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799507#comment-16799507
 ] 

Prasanth Jayachandran commented on HIVE-21305:
--

[~gopalv] can you please take a look?

> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21305.1.patch
>
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-02-25 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776779#comment-16776779
 ] 

Peter Vary commented on HIVE-21305:
---

[~gopalv]:
 * Read through cache - ok - got it :)
 * Consider the following query:
{code:java}
insert into ETL_1 values
select fact.id, fact.value, dim.value from fact, dim where 
fact.dim_id=dim.id;
{code}
We might want to cache the dim table, since that might be reused in another 
query, but we might not want to cache the fact table.

 * Small tables vs. big tables cache: I might be wrong but my assumption was 
that reading files has some constant access time like overhead and then a size 
based reading time. If my assumption is correct we might be better of caching 
the small tables (provided they are reused later) since this can save us the 
constant access time. Since they would have smaller memory footprint we can 
store more of them in the cache, so the size is not that much of a factor.

Disclaimer: All of that above is based only on limited data - you have more 
experience here :)

 

> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Priority: Major
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-02-22 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775544#comment-16775544
 ] 

Gopal V commented on HIVE-21305:


bq. We decide if the query inserts into a table then we do not add entries to 
the cache, but we still use the existing cache elements?

The cache does the read through, so the cache is in charge of reading data into 
itself - the items are not read and then placed into the cache.

bq. We might be better off caching the small tables but skipping the big ones.

Once you decide not to cache in a scenario, the smallest tables are the least 
worth caching - the improvement in performance is going to be smaller as the 
tables get smaller.

A more granular decision might be helpful, but this is a "too obvious" general 
ticket & not the final version (we will learn as we implement and deploy).

The original customer case was for the SerDeEncodedDataReader (which burns CPU 
to do intermediate transforms, not just caching data), not the ORC cache.

And the real issue was displacement as well (the existing "hot data"  getting 
displaced by this) - in the most common scenario, the text data is never going 
to be read again.

> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Priority: Major
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-02-22 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774884#comment-16774884
 ] 

Peter Vary commented on HIVE-21305:
---

[~prasanth_j]: We decide if the query inserts into a table then we do not add 
entries to the cache, but we still use the existing cache elements?

What do you think about using the row number statistics for the input tables? 
ETL queries still might use/and reuse some smaller tables in joins. We might be 
better off caching the small tables but skipping the big ones.

Thanks,

Peter 

 

> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Priority: Major
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)