[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047856#comment-17047856 ] Slim Bouguerra commented on HIVE-22583: --- sorry i think am missing the fact that that file is not yours. +1 > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch, HIVE-22583.1.patch, > HIVE-22583.2.patch, HIVE-22583.3.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047846#comment-17047846 ] Slim Bouguerra commented on HIVE-22583: --- [~szita] can you please remove the use of the counters form the q files ? thought we have added option to crash if the cache has miss ? > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch, HIVE-22583.1.patch, > HIVE-22583.2.patch, HIVE-22583.3.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043889#comment-17043889 ] Hive QA commented on HIVE-22583: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12994307/HIVE-22583.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 18058 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/20806/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20806/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20806/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12994307 - PreCommit-HIVE-Build > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch, HIVE-22583.1.patch, > HIVE-22583.2.patch, HIVE-22583.3.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043852#comment-17043852 ] Hive QA commented on HIVE-22583: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 0s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 4s{color} | {color:blue} ql in master has 1530 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 48s{color} | {color:blue} llap-server in master has 90 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 30s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 16s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 30m 43s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-20806/dev-support/hive-personality.sh | | git revision | master / 92d9f7d | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | asflicense | http://104.198.109.242/logs//PreCommit-HIVE-Build-20806/yetus/patch-asflicense-problems.txt | | modules | C: ql llap-server itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-20806/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch, HIVE-22583.1.patch, > HIVE-22583.2.patch, HIVE-22583.3.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043391#comment-17043391 ] Ádám Szita commented on HIVE-22583: --- Thanks for the tip [~bslim] . I incorporated your suggestion of making a query read from cache only in HIVE-22721 and amended my patch here, so that it can make use of it. Can you take a quick peek on it please? > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch, HIVE-22583.1.patch, > HIVE-22583.2.patch, HIVE-22583.3.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011048#comment-17011048 ] Slim Bouguerra commented on HIVE-22583: --- [~szita] I think that might be the same thing, in fact the tez counters depends on HDFS counters and that is related to the file format that can change and thus the bytes count can change. Think of it that the byte read or miss by the cache are relative the ORC file formats. As i said i think for now we can avoid this test case that can be flaky and work on a query that can run against the cache only, that's more robust IMO. > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch, HIVE-22583.1.patch, > HIVE-22583.2.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994821#comment-16994821 ] Ádám Szita commented on HIVE-22583: --- Thanks for looking into this, [~bslim]. I was thinking that maybe we could do an {{llap cache -purge}} at the end of the test case. That would imprint the number bytes that were cached in the q test result file. I guess it would not interfere with other test cases, as ptest runs them sequentially within a batch. What's your opinion? > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch, HIVE-22583.1.patch, > HIVE-22583.2.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993009#comment-16993009 ] Slim Bouguerra commented on HIVE-22583: --- The fix looks good to me. I try to avoid using TEZ counters as a part of the test validation since they can change quite drastically if the plan changes but i understand that it is the only way for now to test the data in the cache. One of the idea i want to implement is to have a query flag that tells LLAP to run a query against the cache content only and that can enable such tests and will avoid to use the the TEZ counters. > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch, HIVE-22583.1.patch, > HIVE-22583.2.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991696#comment-16991696 ] Hive QA commented on HIVE-22583: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12988307/HIVE-22583.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 17763 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19832/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19832/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19832/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12988307 - PreCommit-HIVE-Build > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch, HIVE-22583.1.patch, > HIVE-22583.2.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991641#comment-16991641 ] Hive QA commented on HIVE-22583: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 46s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 28s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 26s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 4s{color} | {color:blue} ql in master has 1532 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 44s{color} | {color:blue} llap-server in master has 90 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 29m 47s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19832/dev-support/hive-personality.sh | | git revision | master / d7a193b | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | modules | C: ql llap-server itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-19832/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch, HIVE-22583.1.patch, > HIVE-22583.2.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990832#comment-16990832 ] Hive QA commented on HIVE-22583: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12987827/HIVE-22583.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19814/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19814/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19814/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12987827/HIVE-22583.1.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12987827 - PreCommit-HIVE-Build > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch, HIVE-22583.1.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990777#comment-16990777 ] Hive QA commented on HIVE-22583: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12987827/HIVE-22583.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17759 tests executed *Failed tests:* {noformat} TestStatsReplicationScenariosACIDNoAutogather - did not produce a TEST-*.xml file (likely timed out) (batchId=257) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19810/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19810/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19810/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12987827 - PreCommit-HIVE-Build > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch, HIVE-22583.1.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990766#comment-16990766 ] Hive QA commented on HIVE-22583: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 45s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 9s{color} | {color:blue} ql in master has 1532 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 43s{color} | {color:blue} llap-server in master has 90 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 30m 11s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19810/dev-support/hive-personality.sh | | git revision | master / a245e79 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | modules | C: ql llap-server itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-19810/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch, HIVE-22583.1.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_.
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990385#comment-16990385 ] Hive QA commented on HIVE-22583: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12987578/HIVE-22583.0.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19788/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19788/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19788/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12987578/HIVE-22583.0.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12987578 - PreCommit-HIVE-Build > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989478#comment-16989478 ] Hive QA commented on HIVE-22583: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12987578/HIVE-22583.0.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 17759 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[unicode_data] (batchId=87) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_io_etl] (batchId=168) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19769/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19769/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19769/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12987578 - PreCommit-HIVE-Build > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989447#comment-16989447 ] Hive QA commented on HIVE-22583: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 23s{color} | {color:blue} ql in master has 1532 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 47s{color} | {color:blue} llap-server in master has 90 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 31s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 31m 57s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19769/dev-support/hive-personality.sh | | git revision | master / 74ae418 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | modules | C: ql llap-server itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-19769/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message
[jira] [Commented] (HIVE-22583) LLAP cache always misses with non-vectorized serde readers such as OpenCSV
[ https://issues.apache.org/jira/browse/HIVE-22583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16988668#comment-16988668 ] Ádám Szita commented on HIVE-22583: --- [~bslim], [~odraese], [~pvary] can you take a quick look please? > LLAP cache always misses with non-vectorized serde readers such as OpenCSV > -- > > Key: HIVE-22583 > URL: https://issues.apache.org/jira/browse/HIVE-22583 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Attachments: HIVE-22583.0.patch > > > Although after the first read LLAP cache stores data of tables that are not > using the LazySimple serde, the stored data is then never used in the future > subsequent queries, causing a full cache miss and re-read each time. > Problem is rooted in SerdeEncodedDataReader#cacheFileData is not taking care > of creating an entry for the root/struct column of the table. The only cases > this is taken care of are when a vectorized reader is used _(e.g. > LazySimpleSerde's LazySimpleDeserializeRead)_, where > SerdeEncodedDataReader#processAsyncCacheData takes care of this. > This can be reproduced by either using a custom serde, like OpenCSV or using > LazySimpleSerde, but turning off _hive.llap.io.encode.vector.serde.enabled_. -- This message was sent by Atlassian Jira (v8.3.4#803005)