[jira] [Commented] (HIVE-23022) Arrow deserializer should ensure size of hive vector equal to arrow vector
[ https://issues.apache.org/jira/browse/HIVE-23022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059943#comment-17059943 ] mahesh kumar behera commented on HIVE-23022: +1 [^HIVE-23022.01.patch] looks fine to me. > Arrow deserializer should ensure size of hive vector equal to arrow vector > -- > > Key: HIVE-23022 > URL: https://issues.apache.org/jira/browse/HIVE-23022 > Project: Hive > Issue Type: Bug > Components: llap, Serializers/Deserializers >Reporter: Shubham Chaurasia >Assignee: Shubham Chaurasia >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23022.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Arrow deserializer - {{org.apache.hadoop.hive.ql.io.arrow.Deserializer}} in > some cases does not set the size of hive vector correctly. Size of hive > vector should be set at least equal to arrow vector to be able to read > (accommodate) it fully. > Following exception can be seen when we try to read (using > {{LlapArrowRowInputFormat}} ) some table which contains complex types (struct > nested in array to be specific) and number of rows in table is more than > default (1024) batch/vector size. > {code:java} > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.hadoop.hive.ql.io.arrow.Deserializer.readStruct(Deserializer.java:440) > at > org.apache.hadoop.hive.ql.io.arrow.Deserializer.read(Deserializer.java:143) > at > org.apache.hadoop.hive.ql.io.arrow.Deserializer.readList(Deserializer.java:394) > at > org.apache.hadoop.hive.ql.io.arrow.Deserializer.read(Deserializer.java:137) > at > org.apache.hadoop.hive.ql.io.arrow.Deserializer.deserialize(Deserializer.java:122) > at > org.apache.hadoop.hive.ql.io.arrow.ArrowColumnarBatchSerDe.deserialize(ArrowColumnarBatchSerDe.java:284) > at > org.apache.hadoop.hive.llap.LlapArrowRowRecordReader.next(LlapArrowRowRecordReader.java:75) > ... 23 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23022) Arrow deserializer should ensure size of hive vector equal to arrow vector
[ https://issues.apache.org/jira/browse/HIVE-23022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059131#comment-17059131 ] Hive QA commented on HIVE-23022: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12996648/HIVE-23022.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 18104 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/21106/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/21106/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-21106/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12996648 - PreCommit-HIVE-Build > Arrow deserializer should ensure size of hive vector equal to arrow vector > -- > > Key: HIVE-23022 > URL: https://issues.apache.org/jira/browse/HIVE-23022 > Project: Hive > Issue Type: Bug > Components: llap, Serializers/Deserializers >Reporter: Shubham Chaurasia >Assignee: Shubham Chaurasia >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23022.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Arrow deserializer - {{org.apache.hadoop.hive.ql.io.arrow.Deserializer}} in > some cases does not set the size of hive vector correctly. Size of hive > vector should be set at least equal to arrow vector to be able to read > (accommodate) it fully. > Following exception can be seen when we try to read (using > {{LlapArrowRowInputFormat}} ) some table which contains complex types (struct > nested in array to be specific) and number of rows in table is more than > default (1024) batch/vector size. > {code:java} > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.hadoop.hive.ql.io.arrow.Deserializer.readStruct(Deserializer.java:440) > at > org.apache.hadoop.hive.ql.io.arrow.Deserializer.read(Deserializer.java:143) > at > org.apache.hadoop.hive.ql.io.arrow.Deserializer.readList(Deserializer.java:394) > at > org.apache.hadoop.hive.ql.io.arrow.Deserializer.read(Deserializer.java:137) > at > org.apache.hadoop.hive.ql.io.arrow.Deserializer.deserialize(Deserializer.java:122) > at > org.apache.hadoop.hive.ql.io.arrow.ArrowColumnarBatchSerDe.deserialize(ArrowColumnarBatchSerDe.java:284) > at > org.apache.hadoop.hive.llap.LlapArrowRowRecordReader.next(LlapArrowRowRecordReader.java:75) > ... 23 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23022) Arrow deserializer should ensure size of hive vector equal to arrow vector
[ https://issues.apache.org/jira/browse/HIVE-23022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059122#comment-17059122 ] Hive QA commented on HIVE-23022: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 48s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 43s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 43s{color} | {color:blue} ql in master has 1531 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 42s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s{color} | {color:red} itests/hive-unit: The patch generated 6 new + 18 unchanged - 0 fixed = 24 total (was 18) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 29m 41s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-21106/dev-support/hive-personality.sh | | git revision | master / 64a020b | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-21106/yetus/diff-checkstyle-itests_hive-unit.txt | | modules | C: ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-21106/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Arrow deserializer should ensure size of hive vector equal to arrow vector > -- > > Key: HIVE-23022 > URL: https://issues.apache.org/jira/browse/HIVE-23022 > Project: Hive > Issue Type: Bug > Components: llap, Serializers/Deserializers >Reporter: Shubham Chaurasia >Assignee: Shubham Chaurasia >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23022.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Arrow deserializer - {{org.apache.hadoop.hive.ql.io.arrow.Deserializer}} in > some cases does not set the size of hive vector correctly. Size of hive > vector should be set at least equal to arrow vector to be able to read > (accommodate) it fully. > Following exception can be seen when we try to read (using > {{LlapArrowRowInputFormat}} ) some table which contains complex types (struct > nested in array to be specific) and