[
https://issues.apache.org/jira/browse/HIVE-21966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shubham Chaurasia updated HIVE-21966:
-------------------------------------
Attachment: HIVE-21966.1.patch
Status: Patch Available (was: Open)
> Llap external client - Arrow Serializer throws ArrayIndexOutOfBoundsException
> in some cases
> -------------------------------------------------------------------------------------------
>
> Key: HIVE-21966
> URL: https://issues.apache.org/jira/browse/HIVE-21966
> Project: Hive
> Issue Type: Bug
> Components: llap, Serializers/Deserializers
> Affects Versions: 3.1.1
> Reporter: Shubham Chaurasia
> Assignee: Shubham Chaurasia
> Priority: Major
> Labels: pull-request-available
> Attachments: HIVE-21966.1.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When we submit query through llap-ext-client, arrow serializer throws
> ArrayIndexOutOfBoundsException when 1), 2) and 3) below are satisfied.
> 1) {{hive.vectorized.execution.filesink.arrow.native.enabled=true}} to take
> arrow serializer code path.
> 2) Query contains a filter or limit clause which enforces
> {{VectorizedRowBatch#selectedInUse=true}}
> 3) Projection involves a column of type {{MultiValuedColumnVector}}.
> Sample stacktrace:
> {code}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 150
> at
> org.apache.hadoop.hive.ql.io.arrow.Serializer.writeGeneric(Serializer.java:679)
> at
> org.apache.hadoop.hive.ql.io.arrow.Serializer.writePrimitive(Serializer.java:518)
> at
> org.apache.hadoop.hive.ql.io.arrow.Serializer.write(Serializer.java:276)
> at
> org.apache.hadoop.hive.ql.io.arrow.Serializer.writeStruct(Serializer.java:342)
> at
> org.apache.hadoop.hive.ql.io.arrow.Serializer.write(Serializer.java:282)
> at
> org.apache.hadoop.hive.ql.io.arrow.Serializer.writeList(Serializer.java:365)
> at
> org.apache.hadoop.hive.ql.io.arrow.Serializer.write(Serializer.java:279)
> at
> org.apache.hadoop.hive.ql.io.arrow.Serializer.serializeBatch(Serializer.java:199)
> at
> org.apache.hadoop.hive.ql.exec.vector.filesink.VectorFileSinkArrowOperator.process(VectorFileSinkArrowOperator.java:135)
> ... 30 more
> {code}
> It can be reproduced by:
> from beeline:
> {code}
> CREATE TABLE complex_tbl(c1 array<struct<f1:string,f2:string>>) STORED AS ORC;
> INSERT INTO complex_tbl SELECT ARRAY(NAMED_STRUCT('f1','v11', 'f2','v21'),
> NAMED_STRUCT('f1','v21', 'f2','v22'));
> {code}
> and when we fire query: {{select * from complex_tbl limit 1}} through
> llap-ext-client.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)