[ 
https://issues.apache.org/jira/browse/HIVE-25443?focusedWorklogId=683102&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683102
 ]

ASF GitHub Bot logged work on HIVE-25443:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Nov/21 05:20
            Start Date: 18/Nov/21 05:20
    Worklog Time Spent: 10m 
      Work Description: shameersss1 commented on pull request #2581:
URL: https://github.com/apache/hive/pull/2581#issuecomment-972548623


   > @shameersss1 there are some commits with your 
[[email protected]](mailto:[email protected]) address - could you add associate 
that email address with your github account? without that the committer email 
address will be changed to some 
[[email protected]](mailto:[email protected])
   
   @kgyrtkirk  - I have linked that email address. Please take it forward and 
thanks for the review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 683102)
    Time Spent: 0.5h  (was: 20m)

> Arrow SerDe Cannot serialize/deserialize complex data types When there are 
> more than 1024 values
> ------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-25443
>                 URL: https://issues.apache.org/jira/browse/HIVE-25443
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 3.1.0, 3.0.0, 3.1.1, 3.1.2
>            Reporter: Syed Shameerur Rahman
>            Assignee: Syed Shameerur Rahman
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Complex data types like MAP, STRUCT cannot be serialized/deserialzed using 
> Arrow SerDe when there are more than 1024 values. This happens due to 
> ColumnVector always being initialized with a size of 1024.
> Issue #1 : 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/arrow/ArrowColumnarBatchSerDe.java#L213
> Issue #2 : 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/arrow/ArrowColumnarBatchSerDe.java#L215
> Sample unit test to reproduce the case in TestArrowColumnarBatchSerDe :
> {code:java}
> @Test
>    public void testListBooleanWithMoreThan1024Values() throws SerDeException {
>      String[][] schema = {
>              {"boolean_list", "array<boolean>"},
>      };
>   
>      Object[][] rows = new Object[1025][1];
>      for (int i = 0; i < 1025; i++) {
>        rows[i][0] = new BooleanWritable(true);
>      }
>   
>      initAndSerializeAndDeserialize(schema, toList(rows));
>    }
>   
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to