[ 
https://issues.apache.org/jira/browse/HIVE-25553?focusedWorklogId=658338&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-658338
 ]

ASF GitHub Bot logged work on HIVE-25553:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Sep/21 13:24
            Start Date: 30/Sep/21 13:24
    Worklog Time Spent: 10m 
      Work Description: warriersruthi opened a new pull request #2689:
URL: https://github.com/apache/hive/pull/2689


   This covers the following sub-tasks as well:
   HIVE-25554: Upgrade arrow version to 0.15
   HIVE-25555: ArrowColumnarBatchSerDe should store map natively instead of 
converting to list
   HIVE-25556: Remove com.vlkan.flatbuffers dependency from serde
   
   **What changes were proposed in this pull request?**
   a. Upgrading arrow version to version 0.15.0 (where map data-type is 
supported)
   b. Modifying ArrowColumnarBatchSerDe and corresponding 
Serializer/Deserializer to not use list as a workaround for map and use the 
arrow map data-type instead
   c. Taking care of creating non-nullable struct and non-nullable key type for 
the map data-type in ArrowColumnarBatchSerDe
   
   **Why are the changes needed?**
   Currently, ArrowColumnarBatchSerDe converts map datatype as a list of 
structs data-type (where the struct is containing the key-value pair of the 
map).
   This causes issues when reading Map datatype using llap-ext-client as it 
reads a list of structs instead.
   HiveWarehouseConnector which uses the llap-ext-client throws exception when 
the schema (containing Map data type) is different from actual data (list of 
structs).
   This change includes the fix for this issue.
   
   **Does this PR introduce any user-facing change?**
   No
   
   **How was this patch tested?**
   Enabled back the Arrow specific tests in Hive code


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

            Worklog Id:     (was: 658338)
    Remaining Estimate: 0h
            Time Spent: 10m

> Support Map data-type natively in Arrow format
> ----------------------------------------------
>
>                 Key: HIVE-25553
>                 URL: https://issues.apache.org/jira/browse/HIVE-25553
>             Project: Hive
>          Issue Type: Improvement
>          Components: llap, Serializers/Deserializers
>            Reporter: Adesh Kumar Rao
>            Assignee: Adesh Kumar Rao
>            Priority: Major
>             Fix For: 4.0.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently ArrowColumnarBatchSerDe converts map datatype as a list of structs 
> data-type (where stuct is containing the key-value pair of the map). This 
> causes issues when reading Map datatype using llap-ext-client as it reads a 
> list of structs instead. 
> HiveWarehouseConnector which uses the llap-ext-client throws exception when 
> the schema (containing Map data type) is different from actual data (list of 
> structs).
>  
> Fixing this issue requires upgrading arrow version (where map data-type is 
> supported), modifying ArrowColumnarBatchSerDe and corresponding 
> Serializer/Deserializer to not use list as a workaround for map and use the 
> arrow map data-type instead. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to