[
https://issues.apache.org/jira/browse/HIVE-20044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564539#comment-16564539
]
Matt McCline commented on HIVE-20044:
-------------------------------------
[~teddy.choi] [~ewohlstadter] Second thoughts...
Do you need to write a variation of StringExpr.padRight that accounts for
Unicode? Do other any other parts of your Arrow SerDe need to consider Unicode
character length vs. byte length?
> Arrow Serde should pad char values and handle empty strings correctly
> ---------------------------------------------------------------------
>
> Key: HIVE-20044
> URL: https://issues.apache.org/jira/browse/HIVE-20044
> Project: Hive
> Issue Type: Bug
> Components: Serializers/Deserializers
> Reporter: Teddy Choi
> Assignee: Teddy Choi
> Priority: Major
> Labels: pull-request-available
> Attachments: HIVE-20044.1.branch-3.patch, HIVE-20044.1.patch,
> HIVE-20044.1.patch, HIVE-20044.patch
>
>
> When Arrow Serde serializes char values, it loses padding. Also when it
> counts empty strings, sometimes it makes a smaller number. It should pad char
> values and handle empty strings correctly.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)