[
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527519#comment-16527519
]
Junjie Chen edited comment on HIVE-17593 at 6/29/18 11:55 AM:
--------------------------------------------------------------
In ConvertAstToSeachArg.java we can find that Hive is using padding string of
HiveChar as Search argument, while in parquet DataWritableWriter it stripes
HiveChar spaces, and thus lead to search failed. Actually hive should not
strip tail spaces for parquet since parquet could do encoding, such as RLE, to
deal with this. So update to using padding value.
[~Ferd], please take a look on this.
was (Author: junjie):
In ConvertAstToSeachArg.java we can find that Hive is using padding string of
HiveChar as Search argument, while in parquet DataWritableWriter it stripes
HiveChar spaces, and thus lead to search failed. Actually hive should not
strip tail spaces for parquet since parquet could do encoding, such as RLE, to
deal with this. So update to using padding value.
> DataWritableWriter strip spaces for CHAR type before writing, but predicate
> generator doesn't do same thing.
> ------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
> Issue Type: Bug
> Affects Versions: 2.3.0, 3.0.0
> Reporter: Junjie Chen
> Assignee: Junjie Chen
> Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when
> generating predicate, it does NOT do same striping which should cause data
> missing!
> In current version, it doesn't cause data missing since predicate is not well
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as
> same which will build a predicate with tail spaces.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)