[
https://issues.apache.org/jira/browse/DRILL-7864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285954#comment-17285954
]
Charles Givre commented on DRILL-7864:
--------------------------------------
[~matthros]
There is a pending PR (https://issues.apache.org/jira/browse/DRILL-7825) which
is upgrading the parquet libraries to the latest version. This PR is blocked
by one remaining issue on the Parquet side. This should be merged soon. Do
you think this could solve this issue?
> Parquet file could not be read correctly
> ----------------------------------------
>
> Key: DRILL-7864
> URL: https://issues.apache.org/jira/browse/DRILL-7864
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - Parquet
> Affects Versions: 1.18.0
> Reporter: Matthias Rosenthaler
> Priority: Major
> Attachments: output.parquet
>
>
> The following parquet file which is generated by ParquetSharp (which is using
> the underlying apache arrow c++ lib) is not readable by drill. The values of
> the columns are displaced. If I write the affected float32 columns
> "InjectionRate" and "I_injection_IA" as float64, everything is fine.
> Update: It seems that the bug is *caused by dictionary encoding*. If I turn
> this feature of, drill is able to read it. So please take a look into reading
> dictionary encoded columns in drill to solve the bug.
> Also created a ticket for the arrow project, but they redirect me to the
> drill project. https://issues.apache.org/jira/browse/ARROW-11629
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)