[
https://issues.apache.org/jira/browse/DRILL-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jacques Nadeau resolved DRILL-4070.
-----------------------------------
Resolution: Won't Fix
Workaround provided in comments by Parth.
> Files written with versions of Drill before v1.3 record metadata that is
> indistinguishable from bad metadata from other Parquet creators
> ----------------------------------------------------------------------------------------------------------------------------------------
>
> Key: DRILL-4070
> URL: https://issues.apache.org/jira/browse/DRILL-4070
> Project: Apache Drill
> Issue Type: Bug
> Components: Metadata
> Affects Versions: 1.3.0
> Reporter: Rahul Challapalli
> Assignee: Parth Chandra
> Priority: Blocker
> Fix For: 1.3.0
>
> Attachments: cache.txt, fewtypes_varcharpartition.tar.tgz
>
>
> Drill uses the parquet-mr library to write Parquet files. The metadata
> signature that Drill produced in 1.2 and earlier versions of Drill is
> indistinguishable from older footers written by other tools (such as Pig and
> Hive). There was a known bug when those tools wrote metadata that caused the
> statistics to be incorrect. To correct this, the parquet-mr library adopted a
> behavior of ignoring statistics from the old form of the Parquet footer.
> With 1.3, Drill upgraded to the latest version of parquet-mr and has now
> started ignoring these statistics as well. This ensures correct result but
> produces performance regressions (compared to Drill v1 and v2) when querying
> against partitioned Parquet files generated in Drill 1.1 and 1.2.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)