[ 
https://issues.apache.org/jira/browse/DRILL-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15002438#comment-15002438
 ] 

Jason Altekruse commented on DRILL-4070:
----------------------------------------

The fix I am planning to make is on the Drill side. I'm pretty sure what is 
happening is that the statistics are being written, but the deserialization of 
the column chunk metadata now requires that there be a recent version number 
for it to trust the statistics that are written. I was going to see what we 
write in that field and compare it to a file produced by one of the default 
object models that use the standard write path.

I don't know how much of the community is bothering to append to parquet files, 
but I have confirmed with [~julienledem] in an earlier discussion that writing 
a new footer should work fine. This was just a workaround for anyone with a lot 
of files already written with Drill's auto-partitioning.

> Metadata Caching : min/max values are null for varchar columns in auto 
> partitioned data
> ---------------------------------------------------------------------------------------
>
>                 Key: DRILL-4070
>                 URL: https://issues.apache.org/jira/browse/DRILL-4070
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Metadata
>    Affects Versions: 1.3.0
>            Reporter: Rahul Challapalli
>            Priority: Critical
>         Attachments: cache.txt, fewtypes_varcharpartition.tar.tgz
>
>
> git.commit.id.abbrev=e78e286
> The metadata cache file created contains incorrect values for min/max fields 
> for varchar colums. The data is also partitioned on the varchar column
> {code}
> refresh table metadata fewtypes_varcharpartition;
> {code}
> As a result partition pruning is not happening. This was working after 
> DRILL-3937 has been fixed (d331330efd27dbb8922024c4a18c11e76a00016b)
> I attached the data set and the cache file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to