[
https://issues.apache.org/jira/browse/ARROW-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924382#comment-15924382
]
Wes McKinney commented on ARROW-539:
------------------------------------
I recommend either using Spark or Impala + Ibis to generate one, here's a
docker image you can pull to run Impala:
https://github.com/cloudera/ibis/blob/master/circle.yml#L43
Here's some examples of creating partitioned tables in Impala+HDFS with Ibis:
https://github.com/cloudera/ibis/blob/master/ibis/impala/tests/test_partition.py#L58
Let me generate a quick example tarball to attach to this JIRA
> [Python] Support reading Parquet datasets with standard partition directory
> schemes
> -----------------------------------------------------------------------------------
>
> Key: ARROW-539
> URL: https://issues.apache.org/jira/browse/ARROW-539
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Python
> Reporter: Wes McKinney
>
> Currently, we only support multi-file directories with a flat structure
> (non-partitioned).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)