[Parquet] How to write hive partitioning with partitioning keys in the file

Haocheng Liu Fri, 01 Dec 2023 12:04:18 -0800

Hi community,

Hope this email finds you well. Can folk guide how to write hive
partitioning with partitioning keys *in the file*?  Right now only the
subset of the data will be written.


Both Python pyarrow.dataset.wite_dataset(...)
<https://arrow.apache.org/cookbook/py/io.html#writing-partitioned-datasets> and
the C++ FileSystemDatasetWriteOptions(...)
<https://arrow.apache.org/docs/cpp/dataset.html#reading-and-writing-partitioned-data>
have
this behavior. I failed to find how to change when reading the doc. If it's
not possible, which file  should I check to extend it in the C++ code?  I
can contribute the change to the github trunk.

Related email thread
<https://lists.apache.org/thread/dkjq103wn9j461zx3lp9dqsoqtthjzon>:
"[parquet][Iceberg] Should hive partition keys appear as corresponding
columns in the file"

Best,
Haocheng

[Parquet] How to write hive partitioning with partitioning keys in the file

Reply via email to