This is an automated email from the ASF dual-hosted git repository.
emkornfield pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/parquet-format.git
The following commit(s) were added to refs/heads/master by this push:
new 9621f8c GH-541: Document status of file_path (#542)
9621f8c is described below
commit 9621f8cd460d5a74a4afd20cd028ad5847b6f235
Author: emkornfield <[email protected]>
AuthorDate: Mon Feb 2 16:14:54 2026 -0800
GH-541: Document status of file_path (#542)
---
src/main/thrift/parquet.thrift | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/src/main/thrift/parquet.thrift b/src/main/thrift/parquet.thrift
index 7ff9b9f..a9e62cc 100644
--- a/src/main/thrift/parquet.thrift
+++ b/src/main/thrift/parquet.thrift
@@ -963,6 +963,21 @@ union ColumnCryptoMetaData {
struct ColumnChunk {
/** File where column data is stored. If not set, assumed to be same file as
* metadata. This path is relative to the current file.
+ *
+ * As of December 2025, the only known use-case for this field is writing
summary
+ * parquet files (i.e. "_metadata" files). These files consolidate footers
from
+ * multiple parquet files to allow for efficient reading of footers to
avoid file
+ * listing costs and prune out files that do not need to be read based on
statistics.
+ *
+ * These files do not appear to have ever been formally specified in the
specification.
+ * and are potentially problematic from a correctness perspective [1].
+ *
+ * [1] https://lists.apache.org/thread/ootf2kmyg3p01b1bvplpvp4ftd1bt72d
+ *
+ * There is no other known usage of this field. Specifically, there are no
known
+ * reference implementations that will read externally stored column data
if this field is populated
+ * within a standard parquet file. Making use of the field for this purpose
is
+ * not considered part of the Parquet specification.
**/
1: optional string file_path