This is an automated email from the ASF dual-hosted git repository.
maplefu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/parquet-testing.git
The following commit(s) were added to refs/heads/master by this push:
new 9b48ff4 Add a Parquet file with column chunk key-value metadata (#49)
9b48ff4 is described below
commit 9b48ff4f94dc5e89592d46a119884dbb88100884
Author: Chungmin Lee <[email protected]>
AuthorDate: Sun Jul 21 00:43:59 2024 -0700
Add a Parquet file with column chunk key-value metadata (#49)
* Add a Parquet file with column chunk key-value metadata
This file has a single row group with 0 row and 1 column. The column
chunk has key-value metadata, with a key "foo" mapped to a value "bar".
Created with this code:
```c++
PARQUET_ASSIGN_OR_THROW(
auto sink, arrow::io::FileOutputStream::Open(
"column-chunk-key-value-metadata.parquet"));
parquet::ParquetFileWriter::Open(
sink, std::static_pointer_cast<parquet::schema::GroupNode>(
parquet::schema::GroupNode::Make(
"schema", parquet::Repetition::REQUIRED,
{parquet::schema::PrimitiveNode::Make(
"column1", parquet::Repetition::OPTIONAL,
parquet::Type::INT32)})))
->AppendRowGroup()
->NextColumn()
->key_value_metadata()
.Append("foo", "bar");
```
* Rename to match the prevalent style
* Make it 2 columns
* Update data/README.md
* Add a KeyValue entry without Value
* Update data/README.md
Co-authored-by: mwish <[email protected]>
* Update README.md
* Update README.md
---------
Co-authored-by: mwish <[email protected]>
---
data/README.md | 3 ++-
data/column_chunk_key_value_metadata.parquet | Bin 0 -> 400 bytes
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/data/README.md b/data/README.md
index 2782a93..70bfb21 100644
--- a/data/README.md
+++ b/data/README.md
@@ -51,6 +51,7 @@
| concatenated_gzip_members.parquet | 513 UINT64 numbers compressed using
2 concatenated gzip members in a single data page |
| byte_stream_split.zstd.parquet | Standard normals with `BYTE_STREAM_SPLIT`
encoding. See [note](#byte-stream-split) below |
| incorrect_map_schema.parquet | Contains a Map schema without explicitly
required keys, produced by Presto. See [note](#incorrect-map-schema) |
+| column_chunk_key_value_metadata.parquet | two INT32 columns, one with column
chunk key-value metadata {"foo": "bar", "thisiskeywithoutvalue": null} note
that the second key "thisiskeywithoutvalue", does not have a value, but the
value can be mapped to an empty string "" when read depending on the client |
TODO: Document what each file is in the table above.
@@ -425,4 +426,4 @@ message hive_schema {
}
}
}
-```
\ No newline at end of file
+```
diff --git a/data/column_chunk_key_value_metadata.parquet
b/data/column_chunk_key_value_metadata.parquet
new file mode 100644
index 0000000..bcaf871
Binary files /dev/null and b/data/column_chunk_key_value_metadata.parquet differ