liquidcarbon commented on issue #447:
URL: https://github.com/apache/parquet-format/issues/447#issuecomment-2311019015

   Hi! I'm confused by different types of metadata.
   
   1) there's key-value metadata of type BLOB that you can write via schema 
metadata  (in python `pyarrow.table(data, metadata=...)`) and read similarly, 
or through 
https://duckdb.org/docs/data/parquet/metadata.html#parquet-key-value-metadata
   2) there's separate place, with these fields, that you can read with 
https://duckdb.org/docs/data/parquet/metadata.html#parquet-metadata
   ```
   ['file_name', 'row_group_id', 'row_group_num_rows',
          'row_group_num_columns', 'row_group_bytes', 'column_id', 
'file_offset',
          'num_values', 'path_in_schema', 'type', 'stats_min', 'stats_max',
          'stats_null_count', 'stats_distinct_count', 'stats_min_value',
          'stats_max_value', 'compression', 'encodings', 'index_page_offset',
          'dictionary_page_offset', 'data_page_offset', 'total_compressed_size',
          'total_uncompressed_size', 'key_value_metadata']
   ```
   column-level information and data types are recorded here.  But how to you 
write something into THAT `key_level_metadata` ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to