andygrove commented on code in PR #20370:
URL: https://github.com/apache/datafusion/pull/20370#discussion_r2828908468
##########
datafusion/datasource-parquet/src/metadata.rs:
##########
@@ -68,6 +68,55 @@ pub struct DFParquetMetadata<'a> {
file_metadata_cache: Option<Arc<dyn FileMetadataCache>>,
/// timeunit to coerce INT96 timestamps to
pub coerce_int96: Option<TimeUnit>,
+ /// Whether to extract and use Parquet field IDs for column resolution
+ pub enable_field_ids: bool,
+}
+
+/// Extracts Parquet field IDs and stores them in Arrow field metadata
+/// under the key "PARQUET:field_id"
+///
+/// # Limitations
+///
+/// TODO: Currently only supports flat schemas (top-level primitive fields).
Review Comment:
What happens if I enable the feature and try and read a Parquet file with
complex types?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]