nrc commented on issue #12010:
URL: https://github.com/apache/datafusion/issues/12010#issuecomment-2295097808

   I'm not sure this will be too useful, but our code for creating a 
`ParquetExec` is something like
   
   ```
   let file_scan_config = FileScanConfig::new(self.object_store_url.clone(), 
self.file_schema.clone())
       .with_projection(projection.cloned())
       .with_limit(limit)
       .with_table_partition_cols(partition_fields)
       .with_file_groups(file_groups);
   
   let reader_factory = 
DefaultParquetFileReaderFactory::new(self.object_store.clone());
   
   let exec = ParquetExecBuilder::new(file_scan_config)
       .with_predicate(predicate.clone())
       .with_parquet_file_reader_factory(Arc::new(reader_factory))
       .build_arc();
   ```
   
   You can call `ParquetExecBuilder::with_schema_adapter_factory` to supply a 
schema adpater indirectly to the `ParqetExec`. However, you can't pass in the 
'target' schema to the schema adapter, that happens in `<ParquetExec as 
ExecutionPlan>::execute` and is always the file schema from the 
`FileScanConfig`. We could pass an adjusted schema to `FileScanConfig::new`, 
but I believe the schema is used for other things and so would cause errors 
elsewhere. We could write schema adapter factory which makes a custom schema 
adapter and in that adapter ignore the passed in schema and target our own, but 
that seems like a Bad Idea.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to