rok commented on code in PR #16738:
URL: https://github.com/apache/datafusion/pull/16738#discussion_r2313569898
##########
datafusion/datasource-parquet/src/file_format.rs:
##########
@@ -1654,7 +1636,8 @@ async fn output_single_parquet_file_parallelized(
object_store_writer: Box<dyn AsyncWrite + Send + Unpin>,
data: Receiver<RecordBatch>,
output_schema: Arc<Schema>,
- parquet_props: &WriterProperties,
+ writer_properties: &WriterProperties,
+ skip_arrow_metadata: bool,
Review Comment:
I've
[removed](https://github.com/apache/datafusion/pull/16738/commits/96cf914d6ee1a71ab49157ed3dca4e8c08d6a690)
passing `skip_arrow_metadata` logic and get the error below ([CI
link](https://github.com/apache/datafusion/actions/runs/17373711059/job/49315190980?pr=16738)):
```bash
----
datasource::file_format::parquet::tests::parquet_sink_write_insert_schema_into_metadata
stdout ----
thread
'datasource::file_format::parquet::tests::parquet_sink_write_insert_schema_into_metadata'
panicked at datafusion/core/src/datasource/file_format/parquet.rs:1592:9:
assertion `left == right` failed
left: [KeyValue { key: "ARROW:schema", value:
Some("/////5QAAAAQAAAAAAAKAAwACgAJAAQACgAAABAAAAAAAQQACAAIAAAABAAIAAAABAAAAAIAAAA8AAAABAAAANz///8UAAAADAAAAAAAAAUMAAAAAAAAAMz///8BAAAAYgAAABAAFAAQAAAADwAEAAAACAAQAAAAGAAAAAwAAAAAAAAFEAAAAAAAAAAEAAQABAAAAAEAAABhAAAA")
}, KeyValue { key: "my-data", value: Some("stuff") }, KeyValue { key:
"my-data-bool-key", value: None }]
right: [KeyValue { key: "my-data", value: Some("stuff") }, KeyValue { key:
"my-data-bool-key", value: None }]
```
I think using `ArrowRowGroupWriterFactory` has changed the way non-encrypted
path passes the schema to column writers. I'll take a look at this once
upstream change is merged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]