alamb commented on issue #7482:
URL: 
https://github.com/apache/arrow-datafusion/issues/7482#issuecomment-1707105827

   BTW I found another workaround in IOx. We had configured a `FileScanConfig` 
like this (with `output_ordering: vec![vec![]]`) :
   
   ```rust
           let base_config = FileScanConfig {
               object_store_url: self.object_store_url.clone(),
               file_schema: schema,
               file_groups: vec![vec![PartitionedFile {
                   object_meta: self.object_meta.clone(),
                   partition_values: vec![],
                   range: None,
                   extensions: None,
               }]],
               statistics: Statistics::default(),
               projection: None,
               limit: None,
               table_partition_cols: vec![],
               // Parquet files ARE actually sorted but we don't care here 
since we just construct a `collect` plan.
               output_ordering: vec![],
               infinite_source: false,
           };
   ```
   
   I could stop the crashes like this:
   
   ```diff
   diff --git a/parquet_file/src/storage.rs b/parquet_file/src/storage.rs
   index 285e272f7..c520e3bd0 100644
   --- a/parquet_file/src/storage.rs
   +++ b/parquet_file/src/storage.rs
   @@ -137,7 +137,7 @@ impl ParquetExecInput {
                limit: None,
                table_partition_cols: vec![],
                // Parquet files ARE actually sorted but we don't care here 
since we just construct a `collect` plan.
   -            output_ordering: vec![vec![]],
   +            output_ordering: vec![],
                infinite_source: false,
            };
            let exec = ParquetExec::new(base_config, None, None);
   ```
   
   I do think this illustrates why having a dedicated structure to encapsulate 
the output orderings might be nice. But definitely not necessary


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to