GitHub user alamb added a comment to the discussion: How does 'sort' interact 
with record batches?

I think you might be able to get what you want by running a query for each file:

Something like

```rust
ctx
        .read_parquet("file1.parquet")
        .await?
        .window(vec![row_number().alias(DATA_FUSION_ROW_NUMBER)])
       .sort(vec![ident("userPrimarkyKey").sort(true, true)])?
```

You will also likely have to set [`datafusion.execution.target_partitions` 
config setting ](https://datafusion.apache.org/user-guide/configs.html)to 1 

GitHub link: 
https://github.com/apache/datafusion/discussions/15711#discussioncomment-12979943

----
This is an automatically sent email for github@datafusion.apache.org.
To unsubscribe, please send an email to: 
github-unsubscr...@datafusion.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to