ethan-tyler commented on issue #6051:
URL: https://github.com/apache/datafusion/issues/6051#issuecomment-3787502580

   +1 - This is the “file identity” half we need for file aware table format 
semantics.
   
   We hit an issue in delta-rs where DF coalesced batches across file 
boundaries (fixed in next-scan), which surfaced that relying on implicit stream 
ordering is fragile for DV semantics and log replay. delta-io/delta-rs#4115.
   
   `input_file_name()` can get us to a stable `(file, …)` key. For fully order 
insensitive DV reads/writes we would also need `(…, row_position)` (0-based 
file ordinal)See #13261.
   
   This would be a big help for us on the delta-rs side. Happy to take this on 
or help however needed. Lmk what would be most useful.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to