ethan-tyler commented on issue #6051: URL: https://github.com/apache/datafusion/issues/6051#issuecomment-3787502580
+1 - This is the “file identity” half we need for file aware table format semantics. We hit an issue in delta-rs where DF coalesced batches across file boundaries (fixed in next-scan), which surfaced that relying on implicit stream ordering is fragile for DV semantics and log replay. delta-io/delta-rs#4115. `input_file_name()` can get us to a stable `(file, …)` key. For fully order insensitive DV reads/writes we would also need `(…, row_position)` (0-based file ordinal)See #13261. This would be a big help for us on the delta-rs side. Happy to take this on or help however needed. Lmk what would be most useful. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
