wgtmac commented on issue #40800: URL: https://github.com/apache/arrow/issues/40800#issuecomment-2021987306
> The reason I'm asking is that I keep track of what parts of my whole dataset have been processed based on the existence of parquet files. If a parquet file is missing, that part of the data would be re-processed. IMHO, this looks hacky to me as it depends on the implementation detail of the underlying file system, which the IO interface abstraction aims to hide. It would be better to catch any exception on the application side and rerun failed ones explicitly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
