irenjj commented on issue #13323: URL: https://github.com/apache/datafusion/issues/13323#issuecomment-2483134799
It seems hard to control the behavior of `write_parquet` by `single_file_output`(and I've noticed that It seems never used), what really controls whether to generate a single file output the suffix(in `start_demuxer_task()`), there are several methods I can think of to handle this issue: 1. We can add a suffix like `.single` to the paths that require generating a single file, and then recognize this suffix in `start_demuxer_task()`. 2. Give up `single_file_output` in `DataFrameWriteOptions`, use `FileSinkConfig` instead to control single file behavior. cc @alamb @sergiimk @dhegberg -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
