Dandandan commented on PR #1216: URL: https://github.com/apache/datafusion-ballista/pull/1216#issuecomment-2926981538
As far as I can see, we don't have to validate the IPC files: * Ballista has control over writing the output * In a power down scenario where the file is being written but the stage is not yet completed, we will / should not read the resulting files. As the stage is never completed, the files may be ignored (or cleaned up). > Maybe we can support Job recover and reuse the partition file in the future I think it makes sense to support job recovery either by stage or by completed task rather than per IPC file. Supporting it by partial IPC file seems it might be complicated, as it keep track of input and execution state, which is lost when powering down as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org