wuchunfu commented on issue #10414: URL: https://github.com/apache/seatunnel/issues/10414#issuecomment-3824104266
> [@wuchunfu](https://github.com/wuchunfu) Thanks for confirming and for the offer to assign this to me. My understanding matches: we need periodic/continuous directory scanning, and already-read/transferred files should not be processed again. I suggest a 3-step plan: > > * dedup/incremental: reuse existing sync_mode=update (currently binary only) and expose it for FTP/SFTP/Local sources to address “no repeated transfers” first. > * Continuous monitoring: add a continuous mode (e.g., scan_interval) to keep discovering and processing new files at runtime. > * Operational features: add delete/backup/retention and define recovery/commit semantics to avoid data loss. > > If there are relevant suggestions, please supplement them with specific details. @yzeng1618 I think your approach is fine, and I believe you can handle this matter well. I look forward to your contribution to this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
