yzeng1618 commented on issue #10414: URL: https://github.com/apache/seatunnel/issues/10414#issuecomment-3822082321
@wuchunfu Thanks for confirming and for the offer to assign this to me. My understanding matches: we need periodic/continuous directory scanning, and already-read/transferred files should not be processed again. I suggest a 3-step plan: - dedup/incremental: reuse existing sync_mode=update (currently binary only) and expose it for FTP/SFTP/Local sources to address “no repeated transfers” first. - Continuous monitoring: add a continuous mode (e.g., scan_interval) to keep discovering and processing new files at runtime. - Operational features: add delete/backup/retention and define recovery/commit semantics to avoid data loss. Can the above three steps be implemented one by one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
