wuchunfu commented on issue #10414:
URL: https://github.com/apache/seatunnel/issues/10414#issuecomment-3824104266

   > [@wuchunfu](https://github.com/wuchunfu) Thanks for confirming and for the 
offer to assign this to me. My understanding matches: we need 
periodic/continuous directory scanning, and already-read/transferred files 
should not be processed again. I suggest a 3-step plan:
   > 
   > * dedup/incremental: reuse existing sync_mode=update (currently binary 
only) and expose it for FTP/SFTP/Local sources to address “no repeated 
transfers” first.
   > * Continuous monitoring: add a continuous mode (e.g., scan_interval) to 
keep discovering and processing new files at runtime.
   > * Operational features: add delete/backup/retention and define 
recovery/commit semantics to avoid data loss.
   > 
   > If there are relevant suggestions, please supplement them with specific 
details.
   
   @yzeng1618 I think your approach is fine, and I believe you can handle this 
matter well. I look forward to your contribution to this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to