yzeng1618 commented on issue #10414:
URL: https://github.com/apache/seatunnel/issues/10414#issuecomment-3815798354

   Hello, I’d like to confirm the specific requirements with you. Please feel 
free to supplement any details if there are any discrepancies on our 
understanding.
   
   -  you want a real-time/continuous file ingestion capability for file 
sources (FTP/SFTP/HDFS/local, etc.) that (a) periodically scans/monitors a 
directory for new/updated files, (b) avoids re-processing files already 
transferred, and (c) supports post-actions like delete after transfer, backup 
then delete, delete by retention/expiration, plus tunables like scan interval, 
priority queue/queue size/buffer size, and thread/concurrency. Is that 
accurate? Are you targeting a long-running streaming job, or a “run 
periodically” batch job?
   
   - Current SeaTunnel status already have a limited incremental sync mode 
sync_mode=update, but it is only supported for file_format_type=binary, and 
currently it is effectively only exposed/usable in the HdfsFile source 
(FTP/SFTP/local do not expose sync_mode=update options yet). Also, this is 
batch-style (file list is built at startup), not real-time monitoring.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to