yzeng1618 commented on issue #10414: URL: https://github.com/apache/seatunnel/issues/10414#issuecomment-3815798354
Hello, I’d like to confirm the specific requirements with you. Please feel free to supplement any details if there are any discrepancies on our understanding. - you want a real-time/continuous file ingestion capability for file sources (FTP/SFTP/HDFS/local, etc.) that (a) periodically scans/monitors a directory for new/updated files, (b) avoids re-processing files already transferred, and (c) supports post-actions like delete after transfer, backup then delete, delete by retention/expiration, plus tunables like scan interval, priority queue/queue size/buffer size, and thread/concurrency. Is that accurate? Are you targeting a long-running streaming job, or a “run periodically” batch job? - Current SeaTunnel status already have a limited incremental sync mode sync_mode=update, but it is only supported for file_format_type=binary, and currently it is effectively only exposed/usable in the HdfsFile source (FTP/SFTP/local do not expose sync_mode=update options yet). Also, this is batch-style (file list is built at startup), not real-time monitoring. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
