dybyte commented on issue #10329: URL: https://github.com/apache/seatunnel/issues/10329#issuecomment-3799471744
> To design a compaction mechanism effectively, please clarify: > > 1. Preferred default trigger strategy (time-based/file-count/file-size thresholds)? > 2. Should manual triggering be supported (REST API/CLI)? > 3. Any performance tolerance limits for compaction during job execution? Trigger strategy: Currently compaction is triggered based on total file size (compactionThreshold). Once the accumulated size of immutable files exceeds this threshold, compaction starts automatically. Manual triggering: Manual compaction (via REST API or CLI) is not supported in this PR. We plan to provide this capability in a follow-up PR to keep the current change focused on the core storage workflow. Performance impact during job execution: Compaction is executed in a single thread to avoid excessive resource contention. In addition, throughput can be controlled via configurable batch size and sleep interval between batches, allowing users to fine-tune the balance between compaction speed and runtime impact. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
