dybyte commented on issue #10329:
URL: https://github.com/apache/seatunnel/issues/10329#issuecomment-3799471744

   > To design a compaction mechanism effectively, please clarify:
   > 
   > 1. Preferred default trigger strategy (time-based/file-count/file-size 
thresholds)?
   > 2. Should manual triggering be supported (REST API/CLI)?
   > 3. Any performance tolerance limits for compaction during job execution?
   
   Trigger strategy:
   Currently compaction is triggered based on total file size 
(compactionThreshold). Once the accumulated size of immutable files exceeds 
this threshold, compaction starts automatically.
   
   Manual triggering:
   Manual compaction (via REST API or CLI) is not supported in this PR. We plan 
to provide this capability in a follow-up PR to keep the current change focused 
on the core storage workflow.
   
   Performance impact during job execution:
   Compaction is executed in a single thread to avoid excessive resource 
contention. In addition, throughput can be controlled via configurable batch 
size and sleep interval between batches, allowing users to fine-tune the 
balance between compaction speed and runtime impact.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to