hawk9821 opened a new issue, #10995:
URL: https://github.com/apache/seatunnel/issues/10995

   ### Background
   
   Currently, SeaTunnel CI suffers from long execution time and high timeout 
risk. A single `all-connectors-it` module can take up to **4.5 hours** to 
complete. If any submodule times out due to network flakiness or resource 
contention, the whole job gets re-run for another 4.5 hours, seriously delaying 
PR merge.
   
   Some connector E2E tests are already very time-consuming:
   
   | Module | Observed duration |
   |--------|-------------------|
   | E2Elasticsearch | ~62 min |
   | Hbase | ~44 min |
   | Clickhouse | ~42 min |
   | Mongodb | ~35 min |
   | CDC MySQL | ~32 min |
   | Http | ~30 min |
   
   ### Problem identified from previous attempt (PR #9976)
   
   In PR #9976, which attempted to split long-running modules into standalone 
jobs, we found a key issue:
   
   **No standardized splitting rules** – There is no documented guideline on:
   - **Trigger conditions** – When should a module be considered "too long" and 
be split out?
   - **Splitting method** – How exactly should the split be done (e.g., 
modifying `backend.yml` and `update_modules_check.py`)?
   - **Responsibility** – Who should perform the splitting? Is it suitable for 
new contributors, or should it be done by experienced ones?
   
   Without such standards, different contributors may handle similar situations 
inconsistently, increasing review complexity and risk of configuration errors.
   
   ### Proposal
   
   We propose to establish a **written standard for splitting long-running 
connector E2E tests** in CI. The standard should cover:
   
   1. **Trigger condition** – e.g., any connector E2E module that consistently 
runs > 60 minutes, or any aggregated job that exceeds 2.5 hours, should be 
considered for splitting.
   2. **Splitting method** – a step-by-step template or checklist (e.g., copy 
an existing standalone job like `elasticsearch-connector-it`, adjust module 
names, and update both `backend.yml` and `update_modules_check.py` accordingly).
   3. **Responsibility** – clarify whether this should be done by core 
contributors (due to CI complexity) or can be done by anyone with proper 
guidance, and who should review such changes.
   
   ### Next steps / discussion points
   
   - What should the exact thresholds be (e.g., 60 min per module, 2.5h per 
parent job)?
   - Should we write a short guide / PR template checklist?
   - Who will lead the effort to draft and document the standard?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to