dataroaring opened a new issue, #56974:
URL: https://github.com/apache/doris/issues/56974

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Description
   
   When running batch jobs in Doris — for example, using INSERT OVERWRITE — 
Doris internally creates a temporary partition to hold new data. After the 
write operation completes successfully, the temporary partition is swapped into 
the target table.
   
   However, if the Frontend (FE) crashes while data is still being written to 
the temporary partition, the job fails but the temporary partition remains in 
the table.
   As a result, when rerunning the same INSERT OVERWRITE job, Doris fails to 
create a new temporary partition with the same name. Users must manually delete 
the leftover temporary partition before rerunning the job.
   
   ### Solution
   
   **Steps to Reproduce:**
   
   Run an INSERT OVERWRITE task on a table.
   
   Cause an FE crash during the temporary partition write phase.
   
   Restart FE and rerun the same task.
   
   The rerun fails because the temporary partition from the failed attempt 
still exists.
   
   **Expected Behavior:**
   
   Temporary partitions should be automatically cleaned up if an FE crash or 
job failure occurs before the final swap.
   
   Alternatively, Doris should be able to detect existing temporary partitions 
and safely reuse or overwrite them during a retry.
   
   **Possible Fix / Proposal:**
   
   Add a cleanup mechanism to automatically remove temporary partitions that 
are left behind from failed or interrupted INSERT OVERWRITE operations during 
FE recovery.
   
   When Doris starts up, FE could scan for temporary partitions and remove 
those that are no longer referenced by any active job.
   
   Alternatively, introduce idempotent retry handling, allowing Doris to detect 
and reuse existing temporary partitions when rerunning the same job safely.
   
   Optionally, expose a configuration flag or system variable to control this 
cleanup behavior (e.g., auto_clean_temp_partitions=true).
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to