sercanCyberVision opened a new pull request, #6099:
URL: https://github.com/apache/hive/pull/6099

   **ROOT-CAUSE:**
   Please see https://issues.apache.org/jira/browse/HIVE-29225 for the detailed 
problem description. In short, scratch directories may be deleted prematurely 
while Hive is still streaming output to the client.
   
   **SOLUTION:**
   A new configuration property has been introduced:
   ```
   hive.scratchdir.cleanup.grace.period.hours
   ```
   This property defines a grace period to prevent cleanup of scratch 
directories that have been modified within the specified time window.
   
   - Default value is `0h` (disabled by default).
   - When Hive finishes output streaming, it updates the scratch directory’s 
last modified time.
   - The cleanup logic now checks this timestamp and skips deletion if the 
directory was updated within the grace period.
   
   This ensures directories are not removed prematurely while still in use.
   
   **TESTING:**
   Added a dedicated unit test to validate the new functionality:
   <img width="1442" height="851" alt="the-test" 
src="https://github.com/user-attachments/assets/a73b7bc9-aa74-49be-ba4a-6d37b7d27e04";
 />
   
   Verified that all existing tests in `TestClearDanglingScratchDir` continue 
to pass:
   <img width="1309" height="421" alt="scratch-dir-tests" 
src="https://github.com/user-attachments/assets/a114c263-85c0-414a-bbb9-a94f9132d0cf";
 />
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to