abstractdog commented on code in PR #6376:
URL: https://github.com/apache/hive/pull/6376#discussion_r2975444614


##########
ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java:
##########
@@ -549,6 +578,39 @@ public boolean setEntryValid(CacheEntry cacheEntry, 
FetchWork fetchWork) {
         return false;
       }
 
+      if (isSafeCacheWriteEnabled) {

Review Comment:
   @ramitg254 : thanks for working on this so far
   I'm not sure if the approach fully addresses what has been reported: as far 
as I can understand, there is a safe buffer directory, where the files are 
placed, and this safe folder is on the same storage, but that's not the only 
issue, it's rather that this doesn't prevent big files actually landing on the 
filesystem that holds the cache
   
   the original report showed something like this:
   ```
   du -h -d 1 
/efs/tmp/hive/_resultscache_/results-9d89cc59-c99d-46a5-9d93-2b550576532012.0K 
./66356edb-57a6-4f0a-90cd-7d14d9e2b739
   ...
   1.1T ./0fe343fb-6a89-4d28-b2fd-caed2f2e42f6
   ...
   1.1T .
   ```
   
   I missed something to double-check before creating the jira: if the 
"0fe343fb-6a89-4d28-b2fd-caed2f2e42f6" folder belongs to a finished query 
result? if so - and given that it clearly exceeded the configured 2G max cache 
size - query results cache should have taken care of that, so I think the 
original problem/usecase should be investigated thoroughly first



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to