Akshat-Jain commented on code in PR #16481:
URL: https://github.com/apache/druid/pull/16481#discussion_r1609489542


##########
extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/RetryableS3OutputStream.java:
##########
@@ -199,15 +211,47 @@ private void pushCurrentChunk() throws IOException
   {
     currentChunk.close();
     final Chunk chunk = currentChunk;
-    try {
-      if (chunk.length() > 0) {
-        resultsSize += chunk.length();
+    if (chunk.length() > 0) {
+      try {
+        SEMAPHORE.acquire(); // Acquire a permit from the semaphore

Review Comment:
   > If we are trying to avoid running out of disk space, then instead of 
having semaphores, we just need to check some condition in write() that there 
is disk space available (this can be computed based on the formula that 
@cryptoe has suggested). If there isn't, we should wait for the condition to be 
satisfied before proceeding with the write.
   
   Do you suggest adding a retry wrapper over a new synchronized method 
assertEnoughDiskSpaceExists()? If yes, for what duration? I feel this would be 
more complicated to follow and more error prone, and would potentially increase 
the overall query duration a lot because of a bunch of retries across multiple 
threads, which defeats the purpose of the optimization. Appreciate your 
thoughts and clarification on this, thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to