Re: [PR] Optimize S3 storage writing for MSQ durable storage (druid)

via GitHub Tue, 28 May 2024 04:57:24 -0700


LakshSingla commented on code in PR #16481:
URL: https://github.com/apache/druid/pull/16481#discussion_r1617112881



##########
extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/RetryableS3OutputStream.java:
##########
@@ -271,50 +328,72 @@ public void close() throws IOException
       // This should be emitted as a metric
       LOG.info(
           "Pushed total [%d] parts containing [%d] bytes in [%d]ms.",
-          numChunksPushed,
-          resultsSize,
+          numChunksPushed.get(),
+          resultsSize.get(),
           pushStopwatch.elapsed(TimeUnit.MILLISECONDS)
       );
     });
 
-    closer.register(() -> 
org.apache.commons.io.FileUtils.forceDelete(chunkStorePath));
+    try (Closer ignored = closer) {
+      if (!error) {
+        pushCurrentChunk();
+        completeMultipartUpload();
+      }
+    }
+  }
 
-    closer.register(() -> {
-      try {
-        if (resultsSize > 0 && isAllPushSucceeded()) {
-          RetryUtils.retry(
-              () -> s3.completeMultipartUpload(
-                  new CompleteMultipartUploadRequest(config.getBucket(), 
s3Key, uploadId, pushResults)
-              ),
-              S3Utils.S3RETRY,
-              config.getMaxRetry()
-          );
-        } else {
-          RetryUtils.retry(
-              () -> {
-                s3.cancelMultiPartUpload(new 
AbortMultipartUploadRequest(config.getBucket(), s3Key, uploadId));
-                return null;
-              },
-              S3Utils.S3RETRY,
-              config.getMaxRetry()
-          );
+  private void completeMultipartUpload()
+  {
+    synchronized (fileLock) {
+      while (pendingFiles.get() > 0) {
+        try {
+          LOG.info("Waiting for lock for completing multipart task for 
uploadId [%s].", uploadId);

Review Comment:
   I think this is high. How is parts per stage determined? How many parts per 
stage would it be if there was 1TB data ingested? And how much would it be if 
there was 10TB data ingested? Also consider situations when there are 4-5 
layers of nested joins, and their are 10 stages. (Note: Ignore the cumulative 
number of logging lines, only the number of log lines per worker) The number of 
log lines per worker that mention  "Waiting for the lock for completing 
multipart task..." would run pretty large. Also, in your test case, I don't 
think we would have started using durable storage for the super sorter, which 
has drastically reduced the number of log lines.
   
   Also, what potential conditions will it help in debugging? Let's say for a 
medium job, there are 200 log lines (which doesn't include all the cases that I 
have mentioned above). How can those 200 lines aid in debugging the job (and 
not the connector)? Note: If you worry about a deadlock, we have jstacks which 
are a much better tool for debugging. 
   
   If you think that a) The number of log lines will stay (say) <100 for the 
largest of jobs and b) All those logged lines will aid someone debugging the 
job we can keep it as INFO. Else I feel that the benefit of logging about 
acquisition/waiting/release of locks isn't going to help someone looking 
through the logs, and will most likely sift over it. DEBUG is a much better 
place for such logs. 
   
   If we do choose to keep it at INFO level, we should reword it in a way that 
helps the reader of the logs identify how to use these.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Optimize S3 storage writing for MSQ durable storage (druid)

Reply via email to