Akshat-Jain commented on code in PR #16481:
URL: https://github.com/apache/druid/pull/16481#discussion_r1609474708


##########
extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/output/RetryableS3OutputStream.java:
##########
@@ -103,22 +107,32 @@ public class RetryableS3OutputStream extends OutputStream
   private boolean error;
   private boolean closed;
 
-  public RetryableS3OutputStream(
-      S3OutputConfig config,
-      ServerSideEncryptingAmazonS3 s3,
-      String s3Key
-  ) throws IOException
-  {
+  /**
+   * An atomic counter to store number of files pending to be uploaded for the 
particular uploadId.
+   */
+  private final AtomicInteger pendingFiles = new AtomicInteger(0);
 
-    this(config, s3, s3Key, true);
-  }
+  /**
+   * A lock used for notifying the main thread about the completion of 
s3.uploadPart() for all chunks
+   * and hence starting the s3.completeMultipartUpload() for the uploadId.
+   */
+  private final Object fileLock = new Object();
+
+  /**
+   * Semaphore to restrict the maximum number of simultaneous chunks on disk.
+   */
+  private static final int MAX_CONCURRENT_CHUNKS = 10;
+  private static final Semaphore SEMAPHORE = new 
Semaphore(MAX_CONCURRENT_CHUNKS);
 
-  @VisibleForTesting
-  protected RetryableS3OutputStream(
+  /**
+   * Threadpool used for uploading the chunks asynchronously.
+   */
+  private static final ExecutorService UPLOAD_EXECUTOR = 
Execs.multiThreaded(10, "UploadThreadPool-%d");

Review Comment:
   > This executor should not be static. If the intention is to share the 
executor, it should be passed in as an argument to the constructor.
   
   But then we'd have to inject it and pass it through a bunch of files: 
S3StorageConnectorModule, S3ExportStorageProvider, S3StorageConnectorProvider, 
StorageConnector, RetryableS3OutputStream.
   
   Seems unnecessary to me? For my understanding, what benefit does it provide 
over the current approach?
   
   > Also, there should be a well defined lifecycle of the executor, i.e. when 
it is started and stopped.
   
   Where would you suggest to shutdown() the executor?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to