Re: [PR] Optimize S3 storage writing for MSQ durable storage (druid)

via GitHub Tue, 28 May 2024 00:39:51 -0700


Akshat-Jain commented on PR #16481:
URL: https://github.com/apache/druid/pull/16481#issuecomment-2134545306


   @rohangarg 
   
   > Thanks, I checked earlier but I couldn't understand their timings much. 
For instance, were you able to reason about why the serial upload is taking ~ 8 
minutes for 450MBs?
   
   The serial upload is uploading 5 MBs one-by-one, hence takes an extremely 
long time.
   
   > Also, the chunk size in those queries seemed a bit de-generate to me.
   
   So, I had to reduce the chunk size to lowest possible value to be able to 
simulate upload of a large number of chunks, as I can't do it otherwise due to 
limited system resources. But it shouldn't affect the comparison/improvement 
trend, hence we can still rely on this data to get some idea on the degree of 
savings.
   
   > I was actually meaning that the threads which are running and are writing 
to the output stream currently are able to write it to S3 too on the same 
thread, which might not be the case which this change and the writes might have 
to wait. Are we solving that scenario in this change?
   
   Actually it's the opposite. Previously the writes to the output stream 
(local file) had to wait for the previous chunk write+upload to finish. Now, 
the write doesn't have to wait for the previous uploads to finish (there's a 
backpressure semantic to restrict too many writes, but that's different from 
the current point). Please let me know if this answers the question, happy to 
clarify this further!
   
   > have we considered supporting parallel uploads via the execution framework 
(maybe an operator to fan out writes) instead of doing it on a per storage 
basis? 
   
   No, we hadn't discussed any approach like that. cc: @cryptoe 
   
   > But I would do that in the executor related code itself instead of leaking 
it into the output stream.
   
   Can you please clarify what do you mean by "leaking it into the output 
stream"? Sorry, I didn't get this part.
   
   > You can determine that the file is small when not even one local chunk is 
fully filled, and the output stream is closed. And once you've determined that 
a file is small, direct PUT can be used over multi-part upload.
   
   But this would work only when there's a single chunk. How about when there 
are 5 chunks, for example? We'd have to wait for the last chunk to be processed 
before we can make the decision that the `total number of chunks * chunkSize` 
is small enough to go with PUT approach. Also, supporting the PUT flow along 
with the multipart approach doesn't seem ideal to me, it would be one extra 
thing to maintain for not much benefits. That's my opinion on this atleast. 
Happy to discuss it further.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Optimize S3 storage writing for MSQ durable storage (druid)

Reply via email to