eolivelli commented on PR #3598:
URL: https://github.com/apache/celeborn/pull/3598#issuecomment-3876260623

   @Dzeri96 
   
   > Parallelizing so high in the call chain makes it hard to know if the 
called code will produce any race conditions. Since we are in the controller, I 
guess the method can be called asynchronously anyway, so you won't break 
anything that isn't already broken, but I think it would be better to 
encapsulate only the time-sensitive part of the code and return a Future from 
that function, rather than parallelize everything.
   
   In this case it is not a problem, we know the code that is running in the 
blocks and in this specific case it is only creating files with a predictable 
name (this is  part of the PartitionLocation, see 
[here](https://github.com/apache/celeborn/blob/634343e2001fd3f004e245c56cbe9d45f2db1daf/common/src/main/java/org/apache/celeborn/common/protocol/PartitionLocation.java#L248))
   
   
   > I think it would be better to encapsulate only the time-sensitive part of 
the code and return a Future from that function, rather than parallelize 
everything.
   
   This is basically what we are doing now::
   - Creating files is very fast for MEMORY and DISK, and the file name is 
predicatable. Parallelizing is not a problem.
   - For S3 (and I expect this for all the other DFSs) creating the file is 
very slow, and it is only about waiting for response from S3. Sending N 
requests in parallel does not add more work on the local CPU
   
   
   >  but it seems to me that this method should maybe queue up requests 
instead of executing them greedily? If the worker gets many handleReserveSlots 
requests
   
   This is actually a possible problem and this patch is mitigating that, 
because it will take less time to complete the handleReserveSlot request.
   
   Ideally I would like to make handleReserveSlot fully async and offload it 
from the Netty eventloop, but this can be done in a separate work.
   In any case we should not loop over N PartitionLocations and wait for each 
creation


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to