Re: [PR] HDDS-10338. Implement a Client Datanode API to stream a block [ozone]

via GitHub Fri, 31 Oct 2025 04:21:26 -0700


sodonnel commented on PR #6613:
URL: https://github.com/apache/ozone/pull/6613#issuecomment-3472602945


   > When a datanode receives onNext(L)
   > 
   >     * Open the file if it is the first call.
   > 
   >     * Read the file at least L bytes of data.
   > 
   >     * It could read up to checksum/chunk boundary, or even a few chunks 
more for pre-read.
   > 
   >     * Return the data by one or more onNext() responses.
   
   How is this going to look on the server? If the server gives up its handler 
thread after sending each piece of the block to the client then for each 
onNext() from the client, its going to have to:
   
   1. Acquire a thread
   2. Lookup RocksDB for the checksum and block meta data
   3. Calculate various offsets etc to align the checksums.
   4. Open the block file, seek
   5. Read the desired amount and close the file
   6. Release the thread back to the pool
   
   My observations is that for most calls, these client reads are not "50MB" or 
anything like it. They are generally 4kb, and get rounded up to the next 
checksum boundary. So 16kb by default. So to read the 256MB block we are going 
to have to do the above 16k times. This cannot be efficient for something which 
is trying to read a block from start to end.
   
   While this is an async API, from the clients point of view its blocking. The 
client needs more data and it has to wait for it to arrive. With the streaming 
approach I have implemented in this PR, the data is queuing up to a reasonable 
limit on the socket and as the client needs it, it will be immediately 
available usually and avoids the rework if initializing the read each time.
   
   With the approach you are describing, compared to the approach used in Ozone 
today, it is just keeping the GRPC stream alive. Its still a very chatty back 
and forward protocol passing small pieces of data and I don't see how it is 
going to drastically improve things due to the amount of rework required for 
each tiny read.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDDS-10338. Implement a Client Datanode API to stream a block [ozone]

Reply via email to