bharatviswa504 edited a comment on pull request #2338:
URL: https://github.com/apache/ozone/pull/2338#issuecomment-866524893


   > I've got a couple of questions on this topic:
   > 
   > 1. Does it make sense to thread pool the Incremental Reports too? I know 
they are much smaller, but they are also much more frequent. I have not seen 
any evidence they are backing up or not, but its worth considering / checking 
if we can.
   
   Previously with ICR's, we used to send full container report, that is fixed 
by HDDS-5111. We shall be testing with this PR and HDDS-5111 with huge 
container reports from each DN. If we observe issue with ICR, we can add thread 
pool  to ICR also.
   
   > 2. I believe the DNs send a FCR every 60 - 90 seconds. Is that frequency 
really needed? Unknown bugs aside, is there any reason to send a FCR after 
startup and first registration? ON HDFS datanodes only send full block reports 
every 6 hours by default. If ICRs carry all the required information for SCM, 
perhaps we should increase the FCR interval to an hour or more?
   
   During startup/registration we need to send a full container report as the 
ContainerSafeMode rule is dependent on that to validate its rule.  And also we 
fire container report event, where we process container reports and build 
container replica set.
   
   But I completely agree with you we can change the full container report 
interval to a larger value. And I don't think we need to have a large value 
like HDFS, as compared to HDFS our container report size should be very less.
   
   From our scale testing
   With 9 DN's with each data node filled with 500 TB data, we have seen around 
350K containers in the cluster. So, there are a total of 1 million replicas 
will be reported from all DN's.(When compared with HDFS our container report 
size is far less in size)
   
   > 3. Have we been able to capture any profiles (eg flame charts) of 
processing a large FCR to see if there is anything to be optimized in that flow?
   
   We have not debugged at this level, in future testing, we shall look into 
this.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to