ChenSammi commented on pull request #1147:
URL: https://github.com/apache/hadoop-ozone/pull/1147#issuecomment-659112467


   > > Thanks @bharatviswa504 and @adoroszlai for the review. lock per 
containerDBPath in ContainerCache is a good idea. I tried it but cannot solve 
our problem given a volume has nearly 20K containers, it still take nearly an 
hour for the datanode to finish the initialization.
   > 
   > With a lock per containerPath, where does the bottle neck come from? Is it 
around the lock, or does a single thread load all the containers for a volume, 
and therefore its the time taken to init all the DB instances that is the 
bottleneck?
   
   The DB open takes time. The LOCK makes the DB open action one by one. Once 
it has more than 10K rocksdb in a volume, the initialization takes a 
considerable time, say more than 30minutes.   
   There are several severe consequence of datanode slow initialization,
   1. service unavailable
   2. scm will treat datanode as DEAD(We have raised the DEAD threshold to 30m 
in our cluster), and close all the related pipeline and containers related with 
this datanode, which contributed to why there are so many small containers in 
our cluster. It's a negative cycle.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to