ChenSammi commented on pull request #1147: URL: https://github.com/apache/hadoop-ozone/pull/1147#issuecomment-659112467
> > Thanks @bharatviswa504 and @adoroszlai for the review. lock per containerDBPath in ContainerCache is a good idea. I tried it but cannot solve our problem given a volume has nearly 20K containers, it still take nearly an hour for the datanode to finish the initialization. > > With a lock per containerPath, where does the bottle neck come from? Is it around the lock, or does a single thread load all the containers for a volume, and therefore its the time taken to init all the DB instances that is the bottleneck? The DB open takes time. The LOCK makes the DB open action one by one. Once it has more than 10K rocksdb in a volume, the initialization takes a considerable time, say more than 30minutes. There are several severe consequence of datanode slow initialization, 1. service unavailable 2. scm will treat datanode as DEAD(We have raised the DEAD threshold to 30m in our cluster), and close all the related pipeline and containers related with this datanode, which contributed to why there are so many small containers in our cluster. It's a negative cycle. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
