hi,in my 1000+ cluster test, I found that during namenode restart, if datanode 
does not connect to namenode because namenode is busy, it will print 
IOException in offerService, However, forceFullBr is already false in the next 
loop, which causes the datanode to fail to send FBR during the namenode 
restart. As a result, the namenode restarts slowly. The block reporting phase 
takes about 20 hours. I can set forceFullBr to true after the exception is 
received so that the datanode can send the FBR in a timely manner. How do you 
feel?

Reply via email to