hi,in my 1000+ cluster test, I found that during namenode restart, if datanode does not connect to namenode because namenode is busy, it will print IOException in offerService, However, forceFullBr is already false in the next loop, which causes the datanode to fail to send FBR during the namenode restart. As a result, the namenode restarts slowly. The block reporting phase takes about 20 hours. I can set forceFullBr to true after the exception is received so that the datanode can send the FBR in a timely manner. How do you feel?