Hi 심탁길
If you restart the name-node only, the data-nodes will continue sending
block reports to the name-node
once in an hour, which is the default block-report interval. That means
all blocks will be reported to the
name-node sooner or later. But you want that sooner of course.
IMO we should request block reports when the name-node restarts.
For now I recommend to wait 10 minutes between shutdown and restart.
Then all data-nodes
will expire and the name-node will ask them to send block reports asap.
This works with the
trunk version. HADOOP-641 is scheduled for 0.8.0.
Константин
심탁길 wrote:
For example, New file "a.txt" is created on DFS.
Before NameNode shuts down, A pair of Block and DataNodeInfo[] for "a.txt" is
displayed (Using LocatedBlock class)
After NameNode restarts, Only a Block Info remains, DataNodeInfo disappears...
So when try to read the file, the message like "No node available for block:
blk_XXXXXXX ...... No live nodes contain current block"
is displayed. and then when Datanodes restarts, the problem is solved and
Client can access to the file.
For Cluster with small Nodes it could be ok to restart all the NameNode and
DataNodes together, but Big Cluster more than hundreds of Nodes is different
story.
Any Comments would be appreciated