Hi  심탁길

If you restart the name-node only, the data-nodes will continue sending block reports to the name-node once in an hour, which is the default block-report interval. That means all blocks will be reported to the
name-node sooner or later. But you want that sooner of course.
IMO we should request block reports when the name-node restarts.
For now I recommend to wait 10 minutes between shutdown and restart. Then all data-nodes will expire and the name-node will ask them to send block reports asap. This works with the
trunk version. HADOOP-641 is scheduled for 0.8.0.

Константин


심탁길 wrote:


For example, New file "a.txt" is created on DFS.

Before NameNode shuts down, A pair of Block and DataNodeInfo[] for "a.txt" is 
displayed (Using LocatedBlock class)

After NameNode restarts, Only a Block Info remains, DataNodeInfo disappears...

So when try to read the file, the message like "No node available for block: 
blk_XXXXXXX ...... No live nodes contain current block"

is displayed.  and then when Datanodes restarts, the problem is solved and 
Client can access to the file.

For Cluster with small Nodes it could be ok to restart all the NameNode and 
DataNodes together, but Big Cluster more than hundreds of Nodes is different 
story.

Any Comments would be appreciated





Reply via email to