Phantom wrote:
I am sure re-replication is not done on every heartbeat miss since that
would be very expensive and inefficient. At the same time you cannot really
tell if a node is partitioned away, crashed or just slow. Is it threshold
based i.e I missed N heartbeats so re-replicate ?

Yes, detection of datanode failure is threshold-based. It is currently ten minutes plus ten missed heartbeats.

Which package in the
source code could I look at to glean this information ?

This is in dfs/FSNameSystem.java.

Doug

Reply via email to