Hi Thanh, No, I doubt that anybody is running BackupNode in production, since it's only part of 0.21, and in my opinion an incomplete implementation. A few of the deficiencies I'm aware of:
- Like you said, edits are transferred by synchronous RPC from the NN. As far as I know, there are no timeouts enabled on these RPCs, so if the backupnode hangs, so will the primary. In the case of a BN crash, the primary will hang for many minutes before noticing. - The BN doesn't provide hot standby since it doesn't yet receive block reports. The fact that the RPCs are synchronous seems unavoidable if you want to be able to do a failover without any lost edits. But without timeouts, it's a bit scary. Some work will be going on in trunk to address high availability over the next several months - we'd definitely appreciate your expertise in failure injection, etc, being applied to the new code as it goes in! -Todd On Thu, May 5, 2011 at 9:28 AM, Thanh Do <than...@cs.wisc.edu> wrote: > hi all, > > any body deploy the Backup Node in your system. > I am curious about the impact of the Backup Node > to the NameNode throughput. > > To my understanding, NameNode streams edits > log operation to the BackupNode (by an RPC call), > and only return once that operation has been applied > to the in memory state of the Backup Node. > > Will this RPC call slow down the NameNode a little bit. > > Thanks > Thanh > -- Todd Lipcon Software Engineer, Cloudera