I see ... Thanks for useful feedback, Todd!
On Fri, May 6, 2011 at 7:34 AM, Todd Lipcon <t...@cloudera.com> wrote: > Hi Thanh, > > No, I doubt that anybody is running BackupNode in production, since it's > only part of 0.21, and in my opinion an incomplete implementation. A few of > the deficiencies I'm aware of: > > - Like you said, edits are transferred by synchronous RPC from the NN. As > far as I know, there are no timeouts enabled on these RPCs, so if the > backupnode hangs, so will the primary. In the case of a BN crash, the > primary will hang for many minutes before noticing. > - The BN doesn't provide hot standby since it doesn't yet receive block > reports. > > The fact that the RPCs are synchronous seems unavoidable if you want to be > able to do a failover without any lost edits. But without timeouts, it's a > bit scary. > > Some work will be going on in trunk to address high availability over the > next several months - we'd definitely appreciate your expertise in failure > injection, etc, being applied to the new code as it goes in! > > -Todd > > On Thu, May 5, 2011 at 9:28 AM, Thanh Do <than...@cs.wisc.edu> wrote: > >> hi all, >> >> any body deploy the Backup Node in your system. >> I am curious about the impact of the Backup Node >> to the NameNode throughput. >> >> To my understanding, NameNode streams edits >> log operation to the BackupNode (by an RPC call), >> and only return once that operation has been applied >> to the in memory state of the Backup Node. >> >> Will this RPC call slow down the NameNode a little bit. >> >> Thanks >> Thanh >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera >