Re: Name node redundancy

Doug Cutting Wed, 29 Mar 2006 10:04:26 -0800

Aled Jones wrote:

Anyhoo, I'm fairly new to hadoop and was wondering about the redundancy
aspects of it.  If I have a few servers running for nutch, one being a
name and data node, the others just data nodes, what happens when the
name node falls over?  To get proper redundancy in a hadoop system do I
need at least 2 name nodes?

You just need a single namenode. With a large cluster, some machinewill fail frequently, but any particular machine is unlikely to failvery frequently. Thus it would be a bad idea to have every node to becritical to system operation, but making a single node critical isacceptable.

Currently the (undocumented) best-practice is to have a crontab entrythat uses rsync or similar to regularly copy the namenode's data toanother machine. Long-term we hope to have namenode backup built intoHadoop. For example, we might have a hot-spare namenode. All fschanges can be trickled to the hot spare as they are made. Switchoveron namenode failure would still be manual (for the reasons described inGoogles GFS paper).


Doug

Re: Name node redundancy

Reply via email to