I'd not do this if the fsimage size is greater than, say, 5-6 GB. The SNN pulls and then pushes this back from the NameNode and the transfer can get heavy. If you have https://issues.apache.org/jira/browse/HDFS-1457 (image transfer throttler) in the version of Hadoop you use, you can set it to a proper value and keep the SNN on a slave node without worrying about it hogging all the available bandwidth.
On Thu, Aug 16, 2012 at 3:41 AM, David Rosenstrauch <dar...@darose.net> wrote: > I have a Hadoop cluster that's a little tight on resources. I was thinking > one way I could solve this could be by running an additional data node on > the same machine as the secondary name node. > > I wouldn't dare do that on the primary name node, since that machine needs > to be extremely performant. But since all the secondary name node does is > doing a merge of the name node's checkpoint and logs, which is not an > activity that require top-notch real-time performance, I thought it might > not be a problem if I were to set up a data node running there as well. > > Any reasons why that might be a bad idea? > > Thanks, > > DR -- Harsh J