I have a Hadoop cluster that's a little tight on resources. I was thinking one way I could solve this could be by running an additional data node on the same machine as the secondary name node.

I wouldn't dare do that on the primary name node, since that machine needs to be extremely performant. But since all the secondary name node does is doing a merge of the name node's checkpoint and logs, which is not an activity that require top-notch real-time performance, I thought it might not be a problem if I were to set up a data node running there as well.

Any reasons why that might be a bad idea?

Thanks,

DR

Reply via email to