Hi Tom,

Sounds like the trick. This node is a slave so it's datanode and tasktracker are started from the master. - how do I start the cluster without starting the datanode and the tasktracker on the mini-node slave? Remove it from slaves?
   - what do I minimally need to start on the mini-node?

Also I have replication set to 2 so the data will just get re-replicated once the mini-node is reconfigured, right? There should be another copy somewhere on the cluster.

Thanks
Pat

On 6/4/12 2:38 PM, Tom Melendez wrote:
Hi Pat,

Sounds like you would just turn off the datanode and the tasktracker.
Your config will still point to the Namenode and JT, so you can still
launch jobs and read/write from HDFS.

You'll probably want to replicate the data off first of course.

Thanks,

Tom

On Mon, Jun 4, 2012 at 2:06 PM, Pat Ferrel<p...@occamsmachete.com>  wrote:
I have a machine that is part of the cluster but I'd like to dedicate it to
being the web server and run the db but still have access to starting jobs
and getting data out of hdfs. In other words I'd like to have the cores,
memory, and disk only minimally affected by running jobs on the cluster yet
still have easy access when I need to get data out.

I assume I can do something like set the max number of jobs for the node to
0 and something similar for hdfs? Is there a recommended way to go about
this?

Reply via email to