Hi Pat, Sounds like you would just turn off the datanode and the tasktracker. Your config will still point to the Namenode and JT, so you can still launch jobs and read/write from HDFS.
You'll probably want to replicate the data off first of course. Thanks, Tom On Mon, Jun 4, 2012 at 2:06 PM, Pat Ferrel <p...@occamsmachete.com> wrote: > I have a machine that is part of the cluster but I'd like to dedicate it to > being the web server and run the db but still have access to starting jobs > and getting data out of hdfs. In other words I'd like to have the cores, > memory, and disk only minimally affected by running jobs on the cluster yet > still have easy access when I need to get data out. > > I assume I can do something like set the max number of jobs for the node to > 0 and something similar for hdfs? Is there a recommended way to go about > this?