On 9/11/07 2:23 PM, "Earney, Billy C." <[EMAIL PROTECTED]> wrote:
> So I guess you can run the command below ($HADOOP_HOME/bin/hadoop ..) on
> a separate machine just while you have the config file which defines the
> ip address of the namenode and/or the datanodes?
Essentially, yes.
We routinely have nodes that aren't part of the DFS load data onto the
DFS. [In fact, in one of our larger configurations, we have ~100 nodes that
aren't datanodes but are used for mapred operations against the dfs..
primarily because we've hit some namenode limits with 0.13. :) ] The key
thing is to have the config bits available to pass to the hadoop command.
As I mentioned in an email previously, we routinely do NOT do put's from
datanodes because they get 'priority' for block placement. If you are doing
a large file put, this means you fill your local node. This may not be
ideal depending upon your requirements.