Either I am totally confused or this configuration stuff is confusing the hell out of me. I am pretty sure it is the former. Please I am looking for advice here as to how I should do this. I have my fs.default.name set to hdfs://<host>:<port>. In my JobConf setup I set the set same value for my fs.default.name. Now I have two options and I would appreciate if some expert could tell me which option I should take and why ?
(1) Set my fs.default.name set to hdfs://<host>:<port> and also specify it in the JobConf configuration. Copy my sample input file into HDFS using "bin/hadoop fd -put" from my local file system. I then need to specify this file to my WordCount sample as input. Should I specify this file with the hdfs:// directive ? (2) Set my fs.default.name set to file://<host>:<port> and also specify it in the JobConf configuration. Just specify the input path to the WordCount sample and everything should work if the path is available to all machines in the cluster ? Which way should I go ? Thanks Avinash On 5/29/07, Phantom <[EMAIL PROTECTED]> wrote:
Yes it is. Thanks A On 5/29/07, Doug Cutting <[EMAIL PROTECTED]> wrote: > > Phantom wrote: > > Is there a workaround ? I want to run the WordCount sample against a > > file on > > my local filesystem. If this is not possible do I need to put my file > into > > HDFS and then point that location to my program ? > > Is your local filesystem accessible to all nodes in your system? > > Doug >
