I did run it the way you suggested. But I am running into a slew of
ClassNotFoundException¹s for the MapClass. Exporting the CLASSPATH doesn¹t
seem to fix it. How do I get around it ?

Thanks
Avinash


On 5/29/07 1:30 PM, "Doug Cutting" <[EMAIL PROTECTED]> wrote:

> Phantom wrote:
>> > (1) Set my fs.default.name set to hdfs://<host>:<port> and also specify it
>> > in the JobConf configuration. Copy my sample input file into HDFS using
>> > "bin/hadoop fd -put" from my local file system. I then need to specify this
>> > file to my WordCount sample as input. Should I specify this file with the
>> > hdfs:// directive ?
>> >
>> > (2) Set my fs.default.name set to file://<host>:<port> and also specify it
>> > in the JobConf configuration. Just specify the input path to the WordCount
>> > sample and everything should work if the path is available to all machines
>> > in the cluster ?
>> >
>> > Which way should I go ?
> 
> Either should work.  So should a third option, which is to have your job
> input in the non-default filesystem, but there's currently a bug that
> prevents that from working.  But the above two should work.  The second
> assumes that the input is available on the same path in the native
> filesystem on all nodes.
> 
> When naming files in the default filesystem you do not need to specify
> their filesystem, since it is the default, but it is not an error to
> specify it.
> 
> The most common mode of distributed operation is (1): use an HDFS
> filesytem as your fs.default.name, copy your initial input into that
> filesystem with 'bin/hadoop fs -put localPath hdfsPath', then specify
> 'hdfsPath' as your job's input.  The "hdfs://host:port" is not required
> at this point, since it is the default.
> 
> Doug
> 
> 
> 
> 


Reply via email to