I am trying to run the WordCount sample against a file on my local file
system. So I kick start my program as
"java -D/home/alakshman/hadoop-0.12.3/conf
org.apache.hadoop.examples.WordCount -m 10 -r 4 ~/test2.dat /tmp/out-dir".
When I run this, I get the following in the jobtracker log file (what should
I be doing to fix this):

2007-05-25 14:41:32,733 INFO org.apache.hadoop.mapred.TaskInProgress: Error
from task_0001_m_000000_3: java.lang.IllegalArgumentException: Wrong FS:
file:/home/alakshman/test2.dat, expected:
hdfs://dev030.sctm.facebook.com:9000
        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:216)
        at 
org.apache.hadoop.dfs.DistributedFileSystem$RawDistributedFileSystem.getPath
(DistributedFileSystem.java:110)
        at 
org.apache.hadoop.dfs.DistributedFileSystem$RawDistributedFileSystem.exists(
DistributedFileSystem.java:170)
        at 
org.apache.hadoop.fs.FilterFileSystem.exists(FilterFileSystem.java:168)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:331)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:245)
        at 
org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.jav
a:54)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:139)
        at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1445)



On 5/25/07 2:31 PM, "Doug Cutting" <[EMAIL PROTECTED]> wrote:

> Phantom wrote:
>> If my Map job is going to process a file does it have to be in HDFS
> 
> No, but they usually are.  Job inputs are resolved relative to the
> default filesystem.  So, if you've configured the default filesystem to
> be HDFS, and you pass a filename that's not qualified by a filesystem as
> the input to your job, then your input should be in HDFS.
> 
> But inputs don't have to be in the default filesystem nor must they be
> in HDFS.  They need to be in a filesystem that's available to all nodes.
>   They could be in NFS, S3, or Ceph instead of HDFS.  They could even be
> in a non-default HDFS system.
> 
>> and if so how do I get it there ?
> 
> If HDFS is configured as your default filesystem:
> 
>    bin/hadoop fs -put localFileName nameInHdfs
> 
> Doug

Reply via email to