RE: hadoop does not see my input file

Devaraj Das Sat, 02 Jun 2007 23:07:39 -0700

I would like to start from scratch on this one and here are the steps I
would like you to follow (you might be already doing all the steps below but
let's be sure we are at the same page):
1) Do "bin/hadoop namenode -format"
2) Run bin/start-dfs.sh
3) Check to make sure that dfs started up fine. Access the link
http://rosetta8:50070/ from a browser and see whether you see the datanodes
-rosetta9 & rosetta10. 
4) Now run "bin/hadoop dfs -put <path-to-some-local-dir> <dfs-dir>" So as an
example you could run "bin/hadoop dfs -put $HADOOP_HOME/conf /tmp/in-dir" .
This must not complain about anything.
5) Assuming that your local-dir is non-empty, you should see some entries if
you do "bin/hadoop dfs -ls /tmp/in-dir" and that will mean that your dfs is
working fine.


If not, then do "tail <path-to-log-file-of-namenode>" and see what
exceptions you see there. In the default settings, the log directory is
$HADOOP_HOME/logs and the namenode log file is easily identifiable from the
file names there. Let us know the exceptions.

-----Original Message-----
From: Erdong (Roger) CHEN [mailto:[EMAIL PROTECTED] 
Sent: Sunday, June 03, 2007 2:46 AM
To: [email protected]
Subject: hadoop does not see my input file

Hi all,

Could anyone help me to figure out why hadoop does not see my input file?

I have three computers rosetta8, rosetta9,and rosetta10. rosetta8 is listed
in masters, rosetta9 and rosetta10 are listed in slaves. I run
bin/start-dfs.sh and bin/start-mapred.sh on rosetta8. This is my
hapood-site.xml. I am pretty sure that I followed the installation and
configuration online and the folder /tmp/in-dir/ is not empty.

I tried the following two commands:
./bin/hadoop dfs -ls /tmp/in-dir/
Found 0 items
./bin/hadoop dfs -ls /tmp/
Found 0 items

I tried both two settings for mapred.job.tracker, local and rosetta8:50034.
Both don't work.

<property>
  <name>mapred.job.tracker</name>
  <value>local</value>
  <value>rosetta8:50034</value>
</property>

<property>
  <name>fs.default.name</name>
  <value>rosetta8:50033</value>
</property>

Command that I run:
./bin/hadoop jar hadoop-0.12.3-examples.jar wordcount -m 3 -r 2 /tmp/in-dir/
/tmp/out-dir/

Error message that I get:
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist :
/tmp/in-dir
        at
org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:
138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver
.java:71)
        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
[EMAIL PROTECTED]:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
/tmp/out-dir/
ERROR: Integer expected instead of
2/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/
wordcount [-m <maps>] [-r <reduces>] <input> <output>
[EMAIL PROTECTED]:~/hadoop-install/hadoop$ ./bin/hadoop jar
hadoop-0.12.3-examples.jar wordcount -m 3 -r 2
/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir/ /tmp/out-dir/
org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist :
/afs/csail.mit.edu/u/e/edc/hadoop-install/hadoop/in-dir
        at
org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:
138)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:148)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver
.java:71)
        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)


Erdong Chen

RE: hadoop does not see my input file

Reply via email to