Andrew, there was a bug in my answer below - if your job is failing to open the file for writing rather then reading (which is not clear from the error below) then you will not see any files under /benchmarks/TestDFSIO/io_data - in this case step #2 is to chk read/write permissions in the directory where you store the data
- note that it is a good idea to install hadoop with the same user you are going to run MR jobs i remember having similar permissions issues when using a different user to set it up (even though it was in the same group) so its better to install hadoop with* hadoop* user and run jobs* *with the same user Alex On Thu, Apr 15, 2010 at 2:28 AM, alex kamil <[email protected]> wrote: > Andrew, > > 1. make sure you run "TestDFSIO write" test before ""TestDFSIO read" > 2. run hadoop fs -ls /benchmarks/ and see if the files are actually there > 3. run hadoop dfsadmin -report see if the cluster is alive/no dead nodes > 4. try a simple copyFromLocal and see if its works > > if the answers to all above are "yes" > - chk the file system, if you used the defaults it will probably write to > /tmp (i'm not familiar with specific Hadoop/EC2 package you use) > otherwise see if it writes into directory where your user/group has enough > permissions > > if you get stuck i would even try a different hadoop image, i think there > are a bunch of them on AWS and you can switch in a couple of min > you can also try cloudera package with all the bells and whistles > if it causes problems i would try a clean install from apache website > > this is more of a survival guide, may be there is a simpler fix that i'm > not aware of, so pls share your findings > > Cheers > Alex > > > On Thu, Apr 15, 2010 at 8:01 AM, Andrew Nguyen < > [email protected]> wrote: > >> And, I'm getting the following errors: >> >> 10/04/15 06:00:50 INFO mapred.JobClient: Task Id : >> attempt_201004150557_0001_m_000000_1, Status : FAILED >> java.io.IOException: Cannot open filename >> /benchmarks/TestDFSIO/io_data/test_io_0 >> >> A bunch show up and then the job fails. Running the job directly on the >> cluster as the hadoop user. >> >> Any ideas? >> >> Thanks, >> Andrew > > >
