Hi Chris I would guess that the IOException is because getReaders() is trying to treat _logs as a file, when it's actually a directory. I also see race conditions in getReaders() since it lists the files then tries to iterate through them, and they can disappear in between. You probably need to delete the _logs directory before you pass the output directory to the second map.
The _logs directory is also created by Hadoop 18.0. cheers Barry On Thursday 18 September 2008 05:49:30 Chris Dyer wrote: > Hi all- > I am having trouble with SequenceFileOutputFormat.getReaders on a > hadoop 17.2 cluster. I am trying to open a set of SequenceFiles that > was created in one map process that has completed from within a second > map process by passing in the job configuration for the running map > process (not of the map process that created the set of sequence > files) and the path to the output. When I run locally, this works > fine, but when I run remotely on the cluster (using HDFS on the > cluster), I get the following IOException: > > java.io.IOException: Cannot open filename /user/redpony/Model1.data.0/_logs > > However, the following works: > > hadoop dfs -ls /user/redpony/Model1.data.0/_logs > Found 1 items > /user/redpony/Model1.data.0/_logs/history <dir> 2008-09-18 > 00:43 rwxrwxrwx redpony supergroup > > This is probably something dumb, and quite likely related to me not > having my settings configured properly, but I'm completely at a loss > for how to proceed. Any ideas? > > Thanks! > Chris
