It's tiny bug in SequenceFileRecordReader. A cast to float is needed here
return (in.getPosition() - start) / (end - start);
gives
return (in.getPosition() - start) / (float) (end - start);
As well as assigning start in the constructor:
this.start = split.getStart();
- Espen
(Sorry, about this not being a patch ... windoze ... arg)
Andrzej Bialecki wrote:
Andrzej Bialecki wrote:
Hi all,
Is it just me, or is there something strange with Hadoop since ~0.10
or thereabout .. With older version of Hadoop I would get a nice
often updated progress status for each map task. What I'm seeing now
is that map tasks stay at 0.0% and then finally jump to 100.0% and
finish. Consequently, for jobs with small number of long-running map
tasks, the progress update is very coarse.
As I understand, this progress meter (in absence of map tasks
explicitly setting the progress) was based on the RecordReader
reporting of how much of the current split has been read. Is this
something that got broken on the way? If not, what's the reason for
this, and how to fix it?
Does anyone have a suggestion about this problem? It's rather
irritating - long-running tasks seem to be stuck at 0%, and only jump
to 100% at the end of the task. This happens with 0.11.2 as well.