Hi all, I am trying to better understand the counters and logging of the fetch MapReduce executed when crawling.
When looking at the job counters in the MapReduce web UI, I note the following counters and values: *Map input records 162,080* moved 345 robots_denied 4,441 robots_denied_maxcrawldelay 259 *hitByTimeLimit 7,493* exception 3,801 notmodified 2 gone 48 access_denied 1 *success 93,583* temp_moved 3,068 notfound 1,490 And summing all counters does not equal the total map input... But, when I go to the map task logs, at the end of each log there is a line stating: QueueFeeder finished: total *36651* records + hit by time limit :*20975* QueueFeeder finished: total *30248* records + hit by time limit :*25492* QueueFeeder finished: total *44257* records + hit by time limit :*4460* * * Summing all of theses numbers does equal the total map input. I also note that the total hit by time limit here is 50927 but the job counters show 7493. Anyone can elaborate ? Thanks, Amit.

