commit

Palleti, Pallavi Mon, 02 Nov 2009 00:52:31 -0800

Hi Eddy,
 
I faced similar issue when I used pig script for fetching webpages for
certain urls. I could see the map phase showing100% and it is still
running. As I was logging the page that it is currently fetching, I
could see the process hasn't yet finished. It might be the same issue.
So, you can add logging to check whether it is actually stuck or the
process is still going on.
 
Thanks
Pallavi

________________________________

From: Zhang Bingjun (Eddy) [mailto:[email protected]] 
Sent: Monday, November 02, 2009 2:03 PM
To: [email protected]; [email protected];
[email protected]; [email protected]
Subject: too many 100% mapper does not complete / finish / commit

Dear hadoop fellows, 

We have been using Hadoop-0.20.1 MapReduce to crawl some web data. In
this case, we only have mappers to crawl data and save data into HDFS in
a distributed way. No reducers is specified in the job conf. 

The problem is that for every job we have about one third mappers stuck
with 100% progress but never complete. If we look at the the tasktracker
log of those mappers, the last log was the key input INFO log line and
no others logs were output after that. 

>From the stdout log of a specific attempt of one of those mappers, we
can see that the map function of the mapper has been finished completely
and the control of the execution should be somewhere in the MapReduce
framework part. 

Does anyone have any clue about this problem? Is it because we didn't
use any reducers? Since two thirds of the mappers could complete
successfully and commit their output data into HDFS, I suspect the stuck
mappers has something to do with the MapReduce framework code? 

Any input will be appreciated. Thanks a lot!

Best regards,
Zhang Bingjun (Eddy)

E-mail: [email protected], [email protected], [email protected]
Tel No: +65-96188110 (M)

RE: too many 100% mapper does not complete / finish / commit

Reply via email to