See inline.

2013/1/11 Harsh J <[email protected]>

> If the per-record processing time is very high, you will need to
> periodically report a status. Without a status change report from the task
> to the tracker, it will be killed away as a dead task after a default
> timeout of 10 minutes (600s).
>
=====================> Do you mean to increase the report time: "*
mapred.task.timeout"*?


> Also, beware of holding too much memory in a reduce JVM - you're still
> limited there. Best to let the framework do the sort or secondary sort.
>
=======================>  You mean use the default value ? This is my value.
*mapred.job.reduce.memory.mb*-1

>
>
> On Fri, Jan 11, 2013 at 10:58 AM, yaotian <[email protected]> wrote:
>
>> Yes, you are right. The data is GPS trace related to corresponding uid.
>> The reduce is doing this: Sort user to get this kind of result: uid, gps1,
>> gps2, gps3........
>> Yes, the gps data is big because this is 30G data.
>>
>> How to solve this?
>>
>>
>>
>> 2013/1/11 Mahesh Balija <[email protected]>
>>
>>> Hi,
>>>
>>>           2 reducers are successfully completed and 1498 have been
>>> killed. I assume that you have the data issues. (Either the data is huge or
>>> some issues with the data you are trying to process)
>>>           One possibility could be you have many values associated to a
>>> single key, which can cause these kind of issues based on the operation you
>>> do in your reducer.
>>>           Can you put some logs in your reducer and try to trace out
>>> what is happening.
>>>
>>> Best,
>>> Mahesh Balija,
>>> Calsoft Labs.
>>>
>>>
>>> On Fri, Jan 11, 2013 at 8:53 AM, yaotian <[email protected]> wrote:
>>>
>>>> I have 1 hadoop master which name node locates and 2 slave which
>>>> datanode locate.
>>>>
>>>> If i choose a small data like 200M, it can be done.
>>>>
>>>> But if i run 30G data, Map is done. But the reduce report error. Any
>>>> sugggestion?
>>>>
>>>>
>>>> This is the information.
>>>>
>>>> *Black-listed TaskTrackers:* 
>>>> 1<http://23.20.27.135:9003/jobblacklistedtrackers.jsp?jobid=job_201301090834_0041>
>>>> ------------------------------
>>>> Kind % CompleteNum Tasks PendingRunningComplete KilledFailed/Killed
>>>> Task 
>>>> Attempts<http://23.20.27.135:9003/jobfailures.jsp?jobid=job_201301090834_0041>
>>>> map<http://23.20.27.135:9003/jobtasks.jsp?jobid=job_201301090834_0041&type=map&pagenum=1>
>>>> 100.00%4500 
>>>> 0450<http://23.20.27.135:9003/jobtasks.jsp?jobid=job_201301090834_0041&type=map&pagenum=1&state=completed>
>>>> 00 / 
>>>> 1<http://23.20.27.135:9003/jobfailures.jsp?jobid=job_201301090834_0041&kind=map&cause=killed>
>>>> reduce<http://23.20.27.135:9003/jobtasks.jsp?jobid=job_201301090834_0041&type=reduce&pagenum=1>
>>>> 100.00%1500 0 
>>>> 02<http://23.20.27.135:9003/jobtasks.jsp?jobid=job_201301090834_0041&type=reduce&pagenum=1&state=completed>
>>>> 1498<http://23.20.27.135:9003/jobtasks.jsp?jobid=job_201301090834_0041&type=reduce&pagenum=1&state=killed>
>>>> 12<http://23.20.27.135:9003/jobfailures.jsp?jobid=job_201301090834_0041&kind=reduce&cause=failed>
>>>>  / 
>>>> 3<http://23.20.27.135:9003/jobfailures.jsp?jobid=job_201301090834_0041&kind=reduce&cause=killed>
>>>>
>>>>
>>>> TaskCompleteStatusStart TimeFinish TimeErrorsCounters
>>>> task_201301090834_0041_r_000001<http://23.20.27.135:9003/taskdetails.jsp?tipid=task_201301090834_0041_r_000001>
>>>> 0.00%
>>>> 10-Jan-2013 04:18:54
>>>> 10-Jan-2013 06:46:38 (2hrs, 27mins, 44sec)
>>>>
>>>> Task attempt_201301090834_0041_r_000001_0 failed to report status for 600 
>>>> seconds. Killing!
>>>> Task attempt_201301090834_0041_r_000001_1 failed to report status for 602 
>>>> seconds. Killing!
>>>> Task attempt_201301090834_0041_r_000001_2 failed to report status for 602 
>>>> seconds. Killing!
>>>> Task attempt_201301090834_0041_r_000001_3 failed to report status for 602 
>>>> seconds. Killing!
>>>>
>>>>
>>>> 0<http://23.20.27.135:9003/taskstats.jsp?tipid=task_201301090834_0041_r_000001>
>>>> task_201301090834_0041_r_000002<http://23.20.27.135:9003/taskdetails.jsp?tipid=task_201301090834_0041_r_000002>
>>>> 0.00%
>>>> 10-Jan-2013 04:18:54
>>>> 10-Jan-2013 06:46:38 (2hrs, 27mins, 43sec)
>>>>
>>>> Task attempt_201301090834_0041_r_000002_0 failed to report status for 601 
>>>> seconds. Killing!
>>>> Task attempt_201301090834_0041_r_000002_1 failed to report status for 600 
>>>> seconds. Killing!
>>>>
>>>>
>>>> 0<http://23.20.27.135:9003/taskstats.jsp?tipid=task_201301090834_0041_r_000002>
>>>> task_201301090834_0041_r_000003<http://23.20.27.135:9003/taskdetails.jsp?tipid=task_201301090834_0041_r_000003>
>>>> 0.00%
>>>> 10-Jan-2013 04:18:57
>>>> 10-Jan-2013 06:46:38 (2hrs, 27mins, 41sec)
>>>>
>>>> Task attempt_201301090834_0041_r_000003_0 failed to report status for 602 
>>>> seconds. Killing!
>>>> Task attempt_201301090834_0041_r_000003_1 failed to report status for 602 
>>>> seconds. Killing!
>>>> Task attempt_201301090834_0041_r_000003_2 failed to report status for 602 
>>>> seconds. Killing!
>>>>
>>>>
>>>> 0<http://23.20.27.135:9003/taskstats.jsp?tipid=task_201301090834_0041_r_000003>
>>>> task_201301090834_0041_r_000005<http://23.20.27.135:9003/taskdetails.jsp?tipid=task_201301090834_0041_r_000005>
>>>> 0.00%
>>>> 10-Jan-2013 06:11:07
>>>> 10-Jan-2013 06:46:38 (35mins, 31sec)
>>>>
>>>> Task attempt_201301090834_0041_r_000005_0 failed to report status for 600 
>>>> seconds. Killing!
>>>>
>>>>
>>>> 0<http://23.20.27.135:9003/taskstats.jsp?tipid=task_201301090834_0041_r_000005>
>>>>
>>>
>>>
>>
>
>
> --
> Harsh J
>

Reply via email to