[ 
https://issues.apache.org/jira/browse/TEZ-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15306219#comment-15306219
 ] 

Feng Yuan commented on TEZ-3273:
--------------------------------

i find it because allocated many container to task:
2016-05-30 02:10:38,856 INFO [DelayedContainerManager] 
rm.YarnTaskSchedulerService: Assigning container to task, container=Container: 
[ContainerId: container_1463493135662_117553_01_000032, NodeId: 
bjlg-44p91-hadoop77.bfdabc.com:35404, NodeHttpAddress: 
bjlg-44p91-hadoop77.bfdabc.com:8042, Resource: <memory:3072, vCores:1>, 
Priority: 2, Token: Token { kind: ContainerToken, service: 192.168.44.91:35404 
}, ], task=attempt_1463493135662_117553_1_00_000023_0, 
containerHost=bjlg-44p91-hadoop77.bfdabc.com, localityMatchType=RackLocal, 
matchedLocation=/rack1, honorLocalityFlags=false, reusedContainer=false, 
delayedContainers=7, containerResourceMemory=3072, containerResourceVCores=1
[hadoop@bjlg-44p40-hadoop27 yuanfeng]$ cat 553 | grep 
"task=attempt_1463493135662_117553_1_00_000024_0, containerHost="
2016-05-30 02:10:38,881 INFO [DelayedContainerManager] 
rm.YarnTaskSchedulerService: Assigning container to task, container=Container: 
[ContainerId: container_1463493135662_117553_01_000044, NodeId: 
bjlg-44p62-hadoop48.bfdabc.com:22928, NodeHttpAddress: 
bjlg-44p62-hadoop48.bfdabc.com:8042, Resource: <memory:3072, vCores:1>, 
Priority: 2, Token: Token { kind: ContainerToken, service: 192.168.44.62:22928 
}, ], task=attempt_1463493135662_117553_1_00_000024_0, 
containerHost=bjlg-44p62-hadoop48.bfdabc.com, localityMatchType=RackLocal, 
matchedLocation=/rack0, honorLocalityFlags=false, reusedContainer=false, 
delayedContainers=2, containerResourceMemory=3072, containerResourceVCores=1
2016-05-30 02:10:43,916 INFO [DelayedContainerManager] 
rm.YarnTaskSchedulerService: Assigning container to task, container=Container: 
[ContainerId: container_1463493135662_117553_01_000048, NodeId: 
bjlg-44p50-hadoop36.bfdabc.com:40906, NodeHttpAddress: 
bjlg-44p50-hadoop36.bfdabc.com:8042, Resource: <memory:3072, vCores:1>, 
Priority: 2, Token: Token { kind: ContainerToken, service: 192.168.44.50:40906 
}, ], task=attempt_1463493135662_117553_1_00_000024_0, 
containerHost=bjlg-44p50-hadoop36.bfdabc.com, localityMatchType=RackLocal, 
matchedLocation=/rack0, honorLocalityFlags=false, reusedContainer=false, 
delayedContainers=4, containerResourceMemory=3072, containerResourceVCores=1
2016-05-30 02:10:44,415 INFO [DelayedContainerManager] 
rm.YarnTaskSchedulerService: Assigning container to task, container=Container: 
[ContainerId: container_1463493135662_117553_01_000007, NodeId: 
bjlg-44p82-hadoop68.bfdabc.com:63544, NodeHttpAddress: 
bjlg-44p82-hadoop68.bfdabc.com:8042, Resource: <memory:3072, vCores:1>, 
Priority: 2, Token: Token { kind: ContainerToken, service: 192.168.44.82:63544 
}, ], task=attempt_1463493135662_117553_1_00_000024_0, 
containerHost=bjlg-44p82-hadoop68.bfdabc.com, localityMatchType=RackLocal, 
matchedLocation=/rack0, honorLocalityFlags=false, reusedContainer=true, 
delayedContainers=5, containerResourceMemory=3072, containerResourceVCores=1
2016-05-30 02:10:44,419 INFO [DelayedContainerManager] 
rm.YarnTaskSchedulerService: Assigning container to task, container=Container: 
[ContainerId: container_1463493135662_117553_01_000022, NodeId: 
bjlg-44p39-hadoop26.bfdabc.com:4334, NodeHttpAddress: 
bjlg-44p39-hadoop26.bfdabc.com:8042, Resource: <memory:3072, vCores:1>, 
Priority: 2, Token: Token { kind: ContainerToken, service: 192.168.44.39:4334 
}, ], task=attempt_1463493135662_117553_1_00_000024_0, 
containerHost=bjlg-44p39-hadoop26.bfdabc.com, localityMatchType=RackLocal, 
matchedLocation=/rack0, honorLocalityFlags=false, reusedContainer=true, 
delayedContainers=2, containerResourceMemory=3072, containerResourceVCores=1
2016-05-30 02:10:44,421 INFO [DelayedContainerManager] 
rm.YarnTaskSchedulerService: Assigning container to task, container=Container: 
[ContainerId: container_1463493135662_117553_01_000054, NodeId: 
bjlg-44p43-hadoop29.bfdabc.com:65059, NodeHttpAddress: 
bjlg-44p43-hadoop29.bfdabc.com:8042, Resource: <memory:3072, vCores:1>, 
Priority: 2, Token: Token { kind: ContainerToken, service: 192.168.44.43:65059 
}, ], task=attempt_1463493135662_117553_1_00_000024_0, 
containerHost=bjlg-44p43-hadoop29.bfdabc.com, localityMatchType=RackLocal, 
matchedLocation=/rack0, honorLocalityFlags=false, reusedContainer=false, 
delayedContainers=0, containerResourceMemory=3072, containerResourceVCores=1

> app.TaskAttemptListenerImpTezDag: Attempt is not recognized for heartbeat in 
> tez 0.5.2,cause job hang
> -----------------------------------------------------------------------------------------------------
>
>                 Key: TEZ-3273
>                 URL: https://issues.apache.org/jira/browse/TEZ-3273
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.5.2
>         Environment: hive0.14 hadoop2.6
>            Reporter: Feng Yuan
>            Priority: Critical
>         Attachments: app_logs.zip
>
>
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> stuck forever~



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to