[
https://issues.apache.org/jira/browse/SPARK-21733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136260#comment-16136260
]
Jepson commented on SPARK-21733:
--------------------------------
*The nodemanager log detail:*
{code:java}
2017-08-22 11:20:07,984 DEBUG
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
Constructing ProcessTree for : PID = 16766 ContainerId =
container_e56_1503371613444_0001_01_000003
2017-08-22 11:20:07,992 DEBUG
org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: [ 17066 16766 ]
2017-08-22 11:20:07,992 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
Memory usage of ProcessTree 16766 for container-id
container_e56_1503371613444_0001_01_000003: 580.4 MB of 3 GB physical memory
used; 4.6 GB of 6.3 GB virtual memory used
2017-08-22 11:20:08,716 DEBUG
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Node's
health-status : true,
2017-08-22 11:20:08,717 DEBUG
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out 3
container statuses: [ContainerStatus: [ContainerId:
container_e56_1503371613444_0001_01_000001, State: RUNNING, Diagnostics: ,
ExitStatus: -1000, ], ContainerStatus: [ContainerId:
container_e56_1503371613444_0001_01_000002, State: RUNNING, Diagnostics: ,
ExitStatus: -1000, ], ContainerStatus: [ContainerId:
container_e56_1503371613444_0001_01_000003, State: RUNNING, Diagnostics: ,
ExitStatus: -1000, ]]
2017-08-22 11:20:08,717 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 102:
Call -> hadoop37.jiuye/192.168.17.37:8031: nodeHeartbeat {node_status { node_id
{ host: "hadoop44.jiuye" port: 8041 } response_id: 389 containersStatuses {
container_id { app_attempt_id { application_id { id: 1 cluster_timestamp:
1503371613444 } attemptId: 1 } id: 61572651155457 } state: C_RUNNING
diagnostics: "" exit_status: -1000 } containersStatuses { container_id {
app_attempt_id { application_id { id: 1 cluster_timestamp: 1503371613444 }
attemptId: 1 } id: 61572651155458 } state: C_RUNNING diagnostics: ""
exit_status: -1000 } containersStatuses { container_id { app_attempt_id {
application_id { id: 1 cluster_timestamp: 1503371613444 } attemptId: 1 } id:
61572651155459 } state: C_RUNNING diagnostics: "" exit_status: -1000 }
nodeHealthStatus { is_node_healthy: true health_report: ""
last_health_report_time: 1503371969299 } }
last_known_container_token_master_key { key_id: -966413074 bytes:
"a\021&\346gs\031n" } last_known_nm_token_master_key { key_id: -1126930838
bytes: "$j@\322\331dr`" }}
2017-08-22 11:20:08,717 DEBUG org.apache.hadoop.ipc.Client: IPC Client
(1778801068) connection to hadoop37.jiuye/192.168.17.37:8031 from yarn sending
#851
2017-08-22 11:20:08,720 DEBUG org.apache.hadoop.ipc.Client: IPC Client
(1778801068) connection to hadoop37.jiuye/192.168.17.37:8031 from yarn got
value #851
2017-08-22 11:20:08,720 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call:
nodeHeartbeat took 3ms
2017-08-22 11:20:08,720 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 102:
Response <- hadoop37.jiuye/192.168.17.37:8031: nodeHeartbeat {response_id: 390
nodeAction: NORMAL containers_to_cleanup { app_attempt_id { application_id {
id: 1 cluster_timestamp: 1503371613444 } attemptId: 1 } id: 61572651155458 }
containers_to_cleanup { app_attempt_id { application_id { id: 1
cluster_timestamp: 1503371613444 } attemptId: 1 } id: 61572651155459 }
nextHeartBeatInterval: 1000}
2017-08-22 11:20:08,721 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher:
Dispatching the event
org.apache.hadoop.yarn.server.nodemanager.CMgrCompletedContainersEvent.EventType:
FINISH_CONTAINERS
2017-08-22 11:20:08,722 {color:#59afe1}DEBUG
org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerKillEvent.EventType:
KILL_CONTAINER{color}
2017-08-22 11:20:08,722 DEBUG
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Processing container_e56_1503371613444_0001_01_000002 of type KILL_CONTAINER
2017-08-22 11:20:08,722 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_e56_1503371613444_0001_01_000002 transitioned from RUNNING
to KILLING
2017-08-22 11:20:08,722 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher:
Dispatching the event
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerKillEvent.EventType:
KILL_CONTAINER
2017-08-22 11:20:08,722 DEBUG
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Processing container_e56_1503371613444_0001_01_000003 of type KILL_CONTAINER
2017-08-22 11:20:08,722 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_e56_1503371613444_0001_01_000003 transitioned from RUNNING
to KILLING
2017-08-22 11:20:08,722 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher:
Dispatching the event
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType:
CLEANUP_CONTAINER
2017-08-22 11:20:08,722 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
Cleaning up container container_e56_1503371613444_0001_01_000002
2017-08-22 11:20:08,722 DEBUG
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
Marking container container_e56_1503371613444_0001_01_000002 as inactive
{code}
> ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
> -----------------------------------------------------------------
>
> Key: SPARK-21733
> URL: https://issues.apache.org/jira/browse/SPARK-21733
> Project: Spark
> Issue Type: Bug
> Components: DStreams
> Affects Versions: 2.1.1
> Environment: Apache Spark2.1.1
> CDH5.12.0 Yarn
> Reporter: Jepson
> Original Estimate: 96h
> Remaining Estimate: 96h
>
> Kafka+Spark streaming ,throw these error:
> {code:java}
> 17/08/15 09:34:14 INFO memory.MemoryStore: Block broadcast_8003_piece0 stored
> as bytes in memory (estimated size 1895.0 B, free 1643.2 MB)
> 17/08/15 09:34:14 INFO broadcast.TorrentBroadcast: Reading broadcast variable
> 8003 took 11 ms
> 17/08/15 09:34:14 INFO memory.MemoryStore: Block broadcast_8003 stored as
> values in memory (estimated size 2.9 KB, free 1643.2 MB)
> 17/08/15 09:34:14 INFO kafka010.KafkaRDD: Beginning offset 10130733 is the
> same as ending offset skipping kssh 5
> 17/08/15 09:34:14 INFO executor.Executor: Finished task 7.0 in stage 8003.0
> (TID 64178). 1740 bytes result sent to driver
> 17/08/15 09:34:21 INFO storage.BlockManager: Removing RDD 8002
> 17/08/15 09:34:21 INFO executor.CoarseGrainedExecutorBackend: Got assigned
> task 64186
> 17/08/15 09:34:21 INFO executor.Executor: Running task 7.0 in stage 8004.0
> (TID 64186)
> 17/08/15 09:34:21 INFO broadcast.TorrentBroadcast: Started reading broadcast
> variable 8004
> 17/08/15 09:34:21 INFO memory.MemoryStore: Block broadcast_8004_piece0 stored
> as bytes in memory (estimated size 1895.0 B, free 1643.2 MB)
> 17/08/15 09:34:21 INFO broadcast.TorrentBroadcast: Reading broadcast variable
> 8004 took 8 ms
> 17/08/15 09:34:21 INFO memory.MemoryStore: Block broadcast_8004 stored as
> values in memory (estimated size 2.9 KB, free 1643.2 MB)
> 17/08/15 09:34:21 INFO kafka010.KafkaRDD: Beginning offset 10130733 is the
> same as ending offset skipping kssh 5
> 17/08/15 09:34:21 INFO executor.Executor: Finished task 7.0 in stage 8004.0
> (TID 64186). 1740 bytes result sent to driver
> h3. 17/08/15 09:34:29 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED
> SIGNAL TERM
> 17/08/15 09:34:29 INFO storage.DiskBlockManager: Shutdown hook called
> 17/08/15 09:34:29 INFO util.ShutdownHookManager: Shutdown hook called
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]