Can you share your entire log file. Why your operator got killed for
first time? The above mentioned error seems to be at recovery time.
-Priyanka
On Tue, Jun 6, 2017 at 12:44 AM, Guilherme Hott
<guilhermeh...@gmail.com <mailto:guilhermeh...@gmail.com>> wrote:
Hi, I have this error and I don't know why it's happening. The
operator who is failing is processing a tuple, doing a
dedup check, saving into HBase if it's new or update and emiting
to the stream. But, because of this, only a few tuples are
processed due to the failure.
2017-06-04 06:43:45,265
[org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl
#6] INFO impl.ContainerManagementProtocolProxy newProxy -
Opening proxy : localhost:8052
2017-06-04 06:43:47,573 [IPC Server handler 0 on 38848]
INFO stram.StreamingContainerParent log - child msg:
[container_1496564390452_0002_01_000010] Entering
heartbeat loop.. context:
PTContainer[id=1(container_1496564390452_0002_01_000010),state=ALLOCATED,operators=[PTOperator[id=6,name=ConsoleNew],
PTOperator[id=7,name=ConsoleBad],
PTOperator[id=1,name=cloutApiBanksInput],
PTOperator[id=5,name=banksDeduplicator],
PTOperator[id=10,name=ConsoleNewJDBC],
PTOperator[id=11,name=ConsoleErrorJDBC],
PTOperator[id=9,name=cloutApiBanksOutput],
PTOperator[id=8,name=banksOpLoadObject],
PTOperator[id=3,name=Deduper],
PTOperator[id=4,name=cloutApiBanksInput.outputPort#unifier],
PTOperator[id=2,name=cloutApiBanksInput]]]
2017-06-04 06:43:48,587 [IPC Server handler 1 on 38848]
INFO stram.StreamingContainerManager processHeartbeat -
Container container_1496564390452_0002_01_000010 buffer
server: datatorrent-sandbox:35304
2017-06-04 06:43:56,262 [IPC Server handler 16 on 38848]
INFO stram.StreamingContainerParent log - child msg:
Stopped running due to an exception.
java.lang.NullPointerException
at
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187)
at
org.apache.apex.malhar.lib.wal.FSWindowDataManager.retrieve(FSWindowDataManager.java:487)
at
org.apache.apex.malhar.lib.wal.FSWindowDataManager.retrieve(FSWindowDataManager.java:448)
at
com.datatorrent.lib.db.jdbc.AbstractJdbcPollInputOperator.replay(AbstractJdbcPollInputOperator.java:316)
at
com.datatorrent.lib.db.jdbc.AbstractJdbcPollInputOperator.beginWindow(AbstractJdbcPollInputOperator.java:203)
at
com.datatorrent.stram.engine.InputNode.run(InputNode.java:122)
at
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1441)
context:
PTContainer[id=1(container_1496564390452_0002_01_000010),state=ACTIVE,operators=[PTOperator[id=6,name=ConsoleNew],
PTOperator[id=7,name=ConsoleBad],
PTOperator[id=1,name=cloutApiBanksInput],
PTOperator[id=5,name=banksDeduplicator],
PTOperator[id=10,name=ConsoleNewJDBC],
PTOperator[id=11,name=ConsoleErrorJDBC],
PTOperator[id=9,name=cloutApiBanksOutput],
PTOperator[id=8,name=banksOpLoadObject],
PTOperator[id=3,name=Deduper],
PTOperator[id=4,name=cloutApiBanksInput.outputPort#unifier],
PTOperator[id=2,name=cloutApiBanksInput]]]
2017-06-04 06:43:56,683 [IPC Server handler 5 on 38848]
WARN stram.StreamingContainerManager
processOperatorFailure - Operator failure:
PTOperator[id=2,name=cloutApiBanksInput] count: 6
2017-06-04 06:43:56,683 [IPC Server handler 5 on 38848]
ERROR stram.StreamingContainerManager
processOperatorFailure - Initiating container restart
after operator failure
PTOperator[id=2,name=cloutApiBanksInput]
2017-06-04 06:43:57,292 [main] INFO
stram.StreamingAppMasterService sendContainerAskToRM -
Requested stop container
container_1496564390452_0002_01_000010
2017-06-04 06:43:57,292
[org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl
#7] INFO impl.NMClientAsyncImpl run - Processing Event
EventType: STOP_CONTAINER for Container
container_1496564390452_0002_01_000010
2017-06-04 06:43:57,294
[org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl
#7] INFO impl.ContainerManagementProtocolProxy newProxy -
Opening proxy : localhost:8052
2017-06-04 06:43:59,301 [main] INFO
stram.StreamingAppMasterService execute - Completed
containerId=container_1496564390452_0002_01_000010,
state=COMPLETE, exitStatus=-105, diagnostics=Container
killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
2017-06-04 06:43:59,301 [main] INFO
stram.StreamingContainerManager scheduleContainerRestart
- Initiating recovery for
container_1496564390452_0002_01_000010@localhost:8052
2017-06-04 06:43:59,302 [main] INFO
stram.StreamingContainerManager scheduleContainerRestart
- Affected operators [PTOperator[id=6,name=ConsoleNew],
PTOperator[id=7,name=ConsoleBad],
PTOperator[id=1,name=cloutApiBanksInput],
PTOperator[id=4,name=cloutApiBanksInput.outputPort#unifier],
PTOperator[id=3,name=Deduper],
PTOperator[id=5,name=banksDeduplicator],
PTOperator[id=8,name=banksOpLoadObject],
PTOperator[id=9,name=cloutApiBanksOutput],
PTOperator[id=11,name=ConsoleErrorJDBC],
PTOperator[id=10,name=ConsoleNewJDBC],
PTOperator[id=2,name=cloutApiBanksInput]]
2017-06-04 06:44:00,334 [main] INFO
stram.ResourceRequestHandler getHost - Strict
anti-affinity = [] for container with operators
PTOperator[id=6,name=ConsoleNew],PTOperator[id=7,name=ConsoleBad],PTOperator[id=1,name=cloutApiBanksInput],PTOperator[id=5,name=banksDeduplicator],PTOperator[id=10,name=ConsoleNewJDBC],PTOperator[id=11,name=ConsoleErrorJDBC],PTOperator[id=9,name=cloutApiBanksOutput],PTOperator[id=8,name=banksOpLoadObject],PTOperator[id=3,name=Deduper],PTOperator[id=4,name=cloutApiBanksInput.outputPort#unifier],PTOperator[id=2,name=cloutApiBanksInput]
2017-06-04 06:44:00,334 [main] INFO
stram.ResourceRequestHandler getHost - Found host null
2017-06-04 06:44:01,341 [main] INFO
stram.StreamingAppMasterService execute - Got new
container.,
containerId=container_1496564390452_0002_01_000011,
containerNode=localhost:8052,
containerNodeURI=localhost:8042,
containerResourceMemory6144, priority9
2017-06-04 06:44:01,341 [main] INFO
stram.StreamingContainerManager assignContainer -
Removing container agent
container_1496564390452_0002_01_000010
2017-06-04 06:44:01,342 [main] INFO
stram.LaunchContainerRunnable run - Setting up container
launch context for
containerid=container_1496564390452_0002_01_000011
--
*Guilherme Hott*
/Software Engineer/
Skype: guilhermehott
@guilhermehott
https://www.linkedin.com/in/guilhermehott
<https://www.linkedin.com/in/guilhermehott>