It is a portion of the application master log. In the same log, check what container fails first and check that container for errors and exceptions.

apex/malhar developers, any volunteer to see why the operator fails during recovery?

Thank you,

Vlad

On 6/5/17 23:17, Priyanka Gugale wrote:
Can you share your entire log file. Why your operator got killed for first time? The above mentioned error seems to be at recovery time.

-Priyanka

On Tue, Jun 6, 2017 at 12:44 AM, Guilherme Hott <guilhermeh...@gmail.com <mailto:guilhermeh...@gmail.com>> wrote:

    Hi, I have this error and I don't know why it's happening. The
    operator who is failing is processing a tuple, doing a
    dedup check, saving into HBase if it's new or update and emiting
    to the stream. But, because of this, only a few tuples are
    processed due to the failure.

            2017-06-04 06:43:45,265
            [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl
            #6] INFO  impl.ContainerManagementProtocolProxy newProxy -
            Opening proxy : localhost:8052

            2017-06-04 06:43:47,573 [IPC Server handler 0 on 38848]
            INFO  stram.StreamingContainerParent log - child msg:
            [container_1496564390452_0002_01_000010] Entering
            heartbeat loop.. context:
            
PTContainer[id=1(container_1496564390452_0002_01_000010),state=ALLOCATED,operators=[PTOperator[id=6,name=ConsoleNew],
            PTOperator[id=7,name=ConsoleBad],
            PTOperator[id=1,name=cloutApiBanksInput],
            PTOperator[id=5,name=banksDeduplicator],
            PTOperator[id=10,name=ConsoleNewJDBC],
            PTOperator[id=11,name=ConsoleErrorJDBC],
            PTOperator[id=9,name=cloutApiBanksOutput],
            PTOperator[id=8,name=banksOpLoadObject],
            PTOperator[id=3,name=Deduper],
            PTOperator[id=4,name=cloutApiBanksInput.outputPort#unifier],
            PTOperator[id=2,name=cloutApiBanksInput]]]

            2017-06-04 06:43:48,587 [IPC Server handler 1 on 38848]
            INFO  stram.StreamingContainerManager processHeartbeat -
            Container container_1496564390452_0002_01_000010 buffer
            server: datatorrent-sandbox:35304

            2017-06-04 06:43:56,262 [IPC Server handler 16 on 38848]
            INFO  stram.StreamingContainerParent log - child msg:
            Stopped running due to an exception.
            java.lang.NullPointerException

            at
            
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187)

            at
            
org.apache.apex.malhar.lib.wal.FSWindowDataManager.retrieve(FSWindowDataManager.java:487)

            at
            
org.apache.apex.malhar.lib.wal.FSWindowDataManager.retrieve(FSWindowDataManager.java:448)

            at
            
com.datatorrent.lib.db.jdbc.AbstractJdbcPollInputOperator.replay(AbstractJdbcPollInputOperator.java:316)

            at
            
com.datatorrent.lib.db.jdbc.AbstractJdbcPollInputOperator.beginWindow(AbstractJdbcPollInputOperator.java:203)

            at
            com.datatorrent.stram.engine.InputNode.run(InputNode.java:122)

            at
            
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1441)

             context:
            
PTContainer[id=1(container_1496564390452_0002_01_000010),state=ACTIVE,operators=[PTOperator[id=6,name=ConsoleNew],
            PTOperator[id=7,name=ConsoleBad],
            PTOperator[id=1,name=cloutApiBanksInput],
            PTOperator[id=5,name=banksDeduplicator],
            PTOperator[id=10,name=ConsoleNewJDBC],
            PTOperator[id=11,name=ConsoleErrorJDBC],
            PTOperator[id=9,name=cloutApiBanksOutput],
            PTOperator[id=8,name=banksOpLoadObject],
            PTOperator[id=3,name=Deduper],
            PTOperator[id=4,name=cloutApiBanksInput.outputPort#unifier],
            PTOperator[id=2,name=cloutApiBanksInput]]]

            2017-06-04 06:43:56,683 [IPC Server handler 5 on 38848]
            WARN  stram.StreamingContainerManager
            processOperatorFailure - Operator failure:
            PTOperator[id=2,name=cloutApiBanksInput] count: 6

            2017-06-04 06:43:56,683 [IPC Server handler 5 on 38848]
            ERROR stram.StreamingContainerManager
            processOperatorFailure - Initiating container restart
            after operator failure
            PTOperator[id=2,name=cloutApiBanksInput]

            2017-06-04 06:43:57,292 [main] INFO
             stram.StreamingAppMasterService sendContainerAskToRM -
            Requested stop container
            container_1496564390452_0002_01_000010

            2017-06-04 06:43:57,292
            [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl
            #7] INFO  impl.NMClientAsyncImpl run - Processing Event
            EventType: STOP_CONTAINER for Container
            container_1496564390452_0002_01_000010

            2017-06-04 06:43:57,294
            [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl
            #7] INFO  impl.ContainerManagementProtocolProxy newProxy -
            Opening proxy : localhost:8052

            2017-06-04 06:43:59,301 [main] INFO
             stram.StreamingAppMasterService execute - Completed
            containerId=container_1496564390452_0002_01_000010,
            state=COMPLETE, exitStatus=-105, diagnostics=Container
            killed by the ApplicationMaster.

            Container killed on request. Exit code is 143

            Container exited with a non-zero exit code 143

            2017-06-04 06:43:59,301 [main] INFO
             stram.StreamingContainerManager scheduleContainerRestart
            - Initiating recovery for
            container_1496564390452_0002_01_000010@localhost:8052

            2017-06-04 06:43:59,302 [main] INFO
             stram.StreamingContainerManager scheduleContainerRestart
            - Affected operators [PTOperator[id=6,name=ConsoleNew],
            PTOperator[id=7,name=ConsoleBad],
            PTOperator[id=1,name=cloutApiBanksInput],
            PTOperator[id=4,name=cloutApiBanksInput.outputPort#unifier],
            PTOperator[id=3,name=Deduper],
            PTOperator[id=5,name=banksDeduplicator],
            PTOperator[id=8,name=banksOpLoadObject],
            PTOperator[id=9,name=cloutApiBanksOutput],
            PTOperator[id=11,name=ConsoleErrorJDBC],
            PTOperator[id=10,name=ConsoleNewJDBC],
            PTOperator[id=2,name=cloutApiBanksInput]]

            2017-06-04 06:44:00,334 [main] INFO
             stram.ResourceRequestHandler getHost - Strict
            anti-affinity = [] for container with operators
            
PTOperator[id=6,name=ConsoleNew],PTOperator[id=7,name=ConsoleBad],PTOperator[id=1,name=cloutApiBanksInput],PTOperator[id=5,name=banksDeduplicator],PTOperator[id=10,name=ConsoleNewJDBC],PTOperator[id=11,name=ConsoleErrorJDBC],PTOperator[id=9,name=cloutApiBanksOutput],PTOperator[id=8,name=banksOpLoadObject],PTOperator[id=3,name=Deduper],PTOperator[id=4,name=cloutApiBanksInput.outputPort#unifier],PTOperator[id=2,name=cloutApiBanksInput]

            2017-06-04 06:44:00,334 [main] INFO
             stram.ResourceRequestHandler getHost - Found host null

            2017-06-04 06:44:01,341 [main] INFO
             stram.StreamingAppMasterService execute - Got new
            container.,
            containerId=container_1496564390452_0002_01_000011,
            containerNode=localhost:8052,
            containerNodeURI=localhost:8042,
            containerResourceMemory6144, priority9

            2017-06-04 06:44:01,341 [main] INFO
             stram.StreamingContainerManager assignContainer -
            Removing container agent
            container_1496564390452_0002_01_000010

            2017-06-04 06:44:01,342 [main] INFO
             stram.LaunchContainerRunnable run - Setting up container
            launch context for
            containerid=container_1496564390452_0002_01_000011


-- *Guilherme Hott*
    /Software Engineer/
    Skype: guilhermehott
    @guilhermehott
    https://www.linkedin.com/in/guilhermehott
    <https://www.linkedin.com/in/guilhermehott>



Reply via email to