Looks like you hit a case where there was a offer with no cpu. Checking the cpu was historic, as cpu was set to -1 and then we added to it. It would make more sense now to have `checkResource(cpu==null, "cpu")`. Same for mem and ports. I'm in the process of testing some other stuff now so can check and report back.
On Fri, Sep 11, 2015 at 5:55 PM, Sarjeet Singh (JIRA) <[email protected]> wrote: > Sarjeet Singh created MYRIAD-135: > ------------------------------------ > > Summary: NullPointerException in ResourceOffersEventHandler > from the offer received from Mesos. > Key: MYRIAD-135 > URL: https://issues.apache.org/jira/browse/MYRIAD-135 > Project: Myriad > Issue Type: Bug > Components: Scheduler > Affects Versions: Myriad 0.1.0 > Reporter: Sarjeet Singh > > > I hit a NullPointerException when myriad-scheduler was receiving offers > from mesos & offer was missing some resource entity info e.g. > (cpu/memory/ports). > > The exception is caused from the following code: > > > https://github.com/mesos/myriad/blob/phase1/myriad-scheduler/src/main/java/com/ebay/myriad/scheduler/event/handlers/ResourceOffersEventHandler.java#L150-L156 > > Observed the issue when submit a yarn job and job was ran on CGS NMs, not > FGS NMs. On further debugging the issue, found the following exception from > RM log: > > 15/09/11 13:14:22 WARN handlers.StatusUpdateEventHandler: Task: value: > "yarn_container_e09_1442001795955_0002_01_000001" > not found, status: TASK_FINISHED > 15/09/11 13:14:23 INFO handlers.ResourceOffersEventHandler: Received > offers 1 > Sep 11, 2015 1:14:23 PM com.lmax.disruptor.FatalExceptionHandler > handleEventException > SEVERE: Exception processing: 16 > com.ebay.myriad.scheduler.event.ResourceOffersEvent@1256f6b6 > java.lang.NullPointerException > at > > com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.matches(ResourceOffersEventHandler.java:154) > at > > com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:92) > at > > com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55) > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > 15/09/11 13:14:23 ERROR yarn.YarnUncaughtExceptionHandler: Thread > Thread[pool-2-thread-3,5,main] threw an Exception. > java.lang.RuntimeException: java.lang.NullPointerException > at > > com.lmax.disruptor.FatalExceptionHandler.handleEventException(FatalExceptionHandler.java:45) > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:147) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at > > com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.matches(ResourceOffersEventHandler.java:154) > at > > com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:92) > at > > com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55) > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128) > ... 3 more > > Also, Observed from RM logs that after the above exception, no more offer > logs in RM as thread receiving offers is existed upon exception. > > > > -- > This message was sent by Atlassian JIRA > (v6.3.4#6332) >
