James Birdsall created PROTON-1276:
--------------------------------------

             Summary: Reactor: Race between CollectorImpl peek and pop can 
result in dispatching freed events
                 Key: PROTON-1276
                 URL: https://issues.apache.org/jira/browse/PROTON-1276
             Project: Qpid Proton
          Issue Type: Bug
          Components: proton-j
    Affects Versions: 0.13.0
         Environment: Windows 10
            Reporter: James Birdsall
            Priority: Minor


Last week I started trying to track down some unhandled NullPointerExceptions 
coming out of the Proton-J reactor:
{quote}
org.apache.qpid.proton.reactor.impl.IOHandler.onUnhandled(IOHandler.java:354)
org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:236)
org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108)
org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:307)
org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:276)
com.microsoft.azure.servicebus.MessagingFactory$RunReactor.run(MessagingFactory.java:355)
{quote}

In IOHandler.onUnhandled(), the call to reactor.getSelector() is throwing the 
NPE because reactor is null. It's null because that's what Event.getReactor() 
returns when it doesn't recognize the type of EventImpl.context. It doesn't 
recognize the type because EventImpl.context is null, as is EventImpl.type.

The event that's being dispatched was retrieved by the collector.peek() call in 
ReactorImpl.process(). Looking at CollectorImpl, the code specifically guards 
against putting an event with null type into the collector. It looks like the 
only time that an event can be completely null inside is in 
CollectorImpl.pop(), when the event that was just removed from the collector is 
clear()ed and then put on the free list. For grins, I added code that sets the 
context to the string "POISON" after the clear(), and modified 
CollectorImpl.peek() to detect when it was about to return a poisoned event, 
and peek() detected a poisoned event before every NPE.

Looking at the code, there's a race between pop() and peek() because pop() 
cleans up the event while head is still pointing at it. I rewrote the code to 
remove the event from the chain before doing anything else, and so far it looks 
like that solves the problem:

{code}
@Override
    public void pop()
    {
        if (head != null) {
                EventImpl freeing = head;
                head = head.next;
                freeing.clear();
                freeing.next = free;
                free = freeing;
        }
    }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to