James Birdsall created PROTON-1276:
--------------------------------------
Summary: Reactor: Race between CollectorImpl peek and pop can
result in dispatching freed events
Key: PROTON-1276
URL: https://issues.apache.org/jira/browse/PROTON-1276
Project: Qpid Proton
Issue Type: Bug
Components: proton-j
Affects Versions: 0.13.0
Environment: Windows 10
Reporter: James Birdsall
Priority: Minor
Last week I started trying to track down some unhandled NullPointerExceptions
coming out of the Proton-J reactor:
{quote}
org.apache.qpid.proton.reactor.impl.IOHandler.onUnhandled(IOHandler.java:354)
org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:236)
org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108)
org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:307)
org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:276)
com.microsoft.azure.servicebus.MessagingFactory$RunReactor.run(MessagingFactory.java:355)
{quote}
In IOHandler.onUnhandled(), the call to reactor.getSelector() is throwing the
NPE because reactor is null. It's null because that's what Event.getReactor()
returns when it doesn't recognize the type of EventImpl.context. It doesn't
recognize the type because EventImpl.context is null, as is EventImpl.type.
The event that's being dispatched was retrieved by the collector.peek() call in
ReactorImpl.process(). Looking at CollectorImpl, the code specifically guards
against putting an event with null type into the collector. It looks like the
only time that an event can be completely null inside is in
CollectorImpl.pop(), when the event that was just removed from the collector is
clear()ed and then put on the free list. For grins, I added code that sets the
context to the string "POISON" after the clear(), and modified
CollectorImpl.peek() to detect when it was about to return a poisoned event,
and peek() detected a poisoned event before every NPE.
Looking at the code, there's a race between pop() and peek() because pop()
cleans up the event while head is still pointing at it. I rewrote the code to
remove the event from the chain before doing anything else, and so far it looks
like that solves the problem:
{code}
@Override
public void pop()
{
if (head != null) {
EventImpl freeing = head;
head = head.next;
freeing.clear();
freeing.next = free;
free = freeing;
}
}
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]