Re: I think that's a blocker...

Ted Ross Wed, 25 Feb 2015 07:54:29 -0800

Would it be safe to assume that any operations on driver->io are notthread safe?

Dispatch is a multi-threaded application. It looks to me as thoughio->error is a resource shared across the threads in an unsafe way.


-Ted

On 02/25/2015 08:55 AM, Rafael Schloming wrote:

This isn't necessarily a proton bug. Nothing in the referenced checkin
actually touches the logic around allocating/freeing error strings, it
merely causes pn_send/pn_recv to make use of pn_io_t's pn_error_t where
previously it threw away the error information. This would suggest that
there is perhaps a pre-existing bug in dispatch where it is calling
pn_send/pn_recv with a pn_io_t that has been freed, and it is only now
triggering due to the additional asserts that are encountered due to not
ignoring the error information.

I could be mistaken, but I would try reproducing this under valgrind. That
will tell you where the first free occurred and that should hopefully make
it obvious whether this is indeed a proton bug or whether dispatch is
somehow freeing the pn_io_t sooner than it should.

(FWIW, if it is indeed a proton bug, then I would agree it is a blocker.)

--Rafael

On Wed, Feb 25, 2015 at 7:54 AM, Michael Goulish <[email protected]>
wrote:

...but if not, somebody please feel free to correct me.

The Jira that I just created -- PROTON-826 -- is for a
bug I found with my topology testing of the Dispatch Router,
in which I repeatedly kill and restart a router and make
sure that the router network comes back to the same topology
that it had before.

As of checkin 01cb00c -- which had no Jira -- it is pretty
easy for my test to blow core.  It looks like an error
string is being double-freed (maybe) in the proton library.

( full info in the Jira.  https://issues.apache.org/jira/browse/PROTON-826
)

Re: I think that's a blocker...

Reply via email to