Before I get into the details, let me just say that I'm unable to
reproduce the test failures seen on OSX and Solaris x64 on local
hardware, I don't have access to a debugger or thread dumps.
The tests that fail on OSX and Solaris x64 (the tests pass on sparc),
are practically identical. The basic problem is discovery event's are
either not received or only some discovery events are received. The
tests allow very long time frames for these events to be received, on
other OS's these tests pass rapidly.
Increasing debugging output has the effect of increasing the number of
events received.
The tests and their details can be viewed on Jenkins.
Over the last few months I've been inspecting code manually and fixing
synchronization issues.
River has a large legacy codebase, there are many examples of inadequate
synchronization.
Ironically some of the changes I've made, although reducing test
failures on Linux and Windows has exacerbated test failures on OSX and
Solaris x64.
Is there anyone on this list with access to this hardware who can
reproduce these bugs?
Regards,
Peter.