Race condition between event notification and event registration
----------------------------------------------------------------

                 Key: CORE-5521
                 URL: http://tracker.firebirdsql.org/browse/CORE-5521
             Project: Firebird Core
          Issue Type: Bug
          Components: Engine
    Affects Versions: 3.0.2, 2.5.7, 4.0 Initial
            Reporter: Mark Rotteveel


There is a race condition between event notification and event registration 
(queue) reusing the event buffer, which causes the notification to write the 
event-id of the just queued event instead of the event being notified. See also 
fb-devel thread "Concurrency bugs in posting events?"

Relevant excerpts:

2017-04-09 12:00:
I have built a sample application that reproduces it a little bit more 
consistently (although it still occasionally succeeds without a 
mismatch). Note that this doesn't include the logging I showed in this 
conversation, let me know if you need that.

You can download it from 
https://www.dropbox.com/s/6jxfcadxtojodf8/event-race-condition-1.0-SNAPSHOT.zip?dl=0

Start with ./bin/event-race-condition --help for instructions. It 
requires Java 8.

Playing around with the --threadCount and --insertsPerThread can help to 
improve predictability to reproduce it. I used the same number of 
threads as I have (HT) cores in my machine. Using more inserts per 
thread can also increase the chance of it eventually occurring.

defaults are:
     private static String hostName = "localhost";
     private static int portNumber = 3050;
     private static String databasePath = "D:/data/db/fb3/eventrace.fdb";
     private static String user = "sysdba";
     private static String password = "masterkey";
     private static int threadCount = 8;
     private static int insertsPerThread = 200;

Full project: https://github.com/mrotteveel/event-race-condition

2017-04-02 13:59:
there seems to be a concurrency bug in 
events posted by Firebird to the client. It looks like it overwrites 
local event ids (shared buffer, race condition?).

This is triggered by running the entire Jaybird test suite. Running the 
specific test, TestFBEventManager.testLargeMultiLoad, in isolation 
significantly reduces the chance of it occurring.

For example a test run shows:

[V10AsynchronousChannel]Queue event: WireEventHandle:{ 
name:TEST_EVENT_A, localId:694, eventId:0, internalCount:897, 
previousInternalCount:897 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]java.nio.HeapByteBuffer[pos=0 lim=88 cap=2048]: 
000000340000000000000012010C544553545F4556454E545F42C201000000000000000000000000000002B5000000340000000000000012010C544553545F4556454E545F418503000000000000000000000000000002B6
[V10AsynchronousChannel]Received event id 693, eventCount 450
[V10AsynchronousChannel]Queue event: WireEventHandle:{ 
name:TEST_EVENT_B, localId:695, eventId:0, internalCount:450, 
previousInternalCount:450 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]Received event id 694, eventCount 901
[V10AsynchronousChannel]Queue event: WireEventHandle:{ 
name:TEST_EVENT_A, localId:696, eventId:0, internalCount:901, 
previousInternalCount:901 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]java.nio.HeapByteBuffer[pos=0 lim=44 cap=2048]: 
000000340000000000000012010C544553545F4556454E545F42C301000000000000000000000000000002B8
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]Received event id 696, eventCount 451
[V10AsynchronousChannel]Queue event: WireEventHandle:{ 
name:TEST_EVENT_A, localId:697, eventId:0, internalCount:451, 
previousInternalCount:451 }

In other words, Firebird posts event data for TEST_EVENT_B (count 450 -> 
451) with the local event id of TEST_EVENT_A. On occasion I also see 
that it resends an earlier - already acknowledged - local event id.

As the event name is proper (although not 100% sure it always is), I 
might be able to workaround this in the pure java implementation by 
matching based on the event name instead, but that is hardly a good 
workaround, because it is possible the same event name is registered 
multiple times, and it won't solve the occurrence of the same bug with 
the native client.

This seems to suggest a race condition of some kind when the events are 
posted/written to the aux connection.

I can reproduce this with Jaybird master, Firebird 3.0.2.32703 on 
Windows 10 64 bit, but I have also seen it with other Firebird versions, 
and with Jaybird 2.2 (which has a significantly different implementation 
of event handling), both with pure java and the native client use.

Any thoughts or ideas on this, or is it better if I just create a bug 
report?

Other example: both A and B are acknowledged with id of event B:

[V10AsynchronousChannel]Received event id 640, eventCount 843
[V10AsynchronousChannel]Queue event: WireEventHandle:{ 
name:TEST_EVENT_A, localId:642, eventId:0, internalCount:843, 
previousInternalCount:843 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[FBManagedConnection]End called: Xid[773794790]
[V10AsynchronousChannel]Received event id 641, eventCount 422
[V10AsynchronousChannel]Queue event: WireEventHandle:{ 
name:TEST_EVENT_B, localId:643, eventId:0, internalCount:422, 
previousInternalCount:422 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]java.nio.HeapByteBuffer[pos=0 lim=88 cap=2048]: 
000000340000000000000012010C544553545F4556454E545F414D0300000000000000000000000000000283000000340000000000000012010C544553545F4556454E545F42A70100000000000000000000000000000283
[V10AsynchronousChannel]Received event id 643, eventCount 845
[V10AsynchronousChannel]Queue event: WireEventHandle:{ 
name:TEST_EVENT_B, localId:644, eventId:0, internalCount:845, 
previousInternalCount:845 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]Received event id 643, eventCount 423

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://tracker.firebirdsql.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to