Race condition between event notification and event registration ----------------------------------------------------------------
Key: CORE-5521 URL: http://tracker.firebirdsql.org/browse/CORE-5521 Project: Firebird Core Issue Type: Bug Components: Engine Affects Versions: 3.0.2, 2.5.7, 4.0 Initial Reporter: Mark Rotteveel There is a race condition between event notification and event registration (queue) reusing the event buffer, which causes the notification to write the event-id of the just queued event instead of the event being notified. See also fb-devel thread "Concurrency bugs in posting events?" Relevant excerpts: 2017-04-09 12:00: I have built a sample application that reproduces it a little bit more consistently (although it still occasionally succeeds without a mismatch). Note that this doesn't include the logging I showed in this conversation, let me know if you need that. You can download it from https://www.dropbox.com/s/6jxfcadxtojodf8/event-race-condition-1.0-SNAPSHOT.zip?dl=0 Start with ./bin/event-race-condition --help for instructions. It requires Java 8. Playing around with the --threadCount and --insertsPerThread can help to improve predictability to reproduce it. I used the same number of threads as I have (HT) cores in my machine. Using more inserts per thread can also increase the chance of it eventually occurring. defaults are: private static String hostName = "localhost"; private static int portNumber = 3050; private static String databasePath = "D:/data/db/fb3/eventrace.fdb"; private static String user = "sysdba"; private static String password = "masterkey"; private static int threadCount = 8; private static int insertsPerThread = 200; Full project: https://github.com/mrotteveel/event-race-condition 2017-04-02 13:59: there seems to be a concurrency bug in events posted by Firebird to the client. It looks like it overwrites local event ids (shared buffer, race condition?). This is triggered by running the entire Jaybird test suite. Running the specific test, TestFBEventManager.testLargeMultiLoad, in isolation significantly reduces the chance of it occurring. For example a test run shows: [V10AsynchronousChannel]Queue event: WireEventHandle:{ name:TEST_EVENT_A, localId:694, eventId:0, internalCount:897, previousInternalCount:897 } [AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0 [V10AsynchronousChannel]java.nio.HeapByteBuffer[pos=0 lim=88 cap=2048]: 000000340000000000000012010C544553545F4556454E545F42C201000000000000000000000000000002B5000000340000000000000012010C544553545F4556454E545F418503000000000000000000000000000002B6 [V10AsynchronousChannel]Received event id 693, eventCount 450 [V10AsynchronousChannel]Queue event: WireEventHandle:{ name:TEST_EVENT_B, localId:695, eventId:0, internalCount:450, previousInternalCount:450 } [AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0 [AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0 [V10AsynchronousChannel]Received event id 694, eventCount 901 [V10AsynchronousChannel]Queue event: WireEventHandle:{ name:TEST_EVENT_A, localId:696, eventId:0, internalCount:901, previousInternalCount:901 } [AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0 [V10AsynchronousChannel]java.nio.HeapByteBuffer[pos=0 lim=44 cap=2048]: 000000340000000000000012010C544553545F4556454E545F42C301000000000000000000000000000002B8 [AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0 [V10AsynchronousChannel]Received event id 696, eventCount 451 [V10AsynchronousChannel]Queue event: WireEventHandle:{ name:TEST_EVENT_A, localId:697, eventId:0, internalCount:451, previousInternalCount:451 } In other words, Firebird posts event data for TEST_EVENT_B (count 450 -> 451) with the local event id of TEST_EVENT_A. On occasion I also see that it resends an earlier - already acknowledged - local event id. As the event name is proper (although not 100% sure it always is), I might be able to workaround this in the pure java implementation by matching based on the event name instead, but that is hardly a good workaround, because it is possible the same event name is registered multiple times, and it won't solve the occurrence of the same bug with the native client. This seems to suggest a race condition of some kind when the events are posted/written to the aux connection. I can reproduce this with Jaybird master, Firebird 3.0.2.32703 on Windows 10 64 bit, but I have also seen it with other Firebird versions, and with Jaybird 2.2 (which has a significantly different implementation of event handling), both with pure java and the native client use. Any thoughts or ideas on this, or is it better if I just create a bug report? Other example: both A and B are acknowledged with id of event B: [V10AsynchronousChannel]Received event id 640, eventCount 843 [V10AsynchronousChannel]Queue event: WireEventHandle:{ name:TEST_EVENT_A, localId:642, eventId:0, internalCount:843, previousInternalCount:843 } [AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0 [AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0 [FBManagedConnection]End called: Xid[773794790] [V10AsynchronousChannel]Received event id 641, eventCount 422 [V10AsynchronousChannel]Queue event: WireEventHandle:{ name:TEST_EVENT_B, localId:643, eventId:0, internalCount:422, previousInternalCount:422 } [AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0 [AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0 [AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0 [V10AsynchronousChannel]java.nio.HeapByteBuffer[pos=0 lim=88 cap=2048]: 000000340000000000000012010C544553545F4556454E545F414D0300000000000000000000000000000283000000340000000000000012010C544553545F4556454E545F42A70100000000000000000000000000000283 [V10AsynchronousChannel]Received event id 643, eventCount 845 [V10AsynchronousChannel]Queue event: WireEventHandle:{ name:TEST_EVENT_B, localId:644, eventId:0, internalCount:845, previousInternalCount:845 } [AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0 [AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0 [V10AsynchronousChannel]Received event id 643, eventCount 423 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://tracker.firebirdsql.org/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel