2017-04-04 9:47 GMT+02:00 Enrico Olivelli <eolive...@gmail.com>: > I have completed by debug session and I have an answer. > > The problematic flow is the following: > - LedgerHandler#asyncAddEntry > - in case of BKNotEnoughBookiesException I'm triggering an async callback > (with a CompletableFuture#handleAsync) which in turn calls > LedgerHandler#close > - the LedgerHandler#close method truncates the ledger and some entry is > lost (at the time of the close call I have local LastAddConfirmed == > LastAddPushed == 0 on the writer) > > Is this a possible explanation ? > I wonder which is the best way to react to an error (in this case > BKNotEnoughBookiesException) in asyncAddEntry, usually I'm calling > LedgerHandler#close in order to the necessary clean up, but in this case it > seems very dangerous > (I am running BK 4.4.0) > > This is the issue on my project, where I have put logs > https://github.com/diennea/herddb/issues/100 > > Thanks > -- Enrico > > >
I'm sorry, during other test cases skipping the close() did not fix the issue consistently, I will debug more > > > > > > > 2017-03-17 16:14 GMT+01:00 Enrico Olivelli <eolive...@gmail.com>: > >> >> >> 2017-03-17 7:48 GMT+01:00 Sijie Guo <guosi...@gmail.com>: >> >>> >>> >>> On Wed, Mar 15, 2017 at 9:17 AM, Enrico Olivelli <eolive...@gmail.com> >>> wrote: >>> >>>> Hi BookKeepers, >>>> I'm trying to understand some basic aspect of BookKeeper.... >>>> >>>> I have a strange case, and I would like to have some opinion from any >>>> "expert". >>>> >>>> I am using BookKeeper as a transaction log for a database, this is an >>>> example scenario: >>>> >>>> 1) the user inserts a record in a table >>>> the "fact" is logged to the Write-Ahead-log (Bookkeeper ledger), and >>>> the systems blocks until the asyncAddEntry completes (the callback is >>>> called with rc=0) >>>> >>>> now I (the 'writer') assume that the entry is written to durable >>>> storage and has been synched to disk >>>> then the DB applies the change to the local memory as every application >>>> with a WAL does >>>> >>>> 2) The writer 2 other entries for which the asyncAddEntry completes >>>> (the callback is called with rc=0). I block until the callback as been >>>> called >>>> >>>> 2) for tests I'm using an ensemble of 1 bookie, I "stop" the only bookie >>>> the bookie stops "gracefully" (no strange errors) >>>> >>>> 3) The writer issues other asyncAddEntry, which fail due to >>>> "notEnoughBookiesException" and then "closes" the LedgerHandle, inside the >>>> "callback" cade >>>> >>>> 4) I restart the bookie (no strange errors on logs) >>>> >>>> 5) I restart my "writer", opening the ledger with "openLedger" (with >>>> recovery and fencing...) in order to perform recovery from the WAL >>>> >>>> 6) now the LastAddConfirmed is 0 ! so my first entry is lost ! >>>> >>>> Am I missing something ? >>>> The LAC has not been advanced even with 3 successful writes to the >>>> ledger >>>> >>> >>> Are you using readLastAddConfirmed at step 5)? 0 is expected when you >>> using #readLastAddConfirmed, since the call is retrieving LAC from bookies. >>> >>> >> I have added a call to readLastAddConfirmed but it does not help, the LAC >> is still 0 >> >> >> I have double checked my code and the "close" happens in another thread, >> maybe I am hitting a race condition in my code, in fact the problem is not >> reproducible at 100% >> >> >> I will continue debugging >> >> Thanks >> >> >> >> >>> At step 3), it already closed the ledger and update the metadata's last >>> entry id with 2. so for tailing read, you might need to call >>> #readLastAddConfirmed and then get the last add confirmed using >>> #getLastAddConfirmed. >>> >>> >>> >>>> >>>> From the "writer" point-of-view the fact that the callback as been >>>> called is not a guarantee that the entry as been written to durable storage >>>> ? >>>> >>>> I understand that of a "tailing" reader (a ledger opened with >>>> openLedgerNoRecovery) it is very important not to be able to "read" data >>>> before the LAC, but for the "writer" it is important that it can count on >>>> the fact the entry will be recovered with success >>>> >>>> >>>> Thanks >>>> -- Enrico >>>> >>>> >>> >> >