>>We have a background thread that performs 'force', in order to group >>commits. >>Please check. Thanks. I see it in the code now. The response callback is invoked only by the ForceWriteThread.
>>I suggest you to make benchmarks on your use case. I will try to write a benchmark. I was looking for ways to optimise throughput and latency, even when fsync needs to be done for every write operation. Thanks, Unmesh On Tue, Jun 2, 2020 at 9:15 PM Enrico Olivelli <eolive...@gmail.com> wrote: > Il Mar 2 Giu 2020, 17:09 Unmesh Joshi <unmeshjo...@gmail.com> ha scritto: > > > >>>This sounds strange and you are not the first one that is asking this > > question > > If the order is changed, writing to journal ahead of writing to ledger, > > will it make any difference? > > > > AFAIK it should not make any difference. > > > > >>>The acknowledgement is sent to the client only after a successful > > fdatasync > > >>on the journal (if you do not ask for DEFERRED_SYNC or disable fsyncs > > >>>explicitly) > > Ah, I missed the callback passed in the QueueEntry. The flush > > implementation though, seems to be writing to file > (BufferedChannel.flush), > > doesnt seem to be doing actual fileChannel.force? > > > > The callback is only called after 'force' > I don't have my laptop here now but I am sure. > We have a background thread that performs 'force', in order to group > commits. > Please check. > > > > > >>it is super fast and it > > >>guarantees the data have been persisted durable. > > Just curious, if there are any throughput/lagency tests to look at? > > > > We only have a benchmark tool, but not public results. > I suggest you to make benchmarks on your use case. > > We will be happy to help you > > > Enrico > > > > Thanks, > > Unmesh > > > > > > On Tue, Jun 2, 2020 at 7:23 PM Enrico Olivelli <eolive...@gmail.com> > > wrote: > > > > > Il Mar 2 Giu 2020, 15:20 Unmesh Joshi <unmeshjo...@gmail.com> ha > > scritto: > > > > > > > Hi, > > > > > > > > I was going through bookkeeper code, particularly to see when and how > > > > transaction logs are written and flushed to disk. > > > > Just curious to understand, why in, Bookie.addEntryInternal method, > > > writes > > > > to journal happen after the writes to ledger. ( > > > > > > > > > > > > > > https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Bookie.java > > > > ) > > > > Also, journal writes are not flushed to disk synchronously, as they > > > happen > > > > in their own dedicated thread (and can also be done in batches). > > > > So I had two questions. > > > > 1. Why journal writes are not done before the writes to ledgers > > > > > > > > > > This sounds strange and you are not the first one that is asking this > > > question. > > > Basically entries in BK are immutabile and when the bookie restarts it > > > replays the journal. > > > The LAC protocol shields reader clients from reading entries that have > > not > > > been acknowledged to the writer. > > > > > > > > > 2. Why not to wait till journal writes are successful (even if not > synced > > > > to disk may be) before returning the response. > > > > > > > > > > The acknowledgement is sent to the client only after a successful > > fdatasync > > > on the journal (if you do not ask for DEFERRED_SYNC or disable fsyncs > > > explicitly) > > > This is basically one of the core features of BK: it is super fast and > it > > > guarantees the data have been persisted durable. > > > > > > Enrico > > > > > > > > > If these things are not done, there is always a risk of losing data in > > case > > > > of server or disk crash? > > > > > > > > Thanks, > > > > Unmesh > > > > > > > > > >