👍, looking forward to see the BP. On Thu, Aug 17, 2017 at 7:42 PM, Enrico Olivelli <eolive...@gmail.com> wrote:
> Hi, > I am working with my colleagues at an implementation to relax the > constraint that every acknowledged entry must have been successfully > written and fsynced to disk at journal level. > > The idea is to have a flag in addEntry to ask for acknowledge not after the > fsync in journal but only when data has been successfully written and > flushed to the SO. > > I have the requirement that if an entry requires synch all the entries > successfully sent 'before' that entry (causality) are synched too, even if > they have been added with the new relaxed durability flag. > > Imagine a database transaction log, during a transaction I will write every > change to data to the WAL with the new flag, and only the commit > transaction command will be added with synch requirement. The idea is that > all the changes inside the scope of the transaction have a meaning only if > the transaction is committed, so it is important that the commit entry > won't be lost and if that entry isn't lost all of the other entries of the > same transaction aren't lost too. > > I have another use case. In another project I am storing binary objects > into BK and I have to obtain great performance even on single disk bookie > layouts (journal + data + index on the same partition). In this project it > is acceptable to compensate the risk of not doing fsynch if requesting > enough replication. > IMHO it will be somehow like the Kakfa idea of durability, as far as I know > Kafka by default does not impose fsynch but it leaves all to the SO and to > the fact that there is a minimal configurable number of replicas which are > in-synch. > > There are many open points, already suggested by Matteo, JV and Sijie: > - LAC protocol? > - replication in case of lost entries? > - under production load mixing non synched entries with synched entries > will not give much benefits > > > For the LAC protocol I think that there is no impact, the point is that the > LastAddConfirmed is the max entryid which is known to have been > acknowledged to the writer, so durability is not a concern. You can loose > entries even with fsynch, just by loosing all the disks which contains the > data. Without fsynch it is just more probable. > > Replication: maybe we should write in the ledger metadata that the ledger > allows this feature and deal with it. But I am not sure, I have to > understand better how LaderHandleAdv deals with sparse entryids inside the > re-replication process > > Mixed workload: honestly I would like to add this feature to limit the > number of fsynch, and I expect to have lots of bursts of unsynched entries > to be interleaved with a few synched entries. I know that this feature is > not to be encouraged in general but only for specific cases, like the > stories of LedgerHandleAdv or readUnconfirmedEntries > > If this makes sense to you I will create a BP and attach a first patch > > Enrico > > > > > > -- > > > -- Enrico Olivelli >