On Thu, Aug 17, 2017 at 4:42 AM, Enrico Olivelli <eolive...@gmail.com>
wrote:

> Hi,
> I am working with my colleagues at an implementation to relax the
> constraint that every acknowledged entry must have been successfully
> written and fsynced to disk at journal level.
>
> The idea is to have a flag in addEntry to ask for acknowledge not after the
> fsync in journal but only when data has been successfully written and
> flushed to the SO.
>
> I have the requirement that if an entry requires synch all the entries
> successfully sent 'before' that entry (causality) are synched too, even if
> they have been added with the new relaxed durability flag.
>
> Imagine a database transaction log, during a transaction I will write every
> change to data to the WAL with the new flag, and only the commit
> transaction command will be added with synch requirement. The idea is that
> all the changes inside the scope of the transaction have a meaning only if
> the transaction is committed, so it is important that the commit entry
> won't be lost and if that entry isn't lost all of the other entries of the
> same transaction aren't lost too.
>

This is a great usecase and we would love to work  with you on this.
I believe there are lot of corner cases. What happens if there is an
ensemble change
after few 'relaxed' writes but before 'sync' write? Sync flag goes to only
the new ensemble
and the data on the old ensemble with 'relaxed' flag has no guarentees.

I have a use case where we would like to do this at ledger grnanualarity
basically close to open consistency.


> I have another use case. In another project I am storing binary objects
> into BK and I have to obtain great performance even on single disk bookie
> layouts (journal + data + index on the same partition). In this project it
> is acceptable to compensate the risk of not doing fsynch if requesting
> enough replication.
> IMHO it will be somehow like the Kakfa idea of durability, as far as I know
> Kafka by default does not impose fsynch but it leaves all to the SO and to
> the fact that there is a minimal configurable number of replicas which are
> in-synch.
>
> There are many open points, already suggested by Matteo, JV and Sijie:
> - LAC protocol?
> - replication in case of lost entries?
> - under production load mixing non synched entries with synched entries
> will not give much benefits
>
>
> For the LAC protocol I think that there is no impact, the point is that the
> LastAddConfirmed is the max entryid which is known to have been
> acknowledged to the writer, so durability is not a concern. You can loose
> entries even with fsynch, just by loosing all the disks which contains the
> data. Without fsynch it is just more probable.
>
> Replication: maybe we should write in the ledger metadata that the ledger
> allows this feature and deal with it. But I am not sure, I have to
> understand better how LaderHandleAdv deals with sparse entryids inside the
> re-replication process
>
> Mixed workload: honestly I would like to add this feature to limit the
> number of fsynch, and I expect to have lots of bursts of unsynched entries
> to be interleaved with a few synched entries. I know that this feature is
> not to be encouraged in general but only for specific cases, like the
> stories of LedgerHandleAdv or readUnconfirmedEntries
>
> If this makes sense to you I will create a BP and attach a first patch
>
> Enrico
>
>
>
>
>
> --
>
>
> -- Enrico Olivelli
>



-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi

Reply via email to