I think that's a fair consideration. However I am thinking if we allow non-durable ledger, that means 1) application needs to handle the missing entries; 2) the re-replication should handle non-durable ledger by ignoring the non-existing entries if they are missing.
But Let's see how Jia is proposing. - Sijie On Fri, Jun 3, 2016 at 8:57 AM, Venkateswara Rao Jujjuri <jujj...@gmail.com> wrote: > @sijie let me expand what I mean by " this changes something fundamental " > > Everything starts that we are not persisting. Also I share lot of the > points raised by @Matteo. > > - In theory, we could loose all copies of EntryId X but persist EntryId > X+Y. How does reads,replication, consistency cope up with it? > - We could advance LAC, but loose last last set of entries. What do we do? > do we adjust LAC? at what boundaries? > - One of the core principles of LOG is, if entry X is there , all the > entries up until X are available too, with this we may need to deal with > sparse / missing entries. > > I believe this is more of a direction towards making BooKKeeper in-memory > log, but I am afraid it is more of a core change. > > Thanks, > JV > > On Fri, Jun 3, 2016 at 12:05 AM, Matteo Merli <mme...@apache.org> wrote: > >> I was interested in trying something in this area, but never actually got >> to do it. >> >> A few random notes: >> >> 1. My suspicion, with no backing data at this point, is that simply >> skipping the fsync >> for "non-durable" ledgers might not give a big improvement, just a bit >> less latency >> for non-fsynced writes but roughly the same throughput. Imagine a >> bookie >> receiving writes for 2 ledgers, 1 durable and the other non-durable. >> Since the entries are appended to the journal as they come in, the >> fsync() for the >> durable ledger write will also carry on the data for the previous >> non-durable ledger >> write, causing more IOPS if that was spanning a different disk block. >> Given that the bookie throughput is typically limited by the IOPS >> capacity of the >> journal device, having non-durable write might help that much. >> >> 2. The other options I was thinking were : >> - Do not append the non-durable entries to journal (redundancy is >> anyway given by >> writing to multiple bookies). In this case though, a single bookie >> could loose more >> entries depending on flushTime, and also could loose entries even >> in case of >> process crash, not just kernel-panic or power-outage. >> >> - Use a separate journal for non-durable writes which will not be >> fsynced() >> >> - Configure the durability at the bookie level and then use >> placement/isolation policy to choose the >> appropriate set of bookies for a non-durable ledger. >> >> 3. How do bookie replication will operate when getting read-errors? >> >> Matteo >> >> On Thu, Jun 2, 2016 at 11:09 PM Sijie Guo <si...@apache.org> wrote: >> >> > I think if a ledger is configured to be non-durable, it is kind of >> > application's responsibility to tolerant the data loss. >> > So I don't think it actually will have to change any in the bookkeeper >> > client side. >> > >> > - Sijie >> > >> > On Thu, Jun 2, 2016 at 7:29 AM, Venkateswara Rao Jujjuri < >> > jujj...@gmail.com> >> > wrote: >> > >> > > I agree that we must make this ledger property not perEntry write >> > property. >> > > >> > > But, biggest doubt in my mind is - this changes something fundamental. >> > LAC. >> > > Are we allowing sparse ledger? in failure scenario? Handling read side >> > may >> > > become more complex. >> > > >> > > On Thu, Jun 2, 2016 at 12:19 AM, Sijie Guo <guosi...@gmail.com> >> wrote: >> > > >> > >> This seems interesting to me. However, it might be safe to start >> with a >> > >> flag configured per ledger, rather than per entry. Also, it would be >> > good >> > >> to hear the opinions from other people. JV, Matteo? (If I remembered >> > >> correctly, Matteo mentioned that Yahoo might be working on similar >> > thing) >> > >> >> > >> +1 for creating a BOOKKEEPER jira to track this. >> > >> >> > >> - Sijie >> > >> >> > >> On Wed, Jun 1, 2016 at 6:37 PM, Jia Zhai <zhaiji...@gmail.com> >> wrote: >> > >> >> > >> > + distributedlog-user >> > >> > For more input and comments. :) >> > >> > >> > >> > Thanks. >> > >> > >> > >> > On Thu, Jun 2, 2016 at 9:34 AM, Jia Zhai <zhaiji...@gmail.com> >> wrote: >> > >> > >> > >> >> Hello all, >> > >> >> >> > >> >> I am wondering do you guys have any plans on supporting relax >> > >> durability. >> > >> >> Is it a good feature to have in bookkeeper (also for >> DistributedLog)? >> > >> >> >> > >> >> I am thinking adding a new flag to bookkeeper#addEntry(..., >> Boolean >> > >> >> sync). So the application can control whether to sync or not for >> > >> individual >> > >> >> entries. >> > >> >> >> > >> >> - On the write protocol, adding a flag to indicate whether this >> write >> > >> >> should sync to disk or not. >> > >> >> - On the bookie side, if the addEntry request is sync, going >> through >> > >> >> original pipeline. If the addEntry disables sync, complete the >> add >> > >> >> callbacks after writing to the journal file and before flushing >> > >> journal. >> > >> >> - Those add entries (disabled syncs) will be flushed to disks with >> > >> >> subsequent sync add entries. >> > >> >> >> > >> >> To my use cases on DistributedLog, this feature can be used for >> > >> >> supporting streams that don't have strong durability requirements. >> > >> >> >> > >> >> What do you guys think? Shall I create a jira to implement this? >> > >> >> >> > >> >> Thanks a lot >> > >> >> -Jia >> > >> >> >> > >> > >> > >> > -- >> > >> > You received this message because you are subscribed to the Google >> > >> Groups >> > >> > "distributedlog-user" group. >> > >> > To unsubscribe from this group and stop receiving emails from it, >> send >> > >> an >> > >> > email to distributedlog-user+unsubscr...@googlegroups.com. >> > >> > To post to this group, send email to >> > >> distributedlog-u...@googlegroups.com. >> > >> > To view this discussion on the web visit >> > >> > >> > >> >> > >> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com >> > >> > < >> > >> >> > >> https://groups.google.com/d/msgid/distributedlog-user/CALsc%2BXpJj3YT47bognhmEhHmahJkCgJUUY6Un4HVczfK_1MxPQ%40mail.gmail.com?utm_medium=email&utm_source=footer >> > >> > >> > >> > . >> > >> > For more options, visit https://groups.google.com/d/optout. >> > >> > >> > >> >> > > >> > > >> > > >> > > -- >> > > Jvrao >> > > --- >> > > First they ignore you, then they laugh at you, then they fight you, >> then >> > > you win. - Mahatma Gandhi >> > > >> > > >> > > -- >> > > You received this message because you are subscribed to the Google >> Groups >> > > "distributedlog-user" group. >> > > To unsubscribe from this group and stop receiving emails from it, >> send an >> > > email to distributedlog-user+unsubscr...@googlegroups.com. >> > > To post to this group, send email to >> > distributedlog-u...@googlegroups.com. >> > > To view this discussion on the web visit >> > > >> > >> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com >> > > < >> > >> https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXLqqW6q3V%2Br%3Dt%3DdOhq-gue_fWNpAgaFrMXw%3DaCHUFomQ%40mail.gmail.com?utm_medium=email&utm_source=footer >> > > >> > > . >> > > >> > > For more options, visit https://groups.google.com/d/optout. >> > > >> > >> > > > > -- > Jvrao > --- > First they ignore you, then they laugh at you, then they fight you, then > you win. - Mahatma Gandhi > > > -- > You received this message because you are subscribed to the Google Groups > "distributedlog-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to distributedlog-user+unsubscr...@googlegroups.com. > To post to this group, send email to distributedlog-u...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXs42QqZY-pw0YeL6uYqmDCEiFOxo5%3DRkXwcSg%3DEgrMJA%40mail.gmail.com > <https://groups.google.com/d/msgid/distributedlog-user/CAKKTCLXs42QqZY-pw0YeL6uYqmDCEiFOxo5%3DRkXwcSg%3DEgrMJA%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. >