Handling two (or more) clients: writes append to a log but are marked as speculative?
c/o Tango: Distributed Data Structures over a Shared Log - Mahesh Balakrishnan et al. http://www.cs.cornell.edu/~taozou/sosp13/tangososp.pdf On Mon, Oct 3, 2016 at 10:51 AM, zimbatm <[email protected]> wrote: > Hi, > > Wouldn't the blob commit state also have to be shared in case two clients > upload the same blob at the same time? Otherwise one client might upload a > tree of blobs and loose the subset that's been uploaded by another client. > > On Mon, 3 Oct 2016, 17:26 Brad Fitzpatrick, <[email protected]> wrote: > >> I've actually been thinking that sync should be an explicit part of the >> protocol so higher levels can decide the atomicity that they require. >> >> Then we make everything async by default, but all blob storage >> implementations must support a sync (or "Flush"?) operation. And then >> camput and other tools be sure to do a sync at the end before they return >> success. Or maybe they even have a flag (defaulting to --sync=true?) to let >> the caller control. >> >> Thoughts? And on naming? >> >> >> On Sun, Oct 2, 2016 at 12:21 PM, Theodore Ts'o <[email protected]> >> wrote: >> >> >> >> On Sunday, October 2, 2016 at 2:48:31 PM UTC-4, Theodore Ts'o wrote: >> >> (Especially since if you crash before the permanode is written, the >> client is going to have to restart the whole backup from scratch anyway.) >> >> >> One thought --- as an automated heuristic, if the blobserver receives a >> stream of unsigned blobs, it doesn't need to fsync() them. After all, >> any objects which aren't referenced by a permanode are subject to GC >> treatment. So if you crash and then run a GC, any immutable, non-signed >> objects that were uploaded just before the crash would be GC'ed anyway. >> Hence, there's no point to treat them as precious objects that have to be >> fsync'ed before the client upload is acknowledged. So what could be done >> is when the first signed object is received, the blob server could send >> down a sync(2) command, and then write all of the signed objects using >> fsync(2). >> >> If we did this, the next obvious optimization would be to tune the >> writeback interval for the disk in question to be 2-3 minutes, instead of >> the usual 30 seconds. I noticed that objects were getting written as >> loose files, and then repacked into pack file approximately every 2 minutes >> or so. All modern file systems do delayed allocation, which means that >> if we're not fsync'ing the loose files, they won't get flushed to disk, and >> so if they are written into the packed file and then get deleted within the >> writeback interval, the loose files will never get written to disk. This >> will double camlistore's effective write throughput to the disk, since we >> won't be writing each byte being backed up twice --- once to the loose >> file, and a second time to the pack file. >> >> Cheers, >> >> - Ted >> >> P.S. I assume there are good reasons why we can't just stream the >> objects straight to the pack file, which is what git does? I noticed >> there were some comments about wanting to rearrange the objects so they >> would be in an optimal order for later access. Is that right? >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Camlistore" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Camlistore" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> > -- > You received this message because you are subscribed to the Google Groups > "Camlistore" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Camlistore" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
