Are you saying that instead of adding a Sync/Flush method to the
blobserver.Storage interface (which is what I believe Brad is
proposing), you'd add an argument to the blobserver.ReceiveBlob
method?

Or were you talking more specifically about how it would translate for
higher level tools like camput?

On 3 October 2016 at 20:40, Theodore Tso <[email protected]> wrote:
> Or maybe as an optional argument to the put operation that requests whether
> or not the blob write should be flushed or not, with a config option to
> define what happens in the default case when the caller doesn't specify one
> way or another?
>
> - Ted
>
> On Mon, Oct 3, 2016 at 12:26 PM, Brad Fitzpatrick <[email protected]> wrote:
>>
>> I've actually been thinking that sync should be an explicit part of the
>> protocol so higher levels can decide the atomicity that they require.
>>
>> Then we make everything async by default, but all blob storage
>> implementations must support a sync (or "Flush"?) operation. And then camput
>> and other tools be sure to do a sync at the end before they return success.
>> Or maybe they even have a flag (defaulting to --sync=true?) to let the
>> caller control.
>>
>> Thoughts? And on naming?
>>
>>
>> On Sun, Oct 2, 2016 at 12:21 PM, Theodore Ts'o <[email protected]>
>> wrote:
>>>
>>>
>>>
>>> On Sunday, October 2, 2016 at 2:48:31 PM UTC-4, Theodore Ts'o wrote:
>>>>
>>>>  (Especially since if you crash before the permanode is written, the
>>>> client is going to have to restart the whole backup from scratch anyway.)
>>>
>>>
>>> One thought --- as an automated heuristic, if the blobserver receives a
>>> stream of unsigned blobs, it doesn't need to fsync() them.    After all, any
>>> objects which aren't referenced by a permanode are subject to GC treatment.
>>> So if you crash and then run a GC, any immutable, non-signed objects that
>>> were uploaded just before the crash would be GC'ed anyway.    Hence, there's
>>> no point to treat them as precious objects that have to be fsync'ed before
>>> the client upload is acknowledged.  So what could be done is when the first
>>> signed object is received, the blob server could send down a sync(2)
>>> command, and then write all of the signed objects using fsync(2).
>>>
>>> If we did this, the next obvious optimization would be to tune the
>>> writeback interval for the disk in question to be 2-3 minutes, instead of
>>> the usual 30 seconds.   I noticed that objects were getting written as loose
>>> files, and then repacked into pack file approximately every 2 minutes or so.
>>> All modern file systems do delayed allocation, which means that if we're not
>>> fsync'ing the loose files, they won't get flushed to disk, and so if they
>>> are written into the packed file and then get deleted within the writeback
>>> interval, the loose files will never get written to disk.    This will
>>> double camlistore's effective write throughput to the disk, since we won't
>>> be writing each byte being backed up twice --- once to the loose file, and a
>>> second time to the pack file.
>>>
>>> Cheers,
>>>
>>> - Ted
>>>
>>> P.S.  I assume there are good reasons why we can't just stream the
>>> objects straight to the pack file, which is what git does?    I noticed
>>> there were some comments about wanting to rearrange the objects so they
>>> would be in an optimal order for later access.  Is that right?
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "Camlistore" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Camlistore" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Camlistore" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Camlistore" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to