On Sunday, October 2, 2016 at 2:48:31 PM UTC-4, Theodore Ts'o wrote:
>
>  (Especially since if you crash before the permanode is written, the 
> client is going to have to restart the whole backup from scratch anyway.)
>

One thought --- as an automated heuristic, if the blobserver receives a 
stream of unsigned blobs, it doesn't need to fsync() them.    After all, 
any objects which aren't referenced by a permanode are subject to GC 
treatment.   So if you crash and then run a GC, any immutable, non-signed 
objects that were uploaded just before the crash would be GC'ed anyway.   
 Hence, there's no point to treat them as precious objects that have to be 
fsync'ed before the client upload is acknowledged.  So what could be done 
is when the first signed object is received, the blob server could send 
down a sync(2) command, and then write all of the signed objects using 
fsync(2).

If we did this, the next obvious optimization would be to tune the 
writeback interval for the disk in question to be 2-3 minutes, instead of 
the usual 30 seconds.   I noticed that objects were getting written as 
loose files, and then repacked into pack file approximately every 2 minutes 
or so.    All modern file systems do delayed allocation, which means that 
if we're not fsync'ing the loose files, they won't get flushed to disk, and 
so if they are written into the packed file and then get deleted within the 
writeback interval, the loose files will never get written to disk.    This 
will double camlistore's effective write throughput to the disk, since we 
won't be writing each byte being backed up twice --- once to the loose 
file, and a second time to the pack file.   

Cheers,

- Ted

P.S.  I assume there are good reasons why we can't just stream the objects 
straight to the pack file, which is what git does?    I noticed there were 
some comments about wanting to rearrange the objects so they would be in an 
optimal order for later access.  Is that right?

-- 
You received this message because you are subscribed to the Google Groups 
"Camlistore" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to