Leaving the fsync question aside for now (I'd need to think about it some more, and hoping Brad will reply in the meantime anyway), and answering your question about why you don't straight pack all incoming blobs: packed blobs are supposed to help with sequential access of files, so they're basically a re-assembling of all the small blobs of a file into one (or more, if needed) zip. For, I think, efficiency reasons, blobs/files which are under 512 bytes in size do not get stored in a pack, so they stay forever in the loose blobserver. That is at least one reason why you can't just stream all blobs directly to the packed blobserver.
On 2 October 2016 at 21:21, Theodore Ts'o <[email protected]> wrote: > > > On Sunday, October 2, 2016 at 2:48:31 PM UTC-4, Theodore Ts'o wrote: >> >> (Especially since if you crash before the permanode is written, the >> client is going to have to restart the whole backup from scratch anyway.) > > > One thought --- as an automated heuristic, if the blobserver receives a > stream of unsigned blobs, it doesn't need to fsync() them. After all, any > objects which aren't referenced by a permanode are subject to GC treatment. > So if you crash and then run a GC, any immutable, non-signed objects that > were uploaded just before the crash would be GC'ed anyway. Hence, there's > no point to treat them as precious objects that have to be fsync'ed before > the client upload is acknowledged. So what could be done is when the first > signed object is received, the blob server could send down a sync(2) > command, and then write all of the signed objects using fsync(2). > > If we did this, the next obvious optimization would be to tune the writeback > interval for the disk in question to be 2-3 minutes, instead of the usual 30 > seconds. I noticed that objects were getting written as loose files, and > then repacked into pack file approximately every 2 minutes or so. All > modern file systems do delayed allocation, which means that if we're not > fsync'ing the loose files, they won't get flushed to disk, and so if they > are written into the packed file and then get deleted within the writeback > interval, the loose files will never get written to disk. This will > double camlistore's effective write throughput to the disk, since we won't > be writing each byte being backed up twice --- once to the loose file, and a > second time to the pack file. > > Cheers, > > - Ted > > P.S. I assume there are good reasons why we can't just stream the objects > straight to the pack file, which is what git does? I noticed there were > some comments about wanting to rearrange the objects so they would be in an > optimal order for later access. Is that right? > > -- > You received this message because you are subscribed to the Google Groups > "Camlistore" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Camlistore" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
