Hi,

I was noticing that it was taking "a while" to do a test backup of 76 GB, 
so I started digging into why, and it appears that the blobserver is 
issuing an fsync() call after each object is received from the network, as 
well as after appending each blob to a pack file.   This might make sense 
if we were using the blobserver as, say, a back end store for a Mail 
Server, where you want to make sure that you won't lose an object, even 
after a power failure, before you send that SMTP 200 code, but if you're 
doing a backup of millions and millions of objects, those fsync's are going 
to be expensive.     And if you do crash, well, we can just restart the 
backup, so making sure that the bytes are solidly on iron oxide one object 
at a time seems a bit wasteful.   (Especially since if you crash before the 
permanode is written, the client is going to have to restart the whole 
backup from scratch anyway.)

It would be fairly easily to add a config parameter which turns off fsync's 
entirely, or only after the server has gone idle or every N seconds, which 
ever comes first, but it occurs to me that might not be the best way to do 
things.    Would it make more sense if there was some way for the client to 
the tell the server, "everything coming down this HTTP/2 link doesn't need 
to be treated as 'precious'", so backups might be treated one way, but 
other camlistore clients that might need more careful treatment of their 
data could still get it.

I'm sure I could do the first without a huge amount of difficulty[1],  but 
if the second is more likely to be accepted if I try to submit a code 
contribution, some advice about how to design such a protocol enhancement, 
and how to code it up would be greatly appreciated.

[1]  Even if it would be "Ted's second non-trivial Go code patch".  :-)

Cheers,

-- Ted

-- 
You received this message because you are subscribed to the Google Groups 
"Camlistore" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to