Thanks Eric. That link (and Jay's link beyond) are fascinating!

I've briefly looked at Tahoe in the past, and your comment makes me wonder 
if building a blob server on top of a Tahoe grid might be another way to 
approach this. It does have fault tolerance and a manual repair feature 
(although no auto-correction either so you'd have to monitor/maintain the 
Tahoe grid).

My basic concern is that I'm running Perkeep on filesystems/disks that just 
do not guarantee to give me back the same bytes I wrote, and worse, may 
silently give me back different bytes (HFS+/APFS).

I believe S3 and B2 do guarantee correctness (using some sort of erasure 
coding), but I'd like my primary storage to be local disks for various 
reasons.

Regarding the auto-correcting replica idea: it was my understanding that it 
wouldn't be possible to write this as a composable blob server since there 
is no way to delete or replace existing blobs on underlying replicas. 
However, looking through some of the blob server implementations it seems 
maybe there is a way to remove blobs?

Another possibility: perhaps the (planned) garbage collector could 
periodically delete corrupt blobs that don't match their hash (is there any 
valid reason to keep them?), which would then permit future syncs to 
replace them with correct copies.



On Sunday, May 26, 2019 at 10:51:37 PM UTC+1, eric wrote:
>
> Perkeep doesn't have any erasure-coding (Tahoe-LAFS does, see this 
> rant by the project lead: 
> https://tahoe-lafs.org/pipermail/tahoe-dev/2012-March/007185.html 
> <https://www.google.com/url?q=https%3A%2F%2Ftahoe-lafs.org%2Fpipermail%2Ftahoe-dev%2F2012-March%2F007185.html&sa=D&sntz=1&usg=AFQjCNGzrhZqKDUYwi6q4bMmxq2dIbY2QA>).
>  
>
>
> I think the fault-tolerance story is to keep multiple full replicas 
> and rely on the intrinsic error detection. It would be interesting to 
> make replacing corrupt blobs from another replica automatic, I don't 
> think that's implemented but I may be wrong. 
>
> On Sat, May 25, 2019 at 2:11 AM Dan Cutting <[email protected] 
> <javascript:>> wrote: 
> > 
> > Reading some more, I discovered the old Camlistore group and found this 
> thread about integrity checking: 
> > 
> > 
> https://groups.google.com/forum/#!searchin/camlistore/graph|sort:date/camlistore/_KiSv7x1Eh0/w_lik623AwAJ
>  
> > 
> > So it's good to know that reindexing will verify blobs have not been 
> corrupted. But if this is done infrequently it still seems possible to lose 
> data over time due to corruption. 
> > 
> > Some sort of auto-healing mechanism seems like a good feature for 
> storage that is intended to last 100 years. Is anybody aware of any work 
> that's been done on this for Perkeep? 
> > 
> > Thanks! 
> > 
> > 
> > 
> > On Saturday, May 18, 2019 at 2:53:19 PM UTC+1, Dan Cutting wrote: 
> >> 
> >> Hi, does Perkeep offer any sort of error detection or correction? 
> >> 
> >> Data sitting on a disk will "rot" over time (bad disks, bad drivers, 
> cosmic rays), and it would be good to automatically correct it. 
> >> 
> >> I'm thinking something along the lines of a Reed-Solomon-style blob 
> server. 
> >> 
> >> A simpler possibility might be to rely on the fact that a blob is 
> content-addressed and check that the data read hashes back to the content 
> address before returning it to the client. Combining this with multiple 
> replicas would let the blob server replace bad blobs with good ones when 
> detected. 
> >> 
> >> Or does Perkeep already have a story around this kind of thing? I 
> haven't been able to find it. 
> >> 
> >> Thanks! 
> >> Dan 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "Perkeep" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to [email protected] <javascript:>. 
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/perkeep/0ab8636f-5e20-454f-86de-160184ba7b14%40googlegroups.com.
>  
>
> > For more options, visit https://groups.google.com/d/optout. 
>
>
>
> -- 
> best, Eric 
> eric.pdxhub.org 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Perkeep" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/perkeep/a1be2355-bb11-4f24-ac8e-d4971bb22561%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to