Rather than explicitly track generations, I'd just do consistent hashing, where each disk has a weight proportional to its size (by default) or explicit configuration.
On read, try its ideal location. If you miss, search N shards in parallel looking for it. Optionally enqueue a migration task. On any layout change, have an optional background worker doing migrations to ideal locations. On Sat, Jan 13, 2018 at 1:18 PM, Stephen Searles <stephen.sear...@gmail.com> wrote: > I saw some activity on my ancient sharding merge request. I've been > dusting that off and am actually trying to tackle the problem I've been > stalled on all this time, which I'll describe here. > > The naive sharding approach in the current MR assumes the same shards will > always exist. This sounds too naive to be practical. When you get that > brand new hard drive, possibly because another one bit the dust, you want a > permanent data repository to be able to handle replicating and migrating > data. So at the very least, I'm working on a generational configuration > that allows a single blob to map onto multiple shards, and to allow the > shards available to change over time. The default selector will still > map blobs to shards deterministically based on their hash and size. > > That said, my true goal is to sort-of automatically manage these storage > generations. Basically, I want to be able to throw disks into the > configuration and have data automatically shuffled to take advantage of the > new disk. I also want to configure a redundancy threshold and be alerted if > my data grows too large or if usable disk space grows too small to provide > the level of redundancy requested. Additional gravy: to monitor the > performance and S.M.A.R.T. data from the drives to make inferences about > disk health and warn ahead of time when redundancy may fall. Even further > gravy: use that performance data to specifically shuffle data onto disks > that are fast at reading and have open space on disks that are fast at > writing. > > Anyway, I'm not sure what of this (if any) I'll actually finish by the > next release cycle, but I realize I'm quiet here and wanted to chime in > with what I've been tinkering with. > > Thanks for the thread! Good idea! > > On Friday, January 5, 2018 at 8:25:30 PM UTC-5, Eric Drechsel wrote: >> >> Many open source projects have a tradition to kick off a new dev cycle >> with a thread where contributors list tasks/issues they are planning to >> work on in the coming months. >> >> It seems like PerKeep has been building some momentum over the Winter, >> and I'm really curious to hear what people have on their personal roadmaps, >> both core commiters and new/sometime contribs. >> >> I'll kick it off :D While I haven't been very involved in the past year, >> I am interested to make more contributions. Here are the things I'm keen to >> work on: >> >> - Update this old change list exposing permanode IDs as a FUSE xattr >> [1]. >> - Have build infrastructure produce Synology packages, or figure out >> best practices for deploying to Synology using Docker etc [2]. >> - Write a client side example app/guide using the GopherJS data >> access layer but without using GopherJS for the app code. >> - If this isn't currently possible I would like to work on making >> sure the GopherJS blob exports a nice data access API to JS. >> - Write additional content importers: >> - Git repository fetcher >> - Website crawler >> - Work on maintainence/new features in the Web UI >> >> [1] https://camlistore-review.googlesource.com/c/camlistore/+/2869 >> [2] https://github.com/camlistore/camlistore/issues/986 >> >> Of course I'd be happy if anyone else wants to work on any of these >> features too :D >> >> Also, are there any PerKeep meetups/hack sessions planned? LinuxFestNW >> CFP is open till the end of the month... :) >> >> -- >> best, Eric >> eric.pdxhub.org <http://pdxhub.org/people/eric> >> >> -- > You received this message because you are subscribed to the Google Groups > "Camlistore" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to camlistore+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Camlistore" group. To unsubscribe from this group and stop receiving emails from it, send an email to camlistore+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.