Rather than explicitly track generations, I'd just do consistent hashing,
where each disk has a weight proportional to its size (by default) or
explicit configuration.

On read, try its ideal location. If you miss, search N shards in parallel
looking for it. Optionally enqueue a migration task. On any layout change,
have an optional background worker doing migrations to ideal locations.


On Sat, Jan 13, 2018 at 1:18 PM, Stephen Searles <stephen.sear...@gmail.com>
wrote:

> I saw some activity on my ancient sharding merge request. I've been
> dusting that off and am actually trying to tackle the problem I've been
> stalled on all this time, which I'll describe here.
>
> The naive sharding approach in the current MR assumes the same shards will
> always exist. This sounds too naive to be practical. When you get that
> brand new hard drive, possibly because another one bit the dust, you want a
> permanent data repository to be able to handle replicating and migrating
> data. So at the very least, I'm working on a generational configuration
> that allows a single blob to map onto multiple shards, and to allow the
> shards available to change over time. The default selector will still
> map blobs to shards deterministically based on their hash and size.
>
> That said, my true goal is to sort-of automatically manage these storage
> generations. Basically, I want to be able to throw disks into the
> configuration and have data automatically shuffled to take advantage of the
> new disk. I also want to configure a redundancy threshold and be alerted if
> my data grows too large or if usable disk space grows too small to provide
> the level of redundancy requested. Additional gravy: to monitor the
> performance and S.M.A.R.T. data from the drives to make inferences about
> disk health and warn ahead of time when redundancy may fall. Even further
> gravy: use that performance data to specifically shuffle data onto disks
> that are fast at reading and have open space on disks that are fast at
> writing.
>
> Anyway, I'm not sure what of this (if any) I'll actually finish by the
> next release cycle, but I realize I'm quiet here and wanted to chime in
> with what I've been tinkering with.
>
> Thanks for the thread! Good idea!
>
> On Friday, January 5, 2018 at 8:25:30 PM UTC-5, Eric Drechsel wrote:
>>
>> Many open source projects have a tradition to kick off a new dev cycle
>> with a thread where contributors list tasks/issues they are planning to
>> work on in the coming months.
>>
>> It seems like PerKeep has been building some momentum over the Winter,
>> and I'm really curious to hear what people have on their personal roadmaps,
>> both core commiters and new/sometime contribs.
>>
>> I'll kick it off :D While I haven't been very involved in the past year,
>> I am interested to make more contributions. Here are the things I'm keen to
>> work on:
>>
>>    - Update this old change list exposing permanode IDs as a FUSE xattr
>>    [1].
>>    - Have build infrastructure produce Synology packages, or figure out
>>    best practices for deploying to Synology using Docker etc [2].
>>    - Write a client side example app/guide using the GopherJS data
>>    access layer but without using GopherJS for the app code.
>>       - If this isn't currently possible I would like to work on making
>>       sure the GopherJS blob exports a nice data access API to JS.
>>    - Write additional content importers:
>>       - Git repository fetcher
>>       - Website crawler
>>    - Work on maintainence/new features in the Web UI
>>
>>  [1] https://camlistore-review.googlesource.com/c/camlistore/+/2869
>>  [2] https://github.com/camlistore/camlistore/issues/986
>>
>> Of course I'd be happy if anyone else wants to work on any of these
>> features too :D
>>
>> Also, are there any PerKeep meetups/hack sessions planned? LinuxFestNW
>> CFP is open till the end of the month... :)
>>
>> --
>> best, Eric
>> eric.pdxhub.org <http://pdxhub.org/people/eric>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Camlistore" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to camlistore+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Camlistore" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to camlistore+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to