Re: Huge cache safe to delete?

2018-08-21 Thread Stephen Searles
Yup, there is a blobserver like you describe here:
https://github.com/perkeep/perkeep/blob/master/pkg/blobserver/memory/mem.go

On Tue, Aug 21, 2018 at 12:21 AM Viktor Ogeman 
wrote:

> Thanks
>
> Sent from my phone
>
> > On 21 Aug 2018, at 01:29, Mathieu Lonjaret 
> wrote:
> >
> > hmm, I thought we had an implementation of a blobserver that deletes
> > older blobs when it reaches a certain size, but nope.
> > I'm almost sure I've written something somewhere in Perkeep that does
> > something like that though, but I can't find it right now.
> >
> >
> >> On 20 August 2018 at 23:27, Brad Fitzpatrick  wrote:
> >> You can nuke the cache directory. It's just a cache.
> >>
> >> We should probably have some automatic size management for it. File a
> >> bug? https://github.com/perkeep/perkeep/issues
> >>
> >> (Btw, we use the perkeep@ mailing list now, not camlistore@)
> >>
> >>> On Mon, Aug 20, 2018 at 1:40 PM Viktor Ogeman 
> wrote:
> >>>
> >>> Hi again,
> >>>
> >>> I have a perkeepd server running, storing about 200Gb of data using
> blobpacked. The data is mostly jpg images. The main blob storage is on a
> spinning disk ("packed") but I keep all the indexes, loose blobs + cache on
> a smaller SSD.
> >>>
> >>> I notice however that the cache directory becomes prohibitly large
> (64Gb of cached blobs for 200Gb of "real data"). Is this expected? If so
> why is this, are very high quality thumbnails being stored for all the
> images - or is there some other data being cached as well?
> >>>
> >>> Finally (and most importantly) - Is it safe for me to nuke the cache
> directory? (while the server is running?) Or is there some other
> recommended way to decrease the cache pressure?
> >>>
> >>> Regards
> >>>
> >>> --
> >>> You received this message because you are subscribed to the Google
> Groups "Camlistore" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send
> an email to camlistore+unsubscr...@googlegroups.com.
> >>> For more options, visit https://groups.google.com/d/optout.
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups "Camlistore" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an email to camlistore+unsubscr...@googlegroups.com.
> >> For more options, visit https://groups.google.com/d/optout.
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "Perkeep" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to perkeep+unsubscr...@googlegroups.com.
> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Perkeep" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to perkeep+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Camlistore" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to camlistore+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Per-contributor focuses for next major release, hack sessions

2018-01-13 Thread Stephen Searles
I saw some activity on my ancient sharding merge request. I've been dusting 
that off and am actually trying to tackle the problem I've been stalled on 
all this time, which I'll describe here.

The naive sharding approach in the current MR assumes the same shards will 
always exist. This sounds too naive to be practical. When you get that 
brand new hard drive, possibly because another one bit the dust, you want a 
permanent data repository to be able to handle replicating and migrating 
data. So at the very least, I'm working on a generational configuration 
that allows a single blob to map onto multiple shards, and to allow the 
shards available to change over time. The default selector will still 
map blobs to shards deterministically based on their hash and size.

That said, my true goal is to sort-of automatically manage these storage 
generations. Basically, I want to be able to throw disks into the 
configuration and have data automatically shuffled to take advantage of the 
new disk. I also want to configure a redundancy threshold and be alerted if 
my data grows too large or if usable disk space grows too small to provide 
the level of redundancy requested. Additional gravy: to monitor the 
performance and S.M.A.R.T. data from the drives to make inferences about 
disk health and warn ahead of time when redundancy may fall. Even further 
gravy: use that performance data to specifically shuffle data onto disks 
that are fast at reading and have open space on disks that are fast at 
writing.

Anyway, I'm not sure what of this (if any) I'll actually finish by the next 
release cycle, but I realize I'm quiet here and wanted to chime in with 
what I've been tinkering with.

Thanks for the thread! Good idea!

On Friday, January 5, 2018 at 8:25:30 PM UTC-5, Eric Drechsel wrote:
>
> Many open source projects have a tradition to kick off a new dev cycle 
> with a thread where contributors list tasks/issues they are planning to 
> work on in the coming months. 
>
> It seems like PerKeep has been building some momentum over the Winter, and 
> I'm really curious to hear what people have on their personal roadmaps, 
> both core commiters and new/sometime contribs.
>
> I'll kick it off :D While I haven't been very involved in the past year, I 
> am interested to make more contributions. Here are the things I'm keen to 
> work on:
>
>- Update this old change list exposing permanode IDs as a FUSE xattr 
>[1].
>- Have build infrastructure produce Synology packages, or figure out 
>best practices for deploying to Synology using Docker etc [2].
>- Write a client side example app/guide using the GopherJS data access 
>layer but without using GopherJS for the app code.
>   - If this isn't currently possible I would like to work on making 
>   sure the GopherJS blob exports a nice data access API to JS.
>- Write additional content importers:
>   - Git repository fetcher
>   - Website crawler
>- Work on maintainence/new features in the Web UI
>
>  [1] https://camlistore-review.googlesource.com/c/camlistore/+/2869
>  [2] https://github.com/camlistore/camlistore/issues/986
>
> Of course I'd be happy if anyone else wants to work on any of these 
> features too :D
>
> Also, are there any PerKeep meetups/hack sessions planned? LinuxFestNW CFP 
> is open till the end of the month... :)
>
> -- 
> best, Eric
> eric.pdxhub.org 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Camlistore" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to camlistore+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.