Hi Aidan,
This is an interesting question. I'm in computing for high energy physics,
where there is custom software used for tracking the data files, their
locations, etc. I hadn't considered the application of Perkeep in this area
(I've only toyed with it as a personal project). For tracking where the
data is the HEP community has been moving to Rucio https://rucio.cern.ch/.
The metadata, however, is still a bit of wildcard-Rucio doesn't currently
have a native metadata store. I'm interested in seeing if Perkeep can be
offered as a solution here, but if not I may be able to provide an
alternative.
Regards,
Kevin

On Tue, Dec 4, 2018 at 5:01 PM Aidan Heerdegen <[email protected]>
wrote:

> Hi,
>
> I have a use case where I specifically do not want to store my data files.
>
> They are scientific datasets, usually model output, which might total
> hundreds of TB or more.
>
> What I would like to do is store the meta-data from the datasets, their
> location(s) and some transactional information, if they are moved, deleted
> etc.
>
> It is essential that the original data can be deleted, in some cases it is
> no longer required, and in others it might be backed up to a slow to access
> tape based data silo.
>
> Each user would have their own perkeep store, but the ability to
> coalesce/share the information in those stores would be almost essential.
>
> Does this sound like a use case for perkeep? I like the idea of keeping
> ALL my metadata (it is pretty small), using it to find files with
> particular characteristics, retrieve them from a backup storage location,
> that sort of thing.
>
> If perkeep is a good match, can anyone suggest what "mappings" I might
> need to think about in perkeep terms? e.g.
>
> A permanode would be required for each unique file instance? I would be
> storing an identifying hash of some sort to ensure it was unique.
>
> If a file is modified such that the hash changes, I would need a new
> permanode, but would like to keep a relationship between the two files.
>
> Most data files are netCDF, so I would like to dump the metadata as a JSON
> blob and associate that data with a permanode. I guess perkeep has existing
> methods for dealing with JSON? And indexing/searching it?
>
> Thanks very much for any help,
>
> Cheers
>
> Aidan
>
> --
> You received this message because you are subscribed to the Google Groups
> "Perkeep" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Perkeep" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to