Hi Aidan, This is an interesting question. I'm in computing for high energy physics, where there is custom software used for tracking the data files, their locations, etc. I hadn't considered the application of Perkeep in this area (I've only toyed with it as a personal project). For tracking where the data is the HEP community has been moving to Rucio https://rucio.cern.ch/. The metadata, however, is still a bit of wildcard-Rucio doesn't currently have a native metadata store. I'm interested in seeing if Perkeep can be offered as a solution here, but if not I may be able to provide an alternative. Regards, Kevin
On Tue, Dec 4, 2018 at 5:01 PM Aidan Heerdegen <[email protected]> wrote: > Hi, > > I have a use case where I specifically do not want to store my data files. > > They are scientific datasets, usually model output, which might total > hundreds of TB or more. > > What I would like to do is store the meta-data from the datasets, their > location(s) and some transactional information, if they are moved, deleted > etc. > > It is essential that the original data can be deleted, in some cases it is > no longer required, and in others it might be backed up to a slow to access > tape based data silo. > > Each user would have their own perkeep store, but the ability to > coalesce/share the information in those stores would be almost essential. > > Does this sound like a use case for perkeep? I like the idea of keeping > ALL my metadata (it is pretty small), using it to find files with > particular characteristics, retrieve them from a backup storage location, > that sort of thing. > > If perkeep is a good match, can anyone suggest what "mappings" I might > need to think about in perkeep terms? e.g. > > A permanode would be required for each unique file instance? I would be > storing an identifying hash of some sort to ensure it was unique. > > If a file is modified such that the hash changes, I would need a new > permanode, but would like to keep a relationship between the two files. > > Most data files are netCDF, so I would like to dump the metadata as a JSON > blob and associate that data with a permanode. I guess perkeep has existing > methods for dealing with JSON? And indexing/searching it? > > Thanks very much for any help, > > Cheers > > Aidan > > -- > You received this message because you are subscribed to the Google Groups > "Perkeep" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Perkeep" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
