Hi, I have a use case where I specifically do not want to store my data files.
They are scientific datasets, usually model output, which might total hundreds of TB or more. What I would like to do is store the meta-data from the datasets, their location(s) and some transactional information, if they are moved, deleted etc. It is essential that the original data can be deleted, in some cases it is no longer required, and in others it might be backed up to a slow to access tape based data silo. Each user would have their own perkeep store, but the ability to coalesce/share the information in those stores would be almost essential. Does this sound like a use case for perkeep? I like the idea of keeping ALL my metadata (it is pretty small), using it to find files with particular characteristics, retrieve them from a backup storage location, that sort of thing. If perkeep is a good match, can anyone suggest what "mappings" I might need to think about in perkeep terms? e.g. A permanode would be required for each unique file instance? I would be storing an identifying hash of some sort to ensure it was unique. If a file is modified such that the hash changes, I would need a new permanode, but would like to keep a relationship between the two files. Most data files are netCDF, so I would like to dump the metadata as a JSON blob and associate that data with a permanode. I guess perkeep has existing methods for dealing with JSON? And indexing/searching it? Thanks very much for any help, Cheers Aidan -- You received this message because you are subscribed to the Google Groups "Perkeep" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
