As I've been thinking about this I wouldn't do it based on last accessed time, at least not directly. Using the example of moving infrequently used blobs to cold storage, I would use a property on the node, e.g. "archiveState=toArchive". In this case the property can be clearly tied to that purpose. This can be done in complete control of a user, who can choose to designate "all blobs under this folder can be archived" simply by setting the property on all the nodes. Or a background process can run that understands the automatic archival logic, if it is enabled and configured, and this process goes through the tree e.g. once a week and marks any nodes that should be archived simply by changing the archiveState.
Having more than two supported archiveStates allows a query to differentiate between nodes that are designated for archival but are not archived yet, and nodes that are actually moved to cold storage. This can be useful for example if a GUI that is browsing the repo wants to mark nodes that are archived with some sort of decorator, so users know not to try to open it unless they intend to unarchive it. Using a property directly specified for this purpose gives us more direct control over how it is being used I think. On Fri, Jun 30, 2017 at 6:46 AM, Thomas Mueller <muel...@adobe.com.invalid> wrote: > > From my perspective as an Oak user I would like to have control on that. > > It would be nice for Oak to make *suggestions* about moving things to > > cold storage, but there might be application constraints that need to > > be accounted for. > > That sounds reasonable. What would be the "API" for this? Let's say the > API is: configure a path that _allows_ binaries to be migrated to cold > storage. It's not allowed for all other paths. The default configuration > could be: allow for /jcr:system/jcr:versionStorage, don't allow anywhere > else. This could be implemented using automatic moving (as I have > described), _plus_ a background job that, twice a month, traverses all > nodes and reads the first few bytes of all nodes that are _not_ in > /jcr:system/jcr:versionStorage. The traversal could additionally do some > reporting, for example how many binaries are were, how many times where > they read, how much money could you save if configured like this. > > For automatic moving, behaviour could be: > > - To move to cold storage: configuration would be needed: size, access > frequency, recency (e.g. only move binaries larger than 1 MB that were not > access for one month, and that were accessed only once in the month before > that). > > - When trying to access a binary that is in cold storage: you get an > exception saying the binary is in cold storage. Plus, if configured, the > binary would automatically be read from cold storage, so it's available > within x minutes (configurable) when re-read. > > - Bulk copy from cold storage to regular storage: This might be needed to > create a full backup. We might need an API for this. > > Regards, > Thomas > >