Hi, Thanks all for chiming in. It seems like this feature could be of interest to the user community, so I've opened a ticket to continue maturing the idea there:
https://issues.apache.org/jira/browse/IGNITE-1789 We may need to create a Wiki page later to collaborate around specifics and design. Regards, *Raúl Kripalani* PMC & Committer @ Apache Ignite, Apache Camel | Integration, Big Data and Messaging Engineer http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani http://blog.raulkr.net | twitter: @raulvk On Wed, Oct 21, 2015 at 10:06 AM, Raul Kripalani <[email protected]> wrote: > Hey guys, > > LevelDb has a functionality called Snapshots which provides a consistent > read-only view of the DB at a given point in time, against which queries > can be executed. > > To my knowledge, this functionality doesn't exist in the world of open > source In-Memory Computing. Ignite could be an innovator here. > > Ignite Snapshots would allow queries, distributed closures, map-reduce > jobs, etc. It could be useful for Spark RDDs to avoid data shift while the > computation is taking place (not sure if there's already some form of > snapshotting, though). Same for IGFS. > > Example usage: > > IgniteCacheSnapshot snapshot = > ignite.cache("mycache").snapshots().create(); > > // all three queries are executed against a view of the cache at the > point in time where it was snapshotted > snapshot.query("select ..."); > snapshot.query("select ..."); > snapshot.query("select ..."); > > In fact, it would be awesome to be able to logically save this snapshot > with a name so that later jobs, queries, etc. can run on top of it, e.g.: > > IgniteCacheSnapshot snapshot = > ignite.cache("mycache").snapshots().create("abc"); > > // ... > // in another module of a distributed system, or in another thread in > parallel, use the saved snapshot > IgniteCacheSnapshot snapshot = > ignite.cache("mycache").snapshots().get("abc"); > .... > > Named snapshotting can be dangerous due to data retention, e.g. imagine > keeping a snapshot for 2 weeks! So we should force the user to specify a > TTL: > > IgniteCacheSnapshot snapshot = > ignite.cache("mycache").snapshots().create("abc", 2, TimeUnit.HOURS); > > Such functionality would allow for "reporting checkpoints" and "time > travel", for example, where you want users to be able to query the data as > it stood 1 hour ago, 2 hours ago, etc. > > What do you think? > > P.S.: We do have some form of snapshotting in the Compute checkpointing > functionality – but my proposal is to generalise the notion. > > Regards, > > *Raúl Kripalani* > PMC & Committer @ Apache Ignite, Apache Camel | Integration, Big Data and > Messaging Engineer > http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani > http://blog.raulkr.net | twitter: @raulvk >
