On Mon, Dec 12, 2016 at 3:13 PM, Sanne Grinovero <sa...@infinispan.org> wrote:
> In short, what's the ultimate goal? I see two main but different > options intertwined: > - allow to synchronize the *final state* of a replica > I'm assuming this case is already in place when using remote listeners and includeCurrentState=true and we are discussing how to improve it, as described in the proposal in the wiki and on the 5th email of this thread. > - inspect specific changes > > For the first case, it would be enough for us to be able to provide a > "squashed history" (as in Git squash), but we'd need to keep versioned > shapshots around and someone needs to tell you which ones can be > garbage collected. > For example when a key is: written, updated, updated, deleted since > the snapshot, we'll send only "deleted" as the intermediary states are > irrelevant. > For the second case, say the goal is to inspect fluctuations of price > variations of some item, then the intermediary states are not > irrelevant. > > Which one will we want to solve? Both? > Looking at http://debezium.io/, it implies the second case. "[...] Start it up, point it at your databases, and your apps can start responding to all of the inserts, updates, and deletes that other apps commit to your databases. [...] your apps can respond quickly and never miss an event, even when things go wrong." IMO the choice between squashed/full history, and even retention time is highly application specific. Deletes might not even be involved, one may be interested on answering "what is the peak value of a certain key during the day?" > Personally the attempt of solving the second one seems like a huge > pivot of the project, the current data-structures and storage are not > designed for this. +1, as I wrote earlier about ditching the idea of event cache storage in favor of Lucene. > I see the value of such benefits, but maybe > Infinispan is not the right tool for such a problem. > > I'd prefer to focus on the benefits of the squashed history, and have > versioned entries soon, but even in that case we need to define which > versions need to be kept around, and how garbage collection / > vacuuming is handled. > Is that proposal written/recorded somewhere? It'd be interesting to know how a client interested on data changes would consume those multi-versioned entries (push/pull with offset?, sorted/unsorted?, client tracking/per key/per version?), as it seems there is some storage impedance as well. > > In short, I'd like to see an agreement that analyzing e.g. > fluctuations in stock prices would be a non-goal, if these are stored > as {"stock name", value} key/value pairs. One could still implement > such a thing by using a more sophisticated model, just don't expect to > be able to see all intermediary values each entry has ever had since > the key was first used. > Continuous Queries listens to data key/value data using a query, should it not be expected to see all the intermediary values when changes in the server causes an entry to start/stop matching the query?
_______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev