Well, Elasticsearch is built around the exact opposite requirement - of having the latest data always available as soon as possible. Exposing the Lucene commit points seems unpractical to me, also taking into account merge policies ES manages.
What I would do is introduce a new document that aggregates those metrics and have a job that updates this document every now and then. You will use Elasticsearch both as a document store (for the metrics documents) and as the data-chewing piece of software. That metrics doc will be your snapshot of the data that you just pull and display - and you get caching all the way. Unless we are talking about huge volumes of metrics, this would be my route. This is a common practice in event-sourcing scenarios BTW. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Tue, Apr 8, 2014 at 3:25 PM, David Causse <[email protected]> wrote: > > Le mardi 8 avril 2014 12:20:31 UTC+2, Itamar Syn-Hershko a écrit : > >> What do you mean by "stable"? and why would you want to refresh your >> reader only once a day? >> > > By "stable" I mean that the same query must always return the same results. > I want to refresh the reader only once a day/hour because (for example) > some metrics are computed every day/hour, user can clic on some metrics to > see what docs are behind. As data can be updated afterwards metrics will > become unconsistent with the NRT reader but will remain consistent with an > unrefreshed reader. > > >> It sounds like what you are looking for is some sort of a snaphotting >> mechanism? if so, maybe try to model your data where you have a document / >> type that has the data in its stable form and update it periodically based >> on your business logic? >> > > Snapshotting is exactly what I'm looking for. Modeling my query and or > data to simulate a snapshot mechanism can be quite complex compared to the > lucene IndexCommit point in time feature. > > >> >> Elasticsearch doesn't support what you describe going all the way to a >> specific commit, but the scan/scroll search type is pretty much what you >> describe: http://www.elasticsearch.org/guide/en/elasticsearch/reference/ >> current/search-request-search-type.html#scan >> > > Yes, scroll is the closest ES feature I found. > > >> I think having this implemented on the Lucene commit level is going to be >> tricky if not impossible due to the distributed nature of ES (every shard >> on every node is practically a different Lucene index) >> > > I was afraid of that... > > So a simple naive process like this : > > 1/ API to create a commit point : Send a broadcast commit message to all > nodes for one ES index. > 2/ Use the IndexWriter.commit(Map<String, String> commitInfo) to store ES > specific data (like a cluster wide commit point ID generated by ES). > 3/ Add a param to the query API to specify which commit point to use > 4/ Add some API to list/delete unused commit points > > is unpractical? > > point 2,3,4 looks OK to me, tricky part seems to be in point 1. > > Thank you. > > > On Tue, Apr 8, 2014 at 12:45 PM, David Causse <[email protected]> wrote: > >> Hi, >> >> I'm evaluating ES features by reading the doc. Here is the missing >> usecase I was not able to find in the documentation. >> >> I want to perform query in an index from 2 differents applications. >> >> One application needs NRT view of the index. And another needs a more >> stable view of the data (refreshed every day or hour, it depends on >> application needs). >> >> With raw Lucene it's quite easy to implement such feature : >> >> - Keep one IndexReader open for the stable view + NRT : drawback is >> that I loose my IndexReader if the application restarts >> - Use IndexCommit and IndexDeletionPolicy for the stable IndexReader, >> it supports app restart. >> >> Does ES supports these lucene features : keep a commit point, open a >> reader on that particular commit (and delete the index commit when it's no >> more needed)? >> >> As the base feature is part of Lucene API would it be hard to implement >> such feature into ES? (I suspect scroll api to already keep an opened >> IndexReader under the hood, isn't it possible to generalize it to the query >> API?) >> >> Thanks. >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> >> To view this discussion on the web visit https://groups.google.com/d/ >> msgid/elasticsearch/4b082651-51e6-499c-8882-44398c857dc8% >> 40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/4b082651-51e6-499c-8882-44398c857dc8%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/de28be9d-1920-49cd-a089-234c30b60967%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/de28be9d-1920-49cd-a089-234c30b60967%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuJAJRn-uSUbha6MEi1XMBRM5TP931pxdTgXeZQJxt%2BWg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
