Hello Jason et al, Indeed there are plenty of usecases of instantly needed updated searches, for example the jsr-170 (jcr) compliant Jackrabbit implementation: it havily relies on lucene for searching and hierarchy resolving, and according jsr-170 spec after a save(), changes need to be visible instantly.
Also, I think a very similar solution to yours is implemented there: See [1] if you like Regards Ard [1] http://jackrabbit.apache.org/index-readers.html > I started a wiki name at > http://wiki.apache.org/lucene-java/OceanRealtimeSearch linked > from http://wiki.apache.org/lucene-java/LuceneResources. > > Perhaps I should add some background on the wiki. I can add > a little bit here. I was an early Solr developer/user at a > social networking company when Google's GData came out. It > looked similar to Solr so I took a look at it. The one thing > it had over Solr was realtime updates or the ability to add, > delete, or update a document and be able to see the update in > search results immediately. With Solr the company had > decided on a 10 minute interval of updating the index with > delta updates from an Oracle database. I wanted to see if it > was possible with Lucene to create an approximation of what > GData does. The result is Ocean. > > The use case it was designed for is websites with dynamic > data, some of which are social networking, photo sites, > discussions boards, blogs, wikis, and such. More broadly it > is possible to use Ocean with any application that requires > the database like feature of immediate updates. Probably the > best example of this is all of Google's web applications, > outside of web search, uses a GData interface. Meaning the > primary datastore is not mysql or some equivalent, it is a > proprietary search based database. The best example of this > is Gmail. If I receive an email through Gmail I can also > search on it immediately, there is no 10 minute delay. Also > in Gmail I can change labels, a common example being changing > unread emails to read in bulk. Presumably Gmail is not > reindexing the entire email for each label change. > > Most highly trafficked web applications do not use the > relational facilities like joins because they are too > expensive. Lucene does not offer joins so this is fine. The > only area Lucene is currently weak in is range queries. > Mysql uses a btree index whereas Lucene uses the time > consuming TermEnum and TermDocs combination. This is an area > Tag Index addresses. > > The way Ocean is designed there should be no limitations to > using it compared to using Lucene IndexWriter. It offers the > same functionality. If one does not want to use the > transaction log Ocean offers because one simply wants to > index 1 million documents at once, Ocean offers what is a > called a LargeBatch. It is a way to perform a large number > of updates taking advantage of the new IndexWriter speedup, > combined with transactional semantics. > > Karl, does this answer your question or are there areas that > could use more explanation? > > > On Fri, Jul 11, 2008 at 6:20 AM, Karl Wettin > <[EMAIL PROTECTED]> wrote: > > > > 10 jul 2008 kl. 22.08 skrev Jason Rutherglen: > > > > Is there a good place to put Ocean > https://issues.apache.org/jira/browse/LUCENE-1313 > documentation? Is there a place on the wiki that is good? > > > > Hi Janson, > > the wiki is just fine. > > I've been reading the docs and looked at your patch. > There is a lot of text about how it does what it does, but it > says nothing anything about the intended use. I honestly > don't even know what you mean by "real time search". You will > probably get more attention if the documentation starts out > with some use cases or thoughts on when and why it might make > sense to use your code. > > > karl > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]