>>using Lucene that don't fit under the core premise of full text search
I've had several use cases over the years that use features peculiar to Lucene but here's a very simple one I came across today that illustrates its raw index lookup capability: I needed a fast, scalable and persistent "Set" implementation to maintain a large cold-list (millions of string-based keys). I benchmarked various implementations using a set of ~6 million keys with 10,000 random key lookups. When it comes to RAM use, retrieval times and start-up costs Lucene stands up very well against equivalent embedded databases for this task: * Benchmarks for times to initially open the set when stored on disk: http://goo.gl/dJL3g * Benchmarks for Avg key lookup time once opened: http://goo.gl/SG79N * Stats for RAM use after 10,000 lookups: http://goo.gl/MyJDn I don't doubt all of these implementations could be tweaked (e.g. optimizing the Lucene index, various DB-specific settings) but I tried to use sensible defaults to make the tests fair e.g. use of prepared statements, indexes, minimal data retrieved. Speeds varied with each run of the random lookup test due to OS-level caching effects so the best times were recorded in each case. The HashSet tests are loaded entirely from file (hence the long start-up time) and are not a scalable solution because of RAM costs. MySQL requires an inter-process call as it was not embedded but even using a remoted Lucene call I get significantly better performance (avg 0.5ms lookup vs MySQL 10ms) Cheers Mark ----- Original Message ----- From: Grant Ingersoll <gsing...@apache.org> To: java-user@lucene.apache.org Cc: Sent: Saturday, 22 October 2011, 10:11 Subject: Bet you didn't know Lucene can... Hi All, I'm giving a talk at ApacheCon titled "Bet you didn't know Lucene can..." (http://na11.apachecon.com/talks/18396). It's based on my observation, that over the years, a number of us in the community have done some pretty cool things using Lucene that don't fit under the core premise of full text search. I've got a fair number of ideas for the talk (easily enough for 1 hour), but I wanted to reach out to hear your stories of ways you've (ab)used Lucene and Solr to see if we couldn't extend the conversation to a bit more than the conference and also see if I can't inject more ideas beyond the ones I have. I don't need deep technical details, but just high level use case and the basic insight that led you to believe Lucene could solve the problem. Thanks in advance, Grant -------------------------------------------- Grant Ingersoll http://www.lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org