Re: Contributors - Solr in Action Case Studies
I'd be happy to contribute how we use Solr to power http://search.lucidimagination.com. We ingest many different data sources (email, web, wiki, JIRA, source code, etc.) and use dismax, multi select faceting and a variety of other techniques. I think it would make for a great case study. -Grant
Re: Contributors - Solr in Action Case Studies
Hello, We've been working extensively with Solr as a 'standard' search service. However, recently, we had a volume problem displaying time series (by instance, sentiment of a brand by date), pulling data from a highly denormalized database. Indexing a view of this database, coupled with faceting (god bless faceting :D) by the date field, gave us directly every data series we needed to display, in a time an order of magnitude faster than the database (I guess the 'regular' way) approach. Roughly 1-2 seconds 'playing' with 300.000 points (Solr approach) vs nearly 45 seconds in the DB approach. Amazing thing is that we are not using Solr as a 'text' indexer. The approach is more like a 'faceter' of data and it is surprisingly fast. Can be a good case study of a non-standard way of using Solr or in the Solr vs Database 'war' :D From now on, we're adopting this approach every time we have to display Flash graphs of time series in front-end projects Our field of work is mainly text crawling, analysis and indexing, including sentiment analysis, focused crawling, natural language understanding, concept extraction... Thanks, Oscar On Mon, Feb 1, 2010 at 6:08 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hello everyone, Thanks to all who emailed me so far. This is just another reminder for those who missed the first email below. Please let us know if you'd like to contribute a piece to Solr in Action about your interesting use of Solr. Thanks, Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ - Original Message From: Otis Gospodnetic otis_gospodne...@yahoo.com To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 2:09:41 PM Subject: Contributors - Solr in Action Case Studies Hello, We are working on Solr in Action [1]. One of the well received chapters from LIA #1[2] was the Case Studies chapter, where external contributors described how they used Lucene. We are getting good feedback about this chapter from LIA #2 reviewers, too. Solr in Action also has a Case Studies chapter, and we are starting to look for contributors. If you are using Solr in some clever, interesting, or unusual way and are willing to share this information, please get in touch. 5 to max 10 pages (soft limits) per study is what we are hoping for. Feel free to respond on the list or reply to me directly. [1] http://www.manning.com/catalog/undercontract.html [2] http://www.manning.com/hatcher2/ and http://www.manning.com/hatcher3/ Thanks, Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch -- Whether it's science, technology, personal experience, true love, astrology, or gut feelings, each of us has confidence in something that we will never fully comprehend. --Roy H. William
Re: Contributors - Solr in Action Case Studies
This would be very welcome! I am interested in this particular use case. In other words: if the book will contain this use case then you can count with me buying this book! :-) Regards, Lukas On Tue, Feb 2, 2010 at 2:49 PM, Grant Ingersoll gsing...@apache.org wrote: I'd be happy to contribute how we use Solr to power http://search.lucidimagination.com. We ingest many different data sources (email, web, wiki, JIRA, source code, etc.) and use dismax, multi select faceting and a variety of other techniques. I think it would make for a great case study. -Grant
Re: Contributors - Solr in Action Case Studies
Hello everyone, Thanks to all who emailed me so far. This is just another reminder for those who missed the first email below. Please let us know if you'd like to contribute a piece to Solr in Action about your interesting use of Solr. Thanks, Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ - Original Message From: Otis Gospodnetic otis_gospodne...@yahoo.com To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 2:09:41 PM Subject: Contributors - Solr in Action Case Studies Hello, We are working on Solr in Action [1]. One of the well received chapters from LIA #1[2] was the Case Studies chapter, where external contributors described how they used Lucene. We are getting good feedback about this chapter from LIA #2 reviewers, too. Solr in Action also has a Case Studies chapter, and we are starting to look for contributors. If you are using Solr in some clever, interesting, or unusual way and are willing to share this information, please get in touch. 5 to max 10 pages (soft limits) per study is what we are hoping for. Feel free to respond on the list or reply to me directly. [1] http://www.manning.com/catalog/undercontract.html [2] http://www.manning.com/hatcher2/ and http://www.manning.com/hatcher3/ Thanks, Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
Re: Contributors - Solr in Action Case Studies
Hi Tom, hi Tom :) Yummy goodness. Lots of data. Big books. Thank you, I will be in touch. Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: Tom Burton-West tburtonw...@gmail.com To: solr-user@lucene.apache.org Sent: Wed, January 20, 2010 5:17:39 PM Subject: Re: Contributors - Solr in Action Case Studies Hello Otis, Hi Otis, We are using Solr to provide indexing for the full text of 5 million books (About 4-6 terrabytes of text.) Our index is currently around 3 terrabytes distributed over 10 shards with about 310 GB of index per shard. We are using very large Solr documents (about 750MB of text or about 100,000 words/doc), and using CommonGrams to deal with stopwords/common words in multiple languages. I would be interested in contributing a chapter if this sounds interesting. More details about the project are available at: http://www.hathitrust.org/large_scale_search http://www.hathitrust.org/large_scale_search and our blog: http://www.hathitrust.org/blogs/large-scale-search http://www.hathitrust.org/blogs/large-scale-search (I'll be updating the blog with details of current hardware and performance tests in the next week or so) Tom Tom Burton-West Digital Library Production Service University of Michigan Library -- View this message in context: http://old.nabble.com/Contributors---Solr-in-Action-Case-Studies-tp27166564p27249616.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Contributors - Solr in Action Case Studies
Hello Otis, Hi Otis, We are using Solr to provide indexing for the full text of 5 million books (About 4-6 terrabytes of text.) Our index is currently around 3 terrabytes distributed over 10 shards with about 310 GB of index per shard. We are using very large Solr documents (about 750MB of text or about 100,000 words/doc), and using CommonGrams to deal with stopwords/common words in multiple languages. I would be interested in contributing a chapter if this sounds interesting. More details about the project are available at: http://www.hathitrust.org/large_scale_search http://www.hathitrust.org/large_scale_search and our blog: http://www.hathitrust.org/blogs/large-scale-search http://www.hathitrust.org/blogs/large-scale-search (I'll be updating the blog with details of current hardware and performance tests in the next week or so) Tom Tom Burton-West Digital Library Production Service University of Michigan Library -- View this message in context: http://old.nabble.com/Contributors---Solr-in-Action-Case-Studies-tp27166564p27249616.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Contributors - Solr in Action Case Studies
Hi Gora, Thanks, this sounds interesting, as I don't think we explicitly cover phonetic searches and talking explicitly about languages other than English will be useful to some readers. Let's take further discussion off-line. Thanks, Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: Gora Mohanty g...@srijan.in To: solr-user@lucene.apache.org Sent: Sun, January 17, 2010 4:16:25 PM Subject: Re: Contributors - Solr in Action Case Studies On Thu, 14 Jan 2010 11:09:41 -0800 (PST) Otis Gospodnetic wrote: [...] If you are using Solr in some clever, interesting, or unusual way and are willing to share this information, please get in touch. 5 to max 10 pages (soft limits) per study is what we are hoping for. Feel free to respond on the list or reply to me directly. [...] We have been getting to grips with Solr over the last couple of months, and while I am not sure how interesting this is to people outside of India, one of the things that we have just finished a beta version of is phonetic filters and spell-checking components for Solr, dealing with Indian languages. The aim is to have these work both for Unicode content/search terms, and for Indian languages transliterated into English. The latter is useful as many people, especially current computer users in India, find it more comfortable to type in transliterated English. These components use the standard Solr facilities, as well as established open-source spell-checking libraries like aspell, and the design goal includes fuzzy matches, such as between Amitav and Amitabh, as there is often a fair amount of variance in English transliteration. We see great potential for this as there is already a large amount of content in Indian language, and the government of India is putting in huge amounts of effort into generating more content. Please do let me know if this sounds interesting as a case study. Regards, Gora
Re: Contributors - Solr in Action Case Studies
On Thu, 14 Jan 2010 11:09:41 -0800 (PST) Otis Gospodnetic otis_gospodne...@yahoo.com wrote: [...] If you are using Solr in some clever, interesting, or unusual way and are willing to share this information, please get in touch. 5 to max 10 pages (soft limits) per study is what we are hoping for. Feel free to respond on the list or reply to me directly. [...] We have been getting to grips with Solr over the last couple of months, and while I am not sure how interesting this is to people outside of India, one of the things that we have just finished a beta version of is phonetic filters and spell-checking components for Solr, dealing with Indian languages. The aim is to have these work both for Unicode content/search terms, and for Indian languages transliterated into English. The latter is useful as many people, especially current computer users in India, find it more comfortable to type in transliterated English. These components use the standard Solr facilities, as well as established open-source spell-checking libraries like aspell, and the design goal includes fuzzy matches, such as between Amitav and Amitabh, as there is often a fair amount of variance in English transliteration. We see great potential for this as there is already a large amount of content in Indian language, and the government of India is putting in huge amounts of effort into generating more content. Please do let me know if this sounds interesting as a case study. Regards, Gora
Contributors - Solr in Action Case Studies
Hello, We are working on Solr in Action [1]. One of the well received chapters from LIA #1[2] was the Case Studies chapter, where external contributors described how they used Lucene. We are getting good feedback about this chapter from LIA #2 reviewers, too. Solr in Action also has a Case Studies chapter, and we are starting to look for contributors. If you are using Solr in some clever, interesting, or unusual way and are willing to share this information, please get in touch. 5 to max 10 pages (soft limits) per study is what we are hoping for. Feel free to respond on the list or reply to me directly. [1] http://www.manning.com/catalog/undercontract.html [2] http://www.manning.com/hatcher2/ and http://www.manning.com/hatcher3/ Thanks, Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch