Re: Contributors - Solr in Action Case Studies

2010-02-02 Thread Grant Ingersoll
I'd be happy to contribute how we use Solr to power 
http://search.lucidimagination.com.  We ingest many different data sources 
(email, web, wiki, JIRA, source code, etc.) and use dismax, multi select 
faceting and a variety of other techniques.  I think it would make for a great 
case study.

-Grant

Re: Contributors - Solr in Action Case Studies

2010-02-02 Thread Óscar Marín Miró
Hello,
We've been working extensively with Solr as a 'standard' search service.
However, recently, we had a volume problem displaying time series (by
instance, sentiment of a brand by date), pulling data from a highly
denormalized database. Indexing a view of this database, coupled with
faceting (god bless faceting :D) by the date field, gave us directly every
data series we needed to display, in a time an order of magnitude faster
than the database (I guess the 'regular' way) approach. Roughly 1-2 seconds
'playing' with 300.000 points (Solr approach) vs nearly 45 seconds in the DB
approach.

Amazing thing is that we are not using Solr as a 'text' indexer. The
approach is more like a 'faceter' of data and it is surprisingly fast. Can
be a good case study of a non-standard way of using Solr or in the Solr vs
Database 'war' :D

From now on, we're adopting this approach every time we have to display
Flash graphs of time series in front-end projects

Our field of work is mainly text crawling, analysis and indexing, including
sentiment analysis, focused crawling, natural language understanding,
concept extraction...

Thanks,

Oscar

On Mon, Feb 1, 2010 at 6:08 PM, Otis Gospodnetic otis_gospodne...@yahoo.com
 wrote:

 Hello everyone,

 Thanks to all who emailed me so far.  This is just another reminder for
 those who missed the first email below.  Please let us know if you'd like to
 contribute a piece to Solr in Action about your interesting use of Solr.

 Thanks,
 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Hadoop ecosystem search :: http://search-hadoop.com/



 - Original Message 
  From: Otis Gospodnetic otis_gospodne...@yahoo.com
  To: solr-user@lucene.apache.org
  Sent: Thu, January 14, 2010 2:09:41 PM
  Subject: Contributors - Solr in Action Case Studies
 
  Hello,
 
  We are working on Solr in Action [1].  One of the well received chapters
 from
  LIA #1[2] was the Case Studies chapter, where external contributors
 described
  how they used Lucene.  We are getting good feedback about this chapter
 from LIA
  #2 reviewers, too.
 
  Solr in Action also has a Case Studies chapter, and we are starting to
 look for
  contributors.
 
  If you are using Solr in some clever, interesting, or unusual way and are
  willing to share this information, please get in touch.  5 to max 10
 pages (soft
  limits) per study is what we are hoping for.  Feel free to respond on the
 list
  or reply to me directly.
 
  [1] http://www.manning.com/catalog/undercontract.html
  [2] http://www.manning.com/hatcher2/  and
 http://www.manning.com/hatcher3/
 
  Thanks,
  Otis
  --
  Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch




-- 
Whether it's science, technology, personal experience, true love, astrology,
or gut feelings, each of us has confidence in something that we will never
fully comprehend.
--Roy H. William


Re: Contributors - Solr in Action Case Studies

2010-02-02 Thread Lukáš Vlček
This would be very welcome! I am interested in this particular use case. In
other words: if the book will contain this use case then you can count with
me buying this book! :-)

Regards,
Lukas


On Tue, Feb 2, 2010 at 2:49 PM, Grant Ingersoll gsing...@apache.org wrote:

 I'd be happy to contribute how we use Solr to power
 http://search.lucidimagination.com.  We ingest many different data sources
 (email, web, wiki, JIRA, source code, etc.) and use dismax, multi select
 faceting and a variety of other techniques.  I think it would make for a
 great case study.

 -Grant


Re: Contributors - Solr in Action Case Studies

2010-02-01 Thread Otis Gospodnetic
Hello everyone,

Thanks to all who emailed me so far.  This is just another reminder for those 
who missed the first email below.  Please let us know if you'd like to 
contribute a piece to Solr in Action about your interesting use of Solr.

Thanks,
Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



- Original Message 
 From: Otis Gospodnetic otis_gospodne...@yahoo.com
 To: solr-user@lucene.apache.org
 Sent: Thu, January 14, 2010 2:09:41 PM
 Subject: Contributors - Solr in Action Case Studies
 
 Hello,
 
 We are working on Solr in Action [1].  One of the well received chapters from 
 LIA #1[2] was the Case Studies chapter, where external contributors described 
 how they used Lucene.  We are getting good feedback about this chapter from 
 LIA 
 #2 reviewers, too.
 
 Solr in Action also has a Case Studies chapter, and we are starting to look 
 for 
 contributors.
 
 If you are using Solr in some clever, interesting, or unusual way and are 
 willing to share this information, please get in touch.  5 to max 10 pages 
 (soft 
 limits) per study is what we are hoping for.  Feel free to respond on the 
 list 
 or reply to me directly.
 
 [1] http://www.manning.com/catalog/undercontract.html
 [2] http://www.manning.com/hatcher2/  and  http://www.manning.com/hatcher3/
 
 Thanks,
 Otis
 --
 Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



Re: Contributors - Solr in Action Case Studies

2010-01-21 Thread Otis Gospodnetic
Hi Tom, hi Tom :)

Yummy goodness.  Lots of data.  Big books.  Thank you, I will be in touch.

Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



- Original Message 
 From: Tom Burton-West tburtonw...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Wed, January 20, 2010 5:17:39 PM
 Subject: Re: Contributors - Solr in Action Case Studies
 
 
 Hello Otis,
 
 Hi Otis,
 
 We are using Solr to provide indexing for the full text of 5 million books
 (About 4-6 terrabytes of text.)  Our index is currently around 3 terrabytes
 distributed over 10 shards with about 310 GB of index per shard.  We are
 using very large Solr documents (about 750MB of text or about 100,000
 words/doc), and using CommonGrams to deal with stopwords/common words in
 multiple languages.
 
 I would be interested in contributing a chapter if this sounds interesting. 
 More details about the project are available at: 
 http://www.hathitrust.org/large_scale_search 
 http://www.hathitrust.org/large_scale_search  and our blog: 
 http://www.hathitrust.org/blogs/large-scale-search 
 http://www.hathitrust.org/blogs/large-scale-search  (I'll be updating the
 blog with details of current hardware and performance tests in the next week
 or so)
 
 Tom
 
 Tom Burton-West
 Digital Library Production Service
 University of Michigan Library
 -- 
 View this message in context: 
 http://old.nabble.com/Contributors---Solr-in-Action-Case-Studies-tp27166564p27249616.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Contributors - Solr in Action Case Studies

2010-01-20 Thread Tom Burton-West

Hello Otis,

Hi Otis,

We are using Solr to provide indexing for the full text of 5 million books
(About 4-6 terrabytes of text.)  Our index is currently around 3 terrabytes
distributed over 10 shards with about 310 GB of index per shard.  We are
using very large Solr documents (about 750MB of text or about 100,000
words/doc), and using CommonGrams to deal with stopwords/common words in
multiple languages.

I would be interested in contributing a chapter if this sounds interesting. 
More details about the project are available at: 
http://www.hathitrust.org/large_scale_search 
http://www.hathitrust.org/large_scale_search  and our blog: 
http://www.hathitrust.org/blogs/large-scale-search 
http://www.hathitrust.org/blogs/large-scale-search  (I'll be updating the
blog with details of current hardware and performance tests in the next week
or so)

Tom

Tom Burton-West
Digital Library Production Service
University of Michigan Library
-- 
View this message in context: 
http://old.nabble.com/Contributors---Solr-in-Action-Case-Studies-tp27166564p27249616.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Contributors - Solr in Action Case Studies

2010-01-19 Thread Otis Gospodnetic
Hi Gora,

Thanks, this sounds interesting, as I don't think we explicitly cover phonetic 
searches and talking explicitly about languages other than English will be 
useful to some readers.

Let's take further discussion off-line.


Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



- Original Message 
 From: Gora Mohanty g...@srijan.in
 To: solr-user@lucene.apache.org
 Sent: Sun, January 17, 2010 4:16:25 PM
 Subject: Re: Contributors - Solr in Action Case Studies
 
 On Thu, 14 Jan 2010 11:09:41 -0800 (PST)
 Otis Gospodnetic wrote:
 [...]
  If you are using Solr in some clever, interesting, or unusual way
  and are willing to share this information, please get in touch.
  5 to max 10 pages (soft limits) per study is what we are hoping
  for.  Feel free to respond on the list or reply to me directly.
 [...]
 
 We have been getting to grips with Solr over the last couple of
 months, and while I am not sure how interesting this is to people
 outside of India, one of the things that we have just finished a
 beta version of is phonetic filters and spell-checking components
 for Solr, dealing with Indian languages. The aim is to have these
 work both for Unicode content/search terms, and for Indian
 languages transliterated into English. The latter is useful as
 many people, especially current computer users in India, find it
 more comfortable to type in transliterated English. These components
 use the standard Solr facilities, as well as established open-source
 spell-checking libraries like aspell, and the design goal includes
 fuzzy matches, such as between Amitav and Amitabh, as there is
 often a fair amount of variance in English transliteration.
 
 We see great potential for this as there is already a large amount
 of content in Indian language, and the government of India is
 putting in huge amounts of effort into generating more content.
 Please do let me know if this sounds interesting as a case study.
 
 Regards,
 Gora



Re: Contributors - Solr in Action Case Studies

2010-01-17 Thread Gora Mohanty
On Thu, 14 Jan 2010 11:09:41 -0800 (PST)
Otis Gospodnetic otis_gospodne...@yahoo.com wrote:
[...]
 If you are using Solr in some clever, interesting, or unusual way
 and are willing to share this information, please get in touch.
 5 to max 10 pages (soft limits) per study is what we are hoping
 for.  Feel free to respond on the list or reply to me directly.
[...]

We have been getting to grips with Solr over the last couple of
months, and while I am not sure how interesting this is to people
outside of India, one of the things that we have just finished a
beta version of is phonetic filters and spell-checking components
for Solr, dealing with Indian languages. The aim is to have these
work both for Unicode content/search terms, and for Indian
languages transliterated into English. The latter is useful as
many people, especially current computer users in India, find it
more comfortable to type in transliterated English. These components
use the standard Solr facilities, as well as established open-source
spell-checking libraries like aspell, and the design goal includes
fuzzy matches, such as between Amitav and Amitabh, as there is
often a fair amount of variance in English transliteration.

We see great potential for this as there is already a large amount
of content in Indian language, and the government of India is
putting in huge amounts of effort into generating more content.
Please do let me know if this sounds interesting as a case study.

Regards,
Gora


Contributors - Solr in Action Case Studies

2010-01-14 Thread Otis Gospodnetic
Hello,

We are working on Solr in Action [1].  One of the well received chapters from 
LIA #1[2] was the Case Studies chapter, where external contributors described 
how they used Lucene.  We are getting good feedback about this chapter from LIA 
#2 reviewers, too.

Solr in Action also has a Case Studies chapter, and we are starting to look for 
contributors.

If you are using Solr in some clever, interesting, or unusual way and are 
willing to share this information, please get in touch.  5 to max 10 pages 
(soft limits) per study is what we are hoping for.  Feel free to respond on the 
list or reply to me directly.

[1] http://www.manning.com/catalog/undercontract.html
[2] http://www.manning.com/hatcher2/  and  http://www.manning.com/hatcher3/

Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch