Re: phrase query in solr 4

2014-10-30 Thread Dmitry Kan
On top of what Shawn rightly said, two things: 1. Try to benchmark yourself (best bet) solution with and without the shingles. Then you know better and have story with numbers to tell. 2. If you go with the shingles approach, consider removing duplicates with https://wiki.apache.org/solr/Analyzers

Re: phrase query in solr 4

2014-10-27 Thread Shawn Heisey
On 10/27/2014 6:20 AM, Robust Links wrote: > 1) we want to index and search all tokens in a document (i.e. we do not > rely on external stores) > > 2) we need search time to be fast and willing to pay larger indexing time > and index size, > > 3) be able to search as fast as possible ngrams of 3

phrase query in solr 4

2014-10-27 Thread Robust Links
Hi We are trying to upgrade our index from 3.6.1 to 4.9.1 and I wanted to make sure our existing indexing strategy is still valid or not. The statistics of the raw corpus are: - 4.8 Billon total number of tokens in the entire corpus. - 13MM documents We have 3 requirements 1) we want to inde

phrase query in solr 4

2014-10-24 Thread Robust Links
Hi We are trying to upgrade our index from 3.6.1 to 4.9.1 and I wanted to make sure our existing indexing strategy is still valid or not. The statistics of the raw corpus are: - 4.8 Billon total number of tokens in the entire corpus. - 13MM documents We have 3 requirements 1) we want to inde