Re: Performance with search terms starting and ending with wildcards

2011-04-27 Thread Ueland
Hi!

Thanks for the reply.

We decided to give another try with ngrams. After much tweaking/tuning for
our needs. Both the size and speed was more than good enough for our needs.
So it looks like ngrams was the solution for us afterall :)

Best regards
Tor Henning Ueland

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Performance-with-search-terms-starting-and-ending-with-wildcards-tp2802561p2871451.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Performance with search terms starting and ending with wildcards

2011-04-11 Thread Otis Gospodnetic
Hi,

Perhaps you should give Lucene/Solr trunk a try and compare!  The Wildcard 
query 
in trunk should be much faster.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Ueland tor.henn...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Sun, April 10, 2011 10:44:46 AM
 Subject: Performance with search terms starting and ending with wildcards
 
 Hi!
 
 I have been doing some testing with solr and wildcards. Queries  like:
 
 - *foo
 - foo*
 
 Does complete quickly(1-2s) in a test index  on about 40-50GB.
 
 But when i try to do a search for *foo*, the search  time can without any
 trouble come upwards for 30seconds plus. 
 
 Any  ideas on how that issue can be worked around? 
 
 One fix would be to change  *foo* to (*foo or foo* or oof* or *oof) (is the
 reverse even needed?). But  that will not give the same results as *foo*,
 logicly enough.
 
 I have  also tried to set maxTimeAllowed, but that is simply ignored. I guess
 that is  related to either sorting or the wildcard search itself. 
 
 --
 View this  message in context: 
http://lucene.472066.n3.nabble.com/Performance-with-search-terms-starting-and-ending-with-wildcards-tp2802561p2802561.html

 Sent  from the Solr - User mailing list archive at Nabble.com.
 


Performance with search terms starting and ending with wildcards

2011-04-10 Thread Ueland
Hi!

I have been doing some testing with solr and wildcards. Queries like:

- *foo
- foo*

Does complete quickly(1-2s) in a test index on about 40-50GB.

But when i try to do a search for *foo*, the search time can without any
trouble come upwards for 30seconds plus. 

Any ideas on how that issue can be worked around? 

One fix would be to change *foo* to (*foo or foo* or oof* or *oof) (is the
reverse even needed?). But that will not give the same results as *foo*,
logicly enough.

I have also tried to set maxTimeAllowed, but that is simply ignored. I guess
that is related to either sorting or the wildcard search itself. 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Performance-with-search-terms-starting-and-ending-with-wildcards-tp2802561p2802561.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Performance with search terms starting and ending with wildcards

2011-04-10 Thread lboutros
Which version of solr are you using ?

NGrams could be an option but could you give us the field definition in your
schema ? The words count in this field index ?

Ludovic.


2011/4/10 Ueland [via Lucene] 
ml-node+2802561-121096623-383...@n3.nabble.com

 Hi!

 I have been doing some testing with solr and wildcards. Queries like:

 - *foo
 - foo*

 Does complete quickly(1-2s) in a test index on about 40-50GB.

 But when i try to do a search for *foo*, the search time can without any
 trouble come upwards for 30seconds plus.

 Any ideas on how that issue can be worked around?

 One fix would be to change *foo* to (*foo or foo* or oof* or *oof) (is the
 reverse even needed?). But that will not give the same results as *foo*,
 logicly enough.

 I have also tried to set maxTimeAllowed, but that is simply ignored. I
 guess that is related to either sorting or the wildcard search itself.

 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/Performance-with-search-terms-starting-and-ending-with-wildcards-tp2802561p2802561.html
  To start a new topic under Solr - User, email
 ml-node+472068-1765922688-383...@n3.nabble.com
 To unsubscribe from Solr - User, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=472068code=Ym91dHJvc2xAZ21haWwuY29tfDQ3MjA2OHw0Mzk2MDUxNjE=.




-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Performance-with-search-terms-starting-and-ending-with-wildcards-tp2802561p2802579.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Performance with search terms starting and ending with wildcards

2011-04-10 Thread Ueland
Which version of solr are you using ?

Currently testing with 3.1

 NGrams could be an option but could you give us the field definition in
 your schema ? The words count in this field index ?

I wont share the complete schema but i can summarize it:

For testing, we have around 30 fields used to give us what we need from
documents that can be everything from 1 line to several MB`s of plain text,
and due to this size we have limited the copyfields to a maxmimum of 10 000
characters to limit the index size a bit.

We did a quick test of n-grams, the issue then was that the index grew from
around 90G and until the disk got full at 300G. (We tested more data/fields,
therefore the larger index)

The fact that a n-gram index becomes so large is a bit problematic.

Another interesting note: Even when i use the queryFilter to limit documents
to search in, the query is extremely slow (30s++ etc).

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Performance-with-search-terms-starting-and-ending-with-wildcards-tp2802561p2802686.html
Sent from the Solr - User mailing list archive at Nabble.com.