Re: Phrase Query Performance Question and score threshold

2007-11-05 Thread Yonik Seeley
On 11/5/07, Haishan Chen [EMAIL PROTECTED] wrote: As for the first issues. The number of different phrase queries have performance issues I found so far are about 10. If these are normal phrase queries (no slop), a good solution might be to simply index and query these phrases as a single

Re: Phrase Query Performance Question

2007-11-02 Thread Walter Underwood
He means extremely frequent and I agree. --wunder On 11/2/07 1:51 AM, Haishan Chen [EMAIL PROTECTED] wrote: Thanks for the advice. You certainly have a point. I believe you mean a query term that appears in 5-10% of an index in a natural language corpus is extremely INFREQUENT?

RE: Phrase Query Performance Question

2007-11-02 Thread Haishan Chen
From: [EMAIL PROTECTED] Subject: Re: Phrase Query Performance Question Date: Thu, 1 Nov 2007 11:25:26 -0700 To: solr-user@lucene.apache.org On 31-Oct-07, at 11:54 PM, Haishan Chen wrote:Date: Wed, 31 Oct 2007 17:54:53 -0700 Subject: Re: Phrase Query Performance Question From

Re: Phrase Query Performance Question

2007-11-02 Thread Mike Klaas
On 2-Nov-07, at 10:03 AM, Haishan Chen wrote: Date: Fri, 2 Nov 2007 07:32:30 -0700 Subject: Re: Phrase Query Performance Question From: [EMAIL PROTECTED] To: solr- [EMAIL PROTECTED] He means extremely frequent and I agree. --wunder Then it means a PHRASE (combination of terms

Re: Phrase Query Performance Question

2007-11-02 Thread Chris Hostetter
: It still feels to me that you are trying doing something unique with your : phrase queries. Unfortunately, you still haven't said what you are trying to : do in general terms, which makes it very difficult for people to help you. Agreed. This seems very special case, but we dont' know what

RE: Phrase Query Performance Question

2007-11-02 Thread Haishan Chen
Date: Fri, 2 Nov 2007 12:31:29 -0700 From: [EMAIL PROTECTED] To: solr-user@lucene.apache.org Subject: Re: Phrase Query Performance Question : It still feels to me that you are trying doing something unique with your : phrase queries. Unfortunately, you still haven't said what you

Re: Phrase Query Performance Question

2007-11-01 Thread Mike Klaas
On 31-Oct-07, at 11:54 PM, Haishan Chen wrote: Date: Wed, 31 Oct 2007 17:54:53 -0700 Subject: Re: Phrase Query Performance Question From: [EMAIL PROTECTED] To: solr- [EMAIL PROTECTED] hurricane katrina is a very expensive query against a collection focused on Hurricane Katrina

RE: Phrase Query Performance Question

2007-10-31 Thread Haishan Chen
From: [EMAIL PROTECTED] Subject: Re: Phrase Query Performance Question Date: Tue, 30 Oct 2007 11:22:17 -0700 To: solr-user@lucene.apache.org On 30-Oct-07, at 6:09 AM, Yonik Seeley wrote: On 10/30/07, Haishan Chen [EMAIL PROTECTED] wrote: Thanks a lot for replying Yonik! I am

Re: Phrase Query Performance Question

2007-10-31 Thread Mike Klaas
On 31-Oct-07, at 2:40 PM, Haishan Chen wrote: http://mail-archives.apache.org/mod_mbox/lucene-java-user/ 200512.mbox/[EMAIL PROTECTED] It mentioned that http://websearch.archive.org/katrina/ (in nutch) had 10M documents and a search of hurricane katrina was able to return in 1.35 seconds

RE: Phrase Query Performance Question

2007-10-31 Thread Haishan Chen
From: [EMAIL PROTECTED] Subject: Re: Phrase Query Performance Question Date: Wed, 31 Oct 2007 15:25:42 -0700 To: solr-user@lucene.apache.org On 31-Oct-07, at 2:40 PM, Haishan Chen wrote: http://mail-archives.apache.org/mod_mbox/lucene-java-user/ 200512.mbox/[EMAIL PROTECTED

Re: Phrase Query Performance Question

2007-10-31 Thread Walter Underwood
hurricane katrina is a very expensive query against a collection focused on Hurricane Katrina. There will be many matches in many documents. If you want to measure worst-case, this is fine. I'd try other things, like: * ninth ward * Ray Nagin * Audubon Park * Canal Street * French Quarter * FEMA

RE: Phrase Query Performance Question

2007-10-31 Thread Chris Hostetter
: (auto repair) 100384 hits 946 ms(auto repair) 100384 hits 31ms(car : repair~100) 112183 hits 766 ms(car repair) 112183 hits 63 : ms(business service~100) 1209751 hits 1500 ms(business service) : 1209751 hits 234 ms(shopping center~100) 119481 hits 359 : ms(shopping

RE: Phrase Query Performance Question

2007-10-31 Thread Haishan Chen
Date: Wed, 31 Oct 2007 19:19:07 -0700 From: [EMAIL PROTECTED] To: solr-user@lucene.apache.org Subject: RE: Phrase Query Performance Question : (auto repair) 100384 hits 946 ms(auto repair) 100384 hits 31ms(car : repair~100) 112183 hits 766 ms(car repair) 112183 hits 63 : ms

RE: Phrase Query Performance Question

2007-10-31 Thread Haishan Chen
Date: Wed, 31 Oct 2007 17:54:53 -0700 Subject: Re: Phrase Query Performance Question From: [EMAIL PROTECTED] To: solr-user@lucene.apache.org hurricane katrina is a very expensive query against a collection focused on Hurricane Katrina. There will be many matches in many documents

RE: Phrase Query Performance Question

2007-10-30 Thread Haishan Chen
Thanks a lot for replying Yonik! I am running solr on a windows 2003 server (standard version). intel Xeon CPU 3.00GHz, with 4.00 GB RAM. The index is locate on Raid5 with 2 million documents. Is there any way to improve query performance without moving to more powerful computer? I

Re: Phrase Query Performance Question

2007-10-30 Thread Yonik Seeley
On 10/30/07, Haishan Chen [EMAIL PROTECTED] wrote: Thanks a lot for replying Yonik! I am running solr on a windows 2003 server (standard version). intel Xeon CPU 3.00GHz, with 4.00 GB RAM. The index is locate on Raid5 with 2 million documents. Is there any way to improve query performance

Re: Phrase Query Performance Question

2007-10-30 Thread Mike Klaas
On 30-Oct-07, at 6:09 AM, Yonik Seeley wrote: On 10/30/07, Haishan Chen [EMAIL PROTECTED] wrote: Thanks a lot for replying Yonik! I am running solr on a windows 2003 server (standard version). intel Xeon CPU 3.00GHz, with 4.00 GB RAM. The index is locate on Raid5 with 2 million documents.