Re: Help with relevance failure in Solr 1.3

2009-04-14 Thread Walter Underwood
Dang, had another server do this.

Syncing and committing a new index does not fix it. The two servers
show the same bad results.

wunder

On 4/11/09 9:12 AM, Walter Underwood wunderw...@netflix.com wrote:

 Restarting Solr fixes it. If I remember correctly, a sync and commit
 does not fix it. I have disabled snappuller this time, so I can study
 the broken instance.
 
 wunder
 
 On 4/11/09 5:03 AM, Grant Ingersoll gsing...@apache.org wrote:
 
 
 On Apr 10, 2009, at 5:50 PM, Walter Underwood wrote:
 
 Normally, both changeling and the changeling work fine. This one
 server is misbehaving like this for all multi-term queries.
 
 Yes, it is VERY weird that the term changeling does not show up in
 the explain.
 
 A server will occasionally go bad and stay in that state. In one
 case,
 two servers went bad and both gave the same wrong results.
 
 
 What's the solution for when they go bad?  Do you have to restart Solr
 or reboot or what?
 
 
 Here is the dismax config. groups means movies. The title* fields
 are stemmed and stopped, the exact* fields are not.
 
  !-- groups and people  --
 
  requestHandler name=groups_people class=solr.SearchHandler
lst name=defaults
 str name=defTypedismax/str
 str name=echoParamsnone/str
 float name=tie0.01/float
 str name=qf
exact^6.0 exact_alt^6.0 exact_base~jw_0.7_1^8.0 exact_alias^8.0
 title^3.0 title_alt^3.0 title_base^4.0
 /str
 
 str name=pf
exact^9.0 exact_alt^9.0 exact_base^12.0 exact_alias^12.0
 title^3.0
 title_alt^4.0 title_base^6.0
 /str
 str name=bf
search_popularity^100.0
 /str
 str name=mm1/str
 int name=ps100/int
 str name=flid,type,movieid,personid,genreid/str
 
/lst
lst name=appends
  str name=fqtype:group OR type:person/str
/lst
  /requestHandler
 
 
 wunder
 
 On 4/10/09 12:51 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 
 On Apr 10, 2009, at 1:56 PM, Walter Underwood wrote:
 
 We have a rare, hard-to-reproduce problem with our Solr 1.3 servers,
 and
 I would appreciate any ideas.
 
 Ocassionally, a server will start returning results with really poor
 relevance. Single term queries work fine, but multi-term queries are
 scored based on the most common term (lowest IDF).
 
 I don't see anything in the logs when this happens. We have a
 monitor
 doing a search for the 100 most popular movies once per minute to
 catch this, so we know when it was first detected.
 
 I'm attaching two explain outputs, one for the query changeling
 and
 one for the changeling.
 
 
 I'm not sure what exactly  you are asking, so bear with me...
 
 Are you saying that the changeling normally returns results just
 fine and then periodically it will go bad or are you saying you
 don't understand why the changeling scores differently from
 changeling?  In looking at the explains, it is weird that in the
 the changeling case, the term changeling doesn't even show up as a
 term.
 
 Can you share your dismax configuration?  That will be easier to
 parse
 than trying to make sense of the debug query parsing.
 
 -Grant
 
 
 --
 Grant Ingersoll
 http://www.lucidimagination.com/
 
 Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
 using Solr/Lucene:
 http://www.lucidimagination.com/search
 
 



Re: Help with relevance failure in Solr 1.3

2009-04-14 Thread Yonik Seeley
It just occurred to me that a query cache issue could potentially
cause this... if it's caching it would most likely be a query.equals()
implementation incorrectly returning true.
Perhaps check the JaroWinkler.equals() first?

Also, when one server starts to return bad results, have you tried
using explainOther=id:id_of_other_doc_that_should_score_higher?

-Yonik
http://www.lucidimagination.com


On Tue, Apr 14, 2009 at 11:43 AM, Walter Underwood
wunderw...@netflix.com wrote:
 Dang, had another server do this.

 Syncing and committing a new index does not fix it. The two servers
 show the same bad results.

 wunder

 On 4/11/09 9:12 AM, Walter Underwood wunderw...@netflix.com wrote:

 Restarting Solr fixes it. If I remember correctly, a sync and commit
 does not fix it. I have disabled snappuller this time, so I can study
 the broken instance.

 wunder


Re: Help with relevance failure in Solr 1.3

2009-04-14 Thread Walter Underwood
The JaroWinkler equals was broken, but I fixed that a month ago.

Query cache sounds possible, but those are cleared on a commit,
right?

I could run with a cache size of 0, since our middle tier HTTP
cache is leaving almost nothing for the caches to do.

I'll try that explain. The stored fields for the correct doc
are fine, because I can see them when I use a single-term query.
The indexed fields seem OK, because that query works.

wunder

On 4/14/09 9:11 AM, Yonik Seeley yo...@lucidimagination.com wrote:

 It just occurred to me that a query cache issue could potentially
 cause this... if it's caching it would most likely be a query.equals()
 implementation incorrectly returning true.
 Perhaps check the JaroWinkler.equals() first?
 
 Also, when one server starts to return bad results, have you tried
 using explainOther=id:id_of_other_doc_that_should_score_higher?
 
 -Yonik
 http://www.lucidimagination.com
 
 
 On Tue, Apr 14, 2009 at 11:43 AM, Walter Underwood
 wunderw...@netflix.com wrote:
 Dang, had another server do this.
 
 Syncing and committing a new index does not fix it. The two servers
 show the same bad results.
 
 wunder
 
 On 4/11/09 9:12 AM, Walter Underwood wunderw...@netflix.com wrote:
 
 Restarting Solr fixes it. If I remember correctly, a sync and commit
 does not fix it. I have disabled snappuller this time, so I can study
 the broken instance.
 
 wunder



Re: Help with relevance failure in Solr 1.3

2009-04-14 Thread Yonik Seeley
On Tue, Apr 14, 2009 at 12:19 PM, Walter Underwood
wunderw...@netflix.com wrote:
 The JaroWinkler equals was broken, but I fixed that a month ago.

 Query cache sounds possible, but those are cleared on a commit,
 right?

Yes, but if you use autowarming, those items are regenerated and if
there is a problem with equals() then it could re-appear (the cache
items are correct, it's just the lookup that returns the wrong one).

-Yonik
http://www.lucidimagination.com


Re: Help with relevance failure in Solr 1.3

2009-04-14 Thread Walter Underwood
But why would it work for a few days, then go bad and stay bad?

It fails for every multi-term query, even those not in cache.
I ran a test with more queries than the cache size.

We do use autowarming.

wunder

On 4/14/09 10:55 AM, Yonik Seeley yo...@lucidimagination.com wrote:

 On Tue, Apr 14, 2009 at 12:19 PM, Walter Underwood
 wunderw...@netflix.com wrote:
 The JaroWinkler equals was broken, but I fixed that a month ago.
 
 Query cache sounds possible, but those are cleared on a commit,
 right?
 
 Yes, but if you use autowarming, those items are regenerated and if
 there is a problem with equals() then it could re-appear (the cache
 items are correct, it's just the lookup that returns the wrong one).
 
 -Yonik
 http://www.lucidimagination.com



Re: Help with relevance failure in Solr 1.3

2009-04-14 Thread Grant Ingersoll

Are there changes occuring when it goes bad that maybe aren't committed?

On Apr 14, 2009, at 1:59 PM, Walter Underwood wrote:


But why would it work for a few days, then go bad and stay bad?

It fails for every multi-term query, even those not in cache.
I ran a test with more queries than the cache size.

We do use autowarming.

wunder

On 4/14/09 10:55 AM, Yonik Seeley yo...@lucidimagination.com  
wrote:



On Tue, Apr 14, 2009 at 12:19 PM, Walter Underwood
wunderw...@netflix.com wrote:

The JaroWinkler equals was broken, but I fixed that a month ago.

Query cache sounds possible, but those are cleared on a commit,
right?


Yes, but if you use autowarming, those items are regenerated and if
there is a problem with equals() then it could re-appear (the cache
items are correct, it's just the lookup that returns the wrong one).

-Yonik
http://www.lucidimagination.com




--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: Help with relevance failure in Solr 1.3

2009-04-14 Thread Walter Underwood
Nope. This is a slave, so no indexing happens, just a sync. The
sync happens once per day. It went bad at a different time.

wunder

On 4/14/09 11:42 AM, Grant Ingersoll gsing...@apache.org wrote:

 Are there changes occuring when it goes bad that maybe aren't committed?
 
 On Apr 14, 2009, at 1:59 PM, Walter Underwood wrote:
 
 But why would it work for a few days, then go bad and stay bad?
 
 It fails for every multi-term query, even those not in cache.
 I ran a test with more queries than the cache size.
 
 We do use autowarming.
 
 wunder
 
 On 4/14/09 10:55 AM, Yonik Seeley yo...@lucidimagination.com
 wrote:
 
 On Tue, Apr 14, 2009 at 12:19 PM, Walter Underwood
 wunderw...@netflix.com wrote:
 The JaroWinkler equals was broken, but I fixed that a month ago.
 
 Query cache sounds possible, but those are cleared on a commit,
 right?
 
 Yes, but if you use autowarming, those items are regenerated and if
 there is a problem with equals() then it could re-appear (the cache
 items are correct, it's just the lookup that returns the wrong one).
 
 -Yonik
 http://www.lucidimagination.com
 
 
 --
 Grant Ingersoll
 http://www.lucidimagination.com/
 
 Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
 using Solr/Lucene:
 http://www.lucidimagination.com/search
 



Re: Help with relevance failure in Solr 1.3

2009-04-14 Thread Grant Ingersoll
Is bad memory a possibility?  i.e. is it the same machine all the  
time?  Is there any recognizable pattern for when it happens?


-Grant (grasping at straws)


On Apr 14, 2009, at 2:51 PM, Walter Underwood wrote:


Nope. This is a slave, so no indexing happens, just a sync. The
sync happens once per day. It went bad at a different time.

wunder

On 4/14/09 11:42 AM, Grant Ingersoll gsing...@apache.org wrote:

Are there changes occuring when it goes bad that maybe aren't  
committed?


On Apr 14, 2009, at 1:59 PM, Walter Underwood wrote:


But why would it work for a few days, then go bad and stay bad?

It fails for every multi-term query, even those not in cache.
I ran a test with more queries than the cache size.

We do use autowarming.

wunder

On 4/14/09 10:55 AM, Yonik Seeley yo...@lucidimagination.com
wrote:


On Tue, Apr 14, 2009 at 12:19 PM, Walter Underwood
wunderw...@netflix.com wrote:

The JaroWinkler equals was broken, but I fixed that a month ago.

Query cache sounds possible, but those are cleared on a commit,
right?


Yes, but if you use autowarming, those items are regenerated and if
there is a problem with equals() then it could re-appear (the cache
items are correct, it's just the lookup that returns the wrong  
one).


-Yonik
http://www.lucidimagination.com




--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search





--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: Help with relevance failure in Solr 1.3

2009-04-14 Thread Walter Underwood
I already ruled out cosmic rays. It has happened on different
hardware and at different times of day, including low load.

The only thing associated with it is load from a new faceted
browse thing we turned on.

wunder

On 4/14/09 2:23 PM, Grant Ingersoll gsing...@apache.org wrote:

 Is bad memory a possibility?  i.e. is it the same machine all the
 time?  Is there any recognizable pattern for when it happens?
 
 -Grant (grasping at straws)
 
 
 On Apr 14, 2009, at 2:51 PM, Walter Underwood wrote:
 
 Nope. This is a slave, so no indexing happens, just a sync. The
 sync happens once per day. It went bad at a different time.
 
 wunder
 
 On 4/14/09 11:42 AM, Grant Ingersoll gsing...@apache.org wrote:
 
 Are there changes occuring when it goes bad that maybe aren't
 committed?
 
 On Apr 14, 2009, at 1:59 PM, Walter Underwood wrote:
 
 But why would it work for a few days, then go bad and stay bad?
 
 It fails for every multi-term query, even those not in cache.
 I ran a test with more queries than the cache size.
 
 We do use autowarming.
 
 wunder
 
 On 4/14/09 10:55 AM, Yonik Seeley yo...@lucidimagination.com
 wrote:
 
 On Tue, Apr 14, 2009 at 12:19 PM, Walter Underwood
 wunderw...@netflix.com wrote:
 The JaroWinkler equals was broken, but I fixed that a month ago.
 
 Query cache sounds possible, but those are cleared on a commit,
 right?
 
 Yes, but if you use autowarming, those items are regenerated and if
 there is a problem with equals() then it could re-appear (the cache
 items are correct, it's just the lookup that returns the wrong
 one).
 
 -Yonik
 http://www.lucidimagination.com
 
 
 --
 Grant Ingersoll
 http://www.lucidimagination.com/
 
 Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
 using Solr/Lucene:
 http://www.lucidimagination.com/search
 
 
 
 --
 Grant Ingersoll
 http://www.lucidimagination.com/
 
 Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
 using Solr/Lucene:
 http://www.lucidimagination.com/search
 



Re: Help with relevance failure in Solr 1.3

2009-04-14 Thread Grant Ingersoll
OK, I guess details on the new faceting stuff would be in order.   
Which faceting are using?  Are you sure that it never occurred before  
(i.e. it slipped under the radar)?


Obviously, the key is reproducibility here, but this has all the  
earmarks of some weird threading issue, it seems, at least IMO.



On Apr 14, 2009, at 5:32 PM, Walter Underwood wrote:


I already ruled out cosmic rays. It has happened on different
hardware and at different times of day, including low load.

The only thing associated with it is load from a new faceted
browse thing we turned on.

wunder

On 4/14/09 2:23 PM, Grant Ingersoll gsing...@apache.org wrote:


Is bad memory a possibility?  i.e. is it the same machine all the
time?  Is there any recognizable pattern for when it happens?

-Grant (grasping at straws)


On Apr 14, 2009, at 2:51 PM, Walter Underwood wrote:


Nope. This is a slave, so no indexing happens, just a sync. The
sync happens once per day. It went bad at a different time.

wunder

On 4/14/09 11:42 AM, Grant Ingersoll gsing...@apache.org wrote:


Are there changes occuring when it goes bad that maybe aren't
committed?

On Apr 14, 2009, at 1:59 PM, Walter Underwood wrote:


But why would it work for a few days, then go bad and stay bad?

It fails for every multi-term query, even those not in cache.
I ran a test with more queries than the cache size.

We do use autowarming.

wunder

On 4/14/09 10:55 AM, Yonik Seeley yo...@lucidimagination.com
wrote:


On Tue, Apr 14, 2009 at 12:19 PM, Walter Underwood
wunderw...@netflix.com wrote:

The JaroWinkler equals was broken, but I fixed that a month ago.

Query cache sounds possible, but those are cleared on a commit,
right?


Yes, but if you use autowarming, those items are regenerated  
and if
there is a problem with equals() then it could re-appear (the  
cache

items are correct, it's just the lookup that returns the wrong
one).

-Yonik
http://www.lucidimagination.com




--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search





--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search





--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: Help with relevance failure in Solr 1.3

2009-04-11 Thread Grant Ingersoll


On Apr 10, 2009, at 5:50 PM, Walter Underwood wrote:


Normally, both changeling and the changeling work fine. This one
server is misbehaving like this for all multi-term queries.

Yes, it is VERY weird that the term changeling does not show up in
the explain.

A server will occasionally go bad and stay in that state. In one  
case,

two servers went bad and both gave the same wrong results.



What's the solution for when they go bad?  Do you have to restart Solr  
or reboot or what?




Here is the dismax config. groups means movies. The title* fields
are stemmed and stopped, the exact* fields are not.

 !-- groups and people  --

 requestHandler name=groups_people class=solr.SearchHandler
   lst name=defaults
str name=defTypedismax/str
str name=echoParamsnone/str
float name=tie0.01/float
str name=qf
   exact^6.0 exact_alt^6.0 exact_base~jw_0.7_1^8.0 exact_alias^8.0
title^3.0 title_alt^3.0 title_base^4.0
/str

str name=pf
   exact^9.0 exact_alt^9.0 exact_base^12.0 exact_alias^12.0  
title^3.0

title_alt^4.0 title_base^6.0
/str
str name=bf
   search_popularity^100.0
/str
str name=mm1/str
int name=ps100/int
str name=flid,type,movieid,personid,genreid/str

   /lst
   lst name=appends
 str name=fqtype:group OR type:person/str
   /lst
 /requestHandler


wunder

On 4/10/09 12:51 PM, Grant Ingersoll gsing...@apache.org wrote:



On Apr 10, 2009, at 1:56 PM, Walter Underwood wrote:


We have a rare, hard-to-reproduce problem with our Solr 1.3 servers,
and
I would appreciate any ideas.

Ocassionally, a server will start returning results with really poor
relevance. Single term queries work fine, but multi-term queries are
scored based on the most common term (lowest IDF).

I don't see anything in the logs when this happens. We have a  
monitor

doing a search for the 100 most popular movies once per minute to
catch this, so we know when it was first detected.

I'm attaching two explain outputs, one for the query changeling  
and

one for the changeling.



I'm not sure what exactly  you are asking, so bear with me...

Are you saying that the changeling normally returns results just
fine and then periodically it will go bad or are you saying you
don't understand why the changeling scores differently from
changeling?  In looking at the explains, it is weird that in the
the changeling case, the term changeling doesn't even show up as a
term.

Can you share your dismax configuration?  That will be easier to  
parse

than trying to make sense of the debug query parsing.

-Grant




--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: Help with relevance failure in Solr 1.3

2009-04-11 Thread Walter Underwood
Restarting Solr fixes it. If I remember correctly, a sync and commit
does not fix it. I have disabled snappuller this time, so I can study
the broken instance.

wunder

On 4/11/09 5:03 AM, Grant Ingersoll gsing...@apache.org wrote:

 
 On Apr 10, 2009, at 5:50 PM, Walter Underwood wrote:
 
 Normally, both changeling and the changeling work fine. This one
 server is misbehaving like this for all multi-term queries.
 
 Yes, it is VERY weird that the term changeling does not show up in
 the explain.
 
 A server will occasionally go bad and stay in that state. In one
 case,
 two servers went bad and both gave the same wrong results.
 
 
 What's the solution for when they go bad?  Do you have to restart Solr
 or reboot or what?
 
 
 Here is the dismax config. groups means movies. The title* fields
 are stemmed and stopped, the exact* fields are not.
 
  !-- groups and people  --
 
  requestHandler name=groups_people class=solr.SearchHandler
lst name=defaults
 str name=defTypedismax/str
 str name=echoParamsnone/str
 float name=tie0.01/float
 str name=qf
exact^6.0 exact_alt^6.0 exact_base~jw_0.7_1^8.0 exact_alias^8.0
 title^3.0 title_alt^3.0 title_base^4.0
 /str
 
 str name=pf
exact^9.0 exact_alt^9.0 exact_base^12.0 exact_alias^12.0
 title^3.0
 title_alt^4.0 title_base^6.0
 /str
 str name=bf
search_popularity^100.0
 /str
 str name=mm1/str
 int name=ps100/int
 str name=flid,type,movieid,personid,genreid/str
 
/lst
lst name=appends
  str name=fqtype:group OR type:person/str
/lst
  /requestHandler
 
 
 wunder
 
 On 4/10/09 12:51 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 
 On Apr 10, 2009, at 1:56 PM, Walter Underwood wrote:
 
 We have a rare, hard-to-reproduce problem with our Solr 1.3 servers,
 and
 I would appreciate any ideas.
 
 Ocassionally, a server will start returning results with really poor
 relevance. Single term queries work fine, but multi-term queries are
 scored based on the most common term (lowest IDF).
 
 I don't see anything in the logs when this happens. We have a
 monitor
 doing a search for the 100 most popular movies once per minute to
 catch this, so we know when it was first detected.
 
 I'm attaching two explain outputs, one for the query changeling
 and
 one for the changeling.
 
 
 I'm not sure what exactly  you are asking, so bear with me...
 
 Are you saying that the changeling normally returns results just
 fine and then periodically it will go bad or are you saying you
 don't understand why the changeling scores differently from
 changeling?  In looking at the explains, it is weird that in the
 the changeling case, the term changeling doesn't even show up as a
 term.
 
 Can you share your dismax configuration?  That will be easier to
 parse
 than trying to make sense of the debug query parsing.
 
 -Grant
 
 
 --
 Grant Ingersoll
 http://www.lucidimagination.com/
 
 Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
 using Solr/Lucene:
 http://www.lucidimagination.com/search
 



Re: Help with relevance failure in Solr 1.3

2009-04-10 Thread Walter Underwood
If you don't see the attachments, you can get them here:

http://wunderwood.org/solr/

wunder

On 4/10/09 10:56 AM, Walter Underwood wunderw...@netflix.com wrote:

 We have a rare, hard-to-reproduce problem with our Solr 1.3 servers, and
 I would appreciate any ideas.
 
 Ocassionally, a server will start returning results with really poor
 relevance. Single term queries work fine, but multi-term queries are
 scored based on the most common term (lowest IDF).
 
 I don't see anything in the logs when this happens. We have a monitor
 doing a search for the 100 most popular movies once per minute to
 catch this, so we know when it was first detected.
 
 I'm attaching two explain outputs, one for the query changeling and
 one for the changeling.
 
 We are running Solr 1.3 with Lucene 2.4.0, and have added a fuzzy query
 using JaroWinkler matching.
 
 I'd appreciate ideas about where to look, what debug output to try, etc.
 
 wunder




Re: Help with relevance failure in Solr 1.3

2009-04-10 Thread Grant Ingersoll


On Apr 10, 2009, at 1:56 PM, Walter Underwood wrote:

We have a rare, hard-to-reproduce problem with our Solr 1.3 servers,  
and

I would appreciate any ideas.

Ocassionally, a server will start returning results with really poor
relevance. Single term queries work fine, but multi-term queries are
scored based on the most common term (lowest IDF).

I don't see anything in the logs when this happens. We have a monitor
doing a search for the 100 most popular movies once per minute to
catch this, so we know when it was first detected.

I'm attaching two explain outputs, one for the query changeling and
one for the changeling.



I'm not sure what exactly  you are asking, so bear with me...

Are you saying that the changeling normally returns results just  
fine and then periodically it will go bad or are you saying you  
don't understand why the changeling scores differently from  
changeling?  In looking at the explains, it is weird that in the  
the changeling case, the term changeling doesn't even show up as a  
term.


Can you share your dismax configuration?  That will be easier to parse  
than trying to make sense of the debug query parsing.


-Grant


Re: Help with relevance failure in Solr 1.3

2009-04-10 Thread Walter Underwood
Normally, both changeling and the changeling work fine. This one
server is misbehaving like this for all multi-term queries.

Yes, it is VERY weird that the term changeling does not show up in
the explain.

A server will occasionally go bad and stay in that state. In one case,
two servers went bad and both gave the same wrong results.

Here is the dismax config. groups means movies. The title* fields
are stemmed and stopped, the exact* fields are not.

  !-- groups and people  --

  requestHandler name=groups_people class=solr.SearchHandler
lst name=defaults
 str name=defTypedismax/str
 str name=echoParamsnone/str
 float name=tie0.01/float
 str name=qf
exact^6.0 exact_alt^6.0 exact_base~jw_0.7_1^8.0 exact_alias^8.0
title^3.0 title_alt^3.0 title_base^4.0
 /str

 str name=pf
exact^9.0 exact_alt^9.0 exact_base^12.0 exact_alias^12.0 title^3.0
title_alt^4.0 title_base^6.0
 /str
 str name=bf
search_popularity^100.0
 /str
 str name=mm1/str
 int name=ps100/int
 str name=flid,type,movieid,personid,genreid/str

/lst
lst name=appends
  str name=fqtype:group OR type:person/str
/lst
  /requestHandler


wunder

On 4/10/09 12:51 PM, Grant Ingersoll gsing...@apache.org wrote:

 
 On Apr 10, 2009, at 1:56 PM, Walter Underwood wrote:
 
 We have a rare, hard-to-reproduce problem with our Solr 1.3 servers,
 and
 I would appreciate any ideas.
 
 Ocassionally, a server will start returning results with really poor
 relevance. Single term queries work fine, but multi-term queries are
 scored based on the most common term (lowest IDF).
 
 I don't see anything in the logs when this happens. We have a monitor
 doing a search for the 100 most popular movies once per minute to
 catch this, so we know when it was first detected.
 
 I'm attaching two explain outputs, one for the query changeling and
 one for the changeling.
 
 
 I'm not sure what exactly  you are asking, so bear with me...
 
 Are you saying that the changeling normally returns results just
 fine and then periodically it will go bad or are you saying you
 don't understand why the changeling scores differently from
 changeling?  In looking at the explains, it is weird that in the
 the changeling case, the term changeling doesn't even show up as a
 term.
 
 Can you share your dismax configuration?  That will be easier to parse
 than trying to make sense of the debug query parsing.
 
 -Grant