Result order when score is the same
I'm using version 1.4.1. It appears that when several documents in a result set have the same score, the secondary sort is by 'indexed_at' ascending. Can this be altered in the config xml files? If I wanted the secondary sort to be indexed_at descending for example, or by a different field, say document title. Thanks, Ken -- View this message in context: http://lucene.472066.n3.nabble.com/Result-order-when-score-is-the-same-tp2816127p2816127.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Result order when score is the same
Is sort order when 'score' is the same a Lucene thing? Should I ask on the Lucene forum? -- View this message in context: http://lucene.472066.n3.nabble.com/Result-order-when-score-is-the-same-tp2816127p2817330.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Result order when score is the same
you could just explicitly send multiple sorts...from the tutorial: sort=inStock asc, price desc cheers. On Wed, Apr 13, 2011 at 2:59 PM, kenf_nc ken.fos...@realestate.com wrote: Is sort order when 'score' is the same a Lucene thing? Should I ask on the Lucene forum? -- View this message in context: http://lucene.472066.n3.nabble.com/Result-order-when-score-is-the-same-tp2816127p2817330.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Result order when score is the same
In real life though, it seems unlikely that the relevancy score will ever be identical, so the second sort field will never be used. Is relevancy score ever identical? Rarely at any rate. On 4/13/2011 3:22 PM, Rob Casson wrote: you could just explicitly send multiple sorts...from the tutorial: sort=inStock asc, price desc cheers. On Wed, Apr 13, 2011 at 2:59 PM, kenf_ncken.fos...@realestate.com wrote: Is sort order when 'score' is the same a Lucene thing? Should I ask on the Lucene forum? -- View this message in context: http://lucene.472066.n3.nabble.com/Result-order-when-score-is-the-same-tp2816127p2817330.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Result order when score is the same
Au contraire, I have almost 4 million documents, representing businesses in the US. And having the score be the same is a very common occurrence. It is quite clear from testing that if score is the same, then it sorts on indexed_at ascending. It seems silly to make me add a sort on every query, there should be some configuration to modify this. However, if I make all my queries include sort=score+desc,indexed_at+desc will that have a detrimental performance effect? -- View this message in context: http://lucene.472066.n3.nabble.com/Result-order-when-score-is-the-same-tp2816127p2817458.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Result order when score is the same
If you omitNorms and omitTermFreqAndPositions on the query field(s) and use no funky boost functions, all results will have identical score in AND-queries (or queries with one search term). IDF has no meaning because of AND, queryNorm is the same across the resultset, fieldNorm is 1 and TF is 1. It's not a really uncommon use-case. Some business owners just do not care about normalizing or term frequencies. In real life though, it seems unlikely that the relevancy score will ever be identical, so the second sort field will never be used. Is relevancy score ever identical? Rarely at any rate. On 4/13/2011 3:22 PM, Rob Casson wrote: you could just explicitly send multiple sorts...from the tutorial: sort=inStock asc, price desc cheers. On Wed, Apr 13, 2011 at 2:59 PM, kenf_ncken.fos...@realestate.com wrote: Is sort order when 'score' is the same a Lucene thing? Should I ask on the Lucene forum? -- View this message in context: http://lucene.472066.n3.nabble.com/Result-order-when-score-is-the-same- tp2816127p2817330.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Result order when score is the same
Sorting a large set is costly, the more fields you sort on, the more memory is consumed (and likely cached). If i remember correctly the result set will be ordered according to Lucene DocID's if there's nothing to sort on. If i read correctly, you don't want to specify those fixed sort parameter for every query right? You can simply add the parameter as default (or constant (= invariant)) in your request handler configuration in solrconfig. Au contraire, I have almost 4 million documents, representing businesses in the US. And having the score be the same is a very common occurrence. It is quite clear from testing that if score is the same, then it sorts on indexed_at ascending. It seems silly to make me add a sort on every query, there should be some configuration to modify this. However, if I make all my queries include sort=score+desc,indexed_at+desc will that have a detrimental performance effect? -- View this message in context: http://lucene.472066.n3.nabble.com/Result-order-when-score-is-the-same-tp2 816127p2817458.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Result order when score is the same
Is a new DocID generated everytime a doc with the same UniqueID is added to the index? If so, then docID must be incremental and would look like indexed_at ascending. What I see (and why it's a problem for me) is the following. a search brings back the first 5 documents in a result set of say 60. The score,titles are as follows (simulated) 1) 6.5, Doc 1 2) 6.3, Doc 2 3) 4.7, Doc 3 4) 4.7, Doc 4 5) 4.7, Doc 5 --- 6) 4.7, Doc 6 7) 4.7, Doc 7 8) 4.4, Doc 8 If I query 6 times the results come back like that every time. However if I change a field in Doc 4, a field that is not part of the search, it gets the same score, but the results are now this. 1) 6.5, Doc 1 2) 6.3, Doc 2 3) 4.7, Doc 3 4) 4.7, Doc 5 5) 4.7, Doc 6 --- 6) 4.7, Doc 7 7) 4.7, Doc 4 8) 4.4, Doc 8 So, in a specific situation I'm looking at, a user sees 5 items on a UI page, they click a button to 'favorite' document #4, I update Doc 4 and (because it was architecturally better) I re-issue the search. So from the users viewpoint they 'favorited' number 4 and it disappeared from their screen. Not a good user experience. If I could modify the secondary sort when score is the same then worse case doc 4 would pop to the top of the users screen but not disappear. Better would be to secondary sort on Title or some other fixed field that exists on all documents. But, I would want the sort to be at the system level, I dont' want the overhead of sorting every query I ever make. -- View this message in context: http://lucene.472066.n3.nabble.com/Result-order-when-score-is-the-same-tp2816127p2817766.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Result order when score is the same
Is a new DocID generated everytime a doc with the same UniqueID is added to the index? If so, then docID must be incremental and would look like indexed_at ascending. What I see (and why it's a problem for me) is the following. Yes, Solr removes the old and inserts a new when updating an existing document. a search brings back the first 5 documents in a result set of say 60. The score,titles are as follows (simulated) 1) 6.5, Doc 1 2) 6.3, Doc 2 3) 4.7, Doc 3 4) 4.7, Doc 4 5) 4.7, Doc 5 --- 6) 4.7, Doc 6 7) 4.7, Doc 7 8) 4.4, Doc 8 If I query 6 times the results come back like that every time. However if I change a field in Doc 4, a field that is not part of the search, it gets the same score, but the results are now this. 1) 6.5, Doc 1 2) 6.3, Doc 2 3) 4.7, Doc 3 4) 4.7, Doc 5 5) 4.7, Doc 6 --- 6) 4.7, Doc 7 7) 4.7, Doc 4 8) 4.4, Doc 8 The above scenario makes sense indeed. So, in a specific situation I'm looking at, a user sees 5 items on a UI page, they click a button to 'favorite' document #4, I update Doc 4 and (because it was architecturally better) I re-issue the search. So from the users viewpoint they 'favorited' number 4 and it disappeared from their screen. Not a good user experience. I agree. If you don't want this to happen you must ensure that the index order is never used in a search. If I could modify the secondary sort when score is the same then worse case doc 4 would pop to the top of the users screen but not disappear. Better would be to secondary sort on Title or some other fixed field that exists on all documents. But, I would want the sort to be at the system level, I dont' want the overhead of sorting every query I ever make. Well, sub-sorts must be used to avoid the index order being used for output. Maybe sorting on creation time (obviously not update time) as a final sort is allowed in our use case. It'll take some resources but if business requirements are as such then the resource penalty must be met or accepted. What do you mean by sorting on the system level? You need the overhead if you don't want the index order to reflect in your result set if the final sub-sort also results in duplicates. -- View this message in context: http://lucene.472066.n3.nabble.com/Result-order-when-score-is-the-same-tp2 816127p2817766.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Result order when score is the same
all documents. But, I would want the sort to be at the system level, I dont' want the overhead of sorting every query I ever make. How would 'doing it at the system level' avoid the 'overhead of sorting every query'? Every query has to be sorted, if you want it sorted. Beyond setting a default sort parameter in general in your request parameters, I don't think there's any way to somehow set defaults when I ask for sort by score, I REALLY mean sort by score, then by X, which is what I think you asked earlier. Just send the sort that you want. score, some_other_field desc. If that's what you want. The second field will really only be used for identical scores, plus Solr sorts pretty darn efficiently. I'd be pretty surprised if you were able to see any measurable performance difference at all of adding a second field to your sort parameter. Beware that the design you describe of updating the Solr index on user action can often run into trouble in Solr as you scale. Solr can only handle so many commits in a given short period of time, before it starts having trouble. At least in 1.4.1. I am not sure of the status in 3.1 of some of the near real time features meant to ameliorate this problem, at least in some cases. But this is potentially a far bigger performance headache, eventually, then worrying about adding a second field onto your sort effecting performance.
Re: Result order when score is the same
Hi Ken, It sounds like you want to just sort by time changed/added (reverse chrono order). I would not worry about issues just yet unless you have some reasons to think this is going to cause problems (e.g. giant index, low RAM). Jonathan is right about commits, and the NRT-ness of search in a typical Solr master-slave setup. In other words, even if you update the doc, it will be on the master, and your user will still see the same results in the same order until the next time the index is replicated from the master. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: kenf_nc ken.fos...@realestate.com To: solr-user@lucene.apache.org Sent: Wed, April 13, 2011 4:49:06 PM Subject: Re: Result order when score is the same Is a new DocID generated everytime a doc with the same UniqueID is added to the index? If so, then docID must be incremental and would look like indexed_at ascending. What I see (and why it's a problem for me) is the following. a search brings back the first 5 documents in a result set of say 60. The score,titles are as follows (simulated) 1) 6.5, Doc 1 2) 6.3, Doc 2 3) 4.7, Doc 3 4) 4.7, Doc 4 5) 4.7, Doc 5 --- 6) 4.7, Doc 6 7) 4.7, Doc 7 8) 4.4, Doc 8 If I query 6 times the results come back like that every time. However if I change a field in Doc 4, a field that is not part of the search, it gets the same score, but the results are now this. 1) 6.5, Doc 1 2) 6.3, Doc 2 3) 4.7, Doc 3 4) 4.7, Doc 5 5) 4.7, Doc 6 --- 6) 4.7, Doc 7 7) 4.7, Doc 4 8) 4.4, Doc 8 So, in a specific situation I'm looking at, a user sees 5 items on a UI page, they click a button to 'favorite' document #4, I update Doc 4 and (because it was architecturally better) I re-issue the search. So from the users viewpoint they 'favorited' number 4 and it disappeared from their screen. Not a good user experience. If I could modify the secondary sort when score is the same then worse case doc 4 would pop to the top of the users screen but not disappear. Better would be to secondary sort on Title or some other fixed field that exists on all documents. But, I would want the sort to be at the system level, I dont' want the overhead of sorting every query I ever make. -- View this message in context: http://lucene.472066.n3.nabble.com/Result-order-when-score-is-the-same-tp2816127p2817766.html Sent from the Solr - User mailing list archive at Nabble.com.