RE: Custom Sorting
Ok thank you for the discussion. As I thought regard to not possible within performance limits. I think the way to go is to document some more stats at index time, and use them in boost queries. :) Thanks Mike Date: Tue, 19 Apr 2011 15:12:00 -0400 Subject: Re: Custom Sorting From: erickerick...@gmail.com To: solr-user@lucene.apache.org As I understand it, sorting by field is what caches are all about. You have a big list in memory of all of the terms for a field, indexed by Lucene doc ID so fetching the term to compare by doc ID is fast, and also why the caches need to be warmed, and why sort fields should be single-valued. If you try to do this yourself and fetch data from each document, you can incur a huge performance hit, since you'll be seeking all over your disk... Score is special though since it's transient. Internally, all Lucene has to do is keep track of the top N scores encountered where N is something like start + queryResultWindowSize, this latter from solrconfig.xml, with no seeks to disk at all... Best Erick On Tue, Apr 19, 2011 at 2:50 PM, Jonathan Rochkind rochk...@jhu.edu wrote: On 4/19/2011 1:43 PM, Jan Høydahl wrote: Hi, Not possible :) Lucene compares each matching document against the query and produces a score for each. Documents are not compared to eachother like normal sort, that would be way too costly. That might be true for sort by 'score' (although even if you have all the scores, it still seems like some kind of sort must be neccesary to see which comes first), but when you sort by a field value, which is also possible, Lucene must be doing some kind of 'normal sort' algorithm, no? Ah, I guess it could just be using each term's position in the index, which is available in constant time, always kept track of in an index? Maybe, I don't know?
Custom Sorting
Hi, I want to able to have a custom sorting algorithm such that for each comparison of document results (A v B) I can rank them. i.e. writing a comparator like I would normally do in Java (Compares its two arguments for order. Returns a negative integer, zero, or a positive integer as the first argument is less than, equal to, or greater than the second). In the comparator I want to be able to take into account the score of the results, as well as other fields in the documents. I've looked at using things such as the score/boost/bf parameters etc, however, want the flexibility of being able to code the comparator, so I can do if conditions and such. Is this possible? And if so what's the best way of doing this? I've upgraded to use the latest version of Solr 3.1, and of course for this use case would expect to have to build from source, in order to add custom source. Or/and, when using the score/boost/bf parameters etc - is it possible to use the score parameter in functions, to say scale it between 0 and 1? Thanks Mike
Sort by function - 400 error
Using solr 3.1. When I do: sort=score desc it works. sort=product(typeId,2) desc (typeId is a valid attribute in document) it works. sort=product(score,typeId) desc fails on 400 error? Also sort=product(score,2) desc fails too. Must be something basic I'm missing? Tried adding fl=*,score too. Thanks Mike
Lower level filtering
Hi all, I'm currently using Solr and I've got a question about filtering on a lower level than filter queries. We want to be able to restrict the documents that can possibly be returned to a users query. From another system we'll get a list of document unique ids for the user which is all the documents that they can possibly see (i.e. a base index list as such). The criteria for what document ids get returned is going to be quite flexible. As the number of ids can be up to index size - 1 (i.e. thousands) using a filter query doesn't seem right for entering a filter query which is so large. Can something be done at a lower level - perhaps at a Lucene level - as I understand Lucene starts from a bitset of possible documents it can return - could we AND this with a filter bitset returned from the other system? Would this be a good way forward? And then how would you do this in Solr with still keeping Solr's extra functionality it brings over Lucene. A new SearchHandler? Thanks Mike
RE: Lower level filtering
That was a quick response Steve! Sounds all great! Much appreciated. Definitely think specifying a bit filter is something that many people many find useful. I'll have a look at Solr-2052 too. Thanks again, Mike Date: Wed, 15 Dec 2010 09:57:54 -0500 Subject: Re: Lower level filtering From: eelstretch...@gmail.com To: solr-user@lucene.apache.org On Wed, Dec 15, 2010 at 9:49 AM, Michael Owen michaelowe...@hotmail.com wrote: I'm currently using Solr and I've got a question about filtering on a lower level than filter queries. We want to be able to restrict the documents that can possibly be returned to a users query. From another system we'll get a list of document unique ids for the user which is all the documents that they can possibly see (i.e. a base index list as such). The criteria for what document ids get returned is going to be quite flexible. As the number of ids can be up to index size - 1 (i.e. thousands) using a filter query doesn't seem right for entering a filter query which is so large. Can something be done at a lower level - perhaps at a Lucene level - as I understand Lucene starts from a bitset of possible documents it can return - could we AND this with a filter bitset returned from the other system? Would this be a good way forward? And then how would you do this in Solr with still keeping Solr's extra functionality it brings over Lucene. A new SearchHandler? I actually submitted a patch a while ago in Solr-2052 that allows you to specify a bit filter and a filter query (you could specify either, but not both.) Otis pointed out that the patch can't be applied against the current source, so I need to go back and make it work with the current source (new job = no time). I'll see if I can find the time this weekend to do this. Steve -- Stephen Green http://thesearchguy.wordpress.com
RE: Lower level filtering
Good point - though the inverse could be true where only a few documents is allowed and then a big list still exists. Even in the middle ground, its still going to be a long list of thousands. Thanks Mike Date: Wed, 15 Dec 2010 14:58:33 + Subject: Re: Lower level filtering From: savvas.andreas.moysi...@googlemail.com To: solr-user@lucene.apache.org It might not be practical in your case, but is it possible to get from that other system, a list of ids the user is *not* allow to see and somehow invert the logic in the filter? Regards, -- Savvas. On 15 December 2010 14:49, Michael Owen michaelowe...@hotmail.com wrote: Hi all, I'm currently using Solr and I've got a question about filtering on a lower level than filter queries. We want to be able to restrict the documents that can possibly be returned to a users query. From another system we'll get a list of document unique ids for the user which is all the documents that they can possibly see (i.e. a base index list as such). The criteria for what document ids get returned is going to be quite flexible. As the number of ids can be up to index size - 1 (i.e. thousands) using a filter query doesn't seem right for entering a filter query which is so large. Can something be done at a lower level - perhaps at a Lucene level - as I understand Lucene starts from a bitset of possible documents it can return - could we AND this with a filter bitset returned from the other system? Would this be a good way forward? And then how would you do this in Solr with still keeping Solr's extra functionality it brings over Lucene. A new SearchHandler? Thanks Mike