RE: Custom Sorting

2011-04-20 Thread Michael Owen

Ok thank you for the discussion. As I thought regard to not possible within 
performance limits.
I think the way to go is to document some more stats at index time, and use 
them in boost queries. :)
Thanks
Mike

 Date: Tue, 19 Apr 2011 15:12:00 -0400
 Subject: Re: Custom Sorting
 From: erickerick...@gmail.com
 To: solr-user@lucene.apache.org
 
 As I understand it, sorting by field is what caches are all
 about. You have a big list in memory of all of the terms for
 a field, indexed by Lucene doc ID so fetching the term to
 compare by doc ID is fast, and also why the caches need
 to be warmed, and why sort fields should be single-valued.
 
 If you try to do this yourself and fetch data from each document,
 you can incur a huge performance hit, since you'll be seeking
 all over your disk...
 
 Score is special though since it's transient. Internally, all Lucene
 has to do is keep track of the top N scores encountered where
 N is something like start + queryResultWindowSize, this
 latter from solrconfig.xml, with no seeks to disk at all...
 
 Best
 Erick
 
 On Tue, Apr 19, 2011 at 2:50 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
  On 4/19/2011 1:43 PM, Jan Høydahl wrote:
 
  Hi,
 
  Not possible :)
  Lucene compares each matching document against the query and produces a
  score for each.
  Documents are not compared to eachother like normal sort, that would be
  way too costly.
 
  That might be true for sort by 'score' (although even if you have all the
  scores, it still seems like some kind of sort must be neccesary to see which
  comes first), but when you sort by a field value, which is also possible,
  Lucene must be doing some kind of 'normal sort' algorithm, no?  Ah, I guess
  it could just be using each term's position in the index, which is available
  in constant time, always kept track of in an index? Maybe, I don't know?
 
 
 
  

Custom Sorting

2011-04-19 Thread Michael Owen

Hi,
I want to able to have a custom sorting algorithm such that for each comparison 
of document results (A v B) I can rank them. i.e. writing a comparator like I 
would normally do in Java (Compares its two arguments for order. Returns a 
negative integer, zero, or a positive integer as the first argument is less 
than, equal to, or greater than the second).
In the comparator I want to be able to take into account the score of the 
results, as well as other fields in the documents.
I've looked at using things such as the score/boost/bf parameters etc, however, 
want the flexibility of being able to code the comparator, so I can do if 
conditions and such.
Is this possible? And if so what's the best way of doing this? I've upgraded to 
use the latest version of Solr 3.1, and of course for this use case would 
expect to have to build from source, in order to add custom source.
Or/and, when using the score/boost/bf parameters etc - is it possible to use 
the score parameter in functions, to say scale it between 0 and 1?
Thanks
Mike



  

Sort by function - 400 error

2011-04-15 Thread Michael Owen

Using solr 3.1.
When I do:
sort=score desc
it works.
sort=product(typeId,2) desc (typeId is a valid attribute in document)
it works.
sort=product(score,typeId) desc
fails on 400 error? Also sort=product(score,2) desc fails too.
Must be something basic I'm missing? Tried adding fl=*,score too.
Thanks
Mike




  

Lower level filtering

2010-12-15 Thread Michael Owen

Hi all,
I'm currently using Solr and I've got a question about filtering on a lower 
level than filter queries.
We want to be able to restrict the documents that can possibly be returned to a 
users query. From another system we'll get a list of document unique ids for 
the user which is all the documents that they can possibly see (i.e. a base 
index list as such). The criteria for what document ids get returned is going 
to be quite flexible. As the number of ids can be up to index size - 1 (i.e. 
thousands) using a filter query doesn't seem right for entering a filter query 
which is so large.
Can something be done at a lower level - perhaps at a Lucene level - as I 
understand Lucene starts from a bitset of possible documents it can return - 
could we AND this with a filter bitset returned from the other system? Would 
this be a good way forward? 
And then how would you do this in Solr with still keeping Solr's extra 
functionality it brings over Lucene. A new SearchHandler?
Thanks
Mike





  

RE: Lower level filtering

2010-12-15 Thread Michael Owen

That was a quick response Steve!
Sounds all great! Much appreciated. Definitely think specifying a bit filter is 
something that many people many find useful.

I'll have a look at Solr-2052 too.
Thanks again,
Mike

 Date: Wed, 15 Dec 2010 09:57:54 -0500
 Subject: Re: Lower level filtering
 From: eelstretch...@gmail.com
 To: solr-user@lucene.apache.org
 
 On Wed, Dec 15, 2010 at 9:49 AM, Michael Owen michaelowe...@hotmail.com 
 wrote:
  I'm currently using Solr and I've got a question about filtering on a lower 
  level than filter queries.
  We want to be able to restrict the documents that can possibly be returned 
  to a users query. From another system we'll get a list of document unique 
  ids for the user which is all the documents that they can possibly see 
  (i.e. a base index list as such). The criteria for what document ids get 
  returned is going to be quite flexible. As the number of ids can be up to 
  index size - 1 (i.e. thousands) using a filter query doesn't seem right for 
  entering a filter query which is so large.
  Can something be done at a lower level - perhaps at a Lucene level - as I 
  understand Lucene starts from a bitset of possible documents it can return 
  - could we AND this with a filter bitset returned from the other system? 
  Would this be a good way forward?
  And then how would you do this in Solr with still keeping Solr's extra 
  functionality it brings over Lucene. A new SearchHandler?
 
 I actually submitted a patch a while ago in Solr-2052 that allows you
 to specify a bit filter and a filter query (you could specify either,
 but not both.)
 
 Otis pointed out that the patch can't be applied against the current
 source, so I need to go back and make it work with the current source
 (new job = no time).  I'll see if I can find the time this weekend to
 do this.
 
 Steve
 -- 
 Stephen Green
 http://thesearchguy.wordpress.com
  

RE: Lower level filtering

2010-12-15 Thread Michael Owen

Good point - though the inverse could be true where only a few documents is 
allowed and then a big list still exists. Even in the middle ground, its still 
going to be a long list of thousands.

Thanks
Mike


 Date: Wed, 15 Dec 2010 14:58:33 +
 Subject: Re: Lower level filtering
 From: savvas.andreas.moysi...@googlemail.com
 To: solr-user@lucene.apache.org
 
 It might not be practical in your case, but is it possible to get from that
 other system, a list of ids the user is *not* allow to see and somehow
 invert the logic in the filter?
 
 Regards,
 -- Savvas.
 
 On 15 December 2010 14:49, Michael Owen michaelowe...@hotmail.com wrote:
 
 
  Hi all,
  I'm currently using Solr and I've got a question about filtering on a lower
  level than filter queries.
  We want to be able to restrict the documents that can possibly be returned
  to a users query. From another system we'll get a list of document unique
  ids for the user which is all the documents that they can possibly see (i.e.
  a base index list as such). The criteria for what document ids get returned
  is going to be quite flexible. As the number of ids can be up to index size
  - 1 (i.e. thousands) using a filter query doesn't seem right for entering a
  filter query which is so large.
  Can something be done at a lower level - perhaps at a Lucene level - as I
  understand Lucene starts from a bitset of possible documents it can return -
  could we AND this with a filter bitset returned from the other system? Would
  this be a good way forward?
  And then how would you do this in Solr with still keeping Solr's extra
  functionality it brings over Lucene. A new SearchHandler?
  Thanks
  Mike