Re: sorting question

2007-03-24 Thread shai deljo

True, but let me ask the question in a different way.
The problem is that when I run the query and order by date then the
most recent results are not relevant enough (in general I find I need
to do work on top of what solr provides in order to get good
relevancy) so I guess I'm looking more for of a threashold to retrieve
results only from a certain score and I need this threshold to be
adaptive. I.e it's not about the number of results to retrieve since I
want as many as possible so I have better chance to get the most
recent one, but more about getting all the results that are relevant
enough.
When I display results sorted by score this is not a problem because
all these results hide in page number X (X is big).

I can think of several hacks (e.g calculating the distribution of
results myself) to do this but was wondering if there is a proper
solution.
Thx

On 3/23/07, Chris Hostetter [EMAIL PROTECTED] wrote:


: Is there a way (in 1 query) to retrieve the best scoring X results and
: then sort them by another field (date  for example)?

not at the moment.

keep in mind, this is the type of thing that can be done easily on the
client side -- pull back the top X results sorted by score, then sort by
date.



-Hoss




Multiple Fields syntax

2007-03-15 Thread shai deljo

A question about the syntax:
Does it support  the exclude (-) syntax ?

e.g.

q=title:photoshop+OR+description:photoshop;score+descversion=2.2start=0rows=170indent=onfl=*,score

Will return documents with photoshop in the title and/or description.

will this query:
q=title:photoshop-adobe+OR+description:photoshop;score+descversion=2.2start=0rows=170indent=onfl=*,score

return documents that have photoshop but NOT adobe in the title and/or
photoshop in the description?

Thanks,


DisMax Question.

2007-03-14 Thread shai deljo

Hi,
I am trying to use DisMax handler in order to search multiple fields
but i don't get results back.

Assume the fields i want to search on are abc, def, ghi and
jkl then i changed solrconfig.xml to this:
requestHandler name=dismax class=solr.DisMaxRequestHandler 
   lst name=defaults
str name=echoParamsexplicit/str
float name=tie0.01/float
str name=qf
   abc^2 def ghi^0.1 jkl
/str
   /lst
 /requestHandler

When i run my query with qt=dismax i don't get results:

/solr/select/?qt=dismaxq=blablabla%3BRelevanc
+descversion=2.2start=0rows=1indent=onfl=*,score

when i remove the qt=dismax i do get result back:

/solr/select/?q=blablabla%3BRelevanc
+descversion=2.2start=0rows=1indent=onfl=*,score

What am i doing wrong ?
Thanks,
S


Re: DisMax Question.

2007-03-14 Thread shai deljo

Thanks, that did the trick but now i have another problem (the
documentation is very little).
It fails when i try to boost terms in the query, i.e.
I get results for :

qt=dismaxq=blablaversion=2.2start=0rows=1indent=onfl=*,scoredebugQuery=truesort=length_seconds+desc

but no results for:
qt=dismaxq= blabla
^1.5version=2.2start=0rows=1indent=onfl=*,scoredebugQuery=truesort=length_seconds+desc

doesn't DisMax support term boosting?

What is the alternative to DisMax then? something like this :

q=field1:bla^0.1 fo OR
field2:blabla^3;score+descversion=2.2start=0rows=170indent=onfl=*,scoredebugQuery=on

?

On 3/14/07, Chris Hostetter [EMAIL PROTECTED] wrote:


: When i run my query with qt=dismax i don't get results:
:
: /solr/select/?qt=dismaxq=blablabla%3BRelevanc
: +descversion=2.2start=0rows=1indent=onfl=*,score
:
: when i remove the qt=dismax i do get result back:
:
: /solr/select/?q=blablabla%3BRelevanc
: +descversion=2.2start=0rows=1indent=onfl=*,score

dismax does not use hte ; syntax for sorting .. this is the one usefull
piece of documetnation i ever manged to put in the wiki about dismax...

http://wiki.apache.org/solr/DisMaxRequestHandler

...if you add debugQuery=true to your URL it will give you a bunch of
great debugging info thta would have pointed out you were actually geting
a query for the terms blablabla;Relevanc and desc across all of those
fields.

..also is Relevanc the name of a field you have, because if you are
trying to sort by score that's not right for either handler ... you need
score desc, either after the ; for standard, or in the sort param
for dismax.


-Hoss




Re: Question About Boosting.

2007-03-12 Thread shai deljo

Buckets it is :)
Thx

On 3/12/07, Chris Hostetter [EMAIL PROTECTED] wrote:


: I thought about this option but it doesn't sound scalable. What
: happens if i have 100 words with 100 different boost factors?

then you've got a problem :)

typically it's not this severe ... i'll frequently have half a dozen
fields that i divide text up into to boost on different amounts, but i'm
having a hard time understanding why you would need 100 unique boost
factors for 100 unique words ... putting things buckets tends be
effective.



-Hoss




Re: Question About Boosting.

2007-03-11 Thread shai deljo

Thanks,
The only way i found to do this
(http://www.mail-archive.com/solr-user@lucene.apache.org/msg02456.html)
is to hack and repeat the word several times in the field, but
doesn't this screw up the norms?
Also, how do i boost words in a query? e.g. q=key1 key2 and i know
key2 is twice as important than key1 ? (searching 1 field).
Thanks,
S.

On 3/11/07, Walter Underwood [EMAIL PROTECTED] wrote:

Back up another step. What are the documents and what do you
want to show to the users? Have you tried the default configuration
with real user queries?

After you've tested it with user queries, then look at the
results where the ranking isn't performing well.

Lucene and Solr already automatically boost rare terms over
common terms, using tf.idf weighting.

I posted more detail on this in my blog last summer:

http://wunderwood.org/most_casual_observer/2006/06/good_to_great_search.html

wunder

On 3/10/07 8:04 PM, shai deljo [EMAIL PROTECTED] wrote:

 I have elements within a field that have different importance.
 I thought boosting would be an elegant way to take this into account.
 Please advise,


 On 3/10/07, Walter Underwood [EMAIL PROTECTED] wrote:
 What are you trying to achieve? Let's start with the problem
 instead of picking one solution which Solr doesn't support. --wunder

 On 3/10/07 5:08 PM, shai deljo [EMAIL PROTECTED] wrote:

 How can i boost some tokens over others in the same field (at Index
 time) ? If this is not supported directly, what's the best way around
 this problem (what's the hack to solve this :) ).
 Thanks,
 Shai






Question About Boosting.

2007-03-10 Thread shai deljo

How can i boost some tokens over others in the same field (at Index
time) ? If this is not supported directly, what's the best way around
this problem (what's the hack to solve this :) ).
Thanks,
Shai


Ranking Question.

2007-03-08 Thread shai deljo

Hi,
Maybe a trivial/stupid questions but:
I have a fairly simple schema with a title, tags and description.
I have my own ranking/scoring system that takes into account the
similarity of each tag to a term in the query but now that i want to
include also the title and description (the description is somewhere
between short to a moderate length)  i am not sure how to handle this.
For example, would parsing the description and title before indexing
in SOLR and adding them as tags makes sense ? it sounds like that
would replicate a mechanism of stop words, stemming etc... built into
lucene.
My goal at the end is change as little as possible in the retrieval
process but then be able to rank based the keywords extracted from the
entire document.
Any ideas / directions ?
Thanks
Shai


Overriding Ranking in solr

2007-02-25 Thread shai deljo

Is there a way to add a plug in to override the ranking in Solr? e.g.
my ranking is based on distance from a Geo location provided in the
query.
Thanks,
Shai