-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of David Balmain
Sent: Wednesday, 20 September 2006 9:50 PM
To: [email protected]
Subject: Re: [Ferret-talk] Understanding boost ?

On 9/20/06, Jens Kraemer <[EMAIL PROTECTED]> wrote:
> Hi!
>
> On Wed, Sep 20, 2006 at 03:40:03PM +1000, Neville Burnell wrote:
> > Hi,
> >
> > I'm confused about managing field boosting ...
> >
> > I have set the :boost for the :name field in my docs to 10, via 
> > :boost => 10
> >
> > Then I performed a search for 'keith' over all fields via with 
> > *:(keith*), expecting a doc with Keith in the :name field to come 
> > out on top. But another doc with Keith mentioned in other fields 
> > (:comments,
> > :address) scored higher.
> >
> > I viewed the explanation from the searcher, but it wasn't clear to 
> > me why the boost wasn't pushing the :name = Keith document to the
top.
>
> as you can see from the explanation, the score for both fields that 
> matched the query got summed up (8... = sum of:), if 'keith' only had 
> shown up in one field, the other document would have had the higher 
> score.
>
> I don't know of any methodology to determine the proper boost setting 
> for a field, imho it's just a question of experimenting with queries 
> and the results you expect.
>
> If you always want to have matches in the name ranked on the top, 
> regardless of how many times a term is mentioned in other parts of 
> your document, set the boost to 100 ;-)
>
> I don't know what the coord value is, though, maybe someone else can 
> step in here ?
>
> Jens

The coord factor is the number of clauses in a BooleanQuery that matched
over the number of clauses. It would seem that in the example, there
were 48 clauses. When you submit a query over all fields (ie.
"*:term") the query is rewritten as a boolean query with a clause for
every field in your index. So it would seem that Neville has 48 fields
in his index.

Hope that makes sense,

Dave

PS: This might be a good time to mention that if you have an index with
a lot of fields like this, it is probably worth thinking about what to
set the :default_field and :all_fields parameters to.
:all_fields is what "*:#{query}" expands to. It doesn't necessarily have
to be all fields in the index. Usually you only want "*" to expand to
all text fields, not actually all fields. For example, you'd probably
want date fields to be excluded. And I've only just fixed this so it
will work when you use a Ferret::Index::Index object.
Previously the QueryParser had all fields in the index added to the
:all_fields parameter. Now that only happens if :all_fields isn't set
explicitly.
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to