On 9/20/06, Jens Kraemer <[EMAIL PROTECTED]> wrote:
> Hi!
>
> On Wed, Sep 20, 2006 at 03:40:03PM +1000, Neville Burnell wrote:
> > Hi,
> >
> > I'm confused about managing field boosting ...
> >
> > I have set the :boost for the :name field in my docs to 10, via :boost
> > => 10
> >
> > Then I performed a search for 'keith' over all fields via with
> > *:(keith*), expecting a doc with Keith in the :name field to come out on
> > top. But another doc with Keith mentioned in other fields (:comments,
> > :address) scored higher.
> >
> > I viewed the explanation from the searcher, but it wasn't clear to me
> > why the boost wasn't pushing the :name = Keith document to the top.
>
> as you can see from the explanation, the score for both fields that
> matched the query got summed up (8... = sum of:), if 'keith' only had
> shown up in one field, the other document would have had the higher
> score.
>
> I don't know of any methodology to determine the proper boost setting
> for a field, imho it's just a question of experimenting with queries and
> the results you expect.
>
> If you always want to have matches in the name ranked on the top,
> regardless of how many times a term is mentioned in other parts of your
> document, set the boost to 100 ;-)
>
> I don't know what the coord value is, though, maybe someone else can
> step in here ?
>
> Jens
The coord factor is the number of clauses in a BooleanQuery that
matched over the number of clauses. It would seem that in the example,
there were 48 clauses. When you submit a query over all fields (ie.
"*:term") the query is rewritten as a boolean query with a clause for
every field in your index. So it would seem that Neville has 48 fields
in his index.
Hope that makes sense,
Dave
PS: This might be a good time to mention that if you have an index
with a lot of fields like this, it is probably worth thinking about
what to set the :default_field and :all_fields parameters to.
:all_fields is what "*:#{query}" expands to. It doesn't necessarily
have to be all fields in the index. Usually you only want "*" to
expand to all text fields, not actually all fields. For example, you'd
probably want date fields to be excluded. And I've only just fixed
this so it will work when you use a Ferret::Index::Index object.
Previously the QueryParser had all fields in the index added to the
:all_fields parameter. Now that only happens if :all_fields isn't set
explicitly.
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk