On 7/8/06, Benjamin Krause <[EMAIL PROTECTED]> wrote:
> Hey David,
>
> thanks for the answer ..
>
> > How about setting the boost for the whole document rather than just
> > the :relevance field? Or do you sometimes want to sort by relevance
> > without taking the :relevance field into account?
>
> ah.. you mean i should boost each field of the document? or is there a
> way to set a boost level for the document as a whole? if so, i've missed
> it ..

doc = Ferret::Document::Document.new()
doc.boost = 100.0

> > PS: While we are on the topic, how would you like the sort API to
> > look? Many have complained that the sort API is too java-like but
> > no-one has suggested any improvements yet. I'd love to see some ideas.
>
> i like the idea of giving a short block with a sort algorithm.. i would
> like to see something like that:
>
> index.search ( :query => my_query,
>                :sort  => Proc.new( |doc| # some caluclation; return
> new_score ),
>                :reverse => false,
>                :filter => false,
>                :start => 0,
>                :limit => 10 )

The way sort works at the moment is that it caches all fields that are
sorted on. If you start doing sort like this and you have to load
every document in the result set which would have a huge performance
hit. I guess I could make this feature available though.

In the pure ruby version of Ferret you can do this;

    st_length = SortField::SortType.new("length", lambda{|str| str.length})
    sf = SortField.new("content", {:sort_type => st_length,
                               :reverse => true,
                               :comparator => lambda{|i,j| j <=> i}})

The sort type lambda allows you to create the sort cache. Then the
comparator lets you compare those two values. This is flexible while
remaining performant, although I still think I can make it more
intuitive.

> alternativly you should be able to give the sort param a name of a
> filed, like ':sort => :score' or an array of fields like ':sort => [
> :score, :title ]' and sort by the first element and then by the 2nd if
> the two or more docs share the same value for the 1st element.
> I guess something like ":sort => :score" is enough for most people ..

Actually, you can already do this. Have you tried it? Only :score is
treated as a field name. You'd have to do this;

    index.search_each(query, :sort => [SortField::RELEVANCE, :title, :price])


> i think the other options are almost like it is implemented right now ..
> i don't think you nee the SortField class.
>
> btw.. i do find the filter API not really intuitive, actually i didn't
> understand it at all ;)
>
> i know what you want to do with filters and how you want to get there,
> but i haven't found any understandable documentation, on how to build
> one ..
>
> maybe you should write a short tutorial on how to write a filter.. i
> would find it very intuitive, to have something like a base_query.. like
> having one query to filter/limit results, and have another query to do
> the real search..

I will. The TermEnum and TermDocEnum are essential for using filters
and they've undergone major changes so I'll hold off on this until I
get the next release out.

> and btw.. one feature i would definitely would like to see is to limit
> the search on a number of fields..
>
> i know i can write something like
>
> field_one:"search string" || field_two:"search
> string||field_three:"search string"||field_four:"search string"
>
> but i would like to be able to write something like
>
> (field_one|field_two|field_three|field_four):"search string"

You can do this already, just get rid of the brackets;

    field_one|field_two|field_three|field_four:"search string"

> furthermore, you should be able to say something like .. search in all
> fields, except field_one .. like
>
> (*|!field_one):"search string"

You can't do this, but it is a nice idea. I'll think about it. I might
also add the brackets into the syntax.


Anyway, thanks for your feedback Ben. I will definitely use it.

Cheers,
Dave
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to