On 9/20/06, David Sheldon <[EMAIL PROTECTED]> wrote:
> David Balmain wrote:
>
> > I'm assuming the matriculation field is always going to be a number.
> > It won't change at a later date. So you can just set up the field
> > whenever you use it for the first time.
> >
> >     require 'rubygems'
> >     require 'ferret'
> >     i = Ferret::I.new
> >     puts i.field_infos
> >     if not i.field_infos[:matriculation]
> >       i.field_infos.add_field(:matriculation,
> >                               :index => :untokenized)
> >     end
> >     puts i.field_infos
> >     i << {:matriculation => 1978}
>
> Oh, I didn't really read this last time.
>
> It looks like this might be handy,
>
> http://ferret.davebalmain.com/api/classes/Ferret/Index/Index.html only
> lists the IndexReader as having the field_infos.
>
> How much overhead would it be to write an "add_value" method that is
> called, say 10 times per doc, which will lookup the field we're going to
> add in the index, and add it if it isn't already there?

Not a lot. It's a hash lookup so it's fast and it should be rare
(after a while at least) that new fields are added. ie, it's probably
not going to happen for every document.

> Is this what the old code did anyway?
>
> David

The old code created a completely new FieldInfos object for every
document you add to the index. It then merges the field_infos objects
when the documents are merged. In other words it was a lot more
complex. This is one of the reasons for the API change. Even after
adding the add_value method, I'd guess that the newer version of
Ferret will still index a lot faster.

Cheers,
Dave
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to