Jens,

You are right, my code was not efficient I agree with you.

The indices from which I create the suggestions index are not very big: 
80kb, 300kb and 2 Mb.

After 20 minutes, I get a suggestions index of 1400 kb approximately.

Thank you for your help,

Johan



----- Original Message ----- 
From: "Jens Kraemer" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Thursday, October 05, 2006 10:30 AM
Subject: Re: [Ferret-talk] search results autocompletion - Checked by 
AntiVir DE


> On Thu, Oct 05, 2006 at 07:58:40AM +0200, johan duflost wrote:
>>
>> Dear list,
>>
>> I 'm using a text input field with autocompletion . The suggestions come
>> from a ferret index which is created by getting all the terms belonging 
>> to
>> other indices. Here is the code:
>>
>> class Suggestion
>>
>>   attr_accessor :term
>>
>>   def self.index(create)
>>     [Person, Project, Orgunit].each{|kl|
>>       terms = self.all_terms(kl)
>>       terms.each{|term|
>>         suggestion = Suggestion.new
>>         suggestion.term = term
>>         SUGGESTION_INDEX << suggestion.to_doc
>>       }
>>     }
>>     SUGGESTION_INDEX.optimize
>>   end
>>
>>   def self.all_terms(klass)
>>     reader = Index::IndexReader.new(Object.const_get(klass.name.upcase +
>> "_INDEX_DIR"))
>>     terms = []
>>     begin
>>     reader.field_names.each {|field_name|
>>     term_enum = reader.terms(field_name)
>>       begin
>>         term = term_enum.term()
>>         if !term.nil?
>>             if klass::SUGGESTIONABLE_FIELDS.include?(field_name)
>>               terms << term
>>             end
>>         end
>>       end while term_enum.next?
>>     }
>>     ensure
>>       reader.close
>>     end
>>     return terms
>>   end
>>
>>   def to_doc
>>     doc = {}
>>     doc[:term] = self.term
>>     return doc
>>   end
>>
>> end
>>
>>
>> It works very well except that the indexing process takes a long time. 
>> Does
>> anybody knows if there's a better way to do this?
>> Is there another way to get all the terms of an index?
>
> Nothing ferret-related, but from the first look at it your code seems a
> bit inefficient: you check the SUGGESTIONABLE_FIELDS array for each
> term, instead of checking once and then going ahead. You even could just
> iterate over the SUGGESTIONABLE_FIELDS array and use the field names
> from there:
>
>   def self.all_terms(klass)
>     reader = Index::IndexReader.new(Object.const_get(klass.name.upcase +
> "_INDEX_DIR"))
>     terms = []
>     begin
>        klass::SUGGESTIONABLE_FIELDS.map { |field|
>          reader.terms(field)
>        }.each do |term_enum|
>          # term_enum.term should not be nil, so no need to check this.
>          terms << term_enum.term while term_enum.next?
>        end
>     ensure
>       reader.close
>     end
>     return terms
>   end
>
> if your SUGGESTIONABLE_FIELDS contains fields not in the index (yet), the
> reader.terms call might fail, in that case
> reader.terms(field) rescue nil
> and compacting the result of map before calling each should work.
>
> You further could save one iteration across all terms by yielding the
> addition of the term to the index like this:
>
> all_terms(klass) do |term|
>  INDEX << { :term => term }
> end
>
> all_terms should do
> yield term_enum.term while term_enum.next?
> in the inner loop then. For extra style points rename all_terms to
> each_term :-)
>
>
>
> cheers,
> Jens
>
> -- 
> webit! Gesellschaft für neue Medien mbH          www.webit.de
> Dipl.-Wirtschaftsingenieur Jens Krämer       [EMAIL PROTECTED]
> Schnorrstraße 76                         Tel +49 351 46766  0
> D-01069 Dresden                          Fax +49 351 46766 66
> _______________________________________________
> Ferret-talk mailing list
> [email protected]
> http://rubyforge.org/mailman/listinfo/ferret-talk
> 

_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to