On 9/16/06, Olivier Siohan <[EMAIL PROTECTED]> wrote:
> Hello,
>
> I'm trying to define my own analyzer by doing something like:
>
> #-----------------------------------------------------
> require 'ferret'
> include Ferret
>
> class MyAnalyzer < Analysis::Analyzer
> def token_stream(field, str)
>
> # Display results of analysis
> puts 'Analyzing: field:%s str:%s' % [field, str]
> t =
> Analysis::LowerCaseFilter.new(Analysis::StandardTokenizer.new(str))
> while true
> n = t.next()
> break if n == nil
> puts n.to_s
> end
>
> return
> Analysis::LowerCaseFilter.new(Analysis::StandardTokenizer.new(str))
> end
> end
>
>
> puts '== Adding document to index...'
> index = Index::Index.new(:analyzer => MyAnalyzer.new())
> index << { :content => "The quick brown fox" }
> index << { :content => "The cow jumps over the moon" }
>
> puts '== Searching Brown...'
> index.search_each('content:Brown') do |doc, score|
> puts "Document #{doc} found with a score of #{score}"
> end
>
> puts '== Searching Foo...'
> index.search_each('content:Foo') do |doc, score|
> puts "Document #{doc} found with a score of #{score}"
> end
>
> puts '== Searching Brown...'
> index.search_each('content:Brown') do |doc, score|
> puts "Document #{doc} found with a score of #{score}"
> end
>
> puts '== Searching Cow...'
> index.search_each('content:Cow') do |doc, score|
> puts "Document #{doc} found with a score of #{score}"
> end
> #-----------------------------------------------------
>
> The output is:
> == Adding document to index...
> Analyzing: field:content str:
> Analyzing: field:content str:
> == Searching Brown...
> Analyzing: field:content str:Brown
> token["brown":0:5:1]
> Document 0 found with a score of 0.5
> == Searching Foo...
> == Searching Brown...
> Document 0 found with a score of 0.5
> == Searching Cow...
> Document 1 found with a score of 0.375
>
> The result is correct, i.e. documents are retrieved as expected.
> However, I don't understand why I don't see my 'Analyzing...' comment
> with the corresponding string being analyzed, except when searching
> for 'Brown', and why I'm getting an empty string in 'Analyzing:
> field:content str:' when the 2 documents are pushed into the index.
>
> Any explanations? I appologize if this is a trivial issue; I'm quite
> new to Ferret/Lucene. I use ferret-0.10.4 under linux.
>
> Many thanks.
>
> -- Olivier
Hi Olivier,
This is a bug I came across recently. It's fixed in the the working
version. However, if you need it to work right away, take out the
inheritence from Analysis::Analyzer. It makes Ferret think you are
passing a C implemented Analyzer.
The next gem will be out soon.
Cheers,
Dave
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk