On Tue, Apr 03, 2007 at 10:29:49AM -0700, Ryan King wrote:
> On 4/3/07, Jens Kraemer <[EMAIL PROTECTED]> wrote:
[..]
> >
> > The funny thing is that this does not necessarily mean that it doesn't
> > work as intended. Just for fun I wrote an analyzer that completely
> > ignores the input it should analyze, and always uses a fixed text
> > instead:
> >
> > class TestAnalyzer
> > def token_stream field, input
> > ts = LetterTokenizer.new("senseless standard text")
> > puts "token_stream for :#{field} and input <#{input}>: #{ts.inspect}\n
> > #{ts.text}"
> > ts
> > end
> > end
> >
> > a = TestAnalyzer.new
> > ts = a.token_stream :test, 'foo bar'
> > puts ts.text # 'senseless standard text' as
> > expected
> >
> > pfa = PerFieldAnalyzer.new(StandardAnalyzer.new())
> > pfa[:test] = TestAnalyzer.new
> > ts = pfa.token_stream :test, 'foo bar'
> > puts ts.text # surprise: 'foo bar'
> >
> > I guess the pfa does not give the text to analyze via the token_stream
> > method, but sets it later by using the Tokenizer's text=() method.
>
> I don't think so. I've tried overriding #text=, but it never gets called.
ok, then it's happening somewhere else - in ferret's analysis.c there's
a method a_standard_get_ts that clones an existing token stream instance
and calls a method named reset on it, with the text to be tokenized.
I guess we'll need Dave's help to sort this out...
Jens
--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[EMAIL PROTECTED] | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk