Re: How do TeeTokenizer and SinkTokenizer work?

Koji Sekiguchi Sat, 23 Aug 2008 01:48:03 -0700

Hi Kurosaka-san,

I'd written an article on my blog several month ago about SinkTokenizer
and TeeTokenFilter.


See:
http://lucene.jugem.jp/?eid=172

Sorry, but all written in Japanese...

Koji


Teruhiko Kurosaka wrote:
> Hello,
> I'm interested in knowing how these tokenizers work together.
> The API doc for TeeTokenizer
> http://lucene.apache.org/java/2_3_1/api/org/apache/lucene/analysis/TeeTokenFilter.html
>
> has this sample code:
> SinkTokenizer sink1 = new SinkTokenizer(null);
> SinkTokenizer sink2 = new SinkTokenizer(null);
>
> TokenStream source1 = new TeeTokenFilter(new TeeTokenFilter(new 
> WhitespaceTokenizer(reader1), sink1), sink2);
> TokenStream source2 = new TeeTokenFilter(new TeeTokenFilter(new 
> WhitespaceTokenizer(reader2), sink1), sink2);
>
> TokenStream final3 = new EntityDetect(sink1);
> TokenStream final4 = new URLDetect(sink2);
>
> with an explanation that reads "sink1 and sink2 will both get tokens from 
> both reader1 and reader2 after whitespace tokenizer",
> but I don't understand how the input from reader1 and reader2 are mixed 
> together.
> Will sink1 first reaturn the reader1 text, and reader2?
> Or are they mixed randomly?
>
> -Kuro
>  
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How do TeeTokenizer and SinkTokenizer work?

Reply via email to