[ 
https://issues.apache.org/jira/browse/LUCENE-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated LUCENE-1077:
------------------------------------

    Attachment: LUCENE-1077.patch

This is a fairly trivial start to to this, but it creates the sinks package in 
the contrib/Analysis section and adds a simple TokenRangeSinkTokenizer and 
test.  This can be used to siphon off tokens that fall in a range.  All it does 
is count the tokens that go by and add those that fall in the range.  It might 
be useful for documents that you know have certain structures.  For instance, 
if you know the first 5 tokens of your docs are X.

More to follow.

> Analysis Sinks package
> ----------------------
>
>                 Key: LUCENE-1077
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1077
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Analysis, contrib/*
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-1077.patch
>
>
> With the advent of the new TeeTokenFilter and SinkTokenizer, there now exists 
> some interesting new things that can be done in the analysis phase of 
> indexing.  See LUCENE-1058.
> This patch provides some new implementations of SinkTokenizer that may be 
> useful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to