[
https://issues.apache.org/jira/browse/LUCENE-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated LUCENE-1077:
------------------------------------
Attachment: LUCENE-1077.patch
This is a fairly trivial start to to this, but it creates the sinks package in
the contrib/Analysis section and adds a simple TokenRangeSinkTokenizer and
test. This can be used to siphon off tokens that fall in a range. All it does
is count the tokens that go by and add those that fall in the range. It might
be useful for documents that you know have certain structures. For instance,
if you know the first 5 tokens of your docs are X.
More to follow.
> Analysis Sinks package
> ----------------------
>
> Key: LUCENE-1077
> URL: https://issues.apache.org/jira/browse/LUCENE-1077
> Project: Lucene - Java
> Issue Type: New Feature
> Components: Analysis, contrib/*
> Reporter: Grant Ingersoll
> Assignee: Grant Ingersoll
> Priority: Minor
> Fix For: 2.3
>
> Attachments: LUCENE-1077.patch
>
>
> With the advent of the new TeeTokenFilter and SinkTokenizer, there now exists
> some interesting new things that can be done in the analysis phase of
> indexing. See LUCENE-1058.
> This patch provides some new implementations of SinkTokenizer that may be
> useful.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]