If this is primarily an issue with the document input, as opposed to queries, you might be better off simply preprocessing the text before it is given to Lucene to be indexed.

-- Jack Krupansky

-----Original Message----- From: Furkan KAMACI
Sent: Wednesday, February 26, 2014 1:37 PM
To: java-user@lucene.apache.org
Subject: Re: How to delete a token that comes exactly after a token

Hi;

I'm parsing a wiki dump file. There are some special definitions. In
example:

link:km

so when I parse my text I have that tokens: "link" and "km". I want to
remove "link" and it is a stopword for my situation. However I want to
remove "km" too if km is followed by token of "link". If there is no such
an implementation I can implement a patch for it?

Thanks;
Furkan KAMACI


2014-02-26 17:36 GMT+02:00 Jack Krupansky <j...@basetechnology.com>:

Sounds like a custom filter.

Or maybe an option for stop filter or a specialization of stop filter.

Or maybe it could be even more generalized.

What are some practical example token sequences?

-- Jack Krupansky

-----Original Message----- From: Furkan KAMACI Sent: Wednesday, February
26, 2014 9:48 AM To: java-user@lucene.apache.org Subject: How to delete a
token that comes exactly after a token
 Hi;

How can I delete a token that comes exactly after a token for
StopwordFilter?

Thanks;
Furkan KAMACI

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to