[ 
https://issues.apache.org/jira/browse/LUCENE-3236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bernhard Kraft updated LUCENE-3236:
-----------------------------------

    Attachment: scan.pdf

A proposed solution for adding keyword awareness to EVERY applicable class. 
Either only classes which really want to operate on keywords would have to get 
changed or optionally only classes should get wrapped which should NOT operate 
on keywords.

The mechanism of bypassing keywords around the realTokenStream (by using a 
token buffer) still has to get detailed.

> Make LowerCaseFilter and StopFilter keyword aware, similar to PorterStemFilter
> ------------------------------------------------------------------------------
>
>                 Key: LUCENE-3236
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3236
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 4.0-ALPHA
>         Environment: N/A
>            Reporter: Sujit Pal
>            Priority: Minor
>              Labels: analysis
>             Fix For: 4.7
>
>         Attachments: lucene-3236-patch.diff, scan.pdf
>
>
> PorterStemFilter has functionality to detect if a term has been marked as a 
> "keyword" by the KeywordMarkerFilter (KeywordAttribute.isKeyword() == true), 
> and if so, skip stemming.
> The suggestion is to have the same functionality in other filters where it is 
> applicable. I think it may be particularly applicable to the LowerCaseFilter 
> (ie if it is a keyword, don't mess with the case), and StopFilter (if it is a 
> keyword, then don't filter it out even if it looks like a stop word).
> Backward compatibility is maintained (in both cases) by adding a new 
> constructor which takes an additional boolean parameter ignoreKeyword. The 
> current constructor will call this new constructor with ignoreKeyword = false.
> Patches are attached (for LowerCaseFilter and StopFilter).
> I have verified that the analysis JUnit tests run against the updated code, 
> ie, backward compatibility is maintained.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to