Hi Dawid, if you dont want to create your own "marker filter", you can use KeywordMarkerFilter (http://goo.gl/OOgf4) instead StopFilter. This will work perfectly and don’t affect other filters, if you don’t have stemming in your analysis chain. The trick is to pass the stop-set to KeywordMarkerFilter instead the StopFiter. This one will mark those as keywords instead of removing them.
If you also have stemming, the easiest is to clone the source code of KeywordMarkerFilter and populate another attribute (a custom one like StopAttribute) with the same information. ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [email protected] > -----Original Message----- > From: Dawid Weiss [mailto:[email protected]] > Sent: Tuesday, August 21, 2012 10:34 PM > To: [email protected] > Subject: Looking for a code pattern to pass stop words as an attribute > > Seeking advice. > > I have an application where I need to know which tokens are stop words. Most > analyzers construct the token stream in a way that those tokens are filtered > out > -- this isn't what I need, I want them in, but marked somehow. The question is > how to do it nicely and in a simple way, possibly reusing existing token > filters? I > had a few ideas but they all seem awkward -- let me know if I'm missing > something obvious. > > Dawid > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] For additional > commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
