Hi Paul, On 10/19/2011 at 5:26 AM, Paul Taylor wrote: > On 18/10/2011 15:25, Steven A Rowe wrote: > > On 10/18/2011 at 4:57 AM, Paul Taylor wrote: > > > On 18/10/2011 06:19, Steven A Rowe wrote: > > > > Another option is to create a char filter that substitutes > > > > PUNCT-EXCLAMATION for exclamation points, PUNCT-PERIOD for periods, > > > > etc., > > > > > > Yes that is how I first did it > > > > No, I don't think you did. When I say "char filter" I'm referring to > > CharFilter [snip] > > If you look at the code you can see I do use a CharFilter: [snip]
I apologize, you're obviously right, I hadn't looked at your code. > > If you go with a CharFilter, you can give it access to the entire input > > at once, and use a regular expression (or something like it) to assess > > the input and then behave accordingly. > > Well this is the problem, you cant use a regular expression or even if > you did would that really slow things down wouldn't it, seeing as 99% > dont need the transformation. PatternReplaceCharFilter might do the trick - maybe worth a test to see if it's performant enough? Steve