Chris Hostetter wrote:
: My real use case is adding the the trim filter to the pattern tokenizer.
: the 'correct' answer in my case it to update the offsets.
hmmm... wouldn't the "correct" thing to do in that case be to change your
pattern so it strips the whitespace when tokenizing? that way the offsets
of your tokens will be accurate from the begining.
probably.... I'm just not very good at regex ;)
pattern="--|,|\s-\s|\(|\)"
this will split on "--", " - ", "(", and ")". I can't figure out how to
build the pattern so it will trim each thing on the way out.