I have a use case that generates some tokens containing punctuation (fractions and other numerical constructs), but I am handling most punctuation with WordDelimiterGraphFilter, which then decomposes those tokens into parts and re-composes, so eg 1/2 becomes {1, 2, 12}. I thought at first that I could mark those tokens as keywords to prevent any future analysis, but I discovered WDGF ignores that.
I have a workaround using Arabic numerals as separators instead of punctuation (1/2 -> 1١2) -- these are classified as digits, so WDGF does not split on them --, but someday I would like to support Arabic (or Hindi) language numbers as well, and then this hack will bite me. Does it seem reasonable to update WDGF (and its cousin WDF) to respect KeywordAttribute? I think it can be done with a very small change.