theshoeshiner commented on PR #450:
URL: https://github.com/apache/commons-text/pull/450#issuecomment-1830915425
@elharo @garydgregory
I converted the cases API to use StringTokenizer as suggested by @elharo. In
general this works out fine and greatly simplifies those classes, but
necessitates a couple of other changes that I'd request your comments on...
1. `org.apache.commons.text.TokenStringifier` - This is essentially the
inverse of the `StringTokenizer `- to allow the Case implementations to control
output formatting. It's a fairly simple class but I don't see anything else
that really fits the bill. It includes the TokenFormatter interface, which
cases can implement to customize the formatting:
```
public interface TokenFormatter {
String format(char[] prior, int tokenIndex, char[] token);
}
```
If there's an alternative to using these new classes I'm all for it, just
tried to keep it simple but clear.
2. `org.apache.commons.text.matcher.AbstractStringMatcher.UppercaseMatcher`
- This new matcher powers the Pascal/Camel case implementations. The wrinkle is
that the Matcher API doesn't really support matching on dynamic length
patterns. i.e. the size() method is expected to return the same value
regardless of the "match". This doesn't work with unicode characters because
not every code point that fits the matcher has the same width. For right now im
simply throwing UnsupportedOperationException from that method, since it's not
needed for the Cases API. This means that matcher wouldn't work if used with
the StringSubstitutorReader (which I think uses the size() method to control
buffering?).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]