theshoeshiner commented on PR #450:
URL: https://github.com/apache/commons-text/pull/450#issuecomment-1830915425

   @elharo @garydgregory 
   
   I converted the cases API to use StringTokenizer as suggested by @elharo. In 
general this works out fine and greatly simplifies those classes, but 
necessitates a couple of other changes that I'd request your comments on...
   
   1. `org.apache.commons.text.TokenStringifier` - This is essentially the 
inverse of the `StringTokenizer `- to allow the Case implementations to control 
output formatting. It's a fairly simple class but I don't see anything else 
that really fits the bill. It includes the TokenFormatter interface, which 
cases can implement to customize the formatting:
   ```
   public interface TokenFormatter {
       String format(char[] prior, int tokenIndex, char[] token);
   }
   ```
   If there's an alternative to using these new classes I'm all for it, just 
tried to keep it simple but clear.
   
   2. `org.apache.commons.text.matcher.AbstractStringMatcher.UppercaseMatcher` 
- This new matcher powers the Pascal/Camel case implementations. The wrinkle is 
that the Matcher API doesn't really support matching on dynamic length 
patterns. i.e. the size() method is expected to return the same value 
regardless of the "match". This doesn't work with unicode characters because 
not every code point that fits the matcher has the same width. For right now im 
simply throwing UnsupportedOperationException from that method, since it's not 
needed for the Cases API. This means that matcher wouldn't work if used with 
the StringSubstitutorReader (which I think uses the size() method to control 
buffering?).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to