[GitHub] commons-lang pull request: LANG-1184: StringUtils#normalizeSpace n...

PascalSchumacher Tue, 26 Jan 2016 13:11:30 -0800

Github user PascalSchumacher commented on the pull request:

    https://github.com/apache/commons-lang/pull/113#issuecomment-175231559
  
    Hi Gary,
    
    3.0 had the same behavior as 3.3.2, but I guess this is not a productive 
discussion.
    
    I know that `Character.isWhitespace` defines `\u00A0` as not a white space, 
but for example guava 
https://github.com/google/guava/blob/8fbeb9038cbe8b382b1ee188ae8459203cd04fb5/guava/src/com/google/common/base/CharMatcher.java#L1217
 classifies it as whitespace.
    
    If you want to keep the changed behavior I suggest at least to re-add the 
`trim()` call or to remove the `Additionally <code>{@link #trim(String)}</code> 
removes control characters (char &lt;= 32) from both ends of this String.` part 
of the java doc.
    
    I'm not unicode expert, but 
https://en.wikipedia.org/wiki/Non-breaking_space has a list of some more 
non-breaking-space unicode characters.
    
    As for the method name I guess the easy way out would be to add a flag to 
normalize space. I can not come up with a good method name at the moment 
`normalizeAllSpace`, is my best try.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] commons-lang pull request: LANG-1184: StringUtils#normalizeSpace n...

Reply via email to