[
https://issues.apache.org/jira/browse/LANG-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aleksandr Bogush updated LANG-1148:
-----------------------------------
Description:
isBlank uses java.lang.Character.isWhitespace(char ch) method, which has not
been changed for a long time for backward compatibility. Over the years
non-breakable whitespaces were introduced and are now used in some cases. So if
we execute the code
{noformat}org.apache.commons.lang.StringUtils.isBlank("\u00A0"); //returns false
org.apache.commons.lang.StringUtils.isBlank("\u202F"); //returns false
org.apache.commons.lang.StringUtils.isBlank("\u2007"); //returns false{noformat}
we will get 3 falses, which is not right, according to StringUtils.isBlank
documentation: {noformat}Checks if a String is whitespace, empty ("") or
null.{noformat}
I suggest fixing it by using regex pattern {noformat}"^[\\p{Z}]*$"{noformat}
instead of looping over the string characters. I know that it is a bit less
fast than it works now, but it will work much more correctly. I would be glad
to do it myself and write unit tests for it, so if you want, please contact me
via email [email protected]
Additionally, I would modify the documentation itself too, because it does not
tell that it returns true when meeting multiple whitespaces.
was:
isBlank uses java.lang.Character.isWhitespace(char ch) method, which has not
been changed for a long time for backward compatibility. Over the years
non-breakable whitespaces were introduced and are now used in some cases. So if
we execute the code
{noformat}org.apache.commons.lang.StringUtils.isBlank("\u00A0"); //returns false
org.apache.commons.lang.StringUtils.isBlank("\u202F"); //returns false
org.apache.commons.lang.StringUtils.isBlank("\u2007"); //returns false{noformat}
we will get 3 falses, which is not right, according to StringUtils.isBlank
documentation: Checks if a String is whitespace, empty ("") or null.
I suggest fixing it by using regex pattern {noformat}"^[\\p{Z}]*$"{noformat}
instead of looping over the string characters. I know that it is a bit less
fast than it works now, but it will work much more correctly. I would be glad
to do it myself and write unit tests for it, so if you want, please contact me
via email [email protected]
Additionally, I would modify the documentation itself too, because it does not
tell that it returns true when meeting multiple whitespaces.
> StringUtils.isBlank does not work correctly with strings containing
> non-breakable whitespace characters
> -------------------------------------------------------------------------------------------------------
>
> Key: LANG-1148
> URL: https://issues.apache.org/jira/browse/LANG-1148
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.*
> Affects Versions: 2.6
> Environment: Windows 8.1 x64 , Java 1.8, but can be reproduced in any
> environment with an official Oracle JDK or JRE
> Reporter: Aleksandr Bogush
> Priority: Minor
> Labels: test
> Original Estimate: 3h
> Remaining Estimate: 3h
>
> isBlank uses java.lang.Character.isWhitespace(char ch) method, which has not
> been changed for a long time for backward compatibility. Over the years
> non-breakable whitespaces were introduced and are now used in some cases. So
> if we execute the code
> {noformat}org.apache.commons.lang.StringUtils.isBlank("\u00A0"); //returns
> false
> org.apache.commons.lang.StringUtils.isBlank("\u202F"); //returns false
> org.apache.commons.lang.StringUtils.isBlank("\u2007"); //returns
> false{noformat}
> we will get 3 falses, which is not right, according to StringUtils.isBlank
> documentation: {noformat}Checks if a String is whitespace, empty ("") or
> null.{noformat}
> I suggest fixing it by using regex pattern {noformat}"^[\\p{Z}]*$"{noformat}
> instead of looping over the string characters. I know that it is a bit less
> fast than it works now, but it will work much more correctly. I would be glad
> to do it myself and write unit tests for it, so if you want, please contact
> me via email [email protected]
> Additionally, I would modify the documentation itself too, because it does
> not tell that it returns true when meeting multiple whitespaces.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)