[ 
https://issues.apache.org/jira/browse/LANG-165?focusedWorklogId=595289&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-595289
 ]

ASF GitHub Bot logged work on LANG-165:
---------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/May/21 13:01
            Start Date: 12/May/21 13:01
    Worklog Time Spent: 10m 
      Work Description: garydgregory commented on a change in pull request #751:
URL: https://github.com/apache/commons-lang/pull/751#discussion_r631012728



##########
File path: src/main/java/org/apache/commons/lang3/StringUtils.java
##########
@@ -9638,6 +9638,79 @@ public static String wrapIfMissing(final String str, 
final String wrapWith) {
         return builder.toString();
     }
 
+    /**
+     * Method that assembles all the numbers, form the passed string and 
returns them as list.
+     * It is important to note here, is that bu 'number' method assume any 
digit sequence, that
+     * can (but not necessary at all) contains dot within it (I mean just 
plain old floats,
+     * something like 51.82)
+     *
+     * For example, you may pass a string "21.2 days 3 minutes 22 seconds". 
For this particular string
+     * the result list of doubles will look like this : [21.2, 3.0, 22.0]
+     *
+     * if string contains invalid numbers (for example this string contains
+     * not valid number: "My height is 1234.23.13" This is invalid because it
+     * is not clear how to parse this part - 1234.23.13), {@link 
NumberFormatException}
+     * will be thrown. Though if string will contain number, where right
+     * after second dot resides not a number, or, any other char, then this
+     * case will be considered as valid. For example, this string contains
+     * only valid numbers: "My pulse is 90.123. and weight is 78.2"
+     * In this case sequence "90.123." will be considered as "90.123", as well 
as
+     * sequence "90." (imagine that there is no digit right after dot) will be
+     * considered as 90.0 double.
+     *
+     * @param stringThatContainsNumbers - string, that contains number or 
several numbers.
+     *                                 Not necessary integers, may be numbers 
with float point.
+     * @return - list of numbers, that this particular string contains
+     *
+     * @throws NumberFormatException - see documentation clarification about 
cases when thrown above
+     */
+    public static List<Double> extractNumbersFromString(String 
stringThatContainsNumbers) {
+        boolean hasDigitAlreadyStarted = false;
+        boolean alreadyMetDotInThisNumber = false;
+
+        List<Double> resultList = new ArrayList<>();
+
+        StringBuilder currentNumberAsStringBuilder = new StringBuilder("");
+
+        for (int i = 0; i < stringThatContainsNumbers.length(); i++) {
+            char currentSymbol = stringThatContainsNumbers.charAt(i);
+            if (Character.isDigit(currentSymbol)) {
+                if (!hasDigitAlreadyStarted) {
+                    hasDigitAlreadyStarted = true;
+                }
+                currentNumberAsStringBuilder.append(currentSymbol);
+                continue;
+            } else if (currentSymbol == '.') {

Review comment:
       You can implement an API to do that if you want but my main point is 
that the code should run with inputs from any locale which means handling 
period and commas as both the decimal and thousands separators. But, the other 
point I was attempting to make is that I do not feel this code belongs in 
Commons Lang, it feels too much like NLP code to me. It might be something for 
Commons Text, but it seems quite a specific use case, too much like NLP, not 
generic enough for a Commons library. I encourage others in the community to 
opine. The NLP nature makes me wonder how you would handle input like "I 'd 
like nuts and bolts: 10,000,9,000, but only up to $10.5." and "Send $4.50.5 
apples too please." and "Les pommes sont €10.5 a Paris, $2.30 a Los Angeles." 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 595289)
    Time Spent: 1.5h  (was: 1h 20m)

> [lang] parseDate with TimeZone
> ------------------------------
>
>                 Key: LANG-165
>                 URL: https://issues.apache.org/jira/browse/LANG-165
>             Project: Commons Lang
>          Issue Type: Improvement
>    Affects Versions: 2.1
>         Environment: Operating System: All
> Platform: All
>            Reporter: Bill Boland
>            Priority: Minor
>             Fix For: 2.2
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> I needed the ability to user a function like the 
> org.apache.commons.lang.time.DateUtils.parseDate function but I needed to 
> consider a different time zone when parsing the dates (assuming the format 
> did 
> not have the time zone as part of the input). This is needed for different 
> clients to enter local date/time values on their browser and consider their 
> defined time zone to convert this to the server/system time zone. I just 
> thought an additional parameter to this function would make it more generic 
> for this purpose while still keeping the current method signate which would 
> use the tiem zone sensitive one with a null or default timezone value.
> Anyway, just thought I'd suggest it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to