[ 
https://issues.apache.org/jira/browse/LANG-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546732#comment-14546732
 ] 

ASF GitHub Bot commented on LANG-1124:
--------------------------------------

Github user rikles commented on a diff in the pull request:

    https://github.com/apache/commons-lang/pull/75#discussion_r30460736
  
    --- Diff: src/main/java/org/apache/commons/lang3/StringUtils.java ---
    @@ -3277,6 +3277,164 @@ public static String substringBetween(final String 
str, final String open, final
             return list.toArray(new String[list.size()]);
         }
     
    +    /**
    +     * <p>Split a String into an array, using an array of fixed string 
lengths.</p>
    +     *
    +     * <p>If not null String input, the returned array size is same as the 
input lengths array.</p>
    +     *
    +     * <p>A null input String returns {@code null}.
    +     * A {@code null} or empty input lengths array returns an empty array.
    +     * A {@code 0} in the input lengths array results in en empty 
string.</p>
    +     *
    +     * <p>Extra characters are ignored (ie String length greater than sum 
of split lengths).
    +     * All empty substrings other than zero length requested, are returned 
{@code null}.</p>
    +     *
    +     * <pre>
    +     * StringUtils.splitByLength(null, *)      = null
    +     * StringUtils.splitByLength("abc")        = []
    +     * StringUtils.splitByLength("abc", null)  = []
    +     * StringUtils.splitByLength("abc", [])    = []
    +     * StringUtils.splitByLength("", 2, 4, 1)  = [null, null, null]
    +     *
    +     * StringUtils.splitByLength("abcdefg", 2, 4, 1)     = ["ab", "cdef", 
"g"]
    --- End diff --
    
    Like said in the next line : `StringUtils.splitByLength("abcdefg", 2, 2)` 
will return `["ab", "cd" ]`.
    `StringUtils.splitByLength("abcdefghij", 2, 4, 1)  = ["ab", "cdef", "g"]`
    
    I asked myself the question during development. Do we discard the extra 
characters ?
    I think it would be nice to let users decide. Moreover, depending on use 
case, it could be useful to keep/discard the "first extra characters" (like 
parsing a single line commented out string).
    
    I propose to :
      * add a private `splitByLengthWorker(String string, boolean splitFromEnd, 
boolean discardExtraChar, int ... lengths)`
      * keep this `splitByLength(String, int ...)` method logic as default  : 
`return splitByLengthWorker(string, false, true, lengths)`. So, by default, the 
returned array is same size as the `int ... lengths` array param and this 
behavior is interesting on parsing "fixed column lengths" strings.
      * add a `splitByLengthKeepExtraChar(String, int ...)` : `return 
splitByLengthWorker(string, false, false, lengths)`
      * add a `splitByLengthFromEnd(String, int ...)` : `return 
splitByLengthWorker(string, true, false, lengths)`
      * add a `splitByLengthFromEndKeepExtraChar(String, int ...)` : `return 
splitByLengthWorker(string, true, true, lengths)`
    
    A question : For _split from end_ methods, which call do you think is more 
logic : _right aligned/end to start_ lengths, _reversed/not reversed_ result ?
      * `StringUtils.splitByLengthFromEndKeepExtraChar("__abcdef", 1, 2, 3)  = 
["__", "a", "bc", "def"]` - (RA, NR)
      * `StringUtils.splitByLengthFromEndKeepExtraChar("__abcdef", 1, 2, 3)  = 
["def", "bc", "a", "__"]` - (RA, R)
      * `StringUtils.splitByLengthFromEndKeepExtraChar("__abcdef", 1, 2, 3)  = 
["f", "de", "abc", "__"]` - (E2S, R)
      * `StringUtils.splitByLengthFromEndKeepExtraChar("__abcdef", 1, 2, 3)  = 
["__", "abc", "de", "f"]` - (E2S, NR)
    
    I think the first one is more readable, we can visually understand the 
splitting, but may be less intuitive :
    ```
    StringUtils.splitByLengthFromEnd("ABCDEFGHIJKLM", 3, 4, 5)  = ["BCD", 
"EFGH", "IJKLM"]
     [3][4_][_5_]
    ABCDEFGHIJKLM
    ```


> Add split by length methods in StringUtils
> ------------------------------------------
>
>                 Key: LANG-1124
>                 URL: https://issues.apache.org/jira/browse/LANG-1124
>             Project: Commons Lang
>          Issue Type: New Feature
>          Components: lang.*
>            Reporter: Loic Guibert
>
> Add methods to split a String by fixed lengths :
> {code:java}
> public static String[] splitByLength(String str, int ... lengths);
> public static String[] splitByLengthRepeatedly(String str, int ... lengths);
> {code}
> Detail :
> {code:java}
> /**
>  * <p>Split a String into an array, using an array of fixed string 
> lengths.</p>
>  *
>  * <p>If not null String input, the returned array size is same as the input 
> lengths array.</p>
>  *
>  * <p>A null input String returns {@code null}.
>  * A {@code null} or empty input lengths array returns an empty array.
>  * A {@code 0} in the input lengths array results in en empty string.</p>
>  *
>  * <p>Extra characters are ignored (ie String length greater than sum of 
> split lengths).
>  * All empty substrings other than zero length requested, are returned {@code 
> null}.</p>
>  *
>  * <pre>
>  * StringUtils.splitByLength(null, *)      = null
>  * StringUtils.splitByLength("abc")        = []
>  * StringUtils.splitByLength("abc", null)  = []
>  * StringUtils.splitByLength("abc", [])    = []
>  * StringUtils.splitByLength("", 2, 4, 1)  = [null, null, null]
>  *
>  * StringUtils.splitByLength("abcdefg", 2, 4, 1)     = ["ab", "cdef", "g"]
>  * StringUtils.splitByLength("abcdefghij", 2, 4, 1)  = ["ab", "cdef", "g"]
>  * StringUtils.splitByLength("abcdefg", 2, 4, 5)     = ["ab", "cdef", "g"]
>  * StringUtils.splitByLength("abcdef", 2, 4, 1)      = ["ab", "cdef", null]
>  *
>  * StringUtils.splitByLength(" abcdef", 2, 4, 1)     = [" a", "bcde", "f"]
>  * StringUtils.splitByLength("abcdef ", 2, 4, 1)     = ["ab", "cdef", " "]
>  * StringUtils.splitByLength("abcdefg", 2, 4, 0, 1)  = ["ab", "cdef", "", "g"]
>  * StringUtils.splitByLength("abcdefg", -1)          = {@link 
> IllegalArgumentException}
>  * </pre>
>  *
>  * @param str  the String to parse, may be null
>  * @param lengths  the string lengths where to cut, may be null, must not be 
> negative
>  * @return an array of splitted Strings, {@code null} if null String input
>  * @throws IllegalArgumentException
>  *             if one of the lengths is negative
>  */
> public static String[] splitByLength(String str, int ... lengths);
> /**
>  * <p>Split a String into an array, using an array of fixed string lengths 
> repeated as
>  * many times as necessary to reach the String end.</p>
>  *
>  * <p>If not null String input, the returned array size is a multiple of the 
> input lengths array.</p>
>  *
>  * <p>A null input String returns {@code null}.
>  * A {@code null} or empty input lengths array returns an empty array.
>  * A {@code 0} in the input lengths array results in en empty string.</p>
>  *
>  * <p>All empty substrings other than zero length requested and following 
> substrings,
>  * are returned {@code null}.</p>
>  *
>  * <pre>
>  * StringUtils.splitByLengthRepeated(null, *)      = null
>  * StringUtils.splitByLengthRepeated("abc")        = []
>  * StringUtils.splitByLengthRepeated("abc", null)  = []
>  * StringUtils.splitByLengthRepeated("abc", [])    = []
>  * StringUtils.splitByLengthRepeated("", 2, 4, 1)  = [null, null, null]
>  *
>  * StringUtils.splitByLengthRepeated("abcdefghij", 2, 3)     = ["ab", "cde", 
> "fg", "hij"]
>  * StringUtils.splitByLengthRepeated("abcdefgh", 2, 3)       = ["ab", "cde", 
> "fg", "h"]
>  * StringUtils.splitByLengthRepeated("abcdefg", 2, 3)        = ["ab", "cde", 
> "fg", null]
>  *
>  * StringUtils.splitByLengthRepeated(" abcdef", 2, 3)        = [" a", "bcd", 
> "ef", null]
>  * StringUtils.splitByLengthRepeated("abcdef ", 2, 3)        = ["ab", "cde", 
> "f ", null]
>  * StringUtils.splitByLengthRepeated("abcdef", 2, 3, 0, 1)   = ["ab", "cde", 
> "", "f"]
>  * StringUtils.splitByLengthRepeated("abcdefg", 2, 3, 0, 1)  = ["ab", "cde", 
> "", "f",
>  *                                                              "g", null, 
> null, null]
>  * StringUtils.splitByLengthRepeated("abcdefgh", 2, 0, 1, 0) = ["ab", "", 
> "c", "",
>  *                                                              "de", "", 
> "f", "",
>  *                                                              "gh", "", 
> null, null]
>  * StringUtils.splitByLengthRepeated("abcdefg", 2, 0, 1, 0) = ["ab", "", "c", 
> "",
>  *                                                              "de", "", 
> "f", "",
>  *                                                              "g", null, 
> null, null]
>  * StringUtils.splitByLengthRepeated("abcdefg", -1)          = {@link 
> IllegalArgumentException}
>  * StringUtils.splitByLengthRepeated("abcdefg", 0, 0)        = {@link 
> IllegalArgumentException}
>  * </pre>
>  *
>  * @param str  the String to parse, may be null
>  * @param lengths  the string lengths where to cut, may be null, must not be 
> negative
>  * @return an array of splitted Strings, {@code null} if null String input
>  * @throws IllegalArgumentException
>  *             if one of the lengths is negative or if lengths sum is less 
> than 1
>  */
> public static String[] splitByLengthRepeatedly(String str, int... lengths);
> {code}
> See PR #75 : https://github.com/apache/commons-lang/pull/75



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to