[ 
https://issues.apache.org/jira/browse/LANG-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503436#comment-13503436
 ] 

Michael Knapp commented on LANG-860:
------------------------------------

I beg to differ, commons-csv assumes there can be an escape character, my code 
assumes there can be an escape pattern.  My code handles a much more broad 
range of problems than CSV.  For example, what if you want to get all the 
parenthesized text out of a document?  commons-csv cannot do that because '(' 
and ')' are different characters.  Commons-csv offers no method to retain 
delimiters that you split on, my code does.  Let's say you split on the pattern 
of open and closed parentheses: no existing split function in commons-lang, and 
no function in commons-csv, is able to retain the text that matched your 
delimiter, but my code does.  The code I wrote does not replace commons-csv, 
nor does it try.  Commons-csv handles comments, empty lines, trimming text, and 
a whole lot more which is out of the scope of my code.  Also, if you expect 
anybody to use commons-csv, you should really put it on the central maven 
repository, and document it a little more.
                
> String split with an escape pattern
> -----------------------------------
>
>                 Key: LANG-860
>                 URL: https://issues.apache.org/jira/browse/LANG-860
>             Project: Commons Lang
>          Issue Type: Improvement
>          Components: lang.*
>            Reporter: Michael Knapp
>            Priority: Minor
>              Labels: patch, split
>         Attachments: StringUtilsSplitEscapingly.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Often times there are strings which are delimited, but certain patterns can 
> escape the delimiter.  For example, quotes are used in CSV to escape a comma 
> delimiter.  I have written a couple methods for StringUtils that split 
> strings while considering the possibility of an escape pattern.  For example, 
> when given "a,\"b,c\",c", it will produce {"a","\"b,c\"","c"}.  In my code, 
> the delimiter can be a string, and it can be escaped by any regular 
> expression pattern.  Unit tests are already written and passing.
> I plan to attach the patch for this once the ticket is created.  I just need 
> a committer to review the patch, approve, and commit it for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to