[jira] Commented: (LANG-288) StrTokenizer needs to support access to the token separators

Henri Yandell (JIRA) Sun, 07 Feb 2010 22:49:54 -0800

    [ 
https://issues.apache.org/jira/browse/LANG-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12830849#action_12830849
 ]


Henri Yandell commented on LANG-288:
------------------------------------

Both :)  Multiple delimiter tokenizer is supported by using a CharSetMatcher if 
I understand correctly.

I think the iniital issue is that the Matcher API will need to return the item 
that was matched against instead of the number of characters matched. 
Essentially the same API, until you put a RegexpMatcher in there or some other 
ruleset.

Once that is done, then StrTokenizer would have access to the delimiter that 
actually matched.

Big question is whether that API change to StrMatcher is 'good'. 

> StrTokenizer needs to support access to the token separators
> ------------------------------------------------------------
>
>                 Key: LANG-288
>                 URL: https://issues.apache.org/jira/browse/LANG-288
>             Project: Commons Lang
>          Issue Type: Improvement
>          Components: lang.text.*
>            Reporter: Stephen Colebourne
>            Priority: Minor
>             Fix For: 3.1
>
>
> With StrTokenizer at present you cannot extract the separators between the 
> tokens, a feature which is possible with StringTokenizer.
> Thus tokenizing "[email protected]" using ".@" would return a,b,c,d but you wouldn't 
> know where the @ was.
> This could probably best be part of the API as a lastSeparator() method that 
> can only be called after next(), returning the separator(s) between that 
> token and the previous token.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (LANG-288) StrTokenizer needs to support access to the token separators

Reply via email to