[ 
https://issues.apache.org/jira/browse/GROOVY-11636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17947435#comment-17947435
 ] 

Paul King edited comment on GROOVY-11636 at 4/25/25 11:30 PM:
--------------------------------------------------------------

Some more examples (existing):
{code:groovy}
assert '𝔸'.next() == '𝔹'  // existing
assert '報'.next() == '堲'  // existing
assert '💙'.next() == '💚'  // existing
assert (0..3).collect('💙'::next).join() == '💙💚💛💜'  // new
assert '❤️'.next() == '❤︐' // ❤️ = ♥ + red variation selector for historical 
reasons (existing)
 {code}
We could try to change the existing behavior to try to use codepoints rather 
than characters (but there is no setCodePointAt or deleteCodePointAt 
equivalents. It would yield the same result for cases like the first three 
examples where there are surrogate pair characters, but codepoints don't really 
handle all cases like the heart emoji case anyway.

It is useful to look at recent advice about working with 
chars/codepoints/single grapheme characters:

[https://www.javaadvent.com/2020/12/confusing-java-strings.html]

[https://horstmann.com/unblog/2023-10-03/index.html]

TL;DR version: we have to wait until Java 20 before we can iterate efficiently 
over grapheme clusters, and that still doesn't cover the red heart emoji case. 
For Groovy 5, we could use the regex {{s.split(/\b\{g}/)}} (Java 9+), but we'd 
have to do our own stringbuilder equivalent code and handle the delete/set 
operations ourselves. And that wouldn't deal with the multi-character emojis. 
So, my suggestion is to just provide the simple char case.


was (Author: paulk):
Some more examples (existing):
{code:groovy}
{code:java}
assert '𝔸'.next() == '𝔹'  // existing
assert '報'.next() == '堲'  // existing
assert '💙'.next() == '💚'  // existing
assert (0..3).collect('💙'::next).join() == '💙💚💛💜'  // new
assert '❤️'.next() == '❤︐' // ❤️ = ♥ + red variation selector for historical 
reasons (existing)
 {code}
We could try to change the existing behavior to try to use codepoints rather 
than characters (but there is no setCodePointAt or deleteCodePointAt 
equivalents. It would yield the same result for cases like the first two where 
there are surrogate pair characters, but codepoints don't really handle all 
cases like the heart emoji case anyway.

It is useful to look at recent advice about working with 
chars/codepoints/single grapheme characters:

[https://www.javaadvent.com/2020/12/confusing-java-strings.html]

[https://horstmann.com/unblog/2023-10-03/index.html]

TL;DR version: we have to wait until Java 20 before we can iterate efficiently 
over grapheme clusters, and that still doesn't cover the red heart emoji case. 
For Groovy 5, we could use the regex {{s.split(/\b\{g}/)}} (Java 9+), but we'd 
have to do our own stringbuilder equivalent code and handle the delete/set 
operations ourselves. And that wouldn't deal with the multi-character emojis. 
So, my suggestion is to just provide the simple char case.

> Create SGM#next/previous methods which act like next/previous but also takes 
> an integer repeat count
> ----------------------------------------------------------------------------------------------------
>
>                 Key: GROOVY-11636
>                 URL: https://issues.apache.org/jira/browse/GROOVY-11636
>             Project: Groovy
>          Issue Type: New Feature
>            Reporter: Paul King
>            Assignee: Paul King
>            Priority: Major
>
> Trying to get from String 'a' to 'e' by "adding 4" can be cumbersome, either 
> calling next() multiple times, or converting to a char, doing the arithmetic, 
> then converting back.
> The idea would be to support:
> {code:groovy}
> assert 'a'.next(0) == 'a'
> assert 'a'.next(4) == 'e'
> assert 'a'.next(0) == 'a'
> assert 'a'.next(25) == 'z'
> assert 'A'.next(32) == 'a'
> assert (0..4).collect('a'::next) == 'a'..'e'
> assert 'car'.next(2) == 'cat'
> {code}
> Although hopefully never used, this piggybacks on the normal next() wrapping 
> behavior if Character.MAX_VALUE is reached. And also, like next(), applies 
> the the last character in a longer String as per last test above.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to