Re: RFR: 8197594 - String and character repeat

2018-02-26 Thread Stuart Marks

On 2/18/18 1:37 AM, James Laskey wrote:

Didn’t I hear someone mentioning “\U1D11A” at some point?


On 2/19/18 7:55 AM, Martin Buchholz wrote:
Oops, I already got it wrong - it's already at 6 hex digits because there are 17 
planes, not 16.  MAX_CODE_POINT is U+10.

Yes, we need a variable width syntax like regex \x{h...h}


Yeah, there are a bunch of syntactic alternatives to consider. An "obvious" 
alternative to "\u" is "\Uxx" which works if you're always willing to 
specify six digits (or to have some weird non-delimited but variable-length 
sequence, such as the existing octal escapes for chars (does anybody use those 
(see JLS 3.10.6)?)) The difference between \u and \U is rather subtle, though. 
Or a delimited sequence such as used by regex might be an alternative.



And java regex also supports
   \N{name}The character with Unicode character name 'name'
so we could do the same for the java language.
Although it would be a little weird to have every Unicode update make some 
previously invalid source files valid.


We could also say "It's 2018 and UTF-8 has won" and simply use UTF-8 in source 
files directly.   No Unicode escapes needed.


Even if one is willing to have a source file in UTF-8 (as opposed to say, ASCII) 
there are things in Unicode that are really hard to edit. For example, there are 
zero-width spaces, joiners, non-joiners, directionality markers, etc. I think 
escapes are the bare minimum. Some kind of name-based interpolation would be 
useful, but the actual Unicode names are rather unwieldy. Maybe something like 
HTML entities would be worthwhile to investigate, though probably with a 
different syntax.


s'marks


Re: RFR: 8197594 - String and character repeat

2018-02-26 Thread Stuart Marks
To close the loop on this, I've reopened JDK-4993841, which requests adding an 
API Character.toString(int) which converts an int codepoint value to a String.


(This seems like the obvious API, in parallel with Character.toString(char), but 
of course alternatives could be considered.)


s'marks

On 2/20/18 11:46 AM, Louis Wasserman wrote:

I'm with Brian: adding a separate API to make it easier to get from a
codepoint to a String seems independently merited, and makes the single
repeat API work well for that case.  A very quick regex-powered search
comes up with 183 hits in Google for (new
String|String.copyValueOf|String.valueOf)(Character.toChars(..)).

I do, however, recommend a separate thread for discussing that API :)

On Tue, Feb 20, 2018 at 11:33 AM Kevin Bourrillion 
wrote:


Just to add another dimension to this data:  most of the usages of our
repeat method (~75%) are in test code. These tests usually just want any
old test string of a certain length. Repeating a single character is the
obvious way to get that.

Among production code usages (~25%), there are a few roughly equal use
cases: ascii indentation/alignment, redaction, and Martin's expected case
of "drawing" with ASCII symbols, and "other".



On Thu, Feb 15, 2018 at 12:52 PM, Louis Wasserman 
wrote:


I don't think there's a case for demand to merit having a
repeat(CharSequence, int) at all.

I did an analysis of usages of Guava's Strings.repeat on Google's
codebase.  Users might be rolling their own implementations, too, but this
should be a very good proxy for demand.

StringRepeat_SingleConstantChar = 4.475 K // strings with .length() ==  1
StringRepeat_SingleConstantCodePoint = 28 // strings with
.codePointCount(...) == 1
StringRepeat_MultiCodePointConstant = 1.156 K // constant strings neither
of the above
StringRepeat_CharSequenceToString = 2 //
Strings.repeat(CharSequence.toString(), n)
StringRepeat_NoneOfTheAbove = 248

Notably, it seems like basically nobody needs to repeat a CharSequence --
definitely not enough demand to merit the awkwardness of e.g.
Rope.repeat(n) inheriting a repeat returning a String.

Based on this data, I'd recommend providing one and only one method of
this
type: String.repeat(int).  There's no real advantage to a static
repeat(char, int) method when the overwhelming majority of these are
constants: e.g. compare SomeUtilClass.repeat('*', n) versus "*".repeat(n).
Character.toString(c).repeat(n) isn't a bad workaround if you don't have a
constant char.  There also isn't much demand for dealing with the code
point case specially; the String.repeat(int) method seems like it'd handle
that just fine.

On Thu, Feb 15, 2018 at 11:44 AM Jim Laskey 
wrote:





On Feb 15, 2018, at 3:36 PM, Ivan Gerasimov <

ivan.gerasi...@oracle.com>

wrote:


Hello!

The link with the webrev returned 404, but I could find it at this

location: http://cr.openjdk.java.net/~jlaskey/8197594/webrev-00/


A few minor comments:

1)

This check:

2992 final long limit = (long)count * 2L;
2993 if ((long)Integer.MAX_VALUE < limit) {

can be possibly simplified as
if (count > Integer.MAX_VALUE - count) {


Good.



2)
Should String repeat(final int codepoint, final int count) be

optimized

for codepoints that can be represented with a single char?


E.g. like this:

public static String repeat(final int codepoint, final int count) {
return Character.isBmpCodePoint(codepoint))
? repeat((char) codepoint, count)
: (new String(Character.toChars(codepoint))).repeat(count);
}


Yes, avoid array allocation.



3)
Using long arithmetic can possibly be avoided in the common path of

repeat(final int count):


E.g. like this:

 if (count < 0) {
 throw new IllegalArgumentException("count is negative, " +

count);

 } else if (count == 1) {
 return this;
 } else if (count == 0) {
 return "";
}
 final int len = value.length;
 if (Integer.MAX_VALUE / count < len) {
 throw new IllegalArgumentException(
 "Resulting string exceeds maximum string length:

" +

((long)len * (long)count));

 }
 final int limit = count * len;


Good.

Thank you.



With kind regards,
Ivan

On 2/15/18 9:20 AM, Jim Laskey wrote:

This is a pre-CSR code review [1] for String repeat methods

(Enhancement).


The proposal is to introduce four new methods;

1. public String repeat(final int count)
2. public static String repeat(final char ch, final int count)
3. public static String repeat(final int codepoint, final int count)
4. public static String repeat(final CharSequence seq, final int

count)


For the sake of transparency, only 1 is necessary, 2-4 are

convenience

methods.

In the case of 2, “*”.repeat(10) performs as well as

String.repeat(‘*’,

10).

3 and 4 convert to String before calling 1.

Performance runs with jmh (results as comment 

Re: RFR: 8197594 - String and character repeat

2018-02-19 Thread Martin Buchholz
.



On Sun, Feb 18, 2018 at 11:19 AM, Martin Buchholz 
wrote:

>
>  - how many digits to consume after the escape?  How much do we trust
> Unicode to never ever grow beyond 5 hex digits?
>

Oops, I already got it wrong - it's already at 6 hex digits because there
are 17 planes, not 16.  MAX_CODE_POINT is U+10.
Yes, we need a variable width syntax like regex \x{h...h}

And java regex also supports
  \N{name} The character with Unicode character name 'name'
so we could do the same for the java language.
Although it would be a little weird to have every Unicode update make some
previously invalid source files valid.

We could also say "It's 2018 and UTF-8 has won" and simply use UTF-8 in
source files directly.   No Unicode escapes needed.


Re: RFR: 8197594 - String and character repeat

2018-02-18 Thread Xueming Shen

On 2/18/18, 11:19 AM, Martin Buchholz wrote:

On Sun, Feb 18, 2018 at 1:37 AM, James Laskey
wrote:


Didn’t I hear someone mentioning “\U1D11A” at some point?


Unicode codepoint escapes are still Not a Thing, apparently.
Yes, it's a language feature, but an easy one to implement.
Except for the usual bikeshedding about syntax
  - include the '+'?
  - how many digits to consume after the escape?  How much do we trust
Unicode to never ever grow beyond 5 hex digits?

|now we have some "similar" already in regex \x|{h...h} :-)

It's hard to say "never" given we are adding so many  emoji in Unicode
but it will take a while to fill those unsigned space.
/
/


Re: RFR: 8197594 - String and character repeat

2018-02-18 Thread Martin Buchholz
On Sun, Feb 18, 2018 at 1:37 AM, James Laskey 
wrote:

> Didn’t I hear someone mentioning “\U1D11A” at some point?
>

Unicode codepoint escapes are still Not a Thing, apparently.
Yes, it's a language feature, but an easy one to implement.
Except for the usual bikeshedding about syntax
 - include the '+'?
 - how many digits to consume after the escape?  How much do we trust
Unicode to never ever grow beyond 5 hex digits?


Re: RFR: 8197594 - String and character repeat

2018-02-18 Thread Ulf Zibis

Am 18.02.2018 um 06:10 schrieb Stuart Marks:

Fair enough. I'll be less unhappy if there is a way to convert from a code 
point to a String, as requested by JDK-4993841. This will reduce

 new String(Character.toChars(codepoint)).repeat(count)

to

 Character.toString(codepoint).repeat(count)
Shorter and maybe more logical to get a String by a String constructor, 
instead overloading toString() of Character:

    String(codepoint).repeat(count)

-Ulf



Re: RFR: 8197594 - String and character repeat

2018-02-18 Thread James Laskey
Didn’t I hear someone mentioning “\U1D11A” at some point?

Sent from my iPhone

> On Feb 18, 2018, at 1:10 AM, Stuart Marks  wrote:
> 
> Fair enough. I'll be less unhappy if there is a way to convert from a code 
> point to a String, as requested by JDK-4993841. This will reduce
> 
>new String(Character.toChars(codepoint)).repeat(count)
> 
> to
> 
>Character.toString(codepoint).repeat(count)
> 
> But this is still fairly roundabout. Since most cases are constants, the 
> advice is to use a string literal instead of a char literal. This works for 
> BMP characters, e.g. "-".repeat(10) or "\u2501".repeat(15). But if I want a 
> non-BMP character as a string literal, I have encode it into a surrogate pair 
> myself. For example, a string literal containing the character U+1D11A 
> MUSICAL SYMBOL FIVE-LINE STAFF would be "\uD834\uDD1A". Ugh! Or, I could just 
> call a function and live with it not being a constant. It would be nice if 
> there were an escape sequence that allowed any Unicode code point, including 
> supplementary characters, to be put to n a string literal.
> 
> s'marks
> 
>> On Feb 16, 2018, at 18:02, Brian Goetz  wrote:
>> 
>> Disagree.  
>> 
>> On #3, most of the time the char being repeated is already a literal.  So 
>> just make it a string.  
>> 
>> On #2, better to aim for string.ofCodePoint(int) and compose w repeat.  
>> 
>> Down to one method again :)
>> 
>> Sent from my MacBook Wheel
>> 
>>> On Feb 16, 2018, at 5:13 PM, Stuart Marks  wrote:
>>> 
>>> Let me put in an argument for handling code points:
>>> 
 3. public static String repeat(final int codepoint, final int count)
>>> 
>>> Most of the String and Character API handles code points on an equal 
>>> footing with chars. I think this is important, as over time Unicode is 
>>> continuing to add supplementary characters -- those that can't be 
>>> represented in a Java char value. Examples abound of how such characters 
>>> are mishandled. Therefore, I believe Java APIs should have full support for 
>>> code points.
>>> 
>>> This is a small thing, and some might consider it a rare case -- how often 
>>> does one need to repeat something like an emoji? The issue however isn't 
>>> that particular use case. Instead what's required is the ability to handle 
>>> *any Unicode character* uniformly, regardless of whether or not it's a 
>>> supplementary character. The way to do that is to deal with code points, so 
>>> any Java API that deals with character data must also handle code points.
>>> 
>>> If we were to add just one method:
>>> 
 1. public String repeat(final int count)
>>> 
>>> the workaround is to take the character, turn it into a string, and call 
>>> the repeat() method on it. For a 'char' value, this isn't too bad, but I'd 
>>> argue it isn't pretty either:
>>> 
>>>  Character.toString(charVal).repeat(n)
>>> 
>>> But this only handles BMP characters, not supplementary characters. 
>>> Unfortunately, there's no direct way to turn a code point into a string -- 
>>> you have to turn it into a byte array first! Thus, to get a string from a 
>>> code point and repeat it, you have to do this:
>>> 
>>>  new String(Character.toChars(codepoint)).repeat(count)
>>> 
>>> This is enough indirection that it's hard to discover, and I suspect that 
>>> most people won't put in the effort to do this correctly, resulting in more 
>>> code that mishandles supplementary characters.
>>> 
>>> Thus, I think we need to add API #3 that performs the repeat function on 
>>> code points.
>>> 
>>> (Hm, the lack of Character.toString(codepoint) is covered by JDK-4993841, 
>>> which is closed. I think I'll reopen it.)
>>> 
 2. public static String repeat(final char ch, final int count)
>>> 
>>> I can see that this API is not as important as one that handles code 
>>> points, and it seems to be less frequently used according to Louis W's 
>>> analysis. But if you have char data you want to repeat, not having this 
>>> seems like an omission; it seems backwards to have to create a string from 
>>> the char, only for repeat() to extract that char from that String in order 
>>> to repeat it. Thus I've vote for inclusion of this method as well.
>>> 
>>> s'marks
>>> 
>>> 
 On 2/16/18 5:10 AM, Jim Laskey wrote:
 We’re going with the one instance method (Louis clinched it.) with 
 recommended enhancements and not touching CharSequence.
 Working it up now.
 — Jim
> On Feb 16, 2018, at 7:46 AM, Alan Bateman  wrote:
> 
> On 15/02/2018 17:20, Jim Laskey wrote:
>> This is a pre-CSR code review [1] for String repeat methods 
>> (Enhancement).
>> 
>> The proposal is to introduce four new methods;
>> 
>> 1. public String repeat(final int count)
>> 2. public static String repeat(final char ch, final int count)
>> 3. public static String repeat(final int codepoint, final int count)
>> 

Re: RFR: 8197594 - String and character repeat

2018-02-17 Thread Stuart Marks
Fair enough. I'll be less unhappy if there is a way to convert from a code 
point to a String, as requested by JDK-4993841. This will reduce

new String(Character.toChars(codepoint)).repeat(count)

to

Character.toString(codepoint).repeat(count)

But this is still fairly roundabout. Since most cases are constants, the advice 
is to use a string literal instead of a char literal. This works for BMP 
characters, e.g. "-".repeat(10) or "\u2501".repeat(15). But if I want a non-BMP 
character as a string literal, I have encode it into a surrogate pair myself. 
For example, a string literal containing the character U+1D11A MUSICAL SYMBOL 
FIVE-LINE STAFF would be "\uD834\uDD1A". Ugh! Or, I could just call a function 
and live with it not being a constant. It would be nice if there were an escape 
sequence that allowed any Unicode code point, including supplementary 
characters, to be put to n a string literal.

s'marks

> On Feb 16, 2018, at 18:02, Brian Goetz  wrote:
> 
> Disagree.  
> 
> On #3, most of the time the char being repeated is already a literal.  So 
> just make it a string.  
> 
> On #2, better to aim for string.ofCodePoint(int) and compose w repeat.  
> 
> Down to one method again :)
> 
> Sent from my MacBook Wheel
> 
>> On Feb 16, 2018, at 5:13 PM, Stuart Marks  wrote:
>> 
>> Let me put in an argument for handling code points:
>> 
>>> 3. public static String repeat(final int codepoint, final int count)
>> 
>> Most of the String and Character API handles code points on an equal footing 
>> with chars. I think this is important, as over time Unicode is continuing to 
>> add supplementary characters -- those that can't be represented in a Java 
>> char value. Examples abound of how such characters are mishandled. 
>> Therefore, I believe Java APIs should have full support for code points.
>> 
>> This is a small thing, and some might consider it a rare case -- how often 
>> does one need to repeat something like an emoji? The issue however isn't 
>> that particular use case. Instead what's required is the ability to handle 
>> *any Unicode character* uniformly, regardless of whether or not it's a 
>> supplementary character. The way to do that is to deal with code points, so 
>> any Java API that deals with character data must also handle code points.
>> 
>> If we were to add just one method:
>> 
>>> 1. public String repeat(final int count)
>> 
>> the workaround is to take the character, turn it into a string, and call the 
>> repeat() method on it. For a 'char' value, this isn't too bad, but I'd argue 
>> it isn't pretty either:
>> 
>>   Character.toString(charVal).repeat(n)
>> 
>> But this only handles BMP characters, not supplementary characters. 
>> Unfortunately, there's no direct way to turn a code point into a string -- 
>> you have to turn it into a byte array first! Thus, to get a string from a 
>> code point and repeat it, you have to do this:
>> 
>>   new String(Character.toChars(codepoint)).repeat(count)
>> 
>> This is enough indirection that it's hard to discover, and I suspect that 
>> most people won't put in the effort to do this correctly, resulting in more 
>> code that mishandles supplementary characters.
>> 
>> Thus, I think we need to add API #3 that performs the repeat function on 
>> code points.
>> 
>> (Hm, the lack of Character.toString(codepoint) is covered by JDK-4993841, 
>> which is closed. I think I'll reopen it.)
>> 
>>> 2. public static String repeat(final char ch, final int count)
>> 
>> I can see that this API is not as important as one that handles code points, 
>> and it seems to be less frequently used according to Louis W's analysis. But 
>> if you have char data you want to repeat, not having this seems like an 
>> omission; it seems backwards to have to create a string from the char, only 
>> for repeat() to extract that char from that String in order to repeat it. 
>> Thus I've vote for inclusion of this method as well.
>> 
>> s'marks
>> 
>> 
>>> On 2/16/18 5:10 AM, Jim Laskey wrote:
>>> We’re going with the one instance method (Louis clinched it.) with 
>>> recommended enhancements and not touching CharSequence.
>>> Working it up now.
>>> — Jim
 On Feb 16, 2018, at 7:46 AM, Alan Bateman  wrote:
 
 On 15/02/2018 17:20, Jim Laskey wrote:
> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
> 
> The proposal is to introduce four new methods;
> 
> 1. public String repeat(final int count)
> 2. public static String repeat(final char ch, final int count)
> 3. public static String repeat(final int codepoint, final int count)
> 4. public static String repeat(final CharSequence seq, final int count)
> 
 Just catching up on this thread and it's hard to see where the bidding is 
 currently at. Are you planning to send an updated proposal, a list of 
 methods is fine, even if it's just one, is okay 

Re: RFR: 8197594 - String and character repeat

2018-02-17 Thread Brian Goetz

> On Feb 17, 2018, at 9:40 AM, Martin Buchholz  wrote:
> 
> Let me join the chorus of agreement with Brian here.
> 
> The most popular use case will forever be ASCII line of non-letter symbols.

We may also wish to have a repeating version on StringBuilder, though:

sb.append(INDENT_CHARS, indentLevel)

(or a default method on Appendable with the same effect.)  

By way of background … my primary motivation for these sorts of methods is to 
take things that require execution as _statements_ (loops, if-then, etc) and 
turn them into _expressions_, not primarily because they are more compact, but 
because they are then _composible_.  Repeating a string today requires a loop 
(yes, I know you can do it with a stream expression), which means it can’t be 
done inline as a method parameter, requiring you to potentially unroll a deeply 
nested expression just to create a statement context to do this bit of 
paperwork.  




Re: RFR: 8197594 - String and character repeat

2018-02-17 Thread Martin Buchholz
Let me join the chorus of agreement with Brian here.

The most popular use case will forever be ASCII line of non-letter symbols.

Another way to think about codepoints is as conversion between UTF-32 and
UTF-16, or as general support for text in UTF-32 format, which is not
supported well within the JDK (and we probably should not do too much work
in this direction).

On Sat, Feb 17, 2018 at 1:50 AM, Remi Forax  wrote:

> - Mail original -
> > De: "Brian Goetz" 
>
> > I really can’t see the value of more than one method. If we need other
> forms
> > they should be for constructing strings not repeating strings.
> >
> > Sent from my MacBook Wheel
>
> I fully agree.
>
> Rémi


Re: RFR: 8197594 - String and character repeat

2018-02-17 Thread Remi Forax
- Mail original -
> De: "Brian Goetz" <brian.go...@oracle.com>
> À: "Xueming Shen" <xueming.s...@oracle.com>
> Cc: "core-libs-dev" <core-libs-dev@openjdk.java.net>
> Envoyé: Samedi 17 Février 2018 03:30:10
> Objet: Re: RFR: 8197594 - String and character repeat

> I really can’t see the value of more than one method. If we need other forms
> they should be for constructing strings not repeating strings.
> 
> Sent from my MacBook Wheel

I fully agree.

Rémi

> 
>> On Feb 16, 2018, at 6:12 PM, Xueming Shen <xueming.s...@oracle.com> wrote:
>> 
>>> On 2/16/18, 5:13 PM, Stuart Marks wrote:
>>> Let me put in an argument for handling code points:
>>> 
>>>> 3. public static String repeat(final int codepoint, final int count)
>>> 
>>> Most of the String and Character API handles code points on an equal footing
>>> with chars. I think this is important, as over time Unicode is continuing to
>>> add supplementary characters -- those that can't be represented in a Java 
>>> char
>>> value. Examples abound of how such characters are mishandled. Therefore, I
>>> believe Java APIs should have full support for code points.
>>> 
>>> This is a small thing, and some might consider it a rare case -- how often 
>>> does
>>> one need to repeat something like an emoji? The issue however isn't that
>>> particular use case. Instead what's required is the ability to handle *any
>>> Unicode character* uniformly, regardless of whether or not it's a 
>>> supplementary
>>> character. The way to do that is to deal with code points, so any Java API 
>>> that
>>> deals with character data must also handle code points.
>>> 
>>> If we were to add just one method:
>>> 
>>>> 1. public String repeat(final int count)
>>> 
>>> the workaround is to take the character, turn it into a string, and call the
>>> repeat() method on it. For a 'char' value, this isn't too bad, but I'd 
>>> argue it
>>> isn't pretty either:
>>> 
>>>Character.toString(charVal).repeat(n)
>> 
>> How about
>> 
>> public static repeat(int count, char... chars)?
>> 
>> String.repeat(100, '*');
>> String.repeat(100, 'x', 'y');
>> 
>> and it should not be too bad and kinda consistent to have
>> 
>> String.repeat(n,  Character.toChars(0x12345));
>> 
> > -sherman


Re: RFR: 8197594 - String and character repeat

2018-02-16 Thread Brian Goetz
I really can’t see the value of more than one method. If we need other forms 
they should be for constructing strings not repeating strings.  

Sent from my MacBook Wheel

> On Feb 16, 2018, at 6:12 PM, Xueming Shen  wrote:
> 
>> On 2/16/18, 5:13 PM, Stuart Marks wrote:
>> Let me put in an argument for handling code points:
>> 
>>> 3. public static String repeat(final int codepoint, final int count)
>> 
>> Most of the String and Character API handles code points on an equal footing 
>> with chars. I think this is important, as over time Unicode is continuing to 
>> add supplementary characters -- those that can't be represented in a Java 
>> char value. Examples abound of how such characters are mishandled. 
>> Therefore, I believe Java APIs should have full support for code points.
>> 
>> This is a small thing, and some might consider it a rare case -- how often 
>> does one need to repeat something like an emoji? The issue however isn't 
>> that particular use case. Instead what's required is the ability to handle 
>> *any Unicode character* uniformly, regardless of whether or not it's a 
>> supplementary character. The way to do that is to deal with code points, so 
>> any Java API that deals with character data must also handle code points.
>> 
>> If we were to add just one method:
>> 
>>> 1. public String repeat(final int count)
>> 
>> the workaround is to take the character, turn it into a string, and call the 
>> repeat() method on it. For a 'char' value, this isn't too bad, but I'd argue 
>> it isn't pretty either:
>> 
>>Character.toString(charVal).repeat(n)
> 
> How about
> 
> public static repeat(int count, char... chars)?
> 
> String.repeat(100, '*');
> String.repeat(100, 'x', 'y');
> 
> and it should not be too bad and kinda consistent to have
> 
> String.repeat(n,  Character.toChars(0x12345));
> 
> -sherman



Re: RFR: 8197594 - String and character repeat

2018-02-16 Thread Xueming Shen

On 2/16/18, 5:13 PM, Stuart Marks wrote:

Let me put in an argument for handling code points:


3. public static String repeat(final int codepoint, final int count)


Most of the String and Character API handles code points on an equal 
footing with chars. I think this is important, as over time Unicode is 
continuing to add supplementary characters -- those that can't be 
represented in a Java char value. Examples abound of how such 
characters are mishandled. Therefore, I believe Java APIs should have 
full support for code points.


This is a small thing, and some might consider it a rare case -- how 
often does one need to repeat something like an emoji? The issue 
however isn't that particular use case. Instead what's required is the 
ability to handle *any Unicode character* uniformly, regardless of 
whether or not it's a supplementary character. The way to do that is 
to deal with code points, so any Java API that deals with character 
data must also handle code points.


If we were to add just one method:


1. public String repeat(final int count)


the workaround is to take the character, turn it into a string, and 
call the repeat() method on it. For a 'char' value, this isn't too 
bad, but I'd argue it isn't pretty either:


Character.toString(charVal).repeat(n)


How about

public static repeat(int count, char... chars)?

String.repeat(100, '*');
String.repeat(100, 'x', 'y');

and it should not be too bad and kinda consistent to have

String.repeat(n,  Character.toChars(0x12345));

-sherman


Re: RFR: 8197594 - String and character repeat

2018-02-16 Thread Brian Goetz
Disagree.  

On #3, most of the time the char being repeated is already a literal.  So just 
make it a string.  

On #2, better to aim for string.ofCodePoint(int) and compose w repeat.  

Down to one method again :)

Sent from my MacBook Wheel

> On Feb 16, 2018, at 5:13 PM, Stuart Marks  wrote:
> 
> Let me put in an argument for handling code points:
> 
>> 3. public static String repeat(final int codepoint, final int count)
> 
> Most of the String and Character API handles code points on an equal footing 
> with chars. I think this is important, as over time Unicode is continuing to 
> add supplementary characters -- those that can't be represented in a Java 
> char value. Examples abound of how such characters are mishandled. Therefore, 
> I believe Java APIs should have full support for code points.
> 
> This is a small thing, and some might consider it a rare case -- how often 
> does one need to repeat something like an emoji? The issue however isn't that 
> particular use case. Instead what's required is the ability to handle *any 
> Unicode character* uniformly, regardless of whether or not it's a 
> supplementary character. The way to do that is to deal with code points, so 
> any Java API that deals with character data must also handle code points.
> 
> If we were to add just one method:
> 
>> 1. public String repeat(final int count)
> 
> the workaround is to take the character, turn it into a string, and call the 
> repeat() method on it. For a 'char' value, this isn't too bad, but I'd argue 
> it isn't pretty either:
> 
>Character.toString(charVal).repeat(n)
> 
> But this only handles BMP characters, not supplementary characters. 
> Unfortunately, there's no direct way to turn a code point into a string -- 
> you have to turn it into a byte array first! Thus, to get a string from a 
> code point and repeat it, you have to do this:
> 
>new String(Character.toChars(codepoint)).repeat(count)
> 
> This is enough indirection that it's hard to discover, and I suspect that 
> most people won't put in the effort to do this correctly, resulting in more 
> code that mishandles supplementary characters.
> 
> Thus, I think we need to add API #3 that performs the repeat function on code 
> points.
> 
> (Hm, the lack of Character.toString(codepoint) is covered by JDK-4993841, 
> which is closed. I think I'll reopen it.)
> 
>> 2. public static String repeat(final char ch, final int count)
> 
> I can see that this API is not as important as one that handles code points, 
> and it seems to be less frequently used according to Louis W's analysis. But 
> if you have char data you want to repeat, not having this seems like an 
> omission; it seems backwards to have to create a string from the char, only 
> for repeat() to extract that char from that String in order to repeat it. 
> Thus I've vote for inclusion of this method as well.
> 
> s'marks
> 
> 
>> On 2/16/18 5:10 AM, Jim Laskey wrote:
>> We’re going with the one instance method (Louis clinched it.) with 
>> recommended enhancements and not touching CharSequence.
>> Working it up now.
>> — Jim
>>> On Feb 16, 2018, at 7:46 AM, Alan Bateman  wrote:
>>> 
>>> On 15/02/2018 17:20, Jim Laskey wrote:
 This is a pre-CSR code review [1] for String repeat methods (Enhancement).
 
 The proposal is to introduce four new methods;
 
 1. public String repeat(final int count)
 2. public static String repeat(final char ch, final int count)
 3. public static String repeat(final int codepoint, final int count)
 4. public static String repeat(final CharSequence seq, final int count)
 
>>> Just catching up on this thread and it's hard to see where the bidding is 
>>> currently at. Are you planning to send an updated proposal, a list of 
>>> methods is fine, even if it's just one, is okay (implementation can follow 
>>> later).
>>> 
>>> -Alan
> 



Re: RFR: 8197594 - String and character repeat

2018-02-16 Thread Stuart Marks

Let me put in an argument for handling code points:


3. public static String repeat(final int codepoint, final int count)


Most of the String and Character API handles code points on an equal footing 
with chars. I think this is important, as over time Unicode is continuing to add 
supplementary characters -- those that can't be represented in a Java char 
value. Examples abound of how such characters are mishandled. Therefore, I 
believe Java APIs should have full support for code points.


This is a small thing, and some might consider it a rare case -- how often does 
one need to repeat something like an emoji? The issue however isn't that 
particular use case. Instead what's required is the ability to handle *any 
Unicode character* uniformly, regardless of whether or not it's a supplementary 
character. The way to do that is to deal with code points, so any Java API that 
deals with character data must also handle code points.


If we were to add just one method:


1. public String repeat(final int count)


the workaround is to take the character, turn it into a string, and call the 
repeat() method on it. For a 'char' value, this isn't too bad, but I'd argue it 
isn't pretty either:


Character.toString(charVal).repeat(n)

But this only handles BMP characters, not supplementary characters. 
Unfortunately, there's no direct way to turn a code point into a string -- you 
have to turn it into a byte array first! Thus, to get a string from a code point 
and repeat it, you have to do this:


new String(Character.toChars(codepoint)).repeat(count)

This is enough indirection that it's hard to discover, and I suspect that most 
people won't put in the effort to do this correctly, resulting in more code that 
mishandles supplementary characters.


Thus, I think we need to add API #3 that performs the repeat function on code 
points.


(Hm, the lack of Character.toString(codepoint) is covered by JDK-4993841, which 
is closed. I think I'll reopen it.)



2. public static String repeat(final char ch, final int count)


I can see that this API is not as important as one that handles code points, and 
it seems to be less frequently used according to Louis W's analysis. But if you 
have char data you want to repeat, not having this seems like an omission; it 
seems backwards to have to create a string from the char, only for repeat() to 
extract that char from that String in order to repeat it. Thus I've vote for 
inclusion of this method as well.


s'marks


On 2/16/18 5:10 AM, Jim Laskey wrote:

We’re going with the one instance method (Louis clinched it.) with recommended 
enhancements and not touching CharSequence.

Working it up now.

— Jim


On Feb 16, 2018, at 7:46 AM, Alan Bateman  wrote:

On 15/02/2018 17:20, Jim Laskey wrote:

This is a pre-CSR code review [1] for String repeat methods (Enhancement).

The proposal is to introduce four new methods;

1. public String repeat(final int count)
2. public static String repeat(final char ch, final int count)
3. public static String repeat(final int codepoint, final int count)
4. public static String repeat(final CharSequence seq, final int count)


Just catching up on this thread and it's hard to see where the bidding is 
currently at. Are you planning to send an updated proposal, a list of methods 
is fine, even if it's just one, is okay (implementation can follow later).

-Alan




Re: RFR: 8197594 - String and character repeat

2018-02-16 Thread Jim Laskey
We’re going with the one instance method (Louis clinched it.) with recommended 
enhancements and not touching CharSequence.

Working it up now.

— Jim

> On Feb 16, 2018, at 7:46 AM, Alan Bateman  wrote:
> 
> On 15/02/2018 17:20, Jim Laskey wrote:
>> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
>> 
>> The proposal is to introduce four new methods;
>> 
>> 1. public String repeat(final int count)
>> 2. public static String repeat(final char ch, final int count)
>> 3. public static String repeat(final int codepoint, final int count)
>> 4. public static String repeat(final CharSequence seq, final int count)
>> 
> Just catching up on this thread and it's hard to see where the bidding is 
> currently at. Are you planning to send an updated proposal, a list of methods 
> is fine, even if it's just one, is okay (implementation can follow later).
> 
> -Alan
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 



Re: RFR: 8197594 - String and character repeat

2018-02-16 Thread Alan Bateman

On 15/02/2018 21:59, Joseph D. Darcy wrote:

:

My general recommendation if the code review and CSR review are to be 
serialized is to estimate which one is more likely to generate 
feedback that might modify the proposal and run though that process 
first. Since there has already been feedback leading to updates of 
this proposal, code review first in this case seems like a reasonable 
choice :-)
I agree, esp. for API additions where it is highly likely there will be 
several iterations as the API is discussed. It's pointless finalizing a 
CSR without getting wider feedback first.


-Alan.


Re: RFR: 8197594 - String and character repeat

2018-02-16 Thread Alan Bateman

On 15/02/2018 17:20, Jim Laskey wrote:

This is a pre-CSR code review [1] for String repeat methods (Enhancement).

The proposal is to introduce four new methods;

1. public String repeat(final int count)
2. public static String repeat(final char ch, final int count)
3. public static String repeat(final int codepoint, final int count)
4. public static String repeat(final CharSequence seq, final int count)

Just catching up on this thread and it's hard to see where the bidding 
is currently at. Are you planning to send an updated proposal, a list of 
methods is fine, even if it's just one, is okay (implementation can 
follow later).


-Alan












Re: RFR: 8197594 - String and character repeat

2018-02-16 Thread Stephen Colebourne
On 15 February 2018 at 20:52, Louis Wasserman  wrote:
> Based on this data, I'd recommend providing one and only one method of this
> type: String.repeat(int).

Only adding the one instance method seems like the best plan in this case.
Stephen


Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread Joseph D. Darcy


On 2/15/2018 12:38 PM, Roger Riggs wrote:

Hi Jim,

Its cleaner to do the API (CSR) review before and without the 
implementation.

It helps keep the issues separate.

Don't make statements about count == Integer.MAX_VALUE / 2.
There is no point, unless it should throw IAE.



My general recommendation if the code review and CSR review are to be 
serialized is to estimate which one is more likely to generate feedback 
that might modify the proposal and run though that process first. Since 
there has already been feedback leading to updates of this proposal, 
code review first in this case seems like a reasonable choice :-)


Developers at their discretion can run CSR and code review in parallel, 
but feedback may come from either process. Proposals still under 
development can also go through the two-phase CSR process to get some 
initial feedback before an intended-to-be-final version is produced.


HTH,

-Joe


Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread Brian Goetz
This matches my intuition too.  

Sent from my MacBook Wheel

> On Feb 15, 2018, at 12:52 PM, Louis Wasserman  wrote:
> 
> I don't think there's a case for demand to merit having a
> repeat(CharSequence, int) at all.
> 
> I did an analysis of usages of Guava's Strings.repeat on Google's
> codebase.  Users might be rolling their own implementations, too, but this
> should be a very good proxy for demand.
> 
> StringRepeat_SingleConstantChar = 4.475 K // strings with .length() ==  1
> StringRepeat_SingleConstantCodePoint = 28 // strings with
> .codePointCount(...) == 1
> StringRepeat_MultiCodePointConstant = 1.156 K // constant strings neither
> of the above
> StringRepeat_CharSequenceToString = 2 //
> Strings.repeat(CharSequence.toString(), n)
> StringRepeat_NoneOfTheAbove = 248
> 
> Notably, it seems like basically nobody needs to repeat a CharSequence --
> definitely not enough demand to merit the awkwardness of e.g.
> Rope.repeat(n) inheriting a repeat returning a String.
> 
> Based on this data, I'd recommend providing one and only one method of this
> type: String.repeat(int).  There's no real advantage to a static
> repeat(char, int) method when the overwhelming majority of these are
> constants: e.g. compare SomeUtilClass.repeat('*', n) versus "*".repeat(n).
> Character.toString(c).repeat(n) isn't a bad workaround if you don't have a
> constant char.  There also isn't much demand for dealing with the code
> point case specially; the String.repeat(int) method seems like it'd handle
> that just fine.
> 
>> On Thu, Feb 15, 2018 at 11:44 AM Jim Laskey  wrote:
>> 
>> 
>> 
>>> On Feb 15, 2018, at 3:36 PM, Ivan Gerasimov 
>> wrote:
>>> 
>>> Hello!
>>> 
>>> The link with the webrev returned 404, but I could find it at this
>> location: http://cr.openjdk.java.net/~jlaskey/8197594/webrev-00/
>>> 
>>> A few minor comments:
>>> 
>>> 1)
>>> 
>>> This check:
>>> 
>>> 2992 final long limit = (long)count * 2L;
>>> 2993 if ((long)Integer.MAX_VALUE < limit) {
>>> 
>>> can be possibly simplified as
>>> if (count > Integer.MAX_VALUE - count) {
>> 
>> Good.
>> 
>>> 
>>> 2)
>>> Should String repeat(final int codepoint, final int count) be optimized
>> for codepoints that can be represented with a single char?
>>> 
>>> E.g. like this:
>>> 
>>> public static String repeat(final int codepoint, final int count) {
>>>   return Character.isBmpCodePoint(codepoint))
>>>   ? repeat((char) codepoint, count)
>>>   : (new String(Character.toChars(codepoint))).repeat(count);
>>> }
>> 
>> Yes, avoid array allocation.
>> 
>>> 
>>> 3)
>>> Using long arithmetic can possibly be avoided in the common path of
>> repeat(final int count):
>>> 
>>> E.g. like this:
>>> 
>>>if (count < 0) {
>>>throw new IllegalArgumentException("count is negative, " +
>> count);
>>>} else if (count == 1) {
>>>return this;
>>>} else if (count == 0) {
>>>return "";
>>> }
>>>final int len = value.length;
>>>if (Integer.MAX_VALUE / count < len) {
>>>throw new IllegalArgumentException(
>>>"Resulting string exceeds maximum string length: " +
>> ((long)len * (long)count));
>>>}
>>>final int limit = count * len;
>> 
>> Good.
>> 
>> Thank you.
>> 
>>> 
>>> With kind regards,
>>> Ivan
>>> 
 On 2/15/18 9:20 AM, Jim Laskey wrote:
 This is a pre-CSR code review [1] for String repeat methods
>> (Enhancement).
 
 The proposal is to introduce four new methods;
 
 1. public String repeat(final int count)
 2. public static String repeat(final char ch, final int count)
 3. public static String repeat(final int codepoint, final int count)
 4. public static String repeat(final CharSequence seq, final int count)
 
 For the sake of transparency, only 1 is necessary, 2-4 are convenience
>> methods.
 In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’,
>> 10).
 3 and 4 convert to String before calling 1.
 
 Performance runs with jmh (results as comment in [2]) show that these
 methods are significantly faster that StringBuilder equivalents.
 - fewer memory allocations
 - fewer char to byte array conversions
 - faster pyramid replication vs O(N) copying
 
 I left StringBuilder out of scope. It falls under the category of
 Appendables#append with repeat. A much bigger project.
 
 All comments welcome. Especially around the need for convenience
 methods, the JavaDoc content and expanding the tests.
 
 — Jim
 
 [1] webrev:
>> http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
 [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
 
 
>>> 
>>> --
>>> With kind regards,
>>> Ivan Gerasimov
>>> 
>> 
>> 



Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread James Laskey
Good to have the data. Thank you Louis. 

Sent from my iPhone

> On Feb 15, 2018, at 4:52 PM, Louis Wasserman  wrote:
> 
> I don't think there's a case for demand to merit having a 
> repeat(CharSequence, int) at all.  
> 
> I did an analysis of usages of Guava's Strings.repeat on Google's codebase.  
> Users might be rolling their own implementations, too, but this should be a 
> very good proxy for demand.
> 
> StringRepeat_SingleConstantChar   = 4.475 K // strings with .length() ==  
> 1
> StringRepeat_SingleConstantCodePoint  = 28 // strings with 
> .codePointCount(...) == 1
> StringRepeat_MultiCodePointConstant   = 1.156 K // constant strings neither 
> of the above
> StringRepeat_CharSequenceToString = 2 // 
> Strings.repeat(CharSequence.toString(), n)
> StringRepeat_NoneOfTheAbove   = 248
> 
> Notably, it seems like basically nobody needs to repeat a CharSequence -- 
> definitely not enough demand to merit the awkwardness of e.g. Rope.repeat(n) 
> inheriting a repeat returning a String.
> 
> Based on this data, I'd recommend providing one and only one method of this 
> type: String.repeat(int).  There's no real advantage to a static repeat(char, 
> int) method when the overwhelming majority of these are constants: e.g. 
> compare SomeUtilClass.repeat('*', n) versus "*".repeat(n).  
> Character.toString(c).repeat(n) isn't a bad workaround if you don't have a 
> constant char.  There also isn't much demand for dealing with the code point 
> case specially; the String.repeat(int) method seems like it'd handle that 
> just fine.
> 
>> On Thu, Feb 15, 2018 at 11:44 AM Jim Laskey  wrote:
>> 
>> 
>> > On Feb 15, 2018, at 3:36 PM, Ivan Gerasimov  
>> > wrote:
>> >
>> > Hello!
>> >
>> > The link with the webrev returned 404, but I could find it at this 
>> > location: http://cr.openjdk.java.net/~jlaskey/8197594/webrev-00/
>> >
>> > A few minor comments:
>> >
>> > 1)
>> >
>> > This check:
>> >
>> > 2992 final long limit = (long)count * 2L;
>> > 2993 if ((long)Integer.MAX_VALUE < limit) {
>> >
>> > can be possibly simplified as
>> > if (count > Integer.MAX_VALUE - count) {
>> 
>> Good.
>> 
>> >
>> > 2)
>> > Should String repeat(final int codepoint, final int count) be optimized 
>> > for codepoints that can be represented with a single char?
>> >
>> > E.g. like this:
>> >
>> > public static String repeat(final int codepoint, final int count) {
>> >return Character.isBmpCodePoint(codepoint))
>> >? repeat((char) codepoint, count)
>> >: (new String(Character.toChars(codepoint))).repeat(count);
>> > }
>> 
>> Yes, avoid array allocation.
>> 
>> >
>> > 3)
>> > Using long arithmetic can possibly be avoided in the common path of 
>> > repeat(final int count):
>> >
>> > E.g. like this:
>> >
>> > if (count < 0) {
>> > throw new IllegalArgumentException("count is negative, " + 
>> > count);
>> > } else if (count == 1) {
>> > return this;
>> > } else if (count == 0) {
>> > return "";
>> > }
>> > final int len = value.length;
>> > if (Integer.MAX_VALUE / count < len) {
>> > throw new IllegalArgumentException(
>> > "Resulting string exceeds maximum string length: " + 
>> > ((long)len * (long)count));
>> > }
>> > final int limit = count * len;
>> 
>> Good.
>> 
>> Thank you.
>> 
>> >
>> > With kind regards,
>> > Ivan
>> >
>> > On 2/15/18 9:20 AM, Jim Laskey wrote:
>> >> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
>> >>
>> >> The proposal is to introduce four new methods;
>> >>
>> >> 1. public String repeat(final int count)
>> >> 2. public static String repeat(final char ch, final int count)
>> >> 3. public static String repeat(final int codepoint, final int count)
>> >> 4. public static String repeat(final CharSequence seq, final int count)
>> >>
>> >> For the sake of transparency, only 1 is necessary, 2-4 are convenience 
>> >> methods.
>> >> In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 
>> >> 10).
>> >> 3 and 4 convert to String before calling 1.
>> >>
>> >> Performance runs with jmh (results as comment in [2]) show that these
>> >> methods are significantly faster that StringBuilder equivalents.
>> >>  - fewer memory allocations
>> >>  - fewer char to byte array conversions
>> >>  - faster pyramid replication vs O(N) copying
>> >>
>> >> I left StringBuilder out of scope. It falls under the category of
>> >> Appendables#append with repeat. A much bigger project.
>> >>
>> >> All comments welcome. Especially around the need for convenience
>> >> methods, the JavaDoc content and expanding the tests.
>> >>
>> >> — Jim
>> >>
>> >> [1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
>> >> [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
>> >>
>> >>
>> >
>> > --
>> > With kind regards,
>> > Ivan 

Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread Louis Wasserman
I don't think there's a case for demand to merit having a
repeat(CharSequence, int) at all.

I did an analysis of usages of Guava's Strings.repeat on Google's
codebase.  Users might be rolling their own implementations, too, but this
should be a very good proxy for demand.

StringRepeat_SingleConstantChar = 4.475 K // strings with .length() ==  1
StringRepeat_SingleConstantCodePoint = 28 // strings with
.codePointCount(...) == 1
StringRepeat_MultiCodePointConstant = 1.156 K // constant strings neither
of the above
StringRepeat_CharSequenceToString = 2 //
Strings.repeat(CharSequence.toString(), n)
StringRepeat_NoneOfTheAbove = 248

Notably, it seems like basically nobody needs to repeat a CharSequence --
definitely not enough demand to merit the awkwardness of e.g.
Rope.repeat(n) inheriting a repeat returning a String.

Based on this data, I'd recommend providing one and only one method of this
type: String.repeat(int).  There's no real advantage to a static
repeat(char, int) method when the overwhelming majority of these are
constants: e.g. compare SomeUtilClass.repeat('*', n) versus "*".repeat(n).
Character.toString(c).repeat(n) isn't a bad workaround if you don't have a
constant char.  There also isn't much demand for dealing with the code
point case specially; the String.repeat(int) method seems like it'd handle
that just fine.

On Thu, Feb 15, 2018 at 11:44 AM Jim Laskey  wrote:

>
>
> > On Feb 15, 2018, at 3:36 PM, Ivan Gerasimov 
> wrote:
> >
> > Hello!
> >
> > The link with the webrev returned 404, but I could find it at this
> location: http://cr.openjdk.java.net/~jlaskey/8197594/webrev-00/
> >
> > A few minor comments:
> >
> > 1)
> >
> > This check:
> >
> > 2992 final long limit = (long)count * 2L;
> > 2993 if ((long)Integer.MAX_VALUE < limit) {
> >
> > can be possibly simplified as
> > if (count > Integer.MAX_VALUE - count) {
>
> Good.
>
> >
> > 2)
> > Should String repeat(final int codepoint, final int count) be optimized
> for codepoints that can be represented with a single char?
> >
> > E.g. like this:
> >
> > public static String repeat(final int codepoint, final int count) {
> >return Character.isBmpCodePoint(codepoint))
> >? repeat((char) codepoint, count)
> >: (new String(Character.toChars(codepoint))).repeat(count);
> > }
>
> Yes, avoid array allocation.
>
> >
> > 3)
> > Using long arithmetic can possibly be avoided in the common path of
> repeat(final int count):
> >
> > E.g. like this:
> >
> > if (count < 0) {
> > throw new IllegalArgumentException("count is negative, " +
> count);
> > } else if (count == 1) {
> > return this;
> > } else if (count == 0) {
> > return "";
> > }
> > final int len = value.length;
> > if (Integer.MAX_VALUE / count < len) {
> > throw new IllegalArgumentException(
> > "Resulting string exceeds maximum string length: " +
> ((long)len * (long)count));
> > }
> > final int limit = count * len;
>
> Good.
>
> Thank you.
>
> >
> > With kind regards,
> > Ivan
> >
> > On 2/15/18 9:20 AM, Jim Laskey wrote:
> >> This is a pre-CSR code review [1] for String repeat methods
> (Enhancement).
> >>
> >> The proposal is to introduce four new methods;
> >>
> >> 1. public String repeat(final int count)
> >> 2. public static String repeat(final char ch, final int count)
> >> 3. public static String repeat(final int codepoint, final int count)
> >> 4. public static String repeat(final CharSequence seq, final int count)
> >>
> >> For the sake of transparency, only 1 is necessary, 2-4 are convenience
> methods.
> >> In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’,
> 10).
> >> 3 and 4 convert to String before calling 1.
> >>
> >> Performance runs with jmh (results as comment in [2]) show that these
> >> methods are significantly faster that StringBuilder equivalents.
> >>  - fewer memory allocations
> >>  - fewer char to byte array conversions
> >>  - faster pyramid replication vs O(N) copying
> >>
> >> I left StringBuilder out of scope. It falls under the category of
> >> Appendables#append with repeat. A much bigger project.
> >>
> >> All comments welcome. Especially around the need for convenience
> >> methods, the JavaDoc content and expanding the tests.
> >>
> >> — Jim
> >>
> >> [1] webrev:
> http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
> >> [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
> >>
> >>
> >
> > --
> > With kind regards,
> > Ivan Gerasimov
> >
>
>


Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread Jim Laskey
Thank you Roger.

> On Feb 15, 2018, at 4:38 PM, Roger Riggs <roger.ri...@oracle.com> wrote:
> 
> Hi Jim,
> 
> Its cleaner to do the API (CSR) review before and without the implementation.
> It helps keep the issues separate.

This was on advice from a member of the core libs team. He can speak up if he 
wants.

> 
> Don't make statements about count == Integer.MAX_VALUE / 2.
> There is no point, unless it should throw IAE.
> 
> On 2/15/2018 3:16 PM, Jim Laskey wrote:
>>> On Feb 15, 2018, at 4:04 PM, Remi Forax <fo...@univ-mlv.fr> wrote:
>>> 
>>> I'm not sure we need 4, it's just a convenient method that may be slower 
>>> than if the user code calls toString() (because of profile pollution),
>>> so i'm not sure i pull it's own weight.
>>> 
>>> And about adding a default method into CharSequence, the default method 
>>> should return a CharSequence or String ?
>> If you mean each class returns an instance of its class, I think that 
>> overlaps Appendable..
> Beware thinking about default methods on interfaces; the dragons will get you.
> Recently, this was discussed in the context of Reader.transferTo.
>>> and what about the other implementations, AbstractStringBuilder and 
>>> CharBuffer at least ?
>> This falls into the Appendable.append( T t, int count) realm mentioned 
>> originally.
>> 
>> Long term this could be a goal, and maybe defaulting CharSequence#repeat 
>> returning a string would be shortsighted.
>> 
>> But, I think having instance String#repeat returning a CharSequence would 
>> limit its use (methods expecting strings.)
> There is no point in returning CharSequence, a String is a CharSequence and 
> can be used anywhere
> a CharSequence can.
> 
> $.02, Roger
> 
>> 
>> — Jim
>> 
>>> Rémi
>>> 
>>> - Mail original -
>>>> De: "Jim Laskey" <james.las...@oracle.com>
>>>> À: "Brian Goetz" <brian.go...@oracle.com>
>>>> Cc: "core-libs-dev" <core-libs-dev@openjdk.java.net>
>>>> Envoyé: Jeudi 15 Février 2018 18:34:19
>>>> Objet: Re: RFR: 8197594 - String and character repeat
>>>> Very reasonable approach.
>>>> 
>>>> 
>>>>> On Feb 15, 2018, at 1:31 PM, Brian Goetz <brian.go...@oracle.com> wrote:
>>>>> 
>>>>> I suggest merging 1 and 4 by making it an instance method on CS with a 
>>>>> default
>>>>> in CS and an override on string and string builder?
>>>>> 
>>>>> Sent from my MacBook Wheel
>>>>> 
>>>>>> On Feb 15, 2018, at 9:20 AM, Jim Laskey <james.las...@oracle.com> wrote:
>>>>>> 
>>>>>> This is a pre-CSR code review [1] for String repeat methods 
>>>>>> (Enhancement).
>>>>>> 
>>>>>> The proposal is to introduce four new methods;
>>>>>> 
>>>>>> 1. public String repeat(final int count)
>>>>>> 2. public static String repeat(final char ch, final int count)
>>>>>> 3. public static String repeat(final int codepoint, final int count)
>>>>>> 4. public static String repeat(final CharSequence seq, final int count)
>>>>>> 
>>>>>> For the sake of transparency, only 1 is necessary, 2-4 are convenience 
>>>>>> methods.
>>>>>> In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 
>>>>>> 10).
>>>>>> 3 and 4 convert to String before calling 1.
>>>>>> 
>>>>>> Performance runs with jmh (results as comment in [2]) show that these
>>>>>> methods are significantly faster that StringBuilder equivalents.
>>>>>> - fewer memory allocations
>>>>>> - fewer char to byte array conversions
>>>>>> - faster pyramid replication vs O(N) copying
>>>>>> 
>>>>>> I left StringBuilder out of scope. It falls under the category of
>>>>>> Appendables#append with repeat. A much bigger project.
>>>>>> 
>>>>>> All comments welcome. Especially around the need for convenience
>>>>>> methods, the JavaDoc content and expanding the tests.
>>>>>> 
>>>>>> — Jim
>>>>>> 
>>>>>> [1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
>>>>>> [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
>>>>>> 
> 



Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread Roger Riggs

Hi Jim,

Its cleaner to do the API (CSR) review before and without the 
implementation.

It helps keep the issues separate.

Don't make statements about count == Integer.MAX_VALUE / 2.
There is no point, unless it should throw IAE.

On 2/15/2018 3:16 PM, Jim Laskey wrote:

On Feb 15, 2018, at 4:04 PM, Remi Forax <fo...@univ-mlv.fr> wrote:

I'm not sure we need 4, it's just a convenient method that may be slower than 
if the user code calls toString() (because of profile pollution),
so i'm not sure i pull it's own weight.

And about adding a default method into CharSequence, the default method should 
return a CharSequence or String ?

If you mean each class returns an instance of its class, I think that overlaps 
Appendable..
Beware thinking about default methods on interfaces; the dragons will 
get you.

Recently, this was discussed in the context of Reader.transferTo.

and what about the other implementations, AbstractStringBuilder and CharBuffer 
at least ?

This falls into the Appendable.append( T t, int count) realm mentioned 
originally.

Long term this could be a goal, and maybe defaulting CharSequence#repeat 
returning a string would be shortsighted.

But, I think having instance String#repeat returning a CharSequence would limit 
its use (methods expecting strings.)
There is no point in returning CharSequence, a String is a CharSequence 
and can be used anywhere

a CharSequence can.

$.02, Roger



— Jim


Rémi

- Mail original -

De: "Jim Laskey" <james.las...@oracle.com>
À: "Brian Goetz" <brian.go...@oracle.com>
Cc: "core-libs-dev" <core-libs-dev@openjdk.java.net>
Envoyé: Jeudi 15 Février 2018 18:34:19
Objet: Re: RFR: 8197594 - String and character repeat
Very reasonable approach.



On Feb 15, 2018, at 1:31 PM, Brian Goetz <brian.go...@oracle.com> wrote:

I suggest merging 1 and 4 by making it an instance method on CS with a default
in CS and an override on string and string builder?

Sent from my MacBook Wheel


On Feb 15, 2018, at 9:20 AM, Jim Laskey <james.las...@oracle.com> wrote:

This is a pre-CSR code review [1] for String repeat methods (Enhancement).

The proposal is to introduce four new methods;

1. public String repeat(final int count)
2. public static String repeat(final char ch, final int count)
3. public static String repeat(final int codepoint, final int count)
4. public static String repeat(final CharSequence seq, final int count)

For the sake of transparency, only 1 is necessary, 2-4 are convenience methods.
In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 10).
3 and 4 convert to String before calling 1.

Performance runs with jmh (results as comment in [2]) show that these
methods are significantly faster that StringBuilder equivalents.
- fewer memory allocations
- fewer char to byte array conversions
- faster pyramid replication vs O(N) copying

I left StringBuilder out of scope. It falls under the category of
Appendables#append with repeat. A much bigger project.

All comments welcome. Especially around the need for convenience
methods, the JavaDoc content and expanding the tests.

— Jim

[1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
[2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594





Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread forax


- Mail original -
> De: "Jim Laskey" <james.las...@oracle.com>
> À: "Remi Forax" <fo...@univ-mlv.fr>
> Cc: "Brian Goetz" <brian.go...@oracle.com>, "core-libs-dev" 
> <core-libs-dev@openjdk.java.net>
> Envoyé: Jeudi 15 Février 2018 21:16:48
> Objet: Re: RFR: 8197594 - String and character repeat

>> On Feb 15, 2018, at 4:04 PM, Remi Forax <fo...@univ-mlv.fr> wrote:
>> 
>> I'm not sure we need 4, it's just a convenient method that may be slower 
>> than if
>> the user code calls toString() (because of profile pollution),
>> so i'm not sure i pull it's own weight.
>> 
>> And about adding a default method into CharSequence, the default method 
>> should
>> return a CharSequence or String ?
> 
> If you mean each class returns an instance of its class, I think that overlaps
> Appendable..

yes, very true

> 
>> and what about the other implementations, AbstractStringBuilder and 
>> CharBuffer
>> at least ?
> 
> This falls into the Appendable.append( T t, int count) realm mentioned
> originally.
> 
> Long term this could be a goal, and maybe defaulting CharSequence#repeat
> returning a string would be shortsighted.
> 
> But, I think having instance String#repeat returning a CharSequence would 
> limit
> its use (methods expecting strings.)

yes, i think we should play safe here and only add repeat to String.

> 
> — Jim

Rémi

> 
>> 
>> Rémi
>> 
>> - Mail original -
>>> De: "Jim Laskey" <james.las...@oracle.com>
>>> À: "Brian Goetz" <brian.go...@oracle.com>
>>> Cc: "core-libs-dev" <core-libs-dev@openjdk.java.net>
>>> Envoyé: Jeudi 15 Février 2018 18:34:19
>>> Objet: Re: RFR: 8197594 - String and character repeat
>> 
>>> Very reasonable approach.
>>> 
>>> 
>>>> On Feb 15, 2018, at 1:31 PM, Brian Goetz <brian.go...@oracle.com> wrote:
>>>> 
>>>> I suggest merging 1 and 4 by making it an instance method on CS with a 
>>>> default
>>>> in CS and an override on string and string builder?
>>>> 
>>>> Sent from my MacBook Wheel
>>>> 
>>>>> On Feb 15, 2018, at 9:20 AM, Jim Laskey <james.las...@oracle.com> wrote:
>>>>> 
>>>>> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
>>>>> 
>>>>> The proposal is to introduce four new methods;
>>>>> 
>>>>> 1. public String repeat(final int count)
>>>>> 2. public static String repeat(final char ch, final int count)
>>>>> 3. public static String repeat(final int codepoint, final int count)
>>>>> 4. public static String repeat(final CharSequence seq, final int count)
>>>>> 
>>>>> For the sake of transparency, only 1 is necessary, 2-4 are convenience 
>>>>> methods.
>>>>> In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 
>>>>> 10).
>>>>> 3 and 4 convert to String before calling 1.
>>>>> 
>>>>> Performance runs with jmh (results as comment in [2]) show that these
>>>>> methods are significantly faster that StringBuilder equivalents.
>>>>> - fewer memory allocations
>>>>> - fewer char to byte array conversions
>>>>> - faster pyramid replication vs O(N) copying
>>>>> 
>>>>> I left StringBuilder out of scope. It falls under the category of
>>>>> Appendables#append with repeat. A much bigger project.
>>>>> 
>>>>> All comments welcome. Especially around the need for convenience
>>>>> methods, the JavaDoc content and expanding the tests.
>>>>> 
>>>>> — Jim
>>>>> 
>>>>> [1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
>>>>> [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594


Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread Jim Laskey
Thank you Remi.

Re final: Attila got into my head.

> On Feb 15, 2018, at 4:24 PM, Remi Forax <fo...@univ-mlv.fr> wrote:
> 
> - Mail original -
>> De: "Ivan Gerasimov" <ivan.gerasi...@oracle.com>
>> À: "Jim Laskey" <james.las...@oracle.com>, "core-libs-dev" 
>> <core-libs-dev@openjdk.java.net>
>> Envoyé: Jeudi 15 Février 2018 20:36:44
>> Objet: Re: RFR: 8197594 - String and character repeat
> 
>> Hello!
>> 
>> The link with the webrev returned 404, but I could find it at this
>> location: http://cr.openjdk.java.net/~jlaskey/8197594/webrev-00/
> 
> Jim,
> in static String repeat(char ch, int count) 
>  you can remove the 'else's because they are after a throw or a return as you 
> have done in the other methods.
> 
> in static String repeat(int codepoint, int count),
>  the parenthesis just after the return seems unnecessary,
> 
> in String repeat(int count)
>  at the end instead of testing isLatin1()  
>return new String(bytes, coder);
>  should be equivalent.
> 
> in replicate
>  you can return 'result' after the call to Arrays.fill, to signal to readers 
> that this is an optimization and that the primary code is below, it will also 
> avoid to shift the primary code to the right.
> 
> and nitpicking about the style, can you remove those pesky final, it's the 
> Nashorn style but not the JDK one :)
> 
> cheers,
> Rémi
> 
> 
>> On 2/15/18 9:20 AM, Jim Laskey wrote:
>>> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
>>> 
>>> The proposal is to introduce four new methods;
>>> 
>>> 1. public String repeat(final int count)
>>> 2. public static String repeat(final char ch, final int count)
>>> 3. public static String repeat(final int codepoint, final int count)
>>> 4. public static String repeat(final CharSequence seq, final int count)
>>> 
>>> For the sake of transparency, only 1 is necessary, 2-4 are convenience 
>>> methods.
>>> In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 10).
>>> 3 and 4 convert to String before calling 1.
>>> 
>>> Performance runs with jmh (results as comment in [2]) show that these
>>> methods are significantly faster that StringBuilder equivalents.
>>>  - fewer memory allocations
>>>  - fewer char to byte array conversions
>>>  - faster pyramid replication vs O(N) copying
>>> 
>>> I left StringBuilder out of scope. It falls under the category of
>>> Appendables#append with repeat. A much bigger project.
>>> 
>>> All comments welcome. Especially around the need for convenience
>>> methods, the JavaDoc content and expanding the tests.
>>> 
>>> — Jim
>>> 
>>> [1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
>>> [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
>>> 
>>> 
>> 
>> --
>> With kind regards,
>> Ivan Gerasimov



Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread Remi Forax
- Mail original -
> De: "Ivan Gerasimov" <ivan.gerasi...@oracle.com>
> À: "Jim Laskey" <james.las...@oracle.com>, "core-libs-dev" 
> <core-libs-dev@openjdk.java.net>
> Envoyé: Jeudi 15 Février 2018 20:36:44
> Objet: Re: RFR: 8197594 - String and character repeat

> Hello!
> 
> The link with the webrev returned 404, but I could find it at this
> location: http://cr.openjdk.java.net/~jlaskey/8197594/webrev-00/

Jim,
in static String repeat(char ch, int count) 
  you can remove the 'else's because they are after a throw or a return as you 
have done in the other methods.

in static String repeat(int codepoint, int count),
  the parenthesis just after the return seems unnecessary,

in String repeat(int count)
  at the end instead of testing isLatin1()  
return new String(bytes, coder);
  should be equivalent.

in replicate
  you can return 'result' after the call to Arrays.fill, to signal to readers 
that this is an optimization and that the primary code is below, it will also 
avoid to shift the primary code to the right.

and nitpicking about the style, can you remove those pesky final, it's the 
Nashorn style but not the JDK one :)

cheers,
Rémi


> On 2/15/18 9:20 AM, Jim Laskey wrote:
>> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
>>
>> The proposal is to introduce four new methods;
>>
>> 1. public String repeat(final int count)
>> 2. public static String repeat(final char ch, final int count)
>> 3. public static String repeat(final int codepoint, final int count)
>> 4. public static String repeat(final CharSequence seq, final int count)
>>
>> For the sake of transparency, only 1 is necessary, 2-4 are convenience 
>> methods.
>> In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 10).
>> 3 and 4 convert to String before calling 1.
>>
>> Performance runs with jmh (results as comment in [2]) show that these
>> methods are significantly faster that StringBuilder equivalents.
>>   - fewer memory allocations
>>   - fewer char to byte array conversions
>>   - faster pyramid replication vs O(N) copying
>>
>> I left StringBuilder out of scope. It falls under the category of
>> Appendables#append with repeat. A much bigger project.
>>
>> All comments welcome. Especially around the need for convenience
>> methods, the JavaDoc content and expanding the tests.
>>
>> — Jim
>>
>> [1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
>> [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
>>
>>
> 
> --
> With kind regards,
> Ivan Gerasimov


Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread Jim Laskey


> On Feb 15, 2018, at 4:04 PM, Remi Forax <fo...@univ-mlv.fr> wrote:
> 
> I'm not sure we need 4, it's just a convenient method that may be slower than 
> if the user code calls toString() (because of profile pollution),
> so i'm not sure i pull it's own weight.
> 
> And about adding a default method into CharSequence, the default method 
> should return a CharSequence or String ? 

If you mean each class returns an instance of its class, I think that overlaps 
Appendable..

> and what about the other implementations, AbstractStringBuilder and 
> CharBuffer at least ? 

This falls into the Appendable.append( T t, int count) realm mentioned 
originally.

Long term this could be a goal, and maybe defaulting CharSequence#repeat 
returning a string would be shortsighted.

But, I think having instance String#repeat returning a CharSequence would limit 
its use (methods expecting strings.)

— Jim

> 
> Rémi
> 
> - Mail original -
>> De: "Jim Laskey" <james.las...@oracle.com>
>> À: "Brian Goetz" <brian.go...@oracle.com>
>> Cc: "core-libs-dev" <core-libs-dev@openjdk.java.net>
>> Envoyé: Jeudi 15 Février 2018 18:34:19
>> Objet: Re: RFR: 8197594 - String and character repeat
> 
>> Very reasonable approach.
>> 
>> 
>>> On Feb 15, 2018, at 1:31 PM, Brian Goetz <brian.go...@oracle.com> wrote:
>>> 
>>> I suggest merging 1 and 4 by making it an instance method on CS with a 
>>> default
>>> in CS and an override on string and string builder?
>>> 
>>> Sent from my MacBook Wheel
>>> 
>>>> On Feb 15, 2018, at 9:20 AM, Jim Laskey <james.las...@oracle.com> wrote:
>>>> 
>>>> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
>>>> 
>>>> The proposal is to introduce four new methods;
>>>> 
>>>> 1. public String repeat(final int count)
>>>> 2. public static String repeat(final char ch, final int count)
>>>> 3. public static String repeat(final int codepoint, final int count)
>>>> 4. public static String repeat(final CharSequence seq, final int count)
>>>> 
>>>> For the sake of transparency, only 1 is necessary, 2-4 are convenience 
>>>> methods.
>>>> In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 
>>>> 10).
>>>> 3 and 4 convert to String before calling 1.
>>>> 
>>>> Performance runs with jmh (results as comment in [2]) show that these
>>>> methods are significantly faster that StringBuilder equivalents.
>>>> - fewer memory allocations
>>>> - fewer char to byte array conversions
>>>> - faster pyramid replication vs O(N) copying
>>>> 
>>>> I left StringBuilder out of scope. It falls under the category of
>>>> Appendables#append with repeat. A much bigger project.
>>>> 
>>>> All comments welcome. Especially around the need for convenience
>>>> methods, the JavaDoc content and expanding the tests.
>>>> 
>>>> — Jim
>>>> 
>>>> [1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
>>>> [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
>>>> 



Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread Remi Forax
I'm not sure we need 4, it's just a convenient method that may be slower than 
if the user code calls toString() (because of profile pollution),
so i'm not sure i pull it's own weight.

And about adding a default method into CharSequence, the default method should 
return a CharSequence or String ? and what about the other implementations, 
AbstractStringBuilder and CharBuffer at least ? 

Rémi

- Mail original -
> De: "Jim Laskey" <james.las...@oracle.com>
> À: "Brian Goetz" <brian.go...@oracle.com>
> Cc: "core-libs-dev" <core-libs-dev@openjdk.java.net>
> Envoyé: Jeudi 15 Février 2018 18:34:19
> Objet: Re: RFR: 8197594 - String and character repeat

> Very reasonable approach.
> 
> 
>> On Feb 15, 2018, at 1:31 PM, Brian Goetz <brian.go...@oracle.com> wrote:
>> 
>> I suggest merging 1 and 4 by making it an instance method on CS with a 
>> default
>> in CS and an override on string and string builder?
>> 
>> Sent from my MacBook Wheel
>> 
>>> On Feb 15, 2018, at 9:20 AM, Jim Laskey <james.las...@oracle.com> wrote:
>>> 
>>> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
>>> 
>>> The proposal is to introduce four new methods;
>>> 
>>> 1. public String repeat(final int count)
>>> 2. public static String repeat(final char ch, final int count)
>>> 3. public static String repeat(final int codepoint, final int count)
>>> 4. public static String repeat(final CharSequence seq, final int count)
>>> 
>>> For the sake of transparency, only 1 is necessary, 2-4 are convenience 
>>> methods.
>>> In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 10).
>>> 3 and 4 convert to String before calling 1.
>>> 
>>> Performance runs with jmh (results as comment in [2]) show that these
>>> methods are significantly faster that StringBuilder equivalents.
>>> - fewer memory allocations
>>> - fewer char to byte array conversions
>>> - faster pyramid replication vs O(N) copying
>>> 
>>> I left StringBuilder out of scope. It falls under the category of
>>> Appendables#append with repeat. A much bigger project.
>>> 
>>> All comments welcome. Especially around the need for convenience
>>> methods, the JavaDoc content and expanding the tests.
>>> 
>>> — Jim
>>> 
>>> [1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
>>> [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
>>> 


Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread Jim Laskey


> On Feb 15, 2018, at 3:36 PM, Ivan Gerasimov  wrote:
> 
> Hello!
> 
> The link with the webrev returned 404, but I could find it at this location: 
> http://cr.openjdk.java.net/~jlaskey/8197594/webrev-00/
> 
> A few minor comments:
> 
> 1)
> 
> This check:
> 
> 2992 final long limit = (long)count * 2L;
> 2993 if ((long)Integer.MAX_VALUE < limit) {
> 
> can be possibly simplified as
> if (count > Integer.MAX_VALUE - count) {

Good.

> 
> 2)
> Should String repeat(final int codepoint, final int count) be optimized for 
> codepoints that can be represented with a single char?
> 
> E.g. like this:
> 
> public static String repeat(final int codepoint, final int count) {
>return Character.isBmpCodePoint(codepoint))
>? repeat((char) codepoint, count)
>: (new String(Character.toChars(codepoint))).repeat(count);
> }

Yes, avoid array allocation.

> 
> 3)
> Using long arithmetic can possibly be avoided in the common path of 
> repeat(final int count):
> 
> E.g. like this:
> 
> if (count < 0) {
> throw new IllegalArgumentException("count is negative, " + count);
> } else if (count == 1) {
> return this;
> } else if (count == 0) {
> return "";
> }
> final int len = value.length;
> if (Integer.MAX_VALUE / count < len) {
> throw new IllegalArgumentException(
> "Resulting string exceeds maximum string length: " + 
> ((long)len * (long)count));
> }
> final int limit = count * len;

Good.

Thank you.

> 
> With kind regards,
> Ivan
> 
> On 2/15/18 9:20 AM, Jim Laskey wrote:
>> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
>> 
>> The proposal is to introduce four new methods;
>> 
>> 1. public String repeat(final int count)
>> 2. public static String repeat(final char ch, final int count)
>> 3. public static String repeat(final int codepoint, final int count)
>> 4. public static String repeat(final CharSequence seq, final int count)
>> 
>> For the sake of transparency, only 1 is necessary, 2-4 are convenience 
>> methods.
>> In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 10).
>> 3 and 4 convert to String before calling 1.
>> 
>> Performance runs with jmh (results as comment in [2]) show that these
>> methods are significantly faster that StringBuilder equivalents.
>>  - fewer memory allocations
>>  - fewer char to byte array conversions
>>  - faster pyramid replication vs O(N) copying
>> 
>> I left StringBuilder out of scope. It falls under the category of
>> Appendables#append with repeat. A much bigger project.
>> 
>> All comments welcome. Especially around the need for convenience
>> methods, the JavaDoc content and expanding the tests.
>> 
>> — Jim
>> 
>> [1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
>> [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
>> 
>> 
> 
> -- 
> With kind regards,
> Ivan Gerasimov
> 



Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread Ivan Gerasimov

Hello!

The link with the webrev returned 404, but I could find it at this 
location: http://cr.openjdk.java.net/~jlaskey/8197594/webrev-00/


A few minor comments:

1)

This check:

2992 final long limit = (long)count * 2L;
2993 if ((long)Integer.MAX_VALUE < limit) {

can be possibly simplified as
if (count > Integer.MAX_VALUE - count) {

2)
Should String repeat(final int codepoint, final int count) be optimized 
for codepoints that can be represented with a single char?


E.g. like this:

public static String repeat(final int codepoint, final int count) {
return Character.isBmpCodePoint(codepoint))
? repeat((char) codepoint, count)
: (new String(Character.toChars(codepoint))).repeat(count);
}

3)
Using long arithmetic can possibly be avoided in the common path of 
repeat(final int count):


E.g. like this:

 if (count < 0) {
 throw new IllegalArgumentException("count is negative, " + 
count);

 } else if (count == 1) {
 return this;
 } else if (count == 0) {
 return "";
}
 final int len = value.length;
 if (Integer.MAX_VALUE / count < len) {
 throw new IllegalArgumentException(
 "Resulting string exceeds maximum string length: " 
+ ((long)len * (long)count));

 }
 final int limit = count * len;

With kind regards,
Ivan

On 2/15/18 9:20 AM, Jim Laskey wrote:

This is a pre-CSR code review [1] for String repeat methods (Enhancement).

The proposal is to introduce four new methods;

1. public String repeat(final int count)
2. public static String repeat(final char ch, final int count)
3. public static String repeat(final int codepoint, final int count)
4. public static String repeat(final CharSequence seq, final int count)

For the sake of transparency, only 1 is necessary, 2-4 are convenience methods.
In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 10).
3 and 4 convert to String before calling 1.

Performance runs with jmh (results as comment in [2]) show that these
methods are significantly faster that StringBuilder equivalents.
  - fewer memory allocations
  - fewer char to byte array conversions
  - faster pyramid replication vs O(N) copying

I left StringBuilder out of scope. It falls under the category of
Appendables#append with repeat. A much bigger project.

All comments welcome. Especially around the need for convenience
methods, the JavaDoc content and expanding the tests.

— Jim

[1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
[2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594




--
With kind regards,
Ivan Gerasimov



Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread Jim Laskey
Very reasonable approach.


> On Feb 15, 2018, at 1:31 PM, Brian Goetz  wrote:
> 
> I suggest merging 1 and 4 by making it an instance method on CS with a 
> default in CS and an override on string and string builder?
> 
> Sent from my MacBook Wheel
> 
>> On Feb 15, 2018, at 9:20 AM, Jim Laskey  wrote:
>> 
>> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
>> 
>> The proposal is to introduce four new methods;
>> 
>> 1. public String repeat(final int count)
>> 2. public static String repeat(final char ch, final int count)
>> 3. public static String repeat(final int codepoint, final int count)
>> 4. public static String repeat(final CharSequence seq, final int count)
>> 
>> For the sake of transparency, only 1 is necessary, 2-4 are convenience 
>> methods.
>> In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 10).
>> 3 and 4 convert to String before calling 1.
>> 
>> Performance runs with jmh (results as comment in [2]) show that these
>> methods are significantly faster that StringBuilder equivalents.
>> - fewer memory allocations
>> - fewer char to byte array conversions
>> - faster pyramid replication vs O(N) copying
>> 
>> I left StringBuilder out of scope. It falls under the category of
>> Appendables#append with repeat. A much bigger project.
>> 
>> All comments welcome. Especially around the need for convenience
>> methods, the JavaDoc content and expanding the tests.
>> 
>> — Jim
>> 
>> [1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
>> [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
>> 
> 



Re: RFR: 8197594 - String and character repeat

2018-02-15 Thread Brian Goetz
I suggest merging 1 and 4 by making it an instance method on CS with a default 
in CS and an override on string and string builder?

Sent from my MacBook Wheel

> On Feb 15, 2018, at 9:20 AM, Jim Laskey  wrote:
> 
> This is a pre-CSR code review [1] for String repeat methods (Enhancement).
> 
> The proposal is to introduce four new methods;
> 
> 1. public String repeat(final int count)
> 2. public static String repeat(final char ch, final int count)
> 3. public static String repeat(final int codepoint, final int count)
> 4. public static String repeat(final CharSequence seq, final int count)
> 
> For the sake of transparency, only 1 is necessary, 2-4 are convenience 
> methods.
> In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 10).
> 3 and 4 convert to String before calling 1.
> 
> Performance runs with jmh (results as comment in [2]) show that these
> methods are significantly faster that StringBuilder equivalents.
> - fewer memory allocations
> - fewer char to byte array conversions
> - faster pyramid replication vs O(N) copying
> 
> I left StringBuilder out of scope. It falls under the category of
> Appendables#append with repeat. A much bigger project.
> 
> All comments welcome. Especially around the need for convenience
> methods, the JavaDoc content and expanding the tests.
> 
> — Jim
> 
> [1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
> [2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
> 



RFR: 8197594 - String and character repeat

2018-02-15 Thread Jim Laskey
This is a pre-CSR code review [1] for String repeat methods (Enhancement).

The proposal is to introduce four new methods;

1. public String repeat(final int count)
2. public static String repeat(final char ch, final int count)
3. public static String repeat(final int codepoint, final int count)
4. public static String repeat(final CharSequence seq, final int count)

For the sake of transparency, only 1 is necessary, 2-4 are convenience methods.
In the case of 2, “*”.repeat(10) performs as well as String.repeat(‘*’, 10).
3 and 4 convert to String before calling 1.

Performance runs with jmh (results as comment in [2]) show that these
methods are significantly faster that StringBuilder equivalents.
 - fewer memory allocations
 - fewer char to byte array conversions
 - faster pyramid replication vs O(N) copying

I left StringBuilder out of scope. It falls under the category of
Appendables#append with repeat. A much bigger project.

All comments welcome. Especially around the need for convenience
methods, the JavaDoc content and expanding the tests.

— Jim

[1] webrev: http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
[2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594