To close the loop on this, I've reopened JDK-4993841, which requests adding an
API Character.toString(int) which converts an int codepoint value to a String.
(This seems like the obvious API, in parallel with Character.toString(char), but
of course alternatives could be considered.)
s'marks
On 2/20/18 11:46 AM, Louis Wasserman wrote:
I'm with Brian: adding a separate API to make it easier to get from a
codepoint to a String seems independently merited, and makes the single
repeat API work well for that case. A very quick regex-powered search
comes up with 183 hits in Google for (new
String|String.copyValueOf|String.valueOf)(Character.toChars(..)).
I do, however, recommend a separate thread for discussing that API :)
On Tue, Feb 20, 2018 at 11:33 AM Kevin Bourrillion <kev...@google.com>
wrote:
Just to add another dimension to this data: most of the usages of our
repeat method (~75%) are in test code. These tests usually just want any
old test string of a certain length. Repeating a single character is the
obvious way to get that.
Among production code usages (~25%), there are a few roughly equal use
cases: ascii indentation/alignment, redaction, and Martin's expected case
of "drawing" with ASCII symbols, and "other".
On Thu, Feb 15, 2018 at 12:52 PM, Louis Wasserman <lowas...@google.com>
wrote:
I don't think there's a case for demand to merit having a
repeat(CharSequence, int) at all.
I did an analysis of usages of Guava's Strings.repeat on Google's
codebase. Users might be rolling their own implementations, too, but this
should be a very good proxy for demand.
StringRepeat_SingleConstantChar = 4.475 K // strings with .length() == 1
StringRepeat_SingleConstantCodePoint = 28 // strings with
.codePointCount(...) == 1
StringRepeat_MultiCodePointConstant = 1.156 K // constant strings neither
of the above
StringRepeat_CharSequenceToString = 2 //
Strings.repeat(CharSequence.toString(), n)
StringRepeat_NoneOfTheAbove = 248
Notably, it seems like basically nobody needs to repeat a CharSequence --
definitely not enough demand to merit the awkwardness of e.g.
Rope.repeat(n) inheriting a repeat returning a String.
Based on this data, I'd recommend providing one and only one method of
this
type: String.repeat(int). There's no real advantage to a static
repeat(char, int) method when the overwhelming majority of these are
constants: e.g. compare SomeUtilClass.repeat('*', n) versus "*".repeat(n).
Character.toString(c).repeat(n) isn't a bad workaround if you don't have a
constant char. There also isn't much demand for dealing with the code
point case specially; the String.repeat(int) method seems like it'd handle
that just fine.
On Thu, Feb 15, 2018 at 11:44 AM Jim Laskey <james.las...@oracle.com>
wrote:
On Feb 15, 2018, at 3:36 PM, Ivan Gerasimov <
ivan.gerasi...@oracle.com>
wrote:
Hello!
The link with the webrev returned 404, but I could find it at this
location: http://cr.openjdk.java.net/~jlaskey/8197594/webrev-00/
A few minor comments:
1)
This check:
2992 final long limit = (long)count * 2L;
2993 if ((long)Integer.MAX_VALUE < limit) {
can be possibly simplified as
if (count > Integer.MAX_VALUE - count) {
Good.
2)
Should String repeat(final int codepoint, final int count) be
optimized
for codepoints that can be represented with a single char?
E.g. like this:
public static String repeat(final int codepoint, final int count) {
return Character.isBmpCodePoint(codepoint))
? repeat((char) codepoint, count)
: (new String(Character.toChars(codepoint))).repeat(count);
}
Yes, avoid array allocation.
3)
Using long arithmetic can possibly be avoided in the common path of
repeat(final int count):
E.g. like this:
if (count < 0) {
throw new IllegalArgumentException("count is negative, " +
count);
} else if (count == 1) {
return this;
} else if (count == 0) {
return "";
}
final int len = value.length;
if (Integer.MAX_VALUE / count < len) {
throw new IllegalArgumentException(
"Resulting string exceeds maximum string length:
" +
((long)len * (long)count));
}
final int limit = count * len;
Good.
Thank you.
With kind regards,
Ivan
On 2/15/18 9:20 AM, Jim Laskey wrote:
This is a pre-CSR code review [1] for String repeat methods
(Enhancement).
The proposal is to introduce four new methods;
1. public String repeat(final int count)
2. public static String repeat(final char ch, final int count)
3. public static String repeat(final int codepoint, final int count)
4. public static String repeat(final CharSequence seq, final int
count)
For the sake of transparency, only 1 is necessary, 2-4 are
convenience
methods.
In the case of 2, “*”.repeat(10) performs as well as
String.repeat(‘*’,
10).
3 and 4 convert to String before calling 1.
Performance runs with jmh (results as comment in [2]) show that these
methods are significantly faster that StringBuilder equivalents.
- fewer memory allocations
- fewer char to byte array conversions
- faster pyramid replication vs O(N) copying
I left StringBuilder out of scope. It falls under the category of
Appendables#append with repeat. A much bigger project.
All comments welcome. Especially around the need for convenience
methods, the JavaDoc content and expanding the tests.
— Jim
[1] webrev:
http://cr.openjdk.java.net//oj/home/jlaskey/8197594/webrev-00
[2] jbs: https://bugs.openjdk.java.net/browse/JDK-8197594
--
With kind regards,
Ivan Gerasimov
--
Kevin Bourrillion | Java Librarian | Google, Inc. | kev...@google.com