Re: RFC: 8356679: Using CharSequence::getChars internally

Markus KARG Wed, 14 May 2025 03:49:00 -0700

Many of the modified classes derive from a common super class and shareone needed common change (which is one of the points which are easy tosee once you see all of those classes in a single PR, but hard toexplain in plaint-text pre-PR mailing list threads), so at least thoseneed to be discussed *together*. But to spare JBS and PRs, I can openthe PR with just the first set of changes, and once we agree that thisset is fine, I can push the next commit *in the same PR*. Otherwise wewould need endless JBS, mailing list threads, and PRs, just to fixe adozen internal code lines.

Having said that, does the current state of this thread count as"reached common agreement to file a PR" or do I still have to wait untilmore people chime in?


-Markus


Am 13.05.2025 um 15:10 schrieb Roger Riggs:

Hi Markus,

A main point was to avoid trying to do everything at once.
The PR comments become hard to follow and intermingled and it takeslonger to get agreement because of the thrash in the PR.
Roger

On 5/13/25 5:05 AM, Markus KARG wrote:
Thank you, Roger.
Actually the method helps in the "toString()" variants, too, as insome places we could *get rid* of "toString()" (which is more workthan "just" a buffer due to the added compression complexity).
In fact, I already took the time to rewrite *all* of them whilewaiting for the approval of this list posting. In *all* cases *less*buffering / copying is needed, and *less* "toString()" conversion(which is a copy under the hood) is needed. So if I would be allowedto show the code as a PR, it would be much easier to explain anddiscuss.
A PR is the best place to discuss "how to code would change". In theworst case, let's drop it if we see that it is actually a bad thing.
-Markus


Am 12.05.2025 um 20:18 schrieb Roger Riggs:
Hi Markus,

On the surface, its looks constructive.
I suspect that many of these cases will turn into discussions aboutthe right/best/better way to buffer the characters.The getChars method only helps when extracting to a char array, manyof the current implementations create strings as the intermediary.The advantage of the 1 character at a time technique is not needinga (separated allocated) buffer.
Consider taking a few at a time before launching into the whole set.

$.02, Roger

On 5/11/25 2:45 AM, Markus KARG wrote:
Dear Core Libs Team,

I am hereby requesting comments on JDK-8356679.
I would like to invest some time and set up a PR implementing ChenLiangs's proposal laid out inhttps://bugs.openjdk.org/browse/JDK-8356679. For your convenience,the text of that JBS is copied below. According to the Developer'sGuide I do need to get broad agreement BEFORE filing a PR.Therefore, I kindly ask everybody to briefly show consent, so I mayfile a PR.
Thanks
-Markus


Copy from https://bugs.openjdk.org/browse/JDK-8356679:
Recently OpenJDK adopted the new method CharSequence::getChars(int,int, char[], int) for inclusion in Java 25. As a bulk readermethod, it allows potentially improved efficiency over thepreviously available char-by-char reader methodCharSequence::charAt(int).
Chen Liang suggested on March 23rd on the core-lib-dev mailing listto use the new method within the internal source code of OpenJDKfor the implementation of Appendables (seehttps://mail.openjdk.org/pipermail/core-libs-dev/2025-March/141521.html).The idea behind this is that the implementations might be moreefficient then.
A quick analysis of the OpenJDK source code identified (at least)the following classes which could potentially run more efficientwhen using CharSequence::getChars internally, thanks to bulkreading and / or prevention of internal copies / toString()conversions:
* java.io.Writer
* java.io.StringWriter
* java.io.PrintWriter
* java.io.BufferedWriter
* java.io.CharArrayWriter
* java.io.FileWriter
* java.io.OutputStreamWriter
* sun.nio.cs.StreamEncoder
* java.io.PrintStream
* java.nio.CharBuffer
In the sense of "eat your own dog food", it makes sense toimplement Chen's idea in (at least) those classes. Possibly moreclasses could get identified when taking a deeper look. Besides thepotential efficiency improvements, it would be a good show case forthe usage of the new API.
The risk of this change should be low, as test coverage exists, andas the intended changes are solely internal to the implementation.No API will get changed. In some cases the JavaDocs will getslightly adapted where it currently exposes the actualimplementation (to not lie in future).

Re: RFC: 8356679: Using CharSequence::getChars internally

Reply via email to