Sorry, the posting below was cross-posted by accident.
-Markus
Am 18.05.2025 um 17:09 schrieb Markus KARG:
Roger,
thank you for your comments.
Following your advice I have splitted the larger work of JDK-8356679
into sub tasks.
I would like to start with a first PR implementing the *foundational*
work, i. e. optimizing Writer::append for efficiency (JDK-8357183).
For convenience, attached below is a copy of the description.
Comments Welcome!
If everybody is fine with this, I would be happy to get some +1 to
publish the PR for JDK-8357183!
Am 14.05.2025 um 15:57 schrieb Roger Riggs:
Hi Markus,
Starting out with the common case is a good idea for the first PR.
I much prefer a PR with a single goal and that comes to a conclusion
and does not add new features or changes after the PR is submitted.
I tend to lose interest in PRs with lots of churn, it means I have to
re-review the bulk of it when there is a change and may wait days to
let it settle down before coming back to it.
I tend to think the PR was not really ready to be reviewed if simple
issues and corrections have to be made frequently.
Do your own checking for typos and copyrights and simple refactoring
before opening the PR.
Quality before quantity or speed.
I'm fine with separate Jira issues that clearly delineate a specific
scope and goal.
The title of this issue (8356679) doesn't identify the real goal.
It seems to be to improve performance or memory usage, not just to
use a new API.
These are my personal opinions about contributions and process.
Regards, Roger
On 5/14/25 6:48 AM, Markus KARG wrote:
Many of the modified classes derive from a common super class and
share one needed common change (which is one of the points which are
easy to see once you see all of those classes in a single PR, but
hard to explain in plaint-text pre-PR mailing list threads), so at
least those need to be discussed *together*. But to spare JBS and
PRs, I can open the PR with just the first set of changes, and once
we agree that this set is fine, I can push the next commit *in the
same PR*. Otherwise we would need endless JBS, mailing list threads,
and PRs, just to fixe a dozen internal code lines.
Having said that, does the current state of this thread count as
"reached common agreement to file a PR" or do I still have to wait
until more people chime in?
-Markus
Am 13.05.2025 um 15:10 schrieb Roger Riggs:
Hi Markus,
A main point was to avoid trying to do everything at once.
The PR comments become hard to follow and intermingled and it takes
longer to get agreement because of the thrash in the PR.
Roger
On 5/13/25 5:05 AM, Markus KARG wrote:
Thank you, Roger.
Actually the method helps in the "toString()" variants, too, as in
some places we could *get rid* of "toString()" (which is more work
than "just" a buffer due to the added compression complexity).
In fact, I already took the time to rewrite *all* of them while
waiting for the approval of this list posting. In *all* cases
*less* buffering / copying is needed, and *less* "toString()"
conversion (which is a copy under the hood) is needed. So if I
would be allowed to show the code as a PR, it would be much easier
to explain and discuss.
A PR is the best place to discuss "how to code would change". In
the worst case, let's drop it if we see that it is actually a bad
thing.
-Markus
Am 12.05.2025 um 20:18 schrieb Roger Riggs:
Hi Markus,
On the surface, its looks constructive.
I suspect that many of these cases will turn into discussions
about the right/best/better way to buffer the characters.
The getChars method only helps when extracting to a char array,
many of the current implementations create strings as the
intermediary. The advantage of the 1 character at a time
technique is not needing a (separated allocated) buffer.
Consider taking a few at a time before launching into the whole set.
$.02, Roger
On 5/11/25 2:45 AM, Markus KARG wrote:
Dear Core Libs Team,
I am hereby requesting comments on JDK-8356679.
I would like to invest some time and set up a PR implementing
Chen Liangs's proposal laid out in
https://bugs.openjdk.org/browse/JDK-8356679. For your
convenience, the text of that JBS is copied below. According to
the Developer's Guide I do need to get broad agreement BEFORE
filing a PR. Therefore, I kindly ask everybody to briefly show
consent, so I may file a PR.
Thanks
-Markus
Copy from https://bugs.openjdk.org/browse/JDK-8356679:
Recently OpenJDK adopted the new method
CharSequence::getChars(int, int, char[], int) for inclusion in
Java 25. As a bulk reader method, it allows potentially improved
efficiency over the previously available char-by-char reader
method CharSequence::charAt(int).
Chen Liang suggested on March 23rd on the core-lib-dev mailing
list to use the new method within the internal source code of
OpenJDK for the implementation of Appendables (see
https://mail.openjdk.org/pipermail/core-libs-dev/2025-March/141521.html).
The idea behind this is that the implementations might be more
efficient then.
A quick analysis of the OpenJDK source code identified (at
least) the following classes which could potentially run more
efficient when using CharSequence::getChars internally, thanks
to bulk reading and / or prevention of internal copies /
toString() conversions:
* java.io.Writer
* java.io.StringWriter
* java.io.PrintWriter
* java.io.BufferedWriter
* java.io.CharArrayWriter
* java.io.FileWriter
* java.io.OutputStreamWriter
* sun.nio.cs.StreamEncoder
* java.io.PrintStream
* java.nio.CharBuffer
In the sense of "eat your own dog food", it makes sense to
implement Chen's idea in (at least) those classes. Possibly more
classes could get identified when taking a deeper look. Besides
the potential efficiency improvements, it would be a good show
case for the usage of the new API.
The risk of this change should be low, as test coverage exists,
and as the intended changes are solely internal to the
implementation. No API will get changed. In some cases the
JavaDocs will get slightly adapted where it currently exposes
the actual implementation (to not lie in future).