Hi wenshao,
I've written up a draft JEP for deprecating the disabling of Compact Strings. There
hasn't been a good term for running the system with Compact Strings disabled, so I
made up a term "UTF-16-only" and used it here.
https://openjdk.org/jeps/8371379
I also filed an enhancement to cover the change to add a warning message when
-XX:-CompactStrings is used, and I've assigned it to you (wenshao). Not that this
change can't be integrated until the JEP is moved into the Targeted state.
https://bugs.openjdk.org/browse/JDK-8371431
You might want to associate PR 27995 with this bug report.
s'marks
On 11/3/25 10:04 PM, Stuart Marks wrote:
Hi wenshao,
I think removing Compact Strings is a great idea! As you noted in your first
message, removing it would make String easier to maintain. Just so that everybody
here understands the issues, every string algorithm has THREE implementations:
1. compact strings enabled, using ISO Latin 1 coder
2. compact strings enabled, using UTF-16 coder
3. compact strings disabled
In recent years I suspect our test coverage of the compact-strings disabled case
is lacking, as some bugs have occurred in only that case. For example, see
JDK-8321514 <https://bugs.openjdk.org/browse/JDK-8321514>, JDK-8316879
<https://bugs.openjdk.org/browse/JDK-8316879>, JDK-8360271
<https://bugs.openjdk.org/browse/JDK-8360271>, JDK-8360255
<https://bugs.openjdk.org/browse/JDK-8360255>, JDK-8221430
<https://bugs.openjdk.org/browse/JDK-8221430>, etc. (Some of these have been
fixed, but some are still open.)
As Alan noted, however, we can't simply remove this case. We also can't simply
deprecate the command-line option; we need to deprecate the feature of running
without Compact Strings before we can remove that feature.
Compact Strings were introduced with JEP 254 <https://openjdk.org/jeps/254>. The
JEP doesn't mention that there is an option to disable compact strings, but the
JVM Guide
<https://docs.oracle.com/en/java/javase/25/vm/java-hotspot-virtual-machine-performance-enhancements.html#GUID-D2E3DC58-D18B-4A6C-8167-4A1DFB4888E4>
describes the Compact Strings feature and also the ability to disable it using the
-XX:-CompactStrings command line option. This section doesn't say much about when
you might want to disable the feature, though; it merely says "This feature can be
disabled if you observe performance regression issues in an application." Articles
like this one from Baeldung <https://www.baeldung.com/java-9-compact-string>, and
vendor documentation from IBM
<https://www.ibm.com/docs/en/sdk-java-technology/8?topic=options-xx-compactstrings>
also document this option, but they offer similarly vague advice.
Since the option is fairly well-known, it's not merely a matter of looking at the
status of the various ports (though those are significant, of course). It could be
that some installations out there running with option to disable compact strings,
perhaps if they encountered a performance regression, or for other reasons.
They'll need to be informed that the feature is going away, and the best way to do
that is with a JEP.
There are some additional issues to consider as well.
* As Alan noted, the ARM32 port has compact strings disabled by default. It's
not clear whether it even works if compact strings are enabled.
* Compact strings increases storage requirements of CJK character data. Our
/assumption/ has been that even CJK-heavy applications use a lot of ASCII
data
for config files, message headers, JSON, etc., and that compact strings are
still a net win for such applications. However, that's an assumption.
There's
the possibility that some installation run those applications with compact
strings disabled.
* The JNI GetStringCritical call returns a direct pointer in the non compact
strings case but makes a copy when compact strings are enabled. Some
applications may suffer regressions because of this; see this Stack Overflow
<https://stackoverflow.com/questions/76913323/string-compact-has-introduced-some-performance-issues-for-the-current-jni-how>
question.
There are probably some other issues we haven't considered yet. The best way to
flush them out is to post a JEP, and then use other channels to publicize the JEP.
The JEP is mostly a formality about changing the official status of running in the
compact-strings-disabled mode to "deprecated". Even though it seems like a lot of
overhead to write a JEP for this, the fact is that many people in the tech press
look only at the list of JEPs for each release and not much else. Any many Java
users look only at tech publications to keep up with Java; they don't look at
GitHub or follow the OpenJDK mailing lists. Thus, posting a JEP is the best chance
we have to reach a broad set of Java users, some of whom might be affected by this
change.
Actual changes that go along with the deprecation will probably only involve
adding warning messages and possibly updating documentation. We don't need to
resolve issues like the ARM32 port yet. However, that will need to be resolved
before we actually remove the feature.
Since I'm "Dr Deprecator" I'll volunteer to draft the JEP.
s'marks
On 10/27/25 11:56 PM, Alan Bateman wrote:
On 28/10/2025 06:32, wenshao wrote:
Thanks to Alan for your feedback.
Based on Chen Liang's suggestion, I submitted a new draft PR
https://github.com/openjdk/jdk/pull/27995
<https://urldefense.com/v3/__https://github.com/openjdk/jdk/pull/27995__;!!ACWV5N9M2RV99hQ!OKe3zURFdlME6esFh_Travsoq4L0s3h71P8bsjCEG5RrmA0nzVmARS7ZAmOZEL0-DWdIg9P8orcXs26SZtYD-c8ZB8h_Fg$>
to add a warning message to the ComactStrings option.
I think first step has to be establish what or who might be using
-XX:-CompactStrings in 2025. This means looking into the status of ports. Andrew
Haley is going to check with folks in IBM as some of the bug reports for the
-CompactString code paths come from ports there.
-Alan