Martin, thanks for your review ...

Am 27.08.2009 03:24, Martin Buchholz schrieb:
On Wed, Aug 26, 2009 at 02:59, Ulf Zibis<[email protected]> wrote:
Hi all,

I have put all important things of sun.nio.cs.Surrogate to java.* packages,
I guess. We likely could get rid of it.

I am in principle of adding the method isBMP to Character.java
(I did write it!) but I was hoping we can find a better name for it...

I had the same thought, but just referred to the elder. ;-)

Hmmmmm..... The Unicode Glossary uses BMP consistently,
so maybe isBMP is the best we can come up with?

How about isBMPCodePoint?

This feels best for me too.

http://unicode.org/glossary/#BMP_code_point

Yeah, I think that's not too bad.

Martin


See: https://bugs.openjdk.java.net/show_bug.cgi?id=100104


================
+    public static char[] toChars(int codePoint) {
+        if (codePoint >= MIN_CODE_POINT) {
+            char[] result;
+            if (codePoint < MIN_SUPPLEMENTARY_CODE_POINT) {
+                result = C1.clone();
----------------
I guess cloning is faster than new instantiation, as fresh one has to be
initialized by 0s, but not sure

Cloning has to initialize as well, with the original contents.
Often there is special support for zeroing newly allocated objects.
So I think this is a bad idea.


Sounds reasonable.
In case of more complicated objects but arrays, can cloning be faster?

================
+    static int toSurrogates(int codePoint, char[] dst, int index) {
+        // We write elements "backwards" to guarantee all-or-nothing //
TODO: isn't forward faster ?
+        dst[index+1] = lowSurrogate(codePoint);
+        dst[index] = highSurrogate(codePoint);
----------------
IMO this guarantee is nowhere needed, so why potentially waste performance ?

All-or-nothing is a fundamental software reliability principle.
Why have your data structure in a corrupted state
if it is not too onerous to avoid it?
The difference in behavior is certainly user-visible.

OK, ...but if there is a speed advantage, we could note this in javadoc (as is just happens for invalid surrogate input in the calling method for the same reason). Anyway, in current state of HotSpot optimization, there seems to be no advantage according Christian Thalinger.

================
-        // Pass 2: Allocate and fill in char[]
-        char[] v = new char[n];
-        for (int i = offset, j = 0; i < offset + count; i++) {
-            int c = codePoints[i];
-            if (c < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
-                v[j++] = (char) c;
-            } else {
-                Character.toSurrogates(c, v, j);
-                j += 2;
-            }
-        }
-
-        this.value  = v;
+        // Pass 2: Allocate and fill in char[]
+        for (int i = offset, j = 0; i < offset + count; i++) {
+            int c = codePoints[i];
+            if (c < Character.MIN_SUPPLEMENTARY_CODE_POINT)
+                value[j++] = (char)c;
+            else
+                j += Character.toSurrogates(c, value, j);
       }
----------------
I guess HotSpot is aware of immutability of String, so no need to locally
buffer v expletive to save indirection and against concurrent modifications.

I like the simplification of this method a lot, but it actually helps
hotspot a little
to hold the new array in a local and assign it to a field at the end
of the constructor.

Okay, so we have to wait for HotSpot being able to "see" that
(1) first holding the variable locally would increase speed
(2) this is thread safe, as we are in constructor, so no concurrent modification can happen. (3) no method in class String allows modification of content of array "value" after first initialization.

* Even javac could "see" that, and do the optimization on byte code level. (maybe anti-confusion technique is needed for the debugger, ...but we also accept, that debugger runs into a StringBuilder#append() loop on base of a _single_ char[] in case of _multiple_ string concatenation.)


Martin, what do you think about adding code point accessors to java.nio.CharBuffer (see my X-Buffer patch) ?


-Ulf


This sort of thing (copying a field into a local) is done in
java.util.concurrent a lot.

---

It would help to separate cosmetic and "real" changes to get them
reviewed and accepted.


Martin



Reply via email to