Looks good to me.

Process points
- you need a 2nd reviewer.
- the fix should go into JDK 8 first, then get backported.

-phil.

On 9/3/2012 8:40 AM, Vladislav Karnaukhov wrote:
Hello,

could anyone review this please?

Thanks in advance,
- Vlad

On 8/27/2012 6:51 PM, Vladislav Karnaukhov wrote:
Hello,

could you please review a new version?
Please find webrev here: http://cr.openjdk.java.net/~vkarnauk/6836089/webrev.04/

Regards,
- Vlad

On 6/29/2012 2:09 AM, Phil Race wrote:
That would work but I think its cleaner to just move it all into mapNumericReference
as below.

-phil.

diff --git a/src/share/classes/javax/swing/text/html/parser/Parser.java b/src/share/classes/javax/swing/text/html/parser/Parser.java
--- a/src/share/classes/javax/swing/text/html/parser/Parser.java
+++ b/src/share/classes/javax/swing/text/html/parser/Parser.java
@@ -952,7 +952,7 @@
                         ch = readCh();
                         break;
                 }
-                char data[] = {mapNumericReference((char) n)};
+                char data[] = mapNumericReference(n);
                 return data;
             }
             addString('#');
@@ -1021,7 +1021,7 @@
     }

     /**
-     * Converts numeric character reference to Unicode character.
+     * Converts numeric character reference to char array.
      *
      * Normally the code in a reference should be always converted
      * to the Unicode character with the same code, but due to
@@ -1030,13 +1030,21 @@
      * to displayable characters with other codes.
      *
      * @param c the code of numeric character reference.
-     * @return the character corresponding to the reference code.
+     * @return a char array corresponding to the reference code.
      */
-    private char mapNumericReference(char c) {
-        if (c < 130 || c > 159) {
-            return c;
+    private char[] mapNumericReference(int c) {
+        char[] data;
+        if (c >= 0xffff) { // outside unicode BMP.
+            try {
+                data = Character.toChars(c);
+            } catch (IllegalArgumentException e) {
+                data = new char[0];
         }
-        return cp1252Map[c - 130];
+        } else {
+            data = new char[1];
+ data[0] = (c < 130 || c > 159) ? (char)c : cp1252Map[c - 130];
+        }
+        return data;
     }


On 06/28/12 07:58 AM, Vladislav Karnaukhov wrote:
Hello Phil, Pavel,

thank you for your comments. I've reworked fix and testcase, please find new webrev here: http://cr.openjdk.java.net/~vkarnauk/6836089/webrev.03/

Regards,
- Vlad

On 6/27/2012 10:13 PM, Phil Race wrote:
Well its not only unnecessary but is likely wrong .. I don't think
you looked at what mapNumericReference() does.

-phil.

On 6/27/2012 11:03 AM, Vladislav Karnaukhov wrote:
Hello Phil,

I used Character.toChars() in both branches because I wanted to delegate code point conversion to char or surrogate pair entirely to Character class...

Regards,
- Vlad

On 6/26/2012 9:49 PM, Phil Race wrote:
I don't understand why you call Character.toChars() if you've just determined
you don't need to ?

ie what was wrong with

data = ( n >>> 16 == 0) ? {mapNumericReference((char) n)} : Character.toChars(n);

?

In the case of an invalid supplementary pair, maybe it would be safer to return { ' ' } ?


One thing I see in the parsing code that is not new or changed here, that may bear examination, is that there's a loop that keeps on reading so long so long there are new digits. I am not sure its wise to keep going once
you overflow.

-phil.

On 6/26/2012 12:37 AM, Vladislav Karnaukhov wrote:
Hello Pavel,

I can provide you with the link to 6u19, but this is direct forward-port and no code changes were made.

I'll make changes as you've pointed out in 1) and 2)

About 3) - is it a requirement to use "? :" operator? I personally prefer single-line if-else, but I don't want to argue over code style, and surely I'll follow code design practices.

Regards,
- Vlad

On 6/25/2012 6:43 PM, Pavel Porvatov wrote:
Hi Vladislav,

Do you have a link to the fix for 6u19?

I didn't investigate the fix deeply, but

1.
 private final int MAX_BMP_BOUND = 65535;
should be static (otherwise variable name should be in lower case)

2. Add a space in single line comments

3.
+                    char data[];
+                    if (n <= MAX_BMP_BOUND) {
+ data = Character.toChars(mapNumericReference((char) n));
+                    } else {
+                        data = Character.toChars(n);
+                    }
+
                 return data;

can be written in one line via "? :" operator and looks more readable for me

Thanks, Pavel
Hello,

please review the fix for 6836089: Swing HTML parser can't properly decode codepoints outside the Unicode Plane 0 into a surrogate pair. This is a forward port from JDK6 (fixed escalated issue, fix integrated) to JDK7.

The issue is a defect in Swing HTML Parser: if the codepoint is outside BMP (Unicode Plain 0), Parser incorrectly decodes codepoint into surrogate pair. The fix is to use Character.toChars() method if codepoint value is greater than upper bound of BMP.

Webrev: http://cr.openjdk.java.net/~vkarnauk/6836089/webrev.00/
Bug description: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6836089

Regards,
- Vlad















Reply via email to