Re: RFR JDK-8244324: RTFEditorKit does not display some of Japanese characters correctly

Sergey Bylokhov Mon, 29 Jun 2020 19:44:58 -0700

+1

On 6/25/20 4:24 am, Prasanta Sadhukhan wrote:

Since we are not able to test other CJK locale locally, I think it's better to 
commit this fix in early phase of jdk16 so that it is allowed to bake and also 
allow other CJK locale customer to test it out.


Anyways, it seems to work for default cases in Font2DTest.

If there are no objections, I will sponsor this as I am ok with this fix.

Regards
Prasanta
On 15-May-20 12:34 PM, Prasanta Sadhukhan wrote:


Thanks Vyom. You could have proposed the patch yourself only...

Anyways, I have tested with Font2DTest with all unicodes for default Latin and 
it seems ok. Will you be able to test in other CJK locales (as  I am not sure 
of the unicodes being displayed correctly) just to ensure they are not 
adversely affected?

Regards
Prasanta
On 14-May-20 9:01 PM, Vyom Tiwari wrote:

Hi prasanta,

Code changes look OK to me, although I am not a expert in this area, but  the 
same patch resolves the issue at our end.
Thanks,
Vyom

On Thu, May 14, 2020 at 4:20 PM Prasanta Sadhukhan <[email protected] 
<mailto:[email protected]>> wrote:

    Hi All,

    Please review a fix for an issue seen whereby RTFEditorKit used to read 
Japanese characters reads some garbage characters.

    The default character set used for the RTF document is set to "ansi" in our 
RTFReader.java.
    And share/classes/javax/swing/text/rtf/charsets/ansi.txt code table has undefined 
values , i.e., 91-98 and A0 are "0". According to 
javax/swing/text/rtf/RTFParser.java, If the ch is 0, handleText() is not called

    As per http://www.biblioscape.com/rtf15_spec.htm#Heading8,

    /RTF file includes the following Character set in its header : //
    //<character set> //
    // (\ansi | \mac | \pc | \pca)? \ansicpgN? //
    //Where, //
    //\ansicpgN This keyword represents the default ANSI code page used to 
perform the *Unicode to ANSI conversion* when writing RTF text. N represents 
the code page in decimal. This is typically set to the default ANSI code page 
of the run-time environment (for example, \ansicpg1252 for U.S. Windows). The 
reader can use the same ANSI code page to convert ANSI text back to Unicode. 
This keyword should be emitted in the RTF header section right after the \ansi, 
\mac, \pc or \pca keyword. /

    Possible values include those in the following table.We can make use of 
ansicpgN (can switch ANSI text to Unicode), define it to refer to the 
latin1TranslationTable [RTFParser inherits it from AbstractFilter] which does 
not include undefined areas instead of ansi's translationTable which has 
undefined areas as seen above.

    Bug: https://bugs.openjdk.java.net/browse/JDK-8244324

    webrev: http://cr.openjdk.java.net/~psadhukhan/8244324/webrev.0/

    Note: I am not able to create a testcase for this as it involves reading 
from rtf file which probably is copyrighted and inserting Japanese characters 
as a string (instead of rtf file) was not working.



--
Thanks,
Vyom



--
Best regards, Sergey.

Re: RFR JDK-8244324: RTFEditorKit does not display some of Japanese characters correctly

Reply via email to