Re: RFR JDK-8244324: RTFEditorKit does not display some of Japanese characters correctly

Prasanta Sadhukhan Mon, 01 Jun 2020 04:51:58 -0700

Hi Vyom,

On 15-May-20 12:34 PM, Prasanta Sadhukhan wrote:

Thanks Vyom. You could have proposed the patch yourself only...
Anyways, I have tested with Font2DTest with all unicodes for defaultLatin and it seems ok. Will you be able to test in other CJK locales(as I am not sure of the unicodes being displayed correctly) just toensure they are not adversely affected?

Any feedback on CJK locale testing with your fix? If not, I am afraid weneed to retarget this fix for jdk16.


Regards
Prasanta

Regards
Prasanta
On 14-May-20 9:01 PM, Vyom Tiwari wrote:

Hi prasanta,

Code changes look OK to me, although I am not a expert in this area,but the same patch resolves the issue at our end.

Thanks,
Vyom

On Thu, May 14, 2020 at 4:20 PM Prasanta Sadhukhan<[email protected]<mailto:[email protected]>> wrote:


    Hi All,

    Please review a fix for an issue seen whereby RTFEditorKit used
    to read Japanese characters reads some garbage characters.

    The default character set used for the RTF document is set to
    "ansi" in our RTFReader.java.
    And share/classes/javax/swing/text/rtf/charsets/ansi.txt code
    table has undefined values , i.e., 91-98 and A0 are "0".
    According to javax/swing/text/rtf/RTFParser.java, If the ch is 0,
    handleText() is not called

    As per http://www.biblioscape.com/rtf15_spec.htm#Heading8,

    /RTF file includes the following Character set in its header : //
    //<character set> //
    // (\ansi | \mac | \pc | \pca)? \ansicpgN? //
    //Where, //
    //\ansicpgN This keyword represents the default ANSI code page
    used to perform the *Unicode to ANSI conversion* when writing RTF
    text. N represents the code page in decimal. This is typically
    set to the default ANSI code page of the run-time environment
    (for example, \ansicpg1252 for U.S. Windows). The reader can use
    the same ANSI code page to convert ANSI text back to Unicode.
    This keyword should be emitted in the RTF header section right
    after the \ansi, \mac, \pc or \pca keyword. /

    Possible values include those in the following table.We can make
    use of ansicpgN (can switch ANSI text to Unicode), define it to
    refer to the latin1TranslationTable [RTFParser inherits it from
    AbstractFilter] which does not include undefined areas instead of
    ansi's translationTable which has undefined areas as seen above.

    Bug: https://bugs.openjdk.java.net/browse/JDK-8244324

    webrev: http://cr.openjdk.java.net/~psadhukhan/8244324/webrev.0/

    Note: I am not able to create a testcase for this as it involves
    reading from rtf file which probably is copyrighted and inserting
    Japanese characters as a string (instead of rtf file) was not
    working.



--
Thanks,
Vyom

Re: RFR JDK-8244324: RTFEditorKit does not display some of Japanese characters correctly

Reply via email to