Re: RFR JDK-8244324: RTFEditorKit does not display some of Japanese characters correctly

Prasanta Sadhukhan Mon, 01 Jun 2020 05:31:00 -0700


On 01-Jun-20 5:55 PM, Jayashree Sk1 wrote:

It was only tested against Japanese characters in our environment.
That is where we found a character, with its conversion code point turned out 
to \‘96’\ which had undefined value (0) in the ansi translation table.

CK were not tested.

But the fix is not japanese locale specific. It needs to be tested forother CK locales for normal case at least.


Regards
Prasanta

Thanks!

-----Prasanta Sadhukhan <[email protected]> wrote: -----
To: Vyom Tiwari <[email protected]>, [email protected]
From: Prasanta Sadhukhan <[email protected]>
Date: 06/01/2020 05:42PM
Cc: "[email protected]" <[email protected]>
Subject: [EXTERNAL] Re: <Swing Dev> RFR JDK-8244324: RTFEditorKit does not 
display some of Japanese characters correctly

                    I am not asking you for the test, although an automated 
jtreg       test would be welcome.
      What I am asking is that the issue was tested against japanese       
characters. but have you or anyone else verified the fix against       other 
CJK locales?

Regards

        Prasanta

On 01-Jun-20 5:38 PM, Vyom Tiwari wrote:Hi Prasanta,

sorry for the late reply, my teammate Jayashree will send you the test by today.

Thanks,

Vyom

On Mon, Jun 1, 2020 at 5:20 PM Prasanta Sadhukhan <[email protected]> wrote:Hi Vyom,On 15-May-20 12:34 PM, Prasanta Sadhukhan wrote:

                                          Thanks Vyom. You could have proposed 
the patch yourself                 only...
                Anyways, I have tested with Font2DTest with all                 
unicodes for default Latin and it seems ok. Will you be                 able to 
test in other CJK locales (as  I am not sure of                 the unicodes 
being displayed correctly) just to ensure                 they are not 
adversely affected?
                           Any feedback on CJK locale testing with your fix? If 
not,               I am afraid we need to retarget this fix for jdk16.
              Regards
              Prasanta
               Regards
                Prasanta

On 14-May-20 9:01 PM, Vyom Tiwari wrote:Hi prasanta,

Code changes look OK to me, although I am not a expert in this area, but the same patch resolves the issue at our end.

Thanks,
Vyom

On Thu, May 14, 2020 at 4:20 PM Prasanta Sadhukhan <[email protected]> wrote:Hi All,

                        Please review a fix for an issue seen whereby           
              RTFEditorKit used to read Japanese characters                     
    reads some garbage characters.
                        The default character set used for the RTF                        
 document is set to "ansi" in our RTFReader.java.
                          And                         
share/classes/javax/swing/text/rtf/charsets/ansi.txt                         code table 
has undefined values , i.e., 91-98                         and A0 are "0". 
According to                         javax/swing/text/rtf/RTFParser.java, If the ch       
                  is 0, handleText() is not called
                        As per 
http://www.biblioscape.com/rtf15_spec.htm#Heading8,
                        RTF file includes the following Character               
            set in its header :
                           <character set>
                            (\ansi | \mac | \pc | \pca)? \ansicpgN?
                           Where,
                           \ansicpgN This keyword represents the                
           default ANSI code page used to perform the Unicode                   
          to ANSI conversion when writing RTF                           text. N 
represents the code page in decimal.                           This is 
typically set to the default ANSI code                           page of the 
run-time environment (for example,                           \ansicpg1252 for 
U.S. Windows). The reader can                           use the same ANSI code 
page to convert ANSI                           text back to Unicode. This 
keyword should be                           emitted in the RTF header section 
right after                           the \ansi, \mac, \pc or \pca keyword.

Possible values include those in the following table.We can make use of ansicpgN (can switch ANSI text to Unicode), define it to refer to the latin1TranslationTable [RTFParser inherits it from AbstractFilter] which does not include undefined areas instead of ansi's translationTable which has undefined areas as seen above.Bug: https://bugs.openjdk.java.net/browse/JDK-8244324

                        webrev: 
http://cr.openjdk.java.net/~psadhukhan/8244324/webrev.0/
                        Note: I am not able to create a testcase for            
             this as it involves reading from rtf file which                    
     probably is copyrighted and inserting Japanese                         
characters as a string (instead of rtf file) was                         not 
working.

--Thanks,

                    Vyom

--Thanks,

          Vyom

Re: RFR JDK-8244324: RTFEditorKit does not display some of Japanese characters correctly

Reply via email to