Firstly, I suggest you avoid using the term "4-byte Japanese characters", since that has no meaning except in the context of some encoding, like UTF-8, UTF-16, UTF-32, etc. In Java, all String objects are encoded in UTF-16 as 16-bit code units. So BMP characters use one 16-bit code unit, and non-BMP characters use two 16-bit code units, i.e., an upper and lower surrogate.
Hiragana characters are encoded in the BMP in the range U+3040 to U+309F [1], Katakana in U+30A0 to U+30FF [2], and CJK Ideographs starting at U+4E00. [1] http://www.unicode.org/charts/PDF/U3040.pdf [2] http://www.unicode.org/charts/PDF/U30A0.pdf [3] http://www.unicode.org/charts/PDF/U4E00.pdf For external FO files, or XML files you will translate to FO via XSLT, you should use the UTF-8 encoding of Unicode, and ensure that you provide a correct XML declaration at the beginning of your file: <?xml version="1.0" encoding="utf-8"?> I also suggest that you ensure the presence of the UTF-8 encoding of the BOM [4] at the beginning of the file: 0xEF 0xBB 0xBF. [4] http://en.wikipedia.org/wiki/Byte_order_mark On Wed, May 20, 2015 at 3:20 AM, mrunal28 <loha...@gmail.com> wrote: > Hi Glenn, > > I am trying to understand if Katakana japanese language is a BMP unicode as > per below link: > http://www.sttmedia.com/unicode-basiclingualplane > <http://www.sttmedia.com/unicode-basiclingualplane> > > If I assume that Katakana is a 4-byte japanese language. As per your reply > if Katakana is BMP encoded and FOP supports it, then using FOP 1.1, my code > should render 4-byte japanese characters correctly in pdf. > > I am attaching my code which I am using to convert japanese text into pdf. > Please find attached files. fop_allfonts.xconf > <http://apache-fop.1065347.n5.nabble.com/file/n42155/fop_allfonts.xconf> > ExampleXML2PDF.java > <http://apache-fop.1065347.n5.nabble.com/file/n42155/ExampleXML2PDF.java> > > Please if you have suggestion on shared files. > > So Questions are: > 1. Is Kanataka is BMP encoded? I assume it is. > 2. Am I missing something in code to convert japanese 4-byte into pdf? > > > > -- > View this message in context: > http://apache-fop.1065347.n5.nabble.com/FOP-1-1-Japanese-4-byte-characters-are-rendering-as-in-pdf-tp42117p42155.html > Sent from the FOP - Users mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org > For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org > >