wwkloo, wwkloo wrote > When create the PDF with another program, the text can be extracted by > iText and Acrobat Reader XI correctly. > - 1: 0xD841 0xDD47 > - 2: 0x92DB > > However, the character is not displayed correctly. :( > iTextExtract_O.pdf > <http://itext-general.2136553.n4.nabble.com/file/n4657858/iTextExtract_O.pdf> >
This other program seems to only know Unicode 1.x and, therefore, only codepoints below 0x10000. Thus, it understands its input 0xD841 0xDD47 as two different characters and not as 0x20547. wwkloo wrote > Please help! I'm sure iText developers responsible for porting the Java version to .Net will look into the handling of unicode characters beyond the basic multilingual plane sometime soon. Regards, Michael -- View this message in context: http://itext-general.2136553.n4.nabble.com/Differences-btw-text-extraction-from-iText-and-Acrobat-Reader-tp4657836p4657866.html Sent from the iText - General mailing list archive at Nabble.com. ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_mar _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php