Hi guys:
I meet a problem when i try to extract CJK(Chinese, Japanese, Korean)text
from PDF.
Maybe you have already known what i'm gonna ask.That is, why some CJK
text can't be extracted.
But could someone tell me:
1) Why adobe reader can display it normally?
2) If i want to extract ALL the CJK text, what i need to get,
CIDToUnicode mapping file?
3) Why i can't find BT...ET struct in the pdf file, only after split it
page by page?
Any help would be greatly appreciated.
Thanks a lot.
------------------------------------------------------------------------------
Get your SQL database under version control now!
Version control is standard for application code, but databases havent
caught up. So what steps can you take to put your SQL databases under
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php