Re: [iText-questions] extracting text from pdfs with japanese data

Paulo Soares Tue, 16 Dec 2008 01:56:04 -0800

There's code in PdfEncodings to parse and convert to/from Unicode the cmaps. 
The font contains the cmap name.


Paulo

----- Original Message ----- 
From: "1T3XT info" <i...@1t3xt.info>
To: "Post all your questions about iText here" 
<itext-questions@lists.sourceforge.net>
Sent: Tuesday, December 16, 2008 9:19 AM
Subject: Re: [iText-questions] extracting text from pdfs with japanese data


Hoppe, Michael wrote:
> The CMap-files are included in the iTextAsianCmaps.jar. So couldn’t they
> be read from that jar in case there is no font information in the pdf?

I'm just thinking out loud here, I didn't dive into the problem yet,
but: do you think it's possible for iText to find which CMap-file is to
be inspected based on the font information available in the PDF?

As Kevin already said: this part of iText is pretty new. We're all
excited about it, but for the moment it's all highly experimental.
-- 
This answer is provided by 1T3XT BVBA
http://www.1t3xt.com/ - http://www.1t3xt.info


------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php

Re: [iText-questions] extracting text from pdfs with japanese data

Reply via email to