Hello Mr Paulo,

I know the direct way for extracting text but the problem is that i want to 
extract arabic text from pdf,
when i extract text from pdf using iText i get the page stream with iText 
PRStream, thE arabic text come with strange codes
(038f-00ac) and i want to convert these codes to original unicode by using CMap,
My question, where cmap in font dictionary
the stream i get with iText is:

/TagSuspect <</TagSuspect /Ordering >>BDC  /P <</MCID 0/Lang (ar-EG)>> BDC BT
/F1 14.04 Tf
1 0 0 1 518.02 707.14 Tm
/GS10 gs
0 g
/GS11 gs
0 G
[<0003>4<03A2>5<039F039B>] TJ
...
<object number="5" category="DICTIONARY" type="/Font" subtype="/Type0">
   <DICTIONARY>
      <INDIRECT key="/DescendantFonts" number="6" generation="0" value="6 0 R" 
/>
      <NAME key="/BaseFont" value="/Arial" />
      <NAME key="/Type" value="/Font" />
      <NAME key="/Encoding" value="/Identity-H" />
      <NAME key="/Subtype" value="/Type0" />
      <INDIRECT key="/ToUnicode" number="30" generation="0" value="30 0 R" />
   </DICTIONARY>
</object>

where CMap itself so that i can map these chaaracter codes to its unicode??

> Date: Wed, 26 Jun 2013 17:52:05 +0100
> From: pgpsoa...@gmail.com
> To: itext-questions@lists.sourceforge.net
> Subject: Re: [iText-questions] Extract CMap from pdf file!
> 
> This is an easy one, the ToUnicode cmap is in the font dictionary. You
> can get the font dictionary from the page resources. Of course,
> there's a direct way to extract text from a PDF using iText without
> having to reinvent the wheel.
> 
> Paulo
> 
> On Wed, Jun 26, 2013 at 5:30 PM, Mohammed Mostafa
> <mohammed_mostafa1...@hotmail.com> wrote:
> > Hello All,
> >
> > I ask about how can i extract ToUnicode CMap from PDF file using iText
> > libray?
> >
> > i am using iText PRStream to retrieve page stream from pdf but page stream
> > not include CMap!!
> >
> > wait your reply fastly please...
> >
> > Thanks,
> > Mohammed
> >
> > ------------------------------------------------------------------------------
> > This SF.net email is sponsored by Windows:
> >
> > Build for Windows Store.
> >
> > http://p.sf.net/sfu/windows-dev2dev
> > _______________________________________________
> > iText-questions mailing list
> > iText-questions@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/itext-questions
> >
> > iText(R) is a registered trademark of 1T3XT BVBA.
> > Many questions posted to this list can (and will) be answered with a
> > reference to the iText book: http://www.itextpdf.com/book/
> > Please check the keywords list before you ask for examples:
> > http://itextpdf.com/themes/keywords.php
> 
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
> 
> Build for Windows Store.
> 
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> iText-questions mailing list
> iText-questions@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/itext-questions
> 
> iText(R) is a registered trademark of 1T3XT BVBA.
> Many questions posted to this list can (and will) be answered with a 
> reference to the iText book: http://www.itextpdf.com/book/
> Please check the keywords list before you ask for examples: 
> http://itextpdf.com/themes/keywords.php
                                          
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to