Re: [iText-questions] Extract CMap from pdf file!

2013-06-27 Thread iText Info
Op 27/06/2013 10:35, Mohammed Mostafa schreef: > are TextRenderInfo give me text size , text coordinates and text color? http://api.itextpdf.com/itext/com/itextpdf/text/pdf/parser/TextRenderInfo.html You get baseline, ascent line and descent line as a LineSegment. Then there's: http://api.itextpd

Re: [iText-questions] Extract CMap from pdf file!

2013-06-27 Thread iText Info
Op 27/06/2013 9:48, Mohammed Mostafa schreef: Hint Mr Paulo: I deal with page stream not with extract text directly because i need some informations associated with text such as font size, font name, x,y coordinates that exist in page stream. Hint Mr Mohammed: TextRenderInfo gives you all that i

Re: [iText-questions] Extract CMap from pdf file!

2013-06-26 Thread Paulo Soares
.14 Tm > /GS10 gs > 0 g > /GS11 gs > 0 G > [<0003>4<03A2>5<039F039B>] TJ > ... > > > > > > > >/> > > > > where CMap itself so that i can map these chaaracter codes to its unicode?? > >> Date: Wed, 26 Jun 2013 17:52:05 +0100 >> Fro

Re: [iText-questions] Extract CMap from pdf file!

2013-06-26 Thread Mohammed Mostafa
where CMap itself so that i can map these chaaracter codes to its unicode?? > Date: Wed, 26 Jun 2013 17:52:05 +0100 > From: pgpsoa...@gmail.com > To: itext-questions@lists.sourceforge.net > Subject: Re: [iText-questions] Extract CMap from pdf file! > > This i

Re: [iText-questions] Extract CMap from pdf file!

2013-06-26 Thread Paulo Soares
This is an easy one, the ToUnicode cmap is in the font dictionary. You can get the font dictionary from the page resources. Of course, there's a direct way to extract text from a PDF using iText without having to reinvent the wheel. Paulo On Wed, Jun 26, 2013 at 5:30 PM, Mohammed Mostafa wrote: