No, I don't think I have big issues with fonts. The TextPosition
object allows to get all the coordinates (X, Y, width, height) and I'm
jsut thinking that I told you a very dumb thing about the graphic
state since it is not anymore needed with this object. The only thing
strange is that X and Y are calculated from the top left corner of the
page and you'll need to position the AnnotationLink from the bottom
left.
Julien PLÉE
7 Avenue Barthélemy Salettes et Jean-Marie Manset
31320 Castanet-Tolosan
France
+33 6.50.00.60.48
Le 17 sept. 10 à 21:06, José Rodolfo Carrijo de Freitas a écrit :
Yeah, I already done a StreamEngine to get image scale and position,
And I just took a look at Annotation example and it cleared my path.
Now, as you suggested, I'm going to extends the textStripper to get
the
position of the text.
I'm afraid to have problem with fonts, are you having some king of
this
problem?
Thanks,
José Rodolfo Carrijo de Freitas
Analista de Sistemas
Softplan - Departamento de pesquisa e desenvolvimento
Sistema da Qualidade Certificado ISO 9001:2008
(48) 3027 8000 Ramal 8359
http://www.softplan.com.br
-----Mensagem original-----
De: Julien Plée [mailto:[email protected]]
Enviada em: sexta-feira, 17 de setembro de 2010 15:59
Para: [email protected]
Assunto: Re: wrap text with links
Yes, it is. This is almost what I am working on at the moment.
To prevent you from wasting much time on research, have a look at the
PDFStreamEngine (more precisely override the processTextPosition
function). If you manage to extend PDFTextStripper, it may be better
since it manages text flows even if it is columned layered. I didn't
manage to do this and PDFStreamEngine suites my needs at the moment.
In the PDF, text is cut in groups of words... and sometimes even words
are cut in half. You'll have to process the text flow with a back
match memory when parsing the flow.
You'll need to deal with the graphic state (to get the text
coordinates) and will have to hack it a bit to get the approximate
position of words or sentences you are looking for (because of the
text flow structure).
Julien PLÉE
Le 17 sept. 10 à 20:24, José Rodolfo Carrijo de Freitas a écrit :
Hello,
Do you believe it is possible to read a text from a pdf and wrap a
text with
a link?
For example:
if it founds “pdfbox” on the box, it will link it to the pdfbox
website.
Thanks,
José Rodolfo Carrijo de Freitas
Analista de Sistemas
Softplan - Departamento de pesquisa e desenvolvimento
Sistema da Qualidade Certificado ISO 9001:2008
(48) 3027 8000 Ramal 8359
http://www.softplan.com.br