Le 4 sept. 09 à 03:34, Dola Woolfe a écrit :
I'm trying to put together several elements to build a PDF translator.
1. Load a PDF in a foreign language (???)
2. Translate the content (Google Translate)
3. Output the translated PDF (FOP)
So I'm guessing step 1 is not part of FOP. Can you perhaps recommend
what I can use for 1.?
Thanks again!
I think you should try iText. You will find an explanation of what you
need near the end of "iText in Action", the authoritative book by
Bruno Lowagie, the guy who designed iText in the first place. And
before proceeding in your project you *should* read the caveats in his
book: extracting text content from an existing PDF may not be as
straightforward as you think - in fact may be almost nonsense in
certain situations. A PDF API will get you the text content in the
order it was technically generated, which may not be the "textual"
order (the order you read the elements in a book).
My own experience in top of this is that it is very difficult to
extract text content from non-European or large fonts (the CID-keyed
fonts, roughly said, those who have more than WinAnsi or ISO-8859-1
characters).
HTH,
Jean-François
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]