金健康 wrote:
> Can iText meet my idear?
> 
> 1.Supposed I have an ebook(book_en.pdf) in English.

Consider this PDF to be like a vector image.
(This is an ASSUMPTION based on an educated guess.)

> 2.Parse book_en.pdf :

You can parse the file structure of a PDF and discover PDF objects
such as null, boolean, number, string, name, array, dictionary, stream.
And you can parse content streams that consist of operators and
operands (the Adobe Imaging Model).

>   2.1 Export a paragraph or a phrase, tanslate to Chinese, and rewrite
> to book_zh.pdf;

The concept of a paragraph and a phrase is UNKNOWN in PDF.
If your PDF is tagged, then you could have a chance of retrieving
the English text; otherwise you need OCR software.
Translating English to Chinese is off-topic on this list.

>   2.2 and so on, and so on...

You should learn more about PDF before even thinking
about "and so on" options.

> Generally speaking, book_en.pdf is same to book_zh.pdf except for language.
> 
> Most import of all, Formatting (chapter, section, subsection and so
> on) must be identical.

The best way to do this is to hire a human being to
translate the PDF and to create a new PDF based on
the translation.

br,
Bruno

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Do you like iText?
Buy the iText book: http://www.1t3xt.com/docs/book.php
Or leave a tip: https://tipit.to/itexttipjar

Reply via email to