I can't thank you enough for a response like that. Definitely increases my
knowledge of what's going on!
From: Leonard Rosenthol [mailto:[email protected]]
Sent: Monday, December 19, 2011 3:34 PM
To: Post here
Subject: Re: [iText-questions] Using iText to analyze PDF for distilling issues
Yes, it most certainly CAN include visual data!
Consider the following:
* Postscript doesn't support JBIG2 or JPEG2000 compressions, so that in
order to convert a PDF that uses those to Postscript, the image data must be
decompressed and recompressed. If the new compression choice is lossy (such as
JPEG) - then you've thrown away all sorts of image data. AND if you do that
again when converting BACK to PDF - double loss!
* Postscript doesn't support color management (such as ICC profiles), so if
you have a PDF that uses them, the colors must be changed to device-dependant
ones (and then changed back again when going back to PDF). Again, double color
shift!
* Postscript doesn't support layers (aka optional content groups), so that
if you convert a PDF that uses them you will suddenly see ALL SORTS of new
objects that you weren't expecting!
Those are just a few examples - but hopefully they demonstrate the point.
Additionally, text search ability is lost during this conversion - there is no
guarantee that after a refrying operation that any text on the page(s) can be
searched properly since the act of conversion to Postscript only bothers to
maintain visual fidelity (Postscript is for printing, after all!).
Leonard
From: "Rick.Wellman" <[email protected]<mailto:[email protected]>>
Reply-To: Post here
<[email protected]<mailto:[email protected]>>
Date: Mon, 19 Dec 2011 13:14:28 -0800
To: Post here
<[email protected]<mailto:[email protected]>>
Subject: Re: [iText-questions] Using iText to analyze PDF for distilling issues
Well, yes, you are preaching to the choir. We are definitely pursuing another
vendor and are thankfully 50%+ done with converting to their tool.
Admittedly, PDF format/issues are not my strength... definitely in the learning
stage and I appreciate your candid responses. They have, at a minimum,
verified that I am not entirely crazy (as it concerns this issue).
I do wonder if you might try to answer one more question based on your
response. When you say the process is HUGELY lossy, does that include "visual
data"? Through the always fallible human review experience, we have found that
very little "visual data" is ever lost (though there are times that it is).
Would that surprise you to hear me say that? I can definitely see where some
of the PDF "features" are lost (signatures, etc., etc.)
Thanks for the bandwidth you have been able to spare up to this point. I do
understand there is a vagueness to my questions but your responses are
definitely helping me focus my own research.
From: Leonard Rosenthol [mailto:[email protected]]
Sent: Monday, December 19, 2011 1:59 PM
To: Post here
Subject: Re: [iText-questions] Using iText to analyze PDF for distilling issues
If I had a "vendor supplied PDF tool" that wasn't handling valid PDF files -
I'd go find myself a new vendor and/or tool. The process of "refrying" a PDF
(PDF->PS->PDF) is HUGELY lossy. You are throwing away SO MUCH information that
can NEVER be recovered. You might as well just poke yourself in the eye with a
sharp stick - less painful for you and your customers.
Leonard
------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php