At 06:01 AM 6/7/2004, Whenham Patrick (Gecotec) wrote:
I have tons of image + hidden text pdfs ( paper -> tif -> pdf -> OCR-image+hidden text ).

OK.


To comply with legal constraints we have to deal with (100% quality OCR or none),

Seems reasonable.

        Is there a reason you don't just fix the OCR - or didn't originally?


I must remove the hidden text from the pdf files (the text generated by running OCR 'image + hidden text' on image pdfs, text that does not display, but allows searching and copy & paste ).

Is there any function in iText that allows this ?

No, there isn't.

In fact, I am not aware of any off the shelf product that can do this - but if could certainly be built using other libraries or tools.


Leonard

---------------------------------------------------------------------------
Leonard Rosenthol                            <mailto:[EMAIL PROTECTED]>
Chief Technical Officer                      <http://www.pdfsages.com>
PDF Sages, Inc.                              215-938-7080 (voice)
                                             215-938-0880 (fax)



-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
iText-questions mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Reply via email to