On Sep 5, 2007, at 10:19 PM, Chad Loder wrote:
> Is there any way using iText to develop a heuristic which detects PDFs
> which have been improperly redacted in the below fashion? Ever if it
> is not 100% reliable.
>

        Yes...

        HOWEVER, it will require you to do a LOT of work - since iText today  
only gives you the lower level of functionality.   iText can parse  
the content stream of a page (or XObject) for you - but that's it.   
You then need to take the results from that parse, build up a  
"display list", compute the bounds of each object in that list and  
then compare bounds.

        Give yourself at least a month or two - including a thorough reading  
of the PDF Reference.

        OR start with a library such as PdfBox or Multivalent that already  
have some/all of this framework in place.


Leonard


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
Buy the iText book: http://itext.ugent.be/itext-in-action/

Reply via email to