This is still a very broad request set, but a few comments:

Xpdf can be patched to disregard copy/edit/print restrictions (those
set with mast rather than user pass) - although the author has a
statement on cracking  - http://www.foolabs.com/xpdf/cracking.html.

You can see a fast sample patch for 3.0.2 (verified) here:
http://www.cs.cmu.edu/~dst/Adobe/Gallery/XPDF/hovland.txt

and general instructions for older versions here:
http://www.cs.cmu.edu/~dst/Adobe/Gallery/xpdf-generic-patch.html

These types of patches violate the Adobe implementation spec. FWIW.

If you'd rather try to brute force passwords, you can always try
pdfcrack (http://pdfcrack.sourceforge.net/). This may take a very long
time on 128-bit encrypted PDFs depending on the speed of your hardware
(although such tools also support masks, dictionaries, etc. Older
40-bit RC4 encrypted PDFs can generally be cracked rapidly with this
and other tools (same for older .doc files - Googling will find you
dozens of programs on and offline that do this).

40-bit passwords can be efficiently recovered (if you have a lot of
disk space and tight time requirements) with rainbow tables; you can
buy them (and the associated tools) from companies like Elcomsoft or
run something like Cain and Abel if you want a front-end, or get free
tables from http://www.freerainbowtables.com/ and run RainbowCrack
(http://project-rainbowcrack.com/). Note that RC4 table support is not
actually included in the free tools I've listed, I'm just making a
point about hash cracking in general.

Kam Woods
Postdoctoral Research Associate
School of Information and Library Science, University of North
Carolina at Chapel Hill


On Fri, Jan 20, 2012 at 9:01 AM, Farrell, Larry D
<[email protected]> wrote:
> At this point I was primarily targeting PDF and Microsoft Office files that 
> would be passed on to our cataloging folks for manual inspection if they were 
> DRM protected.  As has been pointed out on the list, general DRM detection 
> has far trickier than I'd initially thought.  I've been using Apache Tika for 
> file type detection, metadata and full text extraction.  However, when 
> parsing encrypted or password protected files it throws the less than 
> unhelpful "Unexpected Runtime Exception".
>
> Dean

Reply via email to