I've been trying to extract an indirect object stream from a pdf file with lots of libraries with no success.
I've discovered pdfbox and it seems the best for my aim.
Here a snippet from my pdf file:

558 0 obj
<</Contents 583 0 R/CropBox[0 0 595.22 842]/MediaBox[0 0 595.22 842]/Parent 29 0 R/Resources
 <</ColorSpace <</CS0 563 0 R>>
   /ExtGState <</GS0 568 0 R>>
   /Font<</TT0 559 0 R/TT1 560 0 R/TT2 561 0 R/TT3 562 0 R>>
   /ProcSet[/PDF/Text/ImageC]
   /Properties<</MC0<</*MYKEY* 584 0 R>>/MC1<</SubKey 582 0 R>> >>
   /XObject<</Im0 578 0 R>>>>
 /Rotate 0/StructParents 0/Type/Page>>
endobj

...
...

584 0 obj
<</Length 8>>stream

1_22_4_1    ->>> I NEED THIS!

endstream

so, I have to extract the string contained by MYKEY indirect object. How can I do that? Ok, I can get it working on python, for example using the pypdf library, but I got stuck when I tryed to decrypt pdf files! The only ones I can decrypt on python are the files encrypted by oldest (very very very old) versions of acrobat.

Can anybody help me?
Thanks in advance
Giancarlo F.

p.s.: sorry for my bad english

Reply via email to