Hi folks,
I tried to extract a certain information from a PDF, I think the PDF correct
term for this is 'layer' at least it is called layer in Adobe Acrobat Reader.
Is this possible with the latest PDFbox? These layers store additional
information and are marked with colored rectangles inside the document, the
layer itself shows up when the colored part of the text is clicked. I am not
sure about the correct terminology but I think these colored rectangles are
different from the layer and they are called 'annotations'. I can extract
something with:
page.getAnnotations();
annotation.getDictionary();
The extracted data looks not very sensible but it corresponds to the annotated
colored rectangles (and layers?). An example of this extraction looks like
this:
COSDictionary{(COSName{C}:COSArray{[COSInt{0}, COSInt{0}, COSInt{1}]})
(COSName{Border}:COSArray{[COSInt{0}, COSInt{0}, COSInt{0}]})
(COSName{A}:COSDictionary{(COSName{S}:COSName{JavaScript})
(COSName{JS}:COSDictionary{(COSName{Length}:COSInt{478})
(COSName{Filter}:COSName{FlateDecode}) }) }) (COSName{Subtype}:COSName{Link})
(COSName{Rect}:COSArray{[COSFloat{277.28}, COSFloat{304.13}, COSFloat{326.75},
COSFloat{311.03}]}) }
But theres is none of the information in it that is stored in the pop-up layer.
An example of such a 'layer' can be seen here:
http://alexanderklenner.de/img/pdf_user_example_layer.jpg
I dont see a way to get at this stored information so far..
Thank you for any help,
Alex
--
Dipl. Bioinformatiker Alexander G. Klenner
Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI)
Schloss Birlinghoven, D-53754 Sankt Augustin
Tel.: +49 - 2241 - 14 - 2736
E-mail: [email protected]
Internet: http://www.scai.fraunhofer.de