Re: Problem getting values from PDF

Maruan Sahyoun Thu, 04 Apr 2013 03:13:02 -0700

Hi,

can you try using the Non Sequential Parser PDDocument.loadNonSeq(..) ?


BR

Maruan Sahyoun

Am 04.04.2013 um 12:00 schrieb Tobias Tertel <[email protected]>:

> Hey,
> 
> I've a PDF Document and want to extract the values. The PDF-Document is a 
> form. Therfore I can't use the example with processField and field.getValue().
> I've done the following:
> 
> PDDocument pdf = PDDocument.load(new File("test.pdf"));
> PDDocumentCatalog docCatalog = pdf.getDocumentCatalog();
> PDAcroForm acroForm = docCatalog.getAcroForm();
> 
> for (Object pObject : acroForm.getFields()) {
>            PDField pfield = (PDField) pObject;
>            COSDictionary cos1 = (COSDictionary) pfield.getCOSObject();
>            for (Entry<COSName, COSBase> test  : cos1.entrySet())
>            {
>                System.out.println(test.getKey().toString() + " : " + 
> test.getValue().toString());
>            }
>            System.out.println(cos1.getString("T") + " => " + 
> cos1.getString("V"));
>            System.out.println("done");
>        }
> 
> The output ist the folowing:
> 
> COSName{DA} : COSString{/HelveticaNeueLTPro-Bd 18 Tf 0 g}
> COSName{FT} : COSName{Tx}
> COSName{Kids} : COSArray{[COSObject{23, 0}, COSObject{33, 0}]}
> COSName{Q} : COSInt{1}
> COSName{T} : COSString{Marke_01}
> Marke_01 => null
> done
> COSName{DA} : COSString{/HelveticaNeueLTPro-Bd 48 Tf 0 g}
> COSName{FT} : COSName{Tx}
> COSName{Ff} : COSInt{4096}
> COSName{Kids} : COSArray{[COSObject{24, 0}, COSObject{34, 0}]}
> COSName{Q} : COSInt{1}
> COSName{T} : COSString{Produkt_01}
> COSName{V} : COSString{PåskesortimentnullNormalsortiment}
> Produkt_01 => PåskesortimentnullNormalsortiment
> done
> 
> In the first part I'm missing: COSName{V}
> 
> The content of the PDF-Document is:
> 
> %EOF
> 11 0 obj
> <</DA(/HelveticaNeueLTPro-Bd 18 Tf 0 g)/FT/Tx/Kids[23 0 R 33 0 R]/Q 
> 1/T(Marke_01)/V(test hallo)>>
> endobj
> 2 0 obj
> <</DA(/HelveticaNeueLTPro-Bd 48 Tf 0 g)/FT/Tx/Ff 4096/Kids[24 0 R 34 0 R]/Q 
> 1/T(Produkt_01)/V(Påskesortiment\rNormalsortiment)>>
> endobj
> 
> Can anybody help me?
> Thank You.
> 
> Best regards
> Tobias

Re: Problem getting values from PDF

Reply via email to