Hi everybody again, i'm trying to figure out if your method is suitable for my necessities but everytime i try to access the acroForm (even in a pdf file with scripts and forms) it's null. Am i loading the file in a wrong way? Am i missing something?
Best regards. ________________________________________ Da: Tilman Hausherr [thaush...@t-online.de] Inviato: mercoledì 24 agosto 2016 18.24 A: users@pdfbox.apache.org Oggetto: Re: Check for scripts in a PDF Am 24.08.2016 um 15:41 schrieb Davide Zoni: > Thank you. This might be helpful but i'm afraid that i would not be able to > check every possibility. There's a way to check if a PDF is static (or > dynamic)? For our pourpose that shuold be enough. No there is no such method. Tilman > Best regards. > > Davide Zoni > > Cedacri S.p.A. > > Tel.: 0521807433 > > e-mail: davide.z...@cedacri.it > > www.cedacri.it > > > ________________________________________ > Da: Tilman Hausherr [thaush...@t-online.de] > Inviato: martedì 23 agosto 2016 18.23 > A: users@pdfbox.apache.org > Oggetto: Re: Check for scripts in a PDF > > Am 23.08.2016 um 09:35 schrieb Davide Zoni: >> Yes, i'm seeking to detect files with scripts. Not static. I don't undestand >> what do you mean with "Maybe compare >> with the preflight source code to check that you didn't miss something", can >> you elaborate on that? > I meant to search for "Javascript" in the source code, and then see > where it is used. This is just so that you can be more sure what you got > all when you read the PDF specification. > > Btw I once wrote some code to show (some) javascript fields, see below > or search for "Roberto Nibali Javascript". He also improved that code > and posted the improved version. It may not find all javascript stuff, > but it could help show you how to write code. > > Tilman > > > public class PrintJavaScriptFields > { > > /** > * This will print all the fields from the document. > * > * @param pdfDocument The PDF to get the fields from. > * > * @throws IOException If there is an error getting the fields. > */ > public void printFields(PDDocument pdfDocument) throws IOException > { > PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog(); > PDAcroForm acroForm = docCatalog.getAcroForm(); > List<PDField> fields = acroForm.getFields(); > > //System.out.println(fields.size() + " top-level fields were > found on the form"); > > for (PDField field : fields) > { > processField(field, "|--", field.getPartialName()); > } > } > > private void processField(PDField field, String sLevel, String > sParent) throws IOException > { > String partialName = field.getPartialName(); > > if (field instanceof PDTerminalField) > { > PDTerminalField termField = (PDTerminalField) field; > for (PDAnnotationWidget widget : termField.getWidgets()) > { > PDAction action = widget.getAction(); > if (action instanceof PDActionJavaScript) > { > System.out.println(field.getFullyQualifiedName() + > ": " + action.getClass().getSimpleName() + " js widget action:\n" + > action.getCOSObject()); > printPossibleJS(action); > } > PDAnnotationAdditionalActions actions = > widget.getActions(); > if (actions != null) > { > System.out.println(field.getFullyQualifiedName() + > ": " + actions.getClass().getSimpleName() + " js widget actionS:\n" + > actions.getCOSObject()); > > // Merkwürdig, wieso bekomme ich nicht > PDFormFieldAdditionalActions sondern ein PDAnnotationAdditionalActions > in dem ein K ist aber kein getK() ? > PDFormFieldAdditionalActions ffActions = new > PDFormFieldAdditionalActions((COSDictionary) actions.getCOSObject()); > printPossibleJS(ffActions.getK()); > printPossibleJS(ffActions.getC()); > printPossibleJS(ffActions.getF()); > printPossibleJS(ffActions.getV()); > } > } > } > > if (field instanceof PDNonTerminalField) > { > if (!sParent.equals(field.getPartialName())) > { > if (partialName != null) > { > sParent = sParent + "." + partialName; > } > } > //System.out.println(sLevel + sParent); > > for (PDField child : ((PDNonTerminalField) > field).getChildren()) > { > processField(child, "| " + sLevel, sParent); > } > } > else > { > String fieldValue = field.getValueAsString(); > StringBuilder outputString = new StringBuilder(sLevel); > outputString.append(sParent); > if (partialName != null) > { > outputString.append(".").append(partialName); > } > outputString.append(" = ").append(fieldValue); > outputString.append(", > type=").append(field.getClass().getName()); > //System.out.println(outputString); > } > } > > private void printPossibleJS(PDAction kAction) > { > if (kAction instanceof PDActionJavaScript) > { > PDActionJavaScript jsAction = (PDActionJavaScript) kAction; > String jsString = jsAction.getAction(); > if (!jsString.contains("\n")) > { > // Sonst erscheint in Netbeans nichts?! > jsString = jsString.replaceAll("\r", > "\n").replaceAll("\n\n", "\n"); > } > System.out.println(jsString); > System.out.println(); > } > } > > /** > * This will read a PDF file and print out the form elements. <br /> > * see usage() for commandline > * > * @param args command line arguments > * > * @throws IOException If there is an error importing the FDF document. > */ > public static void main(String[] args) throws IOException > { > PDDocument pdf = null; > try > { > pdf = PDDocument.load(new File(XXXXXX)); > PrintJavaScriptFields exporter = new PrintJavaScriptFields(); > exporter.printFields(pdf); > } > finally > { > if (pdf != null) > { > pdf.close(); > } > } > } > > } > > > >> Thank you. >> >> Davide >> >> ________________________________________ >> Da: Tilman Hausherr [thaush...@t-online.de] >> Inviato: martedì 23 agosto 2016 8.34 >> A: users@pdfbox.apache.org >> Oggetto: Re: Check for scripts in a PDF >> >> Am 22.08.2016 um 15:14 schrieb Davide Zoni: >>> Hallo everybody, >>> >>> i'm using PDFbox to check if a PDF file contains malicious scripts. I'm >>> using the PDF/A-1a validation to check the file. Since i'm searching only >>> for potential damaging code and not for a true PDF/A-1a standard >>> accompliance, is it enough to consider 1.x.x, 6.x.x and 7.x.x errors as >>> "true" errors? Below category description: >>> >>> Category Description >>> 1[.y[.z]] Syntax Error >>> 2[.y[.z]] Graphic Error >>> 3[.y[.z]] Font Error >>> 4[.y[.z]] Transparency Error >>> 5[.y[.z]] Annotation Error >>> 6[.y[.z]] Action Error >>> 7[.y[.z]] Metadata Error >> Unclear what you're asking. Are you seeking to detect files with >> javascript? If so, I'd rather build something something from scratch, >> i.e. read the PDF specification and see where JS is used. Maybe compare >> with the preflight source code to check that you didn't miss something. >> >> Tilman >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org >> For additional commands, e-mail: users-h...@pdfbox.apache.org >> >> Il contenuto e le informazioni di questo messaggio di posta elettronica sono >> riservate, confidenziali e non vincolanti nè impegnative per Cedacri s.p.a., >> ne è vietata pertanto la diffusione o divulgazione in qualunque modo >> eseguita. Qualora Lei non fosse la persona a cui il presente messaggio è >> destinato La invitiamo ad eliminarlo e a non leggerlo, dandocene gentilmente >> comunicazione. The content, informations and any attachments of this e-mail >> are classified, confidential and not binding neither impegnative for Cedacri >> S.P.A., the spread or spreading in any executed way is prohibited therefore. >> If you are not named recipient, please notify the sender immediately and do >> not disclose the contents to another person, use it for any purpose, or >> store or copy the information in any medium. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org >> For additional commands, e-mail: users-h...@pdfbox.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org