Am 24.08.2016 um 15:41 schrieb Davide Zoni:
Thank you. This might be helpful but i'm afraid that i would not be able to 
check every possibility. There's a way to check if a PDF is static (or 
dynamic)? For our pourpose that shuold be enough.

No there is no such method.

Tilman


Best regards.

         Davide Zoni

         Cedacri S.p.A.

         Tel.: 0521807433

         e-mail: davide.z...@cedacri.it

         www.cedacri.it


________________________________________
Da: Tilman Hausherr [thaush...@t-online.de]
Inviato: martedì 23 agosto 2016 18.23
A: users@pdfbox.apache.org
Oggetto: Re: Check for scripts in a PDF

Am 23.08.2016 um 09:35 schrieb Davide Zoni:
Yes, i'm seeking to detect files with scripts. Not static. I don't undestand what do 
you mean with "Maybe compare
with the preflight source code to check that you didn't miss something", can 
you elaborate on that?
I meant to search for "Javascript" in the source code, and then see
where it is used. This is just so that you can be more sure what you got
all when you read the PDF specification.

Btw I once wrote some code to show (some) javascript fields, see below
or search for "Roberto Nibali Javascript". He also improved that code
and posted the improved version. It may not find all javascript stuff,
but it could help show you how to write code.

Tilman


public class PrintJavaScriptFields
{

      /**
       * This will print all the fields from the document.
       *
       * @param pdfDocument The PDF to get the fields from.
       *
       * @throws IOException If there is an error getting the fields.
       */
      public void printFields(PDDocument pdfDocument) throws IOException
      {
          PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog();
          PDAcroForm acroForm = docCatalog.getAcroForm();
          List<PDField> fields = acroForm.getFields();

          //System.out.println(fields.size() + " top-level fields were
found on the form");

          for (PDField field : fields)
          {
              processField(field, "|--", field.getPartialName());
          }
      }

      private void processField(PDField field, String sLevel, String
sParent) throws IOException
      {
          String partialName = field.getPartialName();

          if (field instanceof PDTerminalField)
          {
              PDTerminalField termField = (PDTerminalField) field;
              for (PDAnnotationWidget widget : termField.getWidgets())
              {
                  PDAction action = widget.getAction();
                  if (action instanceof PDActionJavaScript)
                  {
                      System.out.println(field.getFullyQualifiedName() +
": " + action.getClass().getSimpleName() + " js widget action:\n" +
action.getCOSObject());
                      printPossibleJS(action);
                  }
                  PDAnnotationAdditionalActions actions =
widget.getActions();
                  if (actions != null)
                  {
                      System.out.println(field.getFullyQualifiedName() +
": " + actions.getClass().getSimpleName() + " js widget actionS:\n" +
actions.getCOSObject());

                      // Merkwürdig, wieso bekomme ich nicht
PDFormFieldAdditionalActions sondern ein PDAnnotationAdditionalActions
in dem ein K ist aber kein getK() ?
                      PDFormFieldAdditionalActions ffActions = new
PDFormFieldAdditionalActions((COSDictionary) actions.getCOSObject());
                      printPossibleJS(ffActions.getK());
                      printPossibleJS(ffActions.getC());
                      printPossibleJS(ffActions.getF());
                      printPossibleJS(ffActions.getV());
                  }
              }
          }

          if (field instanceof PDNonTerminalField)
          {
              if (!sParent.equals(field.getPartialName()))
              {
                  if (partialName != null)
                  {
                      sParent = sParent + "." + partialName;
                  }
              }
              //System.out.println(sLevel + sParent);

              for (PDField child : ((PDNonTerminalField)
field).getChildren())
              {
                  processField(child, "|  " + sLevel, sParent);
              }
          }
          else
          {
              String fieldValue = field.getValueAsString();
              StringBuilder outputString = new StringBuilder(sLevel);
              outputString.append(sParent);
              if (partialName != null)
              {
                  outputString.append(".").append(partialName);
              }
              outputString.append(" = ").append(fieldValue);
              outputString.append(",
type=").append(field.getClass().getName());
              //System.out.println(outputString);
          }
      }

      private void printPossibleJS(PDAction kAction)
      {
          if (kAction instanceof PDActionJavaScript)
          {
              PDActionJavaScript jsAction = (PDActionJavaScript) kAction;
              String jsString = jsAction.getAction();
              if (!jsString.contains("\n"))
              {
                  // Sonst erscheint in Netbeans nichts?!
                  jsString = jsString.replaceAll("\r",
"\n").replaceAll("\n\n", "\n");
              }
              System.out.println(jsString);
              System.out.println();
          }
      }

      /**
       * This will read a PDF file and print out the form elements. <br />
       * see usage() for commandline
       *
       * @param args command line arguments
       *
       * @throws IOException If there is an error importing the FDF document.
       */
      public static void main(String[] args) throws IOException
      {
          PDDocument pdf = null;
          try
          {
              pdf = PDDocument.load(new File(XXXXXX));
              PrintJavaScriptFields exporter = new PrintJavaScriptFields();
              exporter.printFields(pdf);
          }
          finally
          {
              if (pdf != null)
              {
                  pdf.close();
              }
          }
      }

}



Thank you.

          Davide

________________________________________
Da: Tilman Hausherr [thaush...@t-online.de]
Inviato: martedì 23 agosto 2016 8.34
A: users@pdfbox.apache.org
Oggetto: Re: Check for scripts in a PDF

Am 22.08.2016 um 15:14 schrieb Davide Zoni:
Hallo everybody,

i'm using PDFbox to check if a PDF file contains malicious scripts. I'm using the 
PDF/A-1a validation to check the file. Since i'm searching only for potential damaging 
code and not for a true PDF/A-1a standard accompliance, is it enough to consider 1.x.x, 
6.x.x and 7.x.x errors as "true" errors? Below category description:

Category        Description
1[.y[.z]]       Syntax Error
2[.y[.z]]       Graphic Error
3[.y[.z]]       Font Error
4[.y[.z]]       Transparency Error
5[.y[.z]]       Annotation Error
6[.y[.z]]       Action Error
7[.y[.z]]       Metadata Error
Unclear what you're asking. Are you seeking to detect files with
javascript? If so, I'd rather build something something from scratch,
i.e. read the PDF specification and see where JS is used. Maybe compare
with the preflight source code to check that you didn't miss something.

Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Il contenuto e le informazioni di questo messaggio di posta elettronica sono 
riservate, confidenziali e non vincolanti nè impegnative per Cedacri s.p.a., ne 
è vietata pertanto la diffusione o divulgazione in qualunque modo eseguita. 
Qualora Lei non fosse la persona a cui il presente messaggio è destinato La 
invitiamo ad eliminarlo e a non leggerlo, dandocene gentilmente comunicazione. 
The content, informations and any attachments of this e-mail are classified, 
confidential and not binding neither impegnative for Cedacri S.P.A., the spread 
or spreading in any executed way is prohibited therefore. If you are not named 
recipient, please notify the sender immediately and do not disclose the 
contents to another person, use it for any purpose, or store or copy the 
information in any medium.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to