Thank you. This might be helpful but i'm afraid that i would not be able to 
check every possibility. There's a way to check if a PDF is static (or 
dynamic)? For our pourpose that shuold be enough.

Best regards.

        Davide Zoni

        Cedacri S.p.A.

        Tel.: 0521807433

        e-mail: davide.z...@cedacri.it

        www.cedacri.it


________________________________________
Da: Tilman Hausherr [thaush...@t-online.de]
Inviato: martedì 23 agosto 2016 18.23
A: users@pdfbox.apache.org
Oggetto: Re: Check for scripts in a PDF

Am 23.08.2016 um 09:35 schrieb Davide Zoni:
> Yes, i'm seeking to detect files with scripts. Not static. I don't undestand 
> what do you mean with "Maybe compare
> with the preflight source code to check that you didn't miss something", can 
> you elaborate on that?

I meant to search for "Javascript" in the source code, and then see
where it is used. This is just so that you can be more sure what you got
all when you read the PDF specification.

Btw I once wrote some code to show (some) javascript fields, see below
or search for "Roberto Nibali Javascript". He also improved that code
and posted the improved version. It may not find all javascript stuff,
but it could help show you how to write code.

Tilman


public class PrintJavaScriptFields
{

     /**
      * This will print all the fields from the document.
      *
      * @param pdfDocument The PDF to get the fields from.
      *
      * @throws IOException If there is an error getting the fields.
      */
     public void printFields(PDDocument pdfDocument) throws IOException
     {
         PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog();
         PDAcroForm acroForm = docCatalog.getAcroForm();
         List<PDField> fields = acroForm.getFields();

         //System.out.println(fields.size() + " top-level fields were
found on the form");

         for (PDField field : fields)
         {
             processField(field, "|--", field.getPartialName());
         }
     }

     private void processField(PDField field, String sLevel, String
sParent) throws IOException
     {
         String partialName = field.getPartialName();

         if (field instanceof PDTerminalField)
         {
             PDTerminalField termField = (PDTerminalField) field;
             for (PDAnnotationWidget widget : termField.getWidgets())
             {
                 PDAction action = widget.getAction();
                 if (action instanceof PDActionJavaScript)
                 {
                     System.out.println(field.getFullyQualifiedName() +
": " + action.getClass().getSimpleName() + " js widget action:\n" +
action.getCOSObject());
                     printPossibleJS(action);
                 }
                 PDAnnotationAdditionalActions actions =
widget.getActions();
                 if (actions != null)
                 {
                     System.out.println(field.getFullyQualifiedName() +
": " + actions.getClass().getSimpleName() + " js widget actionS:\n" +
actions.getCOSObject());

                     // Merkwürdig, wieso bekomme ich nicht
PDFormFieldAdditionalActions sondern ein PDAnnotationAdditionalActions
in dem ein K ist aber kein getK() ?
                     PDFormFieldAdditionalActions ffActions = new
PDFormFieldAdditionalActions((COSDictionary) actions.getCOSObject());
                     printPossibleJS(ffActions.getK());
                     printPossibleJS(ffActions.getC());
                     printPossibleJS(ffActions.getF());
                     printPossibleJS(ffActions.getV());
                 }
             }
         }

         if (field instanceof PDNonTerminalField)
         {
             if (!sParent.equals(field.getPartialName()))
             {
                 if (partialName != null)
                 {
                     sParent = sParent + "." + partialName;
                 }
             }
             //System.out.println(sLevel + sParent);

             for (PDField child : ((PDNonTerminalField)
field).getChildren())
             {
                 processField(child, "|  " + sLevel, sParent);
             }
         }
         else
         {
             String fieldValue = field.getValueAsString();
             StringBuilder outputString = new StringBuilder(sLevel);
             outputString.append(sParent);
             if (partialName != null)
             {
                 outputString.append(".").append(partialName);
             }
             outputString.append(" = ").append(fieldValue);
             outputString.append(",
type=").append(field.getClass().getName());
             //System.out.println(outputString);
         }
     }

     private void printPossibleJS(PDAction kAction)
     {
         if (kAction instanceof PDActionJavaScript)
         {
             PDActionJavaScript jsAction = (PDActionJavaScript) kAction;
             String jsString = jsAction.getAction();
             if (!jsString.contains("\n"))
             {
                 // Sonst erscheint in Netbeans nichts?!
                 jsString = jsString.replaceAll("\r",
"\n").replaceAll("\n\n", "\n");
             }
             System.out.println(jsString);
             System.out.println();
         }
     }

     /**
      * This will read a PDF file and print out the form elements. <br />
      * see usage() for commandline
      *
      * @param args command line arguments
      *
      * @throws IOException If there is an error importing the FDF document.
      */
     public static void main(String[] args) throws IOException
     {
         PDDocument pdf = null;
         try
         {
             pdf = PDDocument.load(new File(XXXXXX));
             PrintJavaScriptFields exporter = new PrintJavaScriptFields();
             exporter.printFields(pdf);
         }
         finally
         {
             if (pdf != null)
             {
                 pdf.close();
             }
         }
     }

}



>
> Thank you.
>
>          Davide
>
> ________________________________________
> Da: Tilman Hausherr [thaush...@t-online.de]
> Inviato: martedì 23 agosto 2016 8.34
> A: users@pdfbox.apache.org
> Oggetto: Re: Check for scripts in a PDF
>
> Am 22.08.2016 um 15:14 schrieb Davide Zoni:
>> Hallo everybody,
>>
>> i'm using PDFbox to check if a PDF file contains malicious scripts. I'm 
>> using the PDF/A-1a validation to check the file. Since i'm searching only 
>> for potential damaging code and not for a true PDF/A-1a standard 
>> accompliance, is it enough to consider 1.x.x, 6.x.x and 7.x.x errors as 
>> "true" errors? Below category description:
>>
>> Category        Description
>> 1[.y[.z]]       Syntax Error
>> 2[.y[.z]]       Graphic Error
>> 3[.y[.z]]       Font Error
>> 4[.y[.z]]       Transparency Error
>> 5[.y[.z]]       Annotation Error
>> 6[.y[.z]]       Action Error
>> 7[.y[.z]]       Metadata Error
> Unclear what you're asking. Are you seeking to detect files with
> javascript? If so, I'd rather build something something from scratch,
> i.e. read the PDF specification and see where JS is used. Maybe compare
> with the preflight source code to check that you didn't miss something.
>
> Tilman
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: users-h...@pdfbox.apache.org
>
> Il contenuto e le informazioni di questo messaggio di posta elettronica sono 
> riservate, confidenziali e non vincolanti nè impegnative per Cedacri s.p.a., 
> ne è vietata pertanto la diffusione o divulgazione in qualunque modo 
> eseguita. Qualora Lei non fosse la persona a cui il presente messaggio è 
> destinato La invitiamo ad eliminarlo e a non leggerlo, dandocene gentilmente 
> comunicazione. The content, informations and any attachments of this e-mail 
> are classified, confidential and not binding neither impegnative for Cedacri 
> S.P.A., the spread or spreading in any executed way is prohibited therefore. 
> If you are not named recipient, please notify the sender immediately and do 
> not disclose the contents to another person, use it for any purpose, or store 
> or copy the information in any medium.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: users-h...@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to