Re: Removing the Javascript from a PDF

2016-10-18 Thread Tilman Hausherr

Am 18.10.2016 um 15:59 schrieb em...@michaelfesser.ca:

Hi,

I am fairly new to PDFBox and I am wondering if there is a way to find 
and remove all the Javascript in a PDF file.  I have a script to do 
with with IText but the version I am using is fairly old and I would 
like to use PDFBox for all the PDF manipulation if possible. 


Here's a code that shows where there is javascript. This might be 
changed by replacing it with empty stuff.



/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *  http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */


import java.io.File;
import java.io.IOException;
import java.util.List;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDDocumentCatalog;
import org.apache.pdfbox.pdmodel.interactive.action.PDAction;
import org.apache.pdfbox.pdmodel.interactive.action.PDActionJavaScript;
import 
org.apache.pdfbox.pdmodel.interactive.action.PDFormFieldAdditionalActions;

import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationWidget;
import org.apache.pdfbox.pdmodel.interactive.form.PDAcroForm;
import org.apache.pdfbox.pdmodel.interactive.form.PDField;
import org.apache.pdfbox.pdmodel.interactive.form.PDNonTerminalField;
import org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField;

/**
 * This example will take a PDF document and print all the javascript 
fields from the file.

 *
 * @author Ben Litchfield
 * @author Tilman Hausherr
 *
 */
public class PrintJavaScriptFields
{

/**
 * This will print all the fields from the document.
 *
 * @param pdfDocument The PDF to get the fields from.
 *
 * @throws IOException If there is an error getting the fields.
 */
public void printFields(PDDocument pdfDocument) throws IOException
{
PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog();
PDAcroForm acroForm = docCatalog.getAcroForm();
List fields = acroForm.getFields();

//System.out.println(fields.size() + " top-level fields were 
found on the form");

for (PDField field : fields)
{
processField(field, "|--", field.getPartialName());
}
}

private void processField(PDField field, String sLevel, String 
sParent) throws IOException

{
String partialName = field.getPartialName();

if (field instanceof PDTerminalField)
{
PDTerminalField termField = (PDTerminalField) field;
PDFormFieldAdditionalActions fieldActions = field.getActions();
if (fieldActions != null)
{
System.out.println(field.getFullyQualifiedName() + ": " 
+ fieldActions.getClass().getSimpleName() + " js field actionS:\n" + 
fieldActions.getCOSObject());

printPossibleJS(fieldActions.getK());
printPossibleJS(fieldActions.getC());
printPossibleJS(fieldActions.getF());
printPossibleJS(fieldActions.getV());
}
for (PDAnnotationWidget widgetAction : termField.getWidgets())
{
PDAction action = widgetAction.getAction();
if (action instanceof PDActionJavaScript)
{
System.out.println(field.getFullyQualifiedName() + 
": " + action.getClass().getSimpleName() + " js widget action:\n" + 
action.getCOSObject());

printPossibleJS(action);
}
}
}

if (field instanceof PDNonTerminalField)
{
if (!sParent.equals(field.getPartialName()))
{
if (partialName != null)
{
sParent = sParent + "." + partialName;
}
}
//System.out.println(sLevel + sParent);

for (PDField child : ((PDNonTerminalField) 
field).getChildren())

{
processField(child, "|  " + sLevel, sParent);
}
}
else
{
String fieldValue = field.getValueAsString();
StringBuilder outputString = new StringBuilder(sLevel);
outputString.append(sParent);
if (partialName != null)
{
outputString.append(".").append(partialName);
}
outputStr

Removing the Javascript from a PDF

2016-10-18 Thread email

Hi,

I am fairly new to PDFBox and I am wondering if there is a way to find 
and remove all the Javascript in a PDF file.  I have a script to do with 
with IText but the version I am using is fairly old and I would like to 
use PDFBox for all the PDF manipulation if possible.


Thanks,

Mike

-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org