Re: Overwriting Metadata

Tilman Hausherr Fri, 20 Jul 2018 19:47:47 -0700

Am 20.07.2018 um 23:37 schrieb Matthew Clemente:

Thanks Tilman.
I set up my code to match yours (it was pretty similar), and I’mgetting the same result. I can’t overwrite existing fieldsvia XMPMetadata.
For what it’s worth, I’m using version 2.0.11 of PDFBox and XMPBox;not sure if that would make a difference.
I’m assuming, with the approach you’re using, that you are able tochange the Author and Title?

Your question is somewhat unclear... or I misunderstood it ... you wrotethat you failed with both /Info and XMP /Metadata. With /Info (my smallreply) I was able to change just the subject and keep the rest.


Does this work or not?

With the larger code I replaced the whole metadata and didn't try toreplace just a single field.

Possible explanation: you looked at the PDFs with Adobe Reader. IIRCthat one displays what's in the XMP metadata first, i.e. if there is/Info and /Metadata

What do you really want, replace an individual field or replace thewhole metadata?


To alter individual fields, this should work like this:

XMPMetadata xmp = xmpParser.parse(meta.createInputStream());
DublinCoreSchema dc = xmp.getDublinCoreSchema();
if (dc != null)
dc.setDescription("descr");
else
    /// do as before

I took that code from the ExtractMetadata example from the source codedownload.


(I didn't test. It's in the middle of the night and I couldn't sleep)


Tilman


--
Matthew Clemente

From: Tilman Hausherr <[email protected]><mailto:[email protected]>Reply: [email protected] <mailto:[email protected]><[email protected]> <mailto:[email protected]>

Date: July 20, 2018 at 4:14:48 PM

To: [email protected] <mailto:[email protected]><[email protected]> <mailto:[email protected]>

Subject: Re: Overwriting Metadata

It works for me... here's my code:


import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.IOException;
import javax.xml.transform.TransformerException;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.common.PDMetadata;
import org.apache.xmpbox.XMPMetadata;
import org.apache.xmpbox.schema.DublinCoreSchema;
import org.apache.xmpbox.xml.XmpSerializer;

public class ChangeMeta
{

public static void main(String[] args) throws IOException,TransformerException

    {
        PDDocument doc = PDDocument.load(new File("testing.pdf"));
        XMPMetadata xmp = XMPMetadata.createXMPMetadata();
        DublinCoreSchema dc = xmp.createAndAddDublinCoreSchema();
dc.setDescription("descr");
        XmpSerializer serializer = new XmpSerializer();
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
serializer.serialize(xmp, baos, true);
        PDMetadata metadata = new PDMetadata(doc);
metadata.importXMPMetadata(baos.toByteArray());
doc.getDocumentCatalog().setMetadata(metadata);
        doc.save(new File("testing-new.pdf"));
    }
}

And the proof that it worked:




Tilman

Am 20.07.2018 um 21:15 schrieb Matthew Clemente:

Forgive me if this question has an obvious answer; perhaps I’m not taking
the right approach to the problem.

My goal is to save a version of a pdf, with modified metadata. In most
cases, I’ll be removing metadata (setting the author, title, description to
blank), though in some cases I’ll be adding information to those fields.

I’ve tried both approaches from these StackOverflow answers:
https://stackoverflow.com/questions/40295264/how-to-add-metadata-to-pdf-document-using-pdfbox


That is, I’ve tried creating the metadata via XMPMetadata and using
importXMPMetadata(). I’ve also tried using the Document Information object
(inputDoc.getDocumentInformation().setCreator("Some meta”);).

In both cases, if the field is empty in the original document I’ve loaded,
the new value is set without issue. However, if the metadata field already
contains a value, the new value is not applied.

Is there a way for me to overwrite metadata, or am I approaching this all
wrong?

Here’s a pdf I was using while testing (it has a title and author set, but
no subject):https://www.dropbox.com/s/olk2zhnh47ohtpk/testing.pdf?dl=0

Thanks, in advance, for any insight.

Re: Overwriting Metadata

Reply via email to