RE: Problem with the PDFA1B compliance

Manulak Dissanayake Tue, 04 Sep 2012 21:19:13 -0700

Hi Again

I found and experienced that the conversion of an existing PDF to a PDF A 
complaint ones is merely impossible (or very hard and long process to do so) 
and still I found that the problem is with the PDF Schema values when the newly 
added metadata values are not matching with them.
I further found that it is quite possible to read a PDF and edit the PDF Schema 
meta data. So I tried reading the existing PDF to an existing PdfCopy, add 
metadata to it, and then forward it to the convertPdfA() method which is stated 
in my below mail.
But still it doesn't work. (Assume that the file1 is the existing PDF and the 
file2 is the PDFA complaint file which is the output)


      PdfReader reader = new PdfReader(new 
RandomAccessFileOrArray(file1.getPath()), null);
      PdfStamper stamper = new PdfStamper(reader, new 
FileOutputStream(newFile));
      HashMap info = reader.getInfo();
      info.put("Subject", "Report Executor");
      info.put("Author", "Report building logic");
      info.put("Keywords", "PDF/A compliant");
      info.put("Title", "Report");
      info.put("Creator", "Application reporting");
      stamper.setMoreInfo(info);
      ByteArrayOutputStream baos = new ByteArrayOutputStream();
      XmpWriter xmp = new XmpWriter(baos, info);
      xmp.close();
      stamper.setXmpMetadata(baos.toByteArray());
      stamper.close();

(I call this method before calling the convertToPdfA method, and then it will 
return file1 with modified PDF document information)

In my already existing PDF (which is to be converted to PDFA), only the 
following metadata are there.
(When executed - reader.getInfo() )
[0]          "Type => Info"
[1]          "Producer => null"

My PDF which was converted to PDFA from my convertToPdfA method compliance to 
PDFA/1b correctly when I removed the title, keywords and subject fields from 
the PDF information.
(Removed code)

//document.addTitle(reader.getInfo().get("Title").toString());

//document.addKeywords(reader.getInfo().get("Keywords").toString());

//document.addSubject(reader.getInfo().get("Subject").toString());

So it confirms that the above three properties are not existing in PDF 
dictionary values, and if anyhow we can add those values to the already created 
PDF and then pass it to conversion, it should work.
How can I change my approach? As this is not working for me, do you have any 
more suggestions?
Is that enough to update only the PDF Schema like this and then use my 
convertToPdfA method given below?

Thank you in advance.
-Manulak Dissanayake

From: Manulak Dissanayake [mailto:manulak.dissanay...@ifsworld.com]
Sent: Thursday, August 23, 2012 4:38 PM
To: fop-users@xmlgraphics.apache.org
Subject: Problem with the PDFA1B compliance


Hi All



I am creating a PDF which is supposed to be complaint with PDF/A1-b and when I 
use online validators like PDF TOOLS Online 
Validator<http://www.pdf-tools.com/pdf/validate-pdfa-online.aspx>, it gives the 
following error saying that the PDF is not PDF/A complaint.



dc:description/*[0] :: Missing language qualifier.

dc:title/*[0] :: Missing language qualifier.

The property 'pdf:keywords' is not defined in schema 'Adobe PDF Schema'.

The XMP property 'dc:title' is not synchronized with the document information 
entry 'Title'.

The XMP property 'dc:description' is not synchronized with the document 
information entry 'Subject'.

The required XMP property 'pdf:Keywords' for the document information entry 
'Keywords' is missing.



I am using FOP for rendering my PDF and using iText (iText 2.1.6 by 1T3XT) for 
converting the PDF to be PDF/A complaint.



public void convertToPdfA(File file1, File file2)

{

PdfReader reader = null;

Document document = null;

PdfCopy copy;



       InputStream PROFILE = 
this.getClass().getClassLoader().getResourceAsStream("com/ini/vldt/config/sRGB 
Color Space Profile.icm");



There I first read the PDF file to a PdfReader object, then get a copy as 
follows.



reader = new PdfReader(new RandomAccessFileOrArray(file1.getPath()),null);

              //Creating a new document that has the existing page size

              document = new Document(reader.getPageSizeWithRotation(1));

              //save the data in the 'document' to file2

              copy = new PdfCopy(document, new FileOutputStream(file2));



Then I try adding the Title, Author, Keywords, Creator, Subject, Creation Date 
and Producer as follows.



try{

document.addTitle(reader.getInfo().get("Title").toString());

}

catch(Exception e){

document.addTitle("TEST Title");

}



try{

document.addAuthor(reader.getInfo().get("Author").toString());

}

catch(Exception e){

document.addAuthor("TEST Author");

}



try{

document.addKeywords(reader.getInfo().get("Keywords").toString());

}

catch(Exception e){

document.addKeywords("TEST Keywords");

}



document.addCreator("TEST Creator");

document.addCreationDate();

document.addProducer();



try{

document.addSubject(reader.getInfo().get("Subject").toString());

       }

catch(Exception e){

document.addSubject("TEST Subject");

}





Then I give the PDFA conformance values and then create the PDF as follows.



copy.setPDFXConformance(PdfCopy.PDFA1B);

copy.setPdfVersion(PdfCopy.VERSION_1_4);

document.open();

ICC_Profile icc = ICC_Profile.getInstance(PROFILE);

copy.setOutputIntents("Custom", "", "http://www.color.org";, "sRGB 
IEC61966-2.1", icc);



for(int i=1;i<=reader.getNumberOfPages();i++){

document.setPageSize(reader.getPageSizeWithRotation(i));

copy.addPage(copy.getImportedPage(reader, i));

}

copy.createXmpMetadata();

copy.close();

document.close();

reader.close();





I got another error report from 
SolidFrameowrk<http://www.validatepdfa.com/online.htm> validations as follows.


- <metadata>
  <problem severity="error" objectID="27" clause="6.7" standard="pdfa">Language 
qualifier missing for property 'dc:description'</problem>
  <problem severity="error" objectID="27" clause="6.7" standard="pdfa">Language 
qualifier missing for property 'dc:title'</problem>
  <problem severity="error" objectID="27" clause="TN0003" 
standard="pdfa">Property 'pdf:keywords' shall use a custom embedded 
schema</problem>
  <problem severity="error" objectID="27" clause="6.7.3" 
standard="pdfa">Document information entry 'Keywords' not synchronized with 
metadata property 'pdf:Keywords'</problem>
 </metadata>
- <fonts>
  <problem severity="error" objectID="12" clause="6.3.5" 
standard="pdfa">Missing or incorrect CIDSet for CIDFont subset</problem>
  <problem severity="error" objectID="18" clause="6.3.5" 
standard="pdfa">Missing or incorrect CIDSet for CIDFont subset</problem>
 </fonts>

What can be the issue? Any reply is highly appreciated.

Thank  you.
-Manulak Dissanayake



------------------------------------------------------------------------------



CONFIDENTIALITY AND DISCLAIMER NOTICE



This e-mail, including any attachments, is confidential and for use only by

the intended recipient. If you are not the intended recipient, please notify

us immediately and delete this e-mail from your system. Any use or disclosure

of the information contained herein is strictly prohibited. As internet

communications are not secure, we do not accept legal responsibility for the

contents of this message nor responsibility for any change made to this

message after it was sent by the original sender. We advise you to carry out

your own virus check as we cannot accept liability for damage resulting from

software viruses.



------------------------------------------------------------------------------

CONFIDENTIALITY AND DISCLAIMER NOTICE

This e-mail, including any attachments, is confidential and for use only by
the intended recipient. If you are not the intended recipient, please notify
us immediately and delete this e-mail from your system. Any use or disclosure
of the information contained herein is strictly prohibited. As internet
communications are not secure, we do not accept legal responsibility for the
contents of this message nor responsibility for any change made to this
message after it was sent by the original sender. We advise you to carry out
your own virus check as we cannot accept liability for damage resulting from
software viruses.

RE: Problem with the PDFA1B compliance

Reply via email to