Hi.

I have a problem converting a PDF file with GhostScript to PS format. My requirement is that the resulting PS has to have a resolution of 2400 DPI.

I have a PDF file wich, if converted that way, has an enormous size (> 80MB). I figured out that the elements that make th PS so huge are 1.) a pattern used on the page to simulate grey background on a certain area and 2.) a transparent layer above that pattern.

Using Adobe Acrobat i was able to remove those objects. With the resulting PDF the conversion with GhostScript works fine and filesizes are as expected (< 500kb)

Now I am trying to remove th 2 objects from the PDF file using iText. My problem is that 1.) i am not possible to find these objects using iText and 2.) since i do not know which is the object to remove i can't do it.

I tried to find the object with the iText Toolbox and to find information about it by parsing the PDF with the following code, but the information i got didn't give me a clue how to proceed.

=============================SNIP========================
PdfReader reader;
for (int i = 0; i < reader.getXrefSize(); i++)
  {
   PdfObject pdfobj = reader.getPdfObject(i);
   if (pdfobj != null) {
    //determine wether object is stream
    if (pdfobj.isStream()) {
     strIdx++;
    }

analyzeObject(pdfobj, out, streamFileName + strIdx + ".dat", writer, pdfStream, false);
   }
  }
=============================SNIP========================

Analysis was performed by getting several information about the contained objects, like here (abbreviated version):

=============================SNIP========================
private void analyzeObject(PdfObject pdfobj, BufferedWriter out, String streamDat, PdfWriter writer, FileOutputStream pdfOutStream, boolean write) throws IOException {
  if (pdfobj.isNull()) {
   System.out.println("PDF Object is null");
   PdfNull pdfNull = (PdfNull)pdfobj;
   System.out.println("PDF Null : " + pdfNull.toString());
   if (write && pdfNull.getBytes() != null) {
    pdfNull.toPdf(writer, pdfOutStream);
   }
  }
  if (pdfobj.isArray()) {
   System.out.println("PDF Object is array");
   PdfArray pdfArray = (PdfArray)pdfobj;
   System.out.println("PDF Array : " + pdfArray.toString());
   if (write && pdfArray.getBytes() != null) {
    pdfArray.toPdf(writer, pdfOutStream);
   }
   List arrayList = pdfArray.getArrayList();
   for (Object pdfArrayObject : arrayList) {
analyzeObject((PdfObject)pdfArrayObject, out, streamDat, writer, pdfOutStream, write);
   }
=============================SNIP========================

So my question is: How do i find out wich are the objects i have to remove and how to remove them, if that is possible with iText (as i learned that some contents can not be removed by iText properly).

I have attached demo pdf files including the pattern (Pattern_only.pdf) and another one including the pattern and the transparent layer (Patterna_and_trnas_layer.pdf). These files have been produced using Adobe Acrobat by copying of the elements in question and pasting them on a blank page.

The original PDF contains more content than the ones attached, including content my customer does not want to be published, therefore the "cut down" versions.

Any help will be greatly appreciated.


--
Mit freundlichen Grüßen / Sincerely

Claas Jäger

Attachment: Pattern_and_trans_layer.pdf
Description: Adobe PDF document

Attachment: Pattern_only.pdf
Description: Adobe PDF document

begin:vcard
fn;quoted-printable:Claas J=C3=A4ger
n;quoted-printable:J=C3=A4ger;Claas
org;quoted-printable:Allpen Gesellschaft f=C3=BCr Systementwicklung mbH
adr;quoted-printable;dom:;;L=C3=BCbecker Strasse 1;Hamburg;;22087
email;internet:jae...@allpen.de
title:Projektmanager
tel;work:+49(0)40 28407011
x-mozilla-html:FALSE
url:http://www.allpen.de
version:2.1
end:vcard

------------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It is the best place to buy or sell services for
just about anything Open Source.
http://p.sf.net/sfu/Xq1LFB
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php

Reply via email to