[ 
https://issues.apache.org/jira/browse/PDFBOX-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-2690:
---------------------------------------
    Description: 
I am using PDFBox 1.8.8 to manipulate existing PDF files. After saving a 
document, the output file becomes several times larger than the original. This 
is undesirable.

*How to reproduce my problem:*

In the following code, PDFBox simply loads an existing PDF and then save it. 
Nothing else is done. Yet the file size still becomes several times larger.

{code}
import java.io.*;
import org.apache.pdfbox.pdmodel.*;
import org.apache.pdfbox.exceptions.*;

class Test 
{
    public static void main(String[] args) throws IOException, 
COSVisitorException {

    PDDocument document = PDDocument.load("input2.pdf");
    document.save("input2-after-save.pdf");
    document.close();       
    }
}   
{code}
Attached are two sample PDF files.  input2.pdf is an original, unprocessed PDF. 
  input2-after-save.pdf is processed by the code above.  After processing, file 
size increases from 416kB to 1.25MB.


  was:
I am using PDFBox 1.8.8 to manipulate existing PDF files. After saving a 
document, the output file becomes several times larger than the original. This 
is undesirable.

*How to reproduce my problem:*

In the following code, PDFBox simply loads an existing PDF and then save it. 
Nothing else is done. Yet the file size still becomes several times larger.

{code}
import java.io.*;
import org.apache.pdfbox.pdmodel.*;
import org.apache.pdfbox.exceptions.*;

class Test 
{
    public static void main(String[] args) throws IOException, 
COSVisitorException {

    PDDocument document = PDDocument.load("input2.pdf");
    document.save("input2-after-save.pdf");
    document.close();       
    }
}   
{code}
Attached are two sample PDF files.  input2.pdf is an original, unprocessed PDF. 
  input2-after-save.pdf is processed by the code above.  After processing, file 
size increases from 416kB to 1.25MB.

*Possible reason:*

 Tilman Hausherr suggests that there is an enormous amount of "structure" 
information / object stream that is compressed in the input file, but not in 
the output file.

     Issue Type: Improvement  (was: Bug)
        Summary: Implement write support for compressed object streams  (was: 
Filesize becomes extremely large after saving)

PDFBox is able to read compressed object streams but doesn't support writing 
such streams. This inflates the size of pdfs like the attached one.

According to that I've changed the title and the issue type

> Implement write support for compressed object streams
> -----------------------------------------------------
>
>                 Key: PDFBOX-2690
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2690
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Writing
>    Affects Versions: 1.8.8, 1.8.9, 2.0.0
>         Environment: PDFBox 1.8.8, Java8u25, Windows 8.1
>            Reporter: Brian Liu
>         Attachments: input2-after-save.pdf, input2.pdf
>
>
> I am using PDFBox 1.8.8 to manipulate existing PDF files. After saving a 
> document, the output file becomes several times larger than the original. 
> This is undesirable.
> *How to reproduce my problem:*
> In the following code, PDFBox simply loads an existing PDF and then save it. 
> Nothing else is done. Yet the file size still becomes several times larger.
> {code}
> import java.io.*;
> import org.apache.pdfbox.pdmodel.*;
> import org.apache.pdfbox.exceptions.*;
> class Test 
> {
>     public static void main(String[] args) throws IOException, 
> COSVisitorException {
>     PDDocument document = PDDocument.load("input2.pdf");
>     document.save("input2-after-save.pdf");
>     document.close();       
>     }
> }   
> {code}
> Attached are two sample PDF files.  input2.pdf is an original, unprocessed 
> PDF.   input2-after-save.pdf is processed by the code above.  After 
> processing, file size increases from 416kB to 1.25MB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to