[ 
https://issues.apache.org/jira/browse/PDFBOX-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16286483#comment-16286483
 ] 

Tilman Hausherr commented on PDFBOX-4028:
-----------------------------------------

This code produces an empty file:

{code}
        String name1 = "PDF_32000_2008.pdf";
        String name2 = "PDF_32000_2008 - Kopie.pdf";
        Files.copy(new File(name1).toPath(), new File(name2).toPath(),  
StandardCopyOption.REPLACE_EXISTING);
        try (RandomAccessFile raf = new RandomAccessFile(name2, "r");
             FileOutputStream fos = new FileOutputStream(name2))
        {
            byte[] buffer = new byte[4096];
            int n;
            while (-1 != (n = raf.read(buffer)))
            {
                fos.write(buffer, 0, n);
            }
        }
{code}

So there's no way you could get a good result by overwriting your input file. 
What remains is the question why there isn't an exception.

This code produces a bad result. When debugging one can see that the first 
write has correct data, the second one has 4096 zeroes.

RandomAccessBufferedFileInputStream uses RandomAccessFile inside.

{code}
        String name1 = "PDF_32000_2008.pdf";
        String name2 = "PDF_32000_2008 - Kopie.pdf";
        Files.copy(new File(name1).toPath(), new File(name2).toPath(),  
StandardCopyOption.REPLACE_EXISTING);
        try (RandomAccessBufferedFileInputStream raf = new 
RandomAccessBufferedFileInputStream(name2);
             FileOutputStream fos = new FileOutputStream(name2))
        {
            // at this time, the destination file is empty
            byte[] buffer = new byte[4096];
            int n;
            while (-1 != (n = raf.read(buffer)))
            {
                fos.write(buffer, 0, n);
                fos.flush();
                fos.getFD().sync();
            }
        }
{code}

One might argue that it is a bug in RandomAccessBufferedFileInputStream... It 
does a seek past the new end and then reads nothing. But this isn't reported in 
any way.

I tried throwing an exception but existing tests failed.

I then modified RandomAccessBufferedFileInputStream to use the actual file 
length instead of the one from the beginning... the result is empty.

So for your test, you'd only get the increment part.

Thus what you do will always fail.

What you did is a bad idea anyway, even if it would be possible to write while 
reading. Imaging a power failure while writing - you'd have your PDF in an 
unknown state.

> SaveIncremental on same opened file
> -----------------------------------
>
>                 Key: PDFBOX-4028
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4028
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Writing
>    Affects Versions: 2.0.8
>            Reporter: Martin Mancuska
>         Attachments: pdf_reference_1-7.pdf
>
>
> The incremental save does not work correctly if it is done on the same opened 
> document. It produces corrupted file. The save incremental should append 
> changes at the end of file (after last origin EOF).
> Newly saved file contains changes also in the middle of the file not only at 
> the end. Changes in the middle of file contains zeroed bytes or garbage. 
> Tested with the latest stable version of PDFBox 2.0.8.
>  
> Sample code:
> {code:java}
> String fileName = "/path/to/document.pdf";
> PDDocument doc = PDDocument.load((new File(fileName));
> ...
> document changes
> ...
> try ( OutputStream outStream = new FileOutputStream(fileName)) {
>       doc.saveIcremental(outStream);
> }
> catch ....
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to