[ 
https://issues.apache.org/jira/browse/PDFBOX-424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Schlegel updated PDFBOX-424:
------------------------------------

    Description: 
Sometimes it can happens that decoding of streams can hang up.
The reason can be find in org.apache.pdfbox.filter.FlateFilter decode method.

Here you ask for available datas in the compressedData stream:

    decompressor = new InflaterInputStream(compressedData);
    int mayRead = compressedData.available();
    byte[] buffer = new byte[Math.min(mayRead, BUFFER_SIZE)];

Sometimes compressedData.available() returns 0.

Later you iterate over stream datas.

    while((amountRead = decompressor.read(buffer, 0, Math.min(mayRead, 
BUFFER_SIZE))) != -1 )
    {
        result.write(buffer, 0, amountRead);
    }

Because mayRead is 0 with every loop you try to read 0 bytes from stream ==> 
amountRead will be 0 for every loop ==> Loop nether finishes.


You can test this following PDF-Document: 
http://www.usu.de/d/Case_Studies/BSM/Profiles_in_Excellence_FIDUCIA_AG.pdf



  was:
Sometimes it can happens that decoding of streams can hang up.
The reason can be find in org.apache.pdfbox.filter.FlateFilter decode method.

Here you ask for available datas in the compressedData stream:

    decompressor = new InflaterInputStream(compressedData);
    int mayRead = compressedData.available();
    byte[] buffer = new byte[Math.min(mayRead, BUFFER_SIZE)];

Sometimes compressedData.available() returns 0.

Later you iterate over stream datas.

    while((amountRead = decompressor.read(buffer, 0, Main.min(mayRead, 
BUFFER_SIZE))) != -1 )
    {
        result.write(buffer, 0, amountRead);
    }

Because mayRead is 0 with every loop you try to read 0 bytes from stream ==> 
amountRead will be 0 for every loop ==> Loop nether finishes.


You can test this following PDF-Document: 
http://www.usu.de/d/Case_Studies/BSM/Profiles_in_Excellence_FIDUCIA_AG.pdf




> Stream decoding hangs up
> ------------------------
>
>                 Key: PDFBOX-424
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-424
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>            Reporter: Michael Schlegel
>
> Sometimes it can happens that decoding of streams can hang up.
> The reason can be find in org.apache.pdfbox.filter.FlateFilter decode method.
> Here you ask for available datas in the compressedData stream:
>     decompressor = new InflaterInputStream(compressedData);
>     int mayRead = compressedData.available();
>     byte[] buffer = new byte[Math.min(mayRead, BUFFER_SIZE)];
> Sometimes compressedData.available() returns 0.
> Later you iterate over stream datas.
>     while((amountRead = decompressor.read(buffer, 0, Math.min(mayRead, 
> BUFFER_SIZE))) != -1 )
>     {
>         result.write(buffer, 0, amountRead);
>     }
> Because mayRead is 0 with every loop you try to read 0 bytes from stream ==> 
> amountRead will be 0 for every loop ==> Loop nether finishes.
> You can test this following PDF-Document: 
> http://www.usu.de/d/Case_Studies/BSM/Profiles_in_Excellence_FIDUCIA_AG.pdf

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to