[
https://issues.apache.org/jira/browse/PDFBOX-443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Lehmkühler closed PDFBOX-443.
-------------------------------------
Resolution: Fixed
Fix Version/s: 1.0.0
Assignee: Andreas Lehmkühler
> Wrong length in stream decoding after exception (results in endless loop)
> -------------------------------------------------------------------------
>
> Key: PDFBOX-443
> URL: https://issues.apache.org/jira/browse/PDFBOX-443
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 0.8.0-incubator
> Environment: all
> Reporter: Timo Boehme
> Assignee: Andreas Lehmkühler
> Fix For: 1.0.0
>
> Original Estimate: 5m
> Remaining Estimate: 5m
>
> When reading streams from PDF files PDFBox may get stuck and don't return.
> This happens for instance if a stream with FlateDecode filter is read but
> flate decoder is unable to read encoded data.
> The reason is that in COSStream in method doDecode(COSName, int) the filter
> is first applied using length specified by PDF and in case of an error the
> filter is applied with length parameter set to 'writtenLength'. However
> 'writtenLength' is read from unfilteredStream which was newly created in
> first try - so written length is 0 which results in some endless loop in
> flate decoder.
> The solution: read written length before first try and (in case we have
> several filter) make sure that writtenLength is not 0.
> The fixed method:
> private void doDecode( COSName filterName, int filterIndex ) throws
> IOException
> {
> FilterManager manager = getFilterManager();
> Filter filter = manager.getFilter( filterName );
> InputStream input;
> boolean done = false;
> IOException exception = null;
> long position = unFilteredStream.getPosition();
> long length = unFilteredStream.getLength();
> long writtenLength = unFilteredStream.getLengthWritten();
> // TB: in case we need it later
> if( length == 0 )
> {
> //if the length is zero then don't bother trying to decode
> //some filters don't work when attempting to decode
> //with a zero length stream. See zlib_error_01.pdf
> unFilteredStream = new RandomAccessFileOutputStream( file );
> done = true;
> }
> else
> {
> //ok this is a simple hack, sometimes we read a couple extra
> //bytes that shouldn't be there, so we encounter an error we will
> just
> //try again with one less byte.
> for( int tryCount=0; !done && tryCount<5; tryCount++ )
> {
> try
> {
> input = new BufferedInputStream(
> new RandomAccessFileInputStream( file, position,
> length ), BUFFER_SIZE );
> unFilteredStream = new RandomAccessFileOutputStream( file
> );
> filter.decode( input, unFilteredStream, this, filterIndex
> );
> done = true;
> }
> catch( IOException io )
> {
> length--;
> exception = io;
> }
> }
> if( !done )
> {
> //if no good stream was found then lets try again but with the
> //length of data that was actually read and not length
> //defined in the dictionary
> length = writtenLength;
> if( length != 0 )
> {
> for( int tryCount=0; !done && tryCount<5;
> tryCount++ )
> {
> try
> {
> input = new BufferedInputStream(
> new RandomAccessFileInputStream(
> file, position, length ), BUFFER_SIZE );
> unFilteredStream = new
> RandomAccessFileOutputStream( file );
> filter.decode( input, unFilteredStream,
> this, filterIndex );
> done = true;
> }
> catch( IOException io )
> {
> length--;
> exception = io;
> }
> }
> }
> }
> }
> if( !done )
> {
> throw exception;
> }
> }
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira