[ 
https://issues.apache.org/jira/browse/PDFBOX-4999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Klink updated PDFBOX-4999:
----------------------------------
    Description: 
The method {{COSDictionary.addAll(COSDictionary)}} creates the impression, by 
name and by JavaDoc comment,
{code:java}
/**
 * This will add all of the dictionaries keys/values to this dictionary.
...
{code}
that it can be used for exactly that, adding all key/value pairs from the 
argument dictionary to the current one, replacing old entries for the same keys.
 If one looks at the implementation, though, one is in for a surprise:
{code:java}
/**
 * This will add all of the dictionaries keys/values to this dictionary.
 * Only called when adding keys to a trailer that already exists.
 *
 * @param dic The dictionaries to get the keys from.
 */
public void addAll(COSDictionary dic)
{
    dic.forEach((key, value) ->
    {
        /*
         * If we're at a second trailer, we have a linearized pdf file, meaning 
that the first Size entry represents
         * all of the objects so we don't need to grab the second.
         */
        if (!COSName.SIZE.equals(key) || !items.containsKey(COSName.SIZE))
        {
            setItem(key, value);
        }
    });
}
{code}
Here existing *Size* entries explicitly are not replaced!

This appears to be a relic from times when PDFBox parsed PDF documents front to 
back, ignoring cross reference streams, for improved results with linearized 
files when merging trailer dictionaries.

Nowadays this exceptional treatment of *Size* does not make any sense anymore, 
see [this stack overflow answer|https://stackoverflow.com/a/64502740/1729265].

Furthermore, this method is used in other contexts than creating trailer 
unions, even some PDFBox methods use it to create arbitrary dictionary unions:
* 
{{org.apache.pdfbox.pdmodel.PDDocument.assignAcroFormDefaultResource(PDAcroForm,
 COSDictionary)}}
* {{org.apache.pdfbox.filter.JPXFilter.decode(InputStream, OutputStream, 
COSDictionary, int, DecodeOptions)}}
* {{org.apache.pdfbox.examples.interactive.form.FieldTriggers.main(String[])}}
* 
{{org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.PDImageXObject(PDStream,
 PDResources)}}
* 
{{org.apache.pdfbox.pdmodel.graphics.image.PDInlineImage.PDInlineImage(COSDictionary,
 byte[], PDResources)}}
* 
{{org.apache.pdfbox.pdmodel.graphics.image.PDInlineImageTest.testInlineImage()}}
* {{org.apache.pdfbox.pdfparser.XrefTrailerResolver.setStartxref(long)}}

(This list is offered by eclipse as callers of that method. There may be other, 
hidden calls.)

Thus, this exception should be removed after all usages of that method in 
PDFBox have been analyzed.

  was:
The method {{COSDictionary.addAll(COSDictionary)}} creates the impression, by 
name and by JavaDoc comment,
{code:java}
/**
 * This will add all of the dictionaries keys/values to this dictionary.
...
{code}
that it can be used for exactly that, adding all key/value pairs from the 
argument dictionary to the current one, replacing old entries for the same keys.
 If one looks at the implementation, though, one is in for a surprise:
{code:java}
/**
 * This will add all of the dictionaries keys/values to this dictionary.
 * Only called when adding keys to a trailer that already exists.
 *
 * @param dic The dictionaries to get the keys from.
 */
public void addAll(COSDictionary dic)
{
    dic.forEach((key, value) ->
    {
        /*
         * If we're at a second trailer, we have a linearized pdf file, meaning 
that the first Size entry represents
         * all of the objects so we don't need to grab the second.
         */
        if (!COSName.SIZE.equals(key) || !items.containsKey(COSName.SIZE))
        {
            setItem(key, value);
        }
    });
}
{code}
Here existing *Size* entries explicitly are not replaced!

This appears to be a relic from times when PDFBox parsed PDF documents front to 
back, ignoring cross reference streams, for improved results with linearized 
files when merging trailer dictionaries.

Nowadays this exceptional treatment of *Size* does not make any sense anymore, 
see [this stack overflow answer|https://stackoverflow.com/a/64502740/1729265].

Furthermore, this method is used in other contexts than creating trailer 
unions, even some PDFBox methods use it to create arbitrary dictionary unions:
* 
{{org.apache.pdfbox.pdmodel.PDDocument.assignAcroFormDefaultResource(PDAcroForm,
 COSDictionary)}}
* {{org.apache.pdfbox.filter.JPXFilter.decode(InputStream, OutputStream, 
COSDictionary, int, DecodeOptions)}}
* {{org.apache.pdfbox.examples.interactive.form.FieldTriggers.main(String[])}}
* 
{{org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.PDImageXObject(PDStream,
 PDResources)}}
* 
{{org.apache.pdfbox.pdmodel.graphics.image.PDInlineImage.PDInlineImage(COSDictionary,
 byte[], PDResources)}}
* 
{{org.apache.pdfbox.pdmodel.graphics.image.PDInlineImageTest.testInlineImage()}}

(This list is offered by eclipse as callers of that method. There may be other, 
hidden calls.)

Thus, this exception should be removed after all usages of that method in 
PDFBox have been analyzed.


> Dangerous COSDictionary.addAll(COSDictionary) method
> ----------------------------------------------------
>
>                 Key: PDFBOX-4999
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4999
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.21, 3.0.0 PDFBox
>            Reporter: Michael Klink
>            Priority: Critical
>
> The method {{COSDictionary.addAll(COSDictionary)}} creates the impression, by 
> name and by JavaDoc comment,
> {code:java}
> /**
>  * This will add all of the dictionaries keys/values to this dictionary.
> ...
> {code}
> that it can be used for exactly that, adding all key/value pairs from the 
> argument dictionary to the current one, replacing old entries for the same 
> keys.
>  If one looks at the implementation, though, one is in for a surprise:
> {code:java}
> /**
>  * This will add all of the dictionaries keys/values to this dictionary.
>  * Only called when adding keys to a trailer that already exists.
>  *
>  * @param dic The dictionaries to get the keys from.
>  */
> public void addAll(COSDictionary dic)
> {
>     dic.forEach((key, value) ->
>     {
>         /*
>          * If we're at a second trailer, we have a linearized pdf file, 
> meaning that the first Size entry represents
>          * all of the objects so we don't need to grab the second.
>          */
>         if (!COSName.SIZE.equals(key) || !items.containsKey(COSName.SIZE))
>         {
>             setItem(key, value);
>         }
>     });
> }
> {code}
> Here existing *Size* entries explicitly are not replaced!
> This appears to be a relic from times when PDFBox parsed PDF documents front 
> to back, ignoring cross reference streams, for improved results with 
> linearized files when merging trailer dictionaries.
> Nowadays this exceptional treatment of *Size* does not make any sense 
> anymore, see [this stack overflow 
> answer|https://stackoverflow.com/a/64502740/1729265].
> Furthermore, this method is used in other contexts than creating trailer 
> unions, even some PDFBox methods use it to create arbitrary dictionary unions:
> * 
> {{org.apache.pdfbox.pdmodel.PDDocument.assignAcroFormDefaultResource(PDAcroForm,
>  COSDictionary)}}
> * {{org.apache.pdfbox.filter.JPXFilter.decode(InputStream, OutputStream, 
> COSDictionary, int, DecodeOptions)}}
> * {{org.apache.pdfbox.examples.interactive.form.FieldTriggers.main(String[])}}
> * 
> {{org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.PDImageXObject(PDStream,
>  PDResources)}}
> * 
> {{org.apache.pdfbox.pdmodel.graphics.image.PDInlineImage.PDInlineImage(COSDictionary,
>  byte[], PDResources)}}
> * 
> {{org.apache.pdfbox.pdmodel.graphics.image.PDInlineImageTest.testInlineImage()}}
> * {{org.apache.pdfbox.pdfparser.XrefTrailerResolver.setStartxref(long)}}
> (This list is offered by eclipse as callers of that method. There may be 
> other, hidden calls.)
> Thus, this exception should be removed after all usages of that method in 
> PDFBox have been analyzed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to