Hi,
I'm not sure if we dicussed that topic in the past or if I simply mixed it up
with a discussion about "equals" and "="
However, PDFBOX-5286 shows the we have an issue with objects which aren't the
same but are treated as the same because of the same hash. This is true for all
simple objects such as COSInteger, COSFLoat, COSBoolean and COSName.
Think about the following two indirect /Length objects
100 0 obj
512
endobj
200 0 obj
512
endobj
* there two different COSObjects "100 0" and "200 0"
* both COSObjects have different hashes
* both COSObjects are referencing a COSInteger holding the same value "512"
* both COSIntegers are different objects
* both COSIntegers have the SAME hash, as the current implementation of hashCode
is based on the value of the COSInteger
Or some pseudo code
COSObject(100,0) != COSObject(200,0)
COSInteger(100,0) != COSInteger(200,0)
COSObject(100,0).hashCode != COSObject(200,0).hashCode
COSInteger(100,0).hashCode == COSInteger(200,0).hashCode
COSInteger(100,0).equals(COSInteger(200,0) == true
IMHO we should change the implementation of hashCode so that different objects
will have different hashCodes.
I expect some side effects
* we are using a lot of hash-based collections and I'm afraid there may be some
cases where the fact of having the same hash for different objects is wanted
(knowingly or not)
* we have to remove the static instances for COSInteger values in a range from
-100 to 256 which will result in an increased number of COSInteger instances
* there are just two static instances of COSBoolean ("true" and "false") which
have to be replaced too
* COSName is caching a lot of values as static instances as well, which should
be removed as well
* looks like COSFloat shouldn't be a problem
WDYT? Should we simply start with COSFloat and COSInteger and see how it ends
up?
Andreas
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org