[
https://issues.apache.org/jira/browse/THRIFT-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091821#comment-15091821
]
Christian Spriegel commented on THRIFT-1630:
--------------------------------------------
May I ask which fixVersion this ticket is?
> Equivalent objects that contain sets and maps can serialize differently
> -----------------------------------------------------------------------
>
> Key: THRIFT-1630
> URL: https://issues.apache.org/jira/browse/THRIFT-1630
> Project: Thrift
> Issue Type: New Feature
> Components: Java - Compiler
> Reporter: Chris Mullins
> Assignee: Roger Meier
> Attachments:
> 0001-THRIFT-1630-Add-sorted_containers-switch-to-java-gen.patch
>
>
> There's a subtle issue with trying to compare the serialized bytes of Thrift
> objects that contain maps or sets in Java. Even though the objects that go
> into sets (or serve as map keys) have consistent hashcodes, if they are
> inserted in different order, then the iteration order of the collection will
> also be different. Since serialization occurs in iteration order, this can
> lead to objects that are .equals() when in-memory being not-equals when
> serialized.
> In most cases this isn't an issue. However, in cases where the user is doing
> raw comparison (ie, Hadoop), then it is a big issue.
> One solution is to just switch the internal Map implementation to the Sorted
> version (TreeSet/TreeMap). However, these implementations are about 3x slower
> than their Hash counterparts, and I can certainly foresee situations in which
> that would upset a lot of users. I propose we add a compiler switch that
> toggles the Map/Set implementation between sorted and unsorted so that users
> can select which they prefer.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)