[ 
https://issues.apache.org/jira/browse/THRIFT-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091831#comment-15091831
 ] 

Christian Spriegel commented on THRIFT-1630:
--------------------------------------------

Ok, thanks a lot!

> Equivalent objects that contain sets and maps can serialize differently
> -----------------------------------------------------------------------
>
>                 Key: THRIFT-1630
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1630
>             Project: Thrift
>          Issue Type: New Feature
>          Components: Java - Compiler
>            Reporter: Chris Mullins
>            Assignee: Roger Meier
>         Attachments: 
> 0001-THRIFT-1630-Add-sorted_containers-switch-to-java-gen.patch
>
>
> There's a subtle issue with trying to compare the serialized bytes of Thrift 
> objects that contain maps or sets in Java. Even though the objects that go 
> into sets (or serve as map keys) have consistent hashcodes, if they are 
> inserted in different order, then the iteration order of the collection will 
> also be different. Since serialization occurs in iteration order, this can 
> lead to objects that are .equals() when in-memory being not-equals when 
> serialized.
> In most cases this isn't an issue. However, in cases where the user is doing 
> raw comparison (ie, Hadoop), then it is a big issue.
> One solution is to just switch the internal Map implementation to the Sorted 
> version (TreeSet/TreeMap). However, these implementations are about 3x slower 
> than their Hash counterparts, and I can certainly foresee situations in which 
> that would upset a lot of users. I propose we add a compiler switch that 
> toggles the Map/Set implementation between sorted and unsorted so that users 
> can select which they prefer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to