[ 
https://issues.apache.org/jira/browse/AVRO-108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748893#action_12748893
 ] 

Doug Cutting commented on AVRO-108:
-----------------------------------

An API for this might be something like:

  BinaryComparator.compare(byte[] bytes1, int start1, byte[] bytes2, int 
start2, Schema schema);

The schema provided must be the schema used to write the data.

Records would be ordered using the order of their fields, arrays and maps by 
their entries, unions by their branches, etc.


> add binary comparator
> ---------------------
>
>                 Key: AVRO-108
>                 URL: https://issues.apache.org/jira/browse/AVRO-108
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>            Reporter: Doug Cutting
>
> Hadoop MapReduce performance benefits greatly if data may be compared without 
> deserializing to an object, but rather by examining its serialized bytes 
> directly.  Such "raw" comparators are typically written by hand in Hadoop, 
> and are very fragile.
> With Avro it is possible to generically compare two serialized byte sequences 
> if their schema is known.  This should work for any Avro data, regardless of 
> how it was serialized or how it will be deserialized.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to