If you are assuming that the serialization is canonical, can't you just
compare the raw bytes?
On Fri, Apr 23, 2010 at 6:08 AM, Owen O'Malley <omal...@apache.org> wrote:
> On Apr 22, 8:33 pm, stuti awasthi <stutic...@gmail.com> wrote:
> > I wanted to pass the Protocol Buffer generated serialized file
> > directly to map reduce.
> I actually have a patch for Hadoop that does this. When my work
> load on security calms down, I'll clean it up and post it on Hadoop's
> The one spot that Protocol Buffers doesn't give me what I need is
> in defining a RawComparator to support sorting Protocol Buffer keys.
> For those of you not in Hadoop, that means I need to be able to
> int compare(byte b1, int s1, int l1, byte b2, int s2, int l2)
> for serialized Messages. The best approach that I can currently see
> is to walk through the Message's fields via getDescriptorForType
> and use the field's getType to compare the next field in each of the
> keys. It would have to assume the key's fields were in the sorted
> order, but that seems like a reasonable assumption for a single
> MapReduce job. Am I missing something? Is there already code
> that does this, in an Apache license friendly project?
> -- Owen
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To post to this group, send email to proto...@googlegroups.com.
> To unsubscribe from this group, send email to
> For more options, visit this group at
You received this message because you are subscribed to the Google Groups
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to
For more options, visit this group at