[ http://issues.apache.org/jira/browse/HADOOP-525?page=comments#action_12441544 ] Trevor Strohman commented on HADOOP-525: ----------------------------------------
I just added three files (TypeBuilder, WordCountType and TypeBuilder-support.tar) to this issue based on the thread referenced below. This code comes from my own MapReduce implementation and is not directly Hadoop-compatible. The code here handles automatic generation of record comparators, hash functions, and serialization code. The serialization code in particular uses knowledge of object order to compress the output. Reference: http://mail-archives.apache.org/mod_mbox/lucene-hadoop-user/200610.mbox/[EMAIL PROTECTED] > Need raw comparators for hadoop record types > -------------------------------------------- > > Key: HADOOP-525 > URL: http://issues.apache.org/jira/browse/HADOOP-525 > Project: Hadoop > Issue Type: Improvement > Components: record > Affects Versions: 0.6.0 > Reporter: Sameer Paranjpye > Assigned To: Milind Bhandarkar > Priority: Minor > Fix For: 0.8.0 > > Attachments: TypeBuilder-support.tar, TypeBuilder.java, > WordCountType.java > > > Raw comparators are not generated for types that are generated with the > Hadoop record framework. This could have a substantial performance impact > when using hadoop record generated types in Map/Reduce. The record i/o > framework should auto-generate raw comparators for types. > Comparison for hadoop record i/o types is defined to be member wise > comparison of objects. A possible implementation could only deserialize one > member from each object at a time, compare them and either return or move on > to the next member if the values are equal. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
