Wang, Xinglong created AVRO-2354:
------------------------------------

             Summary: Add CombineAvroKeyValueFileInputFormat in avro-mapred to 
combine small avro keyvalue files into combineSplit
                 Key: AVRO-2354
                 URL: https://issues.apache.org/jira/browse/AVRO-2354
             Project: Apache Avro
          Issue Type: Improvement
          Components: java
            Reporter: Wang, Xinglong


In our production env, we generate avro files to track some user behavior 
events. Every hour, we will have several avro files created. And daily, we will 
run MR to do analysis, when using AvroKeyValueInputFormat, a lot of small 
mappers started due to we have small avro files. 

A combine file inputformat will be very helpful for such case. 

Hadoop already provided some implementation for sequencefile and text file. 
This Jira is propose a CombineAvroKeyValueFileInputFormat class to implement 
the same for avro keyvalue files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to