Wang, Xinglong created AVRO-2354:
------------------------------------
Summary: Add CombineAvroKeyValueFileInputFormat in avro-mapred to
combine small avro keyvalue files into combineSplit
Key: AVRO-2354
URL: https://issues.apache.org/jira/browse/AVRO-2354
Project: Apache Avro
Issue Type: Improvement
Components: java
Reporter: Wang, Xinglong
In our production env, we generate avro files to track some user behavior
events. Every hour, we will have several avro files created. And daily, we will
run MR to do analysis, when using AvroKeyValueInputFormat, a lot of small
mappers started due to we have small avro files.
A combine file inputformat will be very helpful for such case.
Hadoop already provided some implementation for sequencefile and text file.
This Jira is propose a CombineAvroKeyValueFileInputFormat class to implement
the same for avro keyvalue files.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)