[ 
https://issues.apache.org/jira/browse/AVRO-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangrui Meng updated AVRO-1357:
--------------------------------

    Release Note: Allow users to choose de-serialization types among generic 
(always generating generic records), specific (generating specific records if 
input data matches an Avro generated class), and reflect (generating instances 
if input data matches an existing class) for input data and map output data in 
mapred jobs.
          Status: Patch Available  (was: Open)
    
> Allow to force reading generic records for input data and map output data
> -------------------------------------------------------------------------
>
>                 Key: AVRO-1357
>                 URL: https://issues.apache.org/jira/browse/AVRO-1357
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>    Affects Versions: 1.7.4
>            Reporter: Xiangrui Meng
>         Attachments: AVRO-1357.patch, AVRO-1357.patch.1
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> In AvroJob/AvroInputFormat/AvroRecordReader, we can choose either 
> SpecificDatumReader or ReflectDatumReader to read input data and map output 
> data, but not GenericDatumReader. We may want to force reading generic 
> records for some jobs.
> For example, assume that the input records contain a field called "category" 
> and we want to compute the number of records for each category. If we can 
> force reading generic records, we can get the category string by calling 
> get("category"). Otherwise, the input record might be loaded as a 
> GenericRecord instance or a SpecificRecord instance. The latter does not 
> implement GenericRecord.
> To add this feature, we can change the booleans 
> IS_REFLECT/MAP_OUTPUT_IS_REFLECT into enums called 
> INPUT_AVRO_DESERIALIZATION_TYPE/MAP_OUTPUT_AVRO_DESERIALIZATION_TYPE, and 
> return the corresponding DatumReader based on the type.
> We can add 
> setDeserializationType/setInputDeserializationType/setMapOutputDeserializationType
>  to AvroJob while deprecating setReflect/setInputReflect/setMapOutputReflect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to