[ 
https://issues.apache.org/jira/browse/MAPREDUCE-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756702#action_12756702
 ] 

Doug Cutting commented on MAPREDUCE-980:
----------------------------------------

> Any reason why we can't directly use generated classes ?

You've already cited the biggest reason: the generated classes don't provide 
constructors or accessors.  Long-term, we could enhance Avro to generate these, 
but I'm not sure we'd want to directly use the generated classes even then.

The wrappers provide considerable utility, including:
 - Javadoc comments.  We could generate these perhaps from documentation in the 
schema.
 - Visibility: The wrappers only provide public getters, not setters.  We could 
perhaps add that to the schema and/or generator.
 - Type conversion:  In both the version included in MAPREDUCE-157 and this 
version there's a fair amount of field-specific type conversion.  For example, 
we don't directly serialize JobID instances, but rather use JobID's toString() 
and forName() methods to convert these to and from strings for serialization.  
Similarly for counters, task ids, etc.  Ideally all of these would be naturally 
serializeable using Avro, but, until they are, the wrappers make it easy to 
incorporate things like these.
 - Compatibility: If we update the schema then Avro will handle reading old 
data, but, without the wrappers, we'd be unable to provide a back-compatible 
API for accessing the old data.  So if we remove a field from the schema, with 
the wrappers we're able to deprecate the accessor and implement it in terms of 
new/remaining fields so that applications don't have to be upgraded.

So I'm not entirely convinced that using wrappers for stuff like this is a bad 
pattern long term.


> Modify JobHistory to use Avro for serialization instead of raw JSON
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-980
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-980
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Jothi Padmanabhan
>            Assignee: Doug Cutting
>             Fix For: 0.21.0
>
>         Attachments: MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch
>
>
> MAPREDUCE-157 modifies JobHistory to log events using Json Format.  This can 
> be modified to use Avro instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to