[ 
https://issues.apache.org/jira/browse/AVRO-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396264#comment-13396264
 ] 

Jacob Metcalf commented on AVRO-1103:
-------------------------------------

I have not yet delved far enough into the innards of Hadoop to understand why 
Configuration.getClassloader() does not work for me. I can see it does the 
following:

{quote}
 private ClassLoader classLoader;
 {{
    classLoader = Thread.currentThread().getContextClassLoader();
    if (classLoader == null) {{
       classLoader = Configuration.class.getClassLoader();
     }}
  }}
{quote}

When I debug 0.23.1 in standalone mode I can see that the config classLoader 
does not reference the jar containing my Avro specific classes from the /lib 
directory of my job jar. I agree calling Class.forName(...).getClassLoader() is 
a sledge hammer to crack a nut. I can debug a bit more this week to try and 
work out why.


For Hadoop 2 its both pom and code changes. But the code changes were 
relatively easy as the difference in the versions seemed to be confined to 
TaskAttemptContext & SequenceFileBase.


                
> New AvroDeserializer should Locate Appropriate Classloader
> ----------------------------------------------------------
>
>                 Key: AVRO-1103
>                 URL: https://issues.apache.org/jira/browse/AVRO-1103
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.7.0
>         Environment: Hadoop 0.23.1 with Avro jars replaced by 1.7 jars
> Specific data classes assembled into JAR with mapper/reducer
>            Reporter: Jacob Metcalf
>            Assignee: Doug Cutting
>             Fix For: 1.7.1
>
>         Attachments: AVRO-1103-for 0.23.1.patch, AVRO-1103.patch, 
> AVRO-1103.patch, AVRO-1103.patch
>
>
> Continuing on from AVRO-873 I believe some more work needs to be done to get 
> the MapReduce 2 APIs in Avro 1.7 working with Hadoop 0.23. Since it revolves 
> around classloaders it is complex to present a unit test which fails so I 
> will explain the problem:
> - By default SpecificDatumReader will use the classloader it was loaded from 
> to find a Specific class to deserialize into.
> - In earlier versions of Hadoop e.g. 0.20.2 Avro was not included so 
> typically you would bundle Avro into your job jar along with the Specific 
> classes so they would be on the same classpath.
>  
> - However later versions of Hadoop such as 0.23 ship with Avro. Thus you find 
> that the SpecificData.class.getClassloader() is typically a parent loader 
> which just contains Hadoop components.
> - Thus when SpecificData goes to construct a Specific class from the schema 
> it cannot locate it and silently defaults to creating a GenericData.
> In AVRO-873 an additional constructor was added to SpecificData to force it 
> to use a different classloader. Thus to extend this fix to the new MR2 APIs:
> - AvroDeserializer could attempt to instantiate the class using 
> Class.forName() and from this get the appropriate Classloader and pass this 
> into the constructor of SpecificDatumReader.
> - Line 2771 of SpecificData.java is:
> bq. Class c = SpecificData.get().getClass(schema);
> - This would need to be changed to:
> bq. Class c = this.getClass(schema);
> I have raised this in the mail groups here: 
> http://search-hadoop.com/m/wVUf1aLCwd/classloader/v=threaded so apologies if 
> this is already being thought about.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to