[
https://issues.apache.org/jira/browse/HADOOP-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15648921#comment-15648921
]
Andrew Wang edited comment on HADOOP-11804 at 11/8/16 10:00 PM:
----------------------------------------------------------------
Thanks for the rev Sean. I tried it with Avro and got NoClassDefFound for Log4J:
{noformat}
testSort(org.apache.avro.mapred.TestAvroTextSort) Time elapsed: 0.051 sec <<<
ERROR!
java.lang.NoClassDefFoundError: org/apache/log4j/Level
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.hadoop.mapred.JobConf.<clinit>(JobConf.java:356)
at
org.apache.avro.mapred.TestAvroTextSort.testSort(TestAvroTextSort.java:37)
{noformat}
I think this is expected based on the contents of the hadoop-client-runtime
pom.xml, which marks log4j as optional. I manually added this dependency, and
then hit this:
{noformat}
testReadAvro(org.apache.avro.hadoop.io.TestAvroSequenceFile) Time elapsed:
0.016 sec <<< ERROR!
java.lang.NullPointerException: null
at
org.apache.hadoop.io.serializer.SerializationFactory.<init>(SerializationFactory.java:58)
at org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:1248)
at
org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1207)
at
org.apache.avro.hadoop.io.AvroSequenceFile$Writer.<init>(AvroSequenceFile.java:532)
at
org.apache.avro.hadoop.io.TestAvroSequenceFile.writeSequenceFile(TestAvroSequenceFile.java:200)
at
org.apache.avro.hadoop.io.TestAvroSequenceFile.testReadAvro(TestAvroSequenceFile.java:53)
{noformat}
I decompiled the SerializationFactory class, and noticed that it messed with
the config key. I think we need to add some kind of exclusion for
CommonConfigurationKeysPublic.
{code}
// before
if (conf.get(CommonConfigurationKeys.IO_SERIALIZATIONS_KEY).equals("")) {
// decompiled
if (conf.get("org.apache.hadoop.shaded.io.serializations").equals("")) {
{code}
Here's my Avro diff for master (without the log4j addition) if you want to try
this yourself:
https://gist.github.com/anonymous/c064c283348a2d1bbec00845678339f9
was (Author: andrew.wang):
Thanks for the rev Sean. I tried it with Avro and got NoClassDefFound for Log4J:
{noformat}
testSort(org.apache.avro.mapred.TestAvroTextSort) Time elapsed: 0.051 sec <<<
ERROR!
java.lang.NoClassDefFoundError: org/apache/log4j/Level
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.hadoop.mapred.JobConf.<clinit>(JobConf.java:356)
at
org.apache.avro.mapred.TestAvroTextSort.testSort(TestAvroTextSort.java:37)
{noformat}
I think this is expected based on the contents of the hadoop-client-runtime
pom.xml, which marks log4j as optional. I manually added this dependency, and
then hit this:
{noformat}
testReadAvro(org.apache.avro.hadoop.io.TestAvroSequenceFile) Time elapsed:
0.016 sec <<< ERROR!
java.lang.NullPointerException: null
at
org.apache.hadoop.io.serializer.SerializationFactory.<init>(SerializationFactory.java:58)
at org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:1248)
at
org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1207)
at
org.apache.avro.hadoop.io.AvroSequenceFile$Writer.<init>(AvroSequenceFile.java:532)
at
org.apache.avro.hadoop.io.TestAvroSequenceFile.writeSequenceFile(TestAvroSequenceFile.java:200)
at
org.apache.avro.hadoop.io.TestAvroSequenceFile.testReadAvro(TestAvroSequenceFile.java:53)
{noformat}
I decompiled the SerializationFactory class, and noticed that it messed with
the config key. I think we need to add some kind of exclusion for
CommonConfigurationKeysPublic.
{code}
// before
if (conf.get(CommonConfigurationKeys.IO_SERIALIZATIONS_KEY).equals("")) {
// decompiled
if (conf.get("org.apache.hadoop.shaded.io.serializations").equals("")) {
{noformat}
Here's my Avro diff for master (without the log4j addition) if you want to try
this yourself:
https://gist.github.com/anonymous/c064c283348a2d1bbec00845678339f9
> POC Hadoop Client w/o transitive dependencies
> ---------------------------------------------
>
> Key: HADOOP-11804
> URL: https://issues.apache.org/jira/browse/HADOOP-11804
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: build
> Reporter: Sean Busbey
> Assignee: Sean Busbey
> Attachments: HADOOP-11804.1.patch, HADOOP-11804.2.patch,
> HADOOP-11804.3.patch, HADOOP-11804.4.patch, HADOOP-11804.5.patch,
> HADOOP-11804.6.patch, HADOOP-11804.7.patch
>
>
> make a hadoop-client-api and hadoop-client-runtime that i.e. HBase can use to
> talk with a Hadoop cluster without seeing any of the implementation
> dependencies.
> see proposal on parent for details.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]