Folks,
With the latest patch I submitted, I'm still having this problem. Since
the sqoop I build was built against the hadoop 0.23 jars, it won't work
against a hadoop 0.20 installation. Any of you guys have a suggestion
for me? Should I hack the build.xml to get the older version of hadoop,
or is that a can of worms that I don't want to open?
--- wad
On 12/09/2011 03:07 PM, Eric Wadsworth wrote:
Folks,
So I'm working on https://issues.apache.org/jira/browse/SQOOP-384,
trying to make sqoop backwards compatible with apache hadoop 0.20.x
clusters.
When I build sqoop, ivy downloads a bunch of jars, including a bunch
of hadoop 0.23 snapshot jars:
hadoop-annotations-0.23.0-SNAPSHOT.jar
hadoop-auth-0.23.0-SNAPSHOT.jar
hadoop-common-0.23.0-SNAPSHOT.jar
hadoop-common-0.23.0-SNAPSHOT-tests.jar
hadoop-hdfs-0.23.0-SNAPSHOT.jar
hadoop-hdfs-0.23.0-SNAPSHOT-tests.jar
hadoop-mapreduce-client-common-0.23.0-SNAPSHOT.jar
hadoop-mapreduce-client-core-0.23.0-SNAPSHOT.jar
There is a binary incompatibility around JobContext. It turned into an
interface in 0.23. I get this stack trace when I run my own built
version of sqoop against either CDH3 or apache hadoop 0.20:
Exception in thread "main" java.lang.IncompatibleClassChangeError:
Found class org.apache.hadoop.mapreduce.JobContext, but interface was
expected
at
org.apache.sqoop.config.ConfigurationHelper.getJobNumMaps(ConfigurationHelper.java:49)
at
com.cloudera.sqoop.config.ConfigurationHelper.getJobNumMaps(ConfigurationHelper.java:37)
at
org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat.getSplits(DataDrivenDBInputFormat.java:120)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961)
at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
at
org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:119)
at
org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:179)
at
org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:413)
at
org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:97)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:380)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:453)
at org.apache.sqoop.Sqoop.run(Sqoop.java:146)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:182)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:221)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:230)
at org.apache.sqoop.Sqoop.main(Sqoop.java:239)
This is due to hadoop 0.23 breaking binary compatibility with the
prior versions. From the web, "If you publish a public library, you
should avoid making incompatible binary changes as much as possible to
preserve what's known as "binary backward compatibility". Updating
dependency jars alone ideally shouldn't break the application or the
build."
This is going to be a problem, because if we ship a jar built against
hadoop 0.23.x, it won't run against anything that doesn't have
JobContext as an interface. Perhaps the best solution is to rename the
JobContext interface in hadoop? Not sure what will break if that
happens, but at least it won't be a runtime error. More ideas?
--- wad