Hi,
I wrote a small test program to perform a simple database extraction of
information from a simple table on a remote cluster. However, it fails to
execute successfully when I run from eclipse it with the following
exception:
12:36:08,993 WARN main mapred.JobClient:659 - Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
12:36:09,567 WARN main mapred.JobClient:776 - No job jar file set. User
classes may not be found. See JobConf(Class) or JobConf#setJar(String).
java.lang.RuntimeException: Error in configuring object
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:575)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:197)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 11 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException:
com.mysql.jdbc.Driver
at
org.apache.hadoop.mapred.lib.db.DBInputFormat.configure(DBInputFormat.java:271)
... 16 more
Caused by: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
at
org.apache.hadoop.mapred.lib.db.DBConfiguration.getConnection(DBConfiguration.java:123)
at
org.apache.hadoop.mapred.lib.db.DBInputFormat.configure(DBInputFormat.java:266)
... 16 more
I do have the mysql-connector jar under the $HADOOP_HOME/lib folder on all
servers in the cluster, and even tried using the
DistributedCache.addArchiveToClassPath method, with no success. Can someone
please help me figure out what is going on here?
Here is my simple main which performs the remote submission of the job:
public int run(String[] arg0) throws Exception {
System.out.println("Setting up job configuration....");
Configuration conf = new Configuration();
conf.set("mapred.job.tracker", "jobtracker.hostname:8021");
conf.set("fs.default.name", "hdfs://namenode.hostname:9000");
conf.set("keep.failed.task.files", "true");
conf.set("mapred.child.java.opts", "-Xmx1024m");
FileSystem fs = FileSystem.get(conf);
fs.delete(new Path("/myfolder/dump_output/"), true);
fs.mkdirs(new Path("/myfolder/libs/"));
fs.copyFromLocalFile(
new Path(
"C:/Users/me/.m2/repository/org/mylib/0.1-SNAPSHOT/myproject-0.1-SNAPSHOT-hadoop.jar"),
new
Path("/myfolder/libs/myproject-0.1-SNAPSHOT-hadoop.jar"));
fs.copyFromLocalFile( new Path(
"C:/Users/me/.m2/repository/mysql/mysql-connector-java/5.1.17/mysql-connector-java-5.1.17.jar"
), new Path("/myfolder/libs/mysql-connector-java-5.1.17.jar"));
DistributedCache.addArchiveToClassPath(new Path(
"/myfolder/libs/myproject-0.1-SNAPSHOT-hadoop.jar"), conf,
fs);
DistributedCache.addArchiveToClassPath(new Path(
"/myfolder/libs/mysql-connector-java-5.1.17.jar"), conf,
fs);
JobConf job = new JobConf(conf);
job.setJobName("Exporting Job");
job.setJarByClass(MyMapper.class);
job.setMapperClass(MyMapper.class);
Class claz = Class.forName("com.mysql.jdbc.Driver");
if (claz == null) {
throw new RuntimeException("wow...");
}
Configuration.dumpConfiguration(conf, new PrintWriter(System.out));
DBConfiguration
.configureDB(
job,
"com.mysql.jdbc.Driver",
"jdbc:mysql://mydbserver:3306/test?autoReconnect=true",
"user", "password");
String[] fields = { "employee_id", "name" };
DBInputFormat.setInput(job, MyRecord.class, "employees", null,
"employee_id", fields);
FileOutputFormat.setOutputPath(job, new Path(
"/myfolder/dump_output/"));
System.out.println("Submitting job....");
JobClient.runJob(job);
System.out.println("job info: " + job.getNumMapTasks());
return 0;
}
public static void main(String[] args) throws Exception {
int exitCode = ToolRunner.run(new SimpleDriver(), args);
System.out.println("Completed.");
System.exit(exitCode);
}
I'm using the hadoop-core version 0.20.205.0 maven dependency to build and
run my program via eclipse. The myproject-0.1-SNAPSHOT-hadoop.jar jar has my
classes, and it's dependencies included under the /lib folder.
Any help would be greatly appreciated.
Thanks