We use a mix of librairies dumped on the Hadoop classpath and including the jars within the job's jar.
BTW the blog does mention when using option 3 to: "Restart the TastTrackers when you are done. Do not forget to update the jar when the underlying software changes." Getting the classpath config right can be a pita, it helps show it when you start a task (if you use HBase, the CP will be printed when it starts its ZK client). J-D On Sun, Sep 25, 2011 at 10:43 PM, Steinmaurer Thomas <[email protected]> wrote: > Hello, > > regarding MR-job deployment, I read this Cloudera blog article: > http://www.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/ > > In my case, I have to deploy the Oracle JDBC driver. I've tried the various > option discussed in the article and the only one which worked out-of-the box > was including the JDBC jar file into my JAR file in the lib folder. Copying > the JDBC jar into HADOOP_HOME/lib etc ... didn't work. Whenever the MR-Job > wasn't able to locate the JDBC driver, I get the infamous exception: > > > java.io.IOException > at > org.apache.hadoop.mapreduce.lib.db.DBOutputFormat.getRecordWriter(DBOutputFormat.java:180) > at > org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:559) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:414) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > > > While I can embed the JDBC library with each build of our MR-job, I rather > would like to deploy the JDBC library into HADOOP_HOME/lib, because it is > rather static and other MR-jobs might depend on that as well. The interesting > thing is, when working with the Cloudera VMWare, a reboot after copying the > library into HADOOP_HOME/lib helped. So, how are you deploying your MR-jobs > into a real/live cluster without the need to restart something? > > Thanks a lot! > > Thomas > > > _______________________________________________________ > DI Thomas Steinmaurer > Industrial Researcher > Software Competence Center Hagenberg GmbH > Softwarepark 21, A-4232 Hagenberg, Austria > UID: ATU 48056909 - FN: 184145b, Landesgericht Linz > Tel. +43 7236 3343-896 > Fax +43 7236 3343-888 > mailto:[email protected] > http://www.scch.at/ > >
