It's a workaround but you should be able to manually set the correct configuration using the Source.inputConf(...)[1] method and set the correct additional property.
[1] - http://crunch.apache.org/apidocs/0.8.2/org/apache/crunch/Source.html#inputConf(java.lang.String, java.lang.String) On Tue, May 20, 2014 at 6:06 PM, Josh Wills <[email protected]> wrote: > crunch-contrib was likely the Hadoop 2.0.0 APIs, not the MR1 APIs. I > didn't realize there was a difference between the two in the value of that > property, which is certainly my bad. I rarely (ever?) read anything from > databases as part of MR jobs, and hadn't run into that one before. > > > On Tue, May 20, 2014 at 3:21 PM, Nathan Schile <[email protected]>wrote: > >> I am having trouble using the DataBaseSource class from crunch-contrib. I >> am using version 0.8.2+32-cdh4.4.0 of crunch-contrib and 2.0.0-mr1-cdh4.4.0 >> of hadoop-core. The DataBaseSource class is setting the property >> "mapreduce.jdbc.driver.class" on the Hadoop configuration [1] to specify >> the JDBC driver to use, while when trying to get a connection to the >> database in DBConfiguration#getConnection [2] it is reading property >> "mapred.jdbc.driver.class" to retrieve the driver class to use. This >> property mismatch is causing the connection to not be established. I would >> have expected "mapred.jdbc.driver.class" property to be used within >> DataBaseSource since MR1 is being used. I decompiled >> crunch-contrib:2.0.0-mr1-cdh4.4.0 jar using [3] and looked at the >> DataBaseSource class and it was using "mapreduce.jdbc.driver.class". It >> makes me think that crunch-contrib:2.0.0-mr1-cdh4.4.0 was compiled with a >> hadoop-core version that was not 2.0.0-mr1-cdh4.4.0. Has anyone ran into >> this issue before? Thanks. >> >> >> [1] >> https://github.com/apache/crunch/blob/master/crunch-contrib/src/main/java/org/apache/crunch/contrib/io/jdbc/DataBaseSource.java#L55 >> >> [2] >> https://repository.cloudera.com/cloudera/public/org/apache/hadoop/hadoop-core/2.0.0-mr1-cdh4.4.0/ >> >> [3] http://jd.benow.ca/ >> > > > > -- > Director of Data Science > Cloudera <http://www.cloudera.com> > Twitter: @josh_wills <http://twitter.com/josh_wills> >
