crunch-contrib was likely the Hadoop 2.0.0 APIs, not the MR1 APIs. I didn't realize there was a difference between the two in the value of that property, which is certainly my bad. I rarely (ever?) read anything from databases as part of MR jobs, and hadn't run into that one before.
On Tue, May 20, 2014 at 3:21 PM, Nathan Schile <[email protected]>wrote: > I am having trouble using the DataBaseSource class from crunch-contrib. I > am using version 0.8.2+32-cdh4.4.0 of crunch-contrib and 2.0.0-mr1-cdh4.4.0 > of hadoop-core. The DataBaseSource class is setting the property > "mapreduce.jdbc.driver.class" on the Hadoop configuration [1] to specify > the JDBC driver to use, while when trying to get a connection to the > database in DBConfiguration#getConnection [2] it is reading property > "mapred.jdbc.driver.class" to retrieve the driver class to use. This > property mismatch is causing the connection to not be established. I would > have expected "mapred.jdbc.driver.class" property to be used within > DataBaseSource since MR1 is being used. I decompiled > crunch-contrib:2.0.0-mr1-cdh4.4.0 jar using [3] and looked at the > DataBaseSource class and it was using "mapreduce.jdbc.driver.class". It > makes me think that crunch-contrib:2.0.0-mr1-cdh4.4.0 was compiled with a > hadoop-core version that was not 2.0.0-mr1-cdh4.4.0. Has anyone ran into > this issue before? Thanks. > > > [1] > https://github.com/apache/crunch/blob/master/crunch-contrib/src/main/java/org/apache/crunch/contrib/io/jdbc/DataBaseSource.java#L55 > > [2] > https://repository.cloudera.com/cloudera/public/org/apache/hadoop/hadoop-core/2.0.0-mr1-cdh4.4.0/ > > [3] http://jd.benow.ca/ > -- Director of Data Science Cloudera <http://www.cloudera.com> Twitter: @josh_wills <http://twitter.com/josh_wills>
