I am running Hadoop 2.9.1, and I am doing a reduce side join, where I want to use reduce function that does the local join using SQL, but I am getting this error (for MySQL).
java.sql.SQLException: No suitable driver found for jdbc:mysql://localhost:3306/acm_ex >From line code- Connection connection = DriverManager.getConnection("jdbc:mysql://localhost:3306/acm_ex", "root", "root"); *Each computer on the cluster has MySQL installed with the database acm_ex. I have a Maven project with the SQL dependencies as follows: <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>5.1.39</version> </dependency> <dependency> <groupId>com.microsoft.sqlserver</groupId> <artifactId>mssql-jdbc</artifactId> <version>7.0.0.jre8</version> </dependency> I compile and make a jar from the project and try to run it with the following reduce function: public void reduce(TextPair key, Iterable<Text> values, Context context) throws IOException, InterruptedException { try { Class.forName("com.mysql.jdbc.Driver").newInstance(); } catch (Exception e){ System.out.println(e.toString()); } try { Connection connection = DriverManager.getConnection("jdbc:mysql://localhost:3306/acm_ex", "root", "root"); Statement statement = connection.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_UPDATABLE); LOG.info("SQL- connection: " + connection + " statement: " + statement); //create 3 tables names . . . } //try }//reduce The code for the reduce function works perfectly when I run it locally (user and password are "root") with Eclipse, but somehow there is a problem when I run the same code with Hadoop's reduce function. I have tried to add the jar to the classpath (mysql-connector-java), although Maven has done it already, and it didn't help. I am not sure if it is something with permissions to 3306 port for the reduce container? Or Maven problem? Or even a hostname problem? Therefore, does anyone know how to solve this particular issue or knows another way to do a reduce side join with SQL (I am familiar with MySQL, but I can change if you believe there is a difference)? *Using Hive or map side join are not an option and doing a naive for loops works but of course not as fast as SQL.