That's correct, at this time MS SQL Server is not supported through the JDBC data source at this time. In my environment, we've been using Hadoop streaming to extract out data from multiple SQL Servers, pushing the data into HDFS, creating the Hive tables and/or converting them into Parquet, and then Spark can access them directly. Due to my heavy use of SQL Server, I've been thinking about seeing if i can help with the extension of the JDBC data source so it can be supported - but alas, I haven't found the time yet ;)
On Tue, Apr 7, 2015 at 6:52 AM ARose <ashley.r...@telarix.com> wrote: > I am having the same issue with my java application. > > String url = "jdbc:sqlserver://" + host + ":1433;DatabaseName=" + > database + ";integratedSecurity=true"; > String driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver"; > > SparkConf conf = new > SparkConf().setAppName(appName).setMaster(master); > JavaSparkContext sc = new JavaSparkContext(conf); > SQLContext sqlContext = new SQLContext(sc); > > Map<String, String> options = new HashMap<>(); > options.put("driver", driver); > options.put("url", url); > options.put("dbtable", "tbTableName"); > > DataFrame jdbcDF = sqlContext.load("jdbc", options); > jdbcDF.printSchema(); > jdbcDF.show(); > > It prints the schema of the DataFrame just fine, but as soon as it tries to > evaluate it for the show() call, I get a ClassNotFoundException for the > driver. But the driver is definitely included as a dependency, so is MS > SQL > Server just not supported? > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/Microsoft-SQL-jdbc-support- > from-spark-sql-tp22399p22404.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >