On Wed, May 28, 2014 at 11:39 PM, Venkat Subramanian <vsubr...@gmail.com>wrote:
> We are planning to use the latest Spark SQL on RDDs. If a third party > application wants to connect to Spark via JDBC, does Spark SQL have > support? > (We want to avoid going though Shark/Hive JDBC layer as we need good > performance). > We don't have a full release yet, but there is a branch on the Shark github repository that has a version of SharkServer2 that uses Spark SQL. We also plan to port the Shark CLI, but this is not yet finished. You can find this branch along with documentation here: https://github.com/amplab/shark/tree/sparkSql Note that this version has not yet received much testing (outside of the integration tests that are run on Spark SQL). That said, I would love for people to test it out and report any problems or missing features. Any help here would be greatly appreciated! > BTW, we also want to do the same for Spark Streaming - With Spark SQL work > on DStreams (since the underlying structure is RDD anyway) and can we > expose > the streaming DStream RDD through JDBC via Spark SQL for Realtime > analytics. > We have talked about doing this, but this is not currently on the near term road map.