Hi Todd, Thanks for the help. So i was able to get the DSE working with tableau as per the link provided by Mohammed. Now i trying to figure out if i could write sparksql queries from tableau and get data from DSE. My end goal is to get a web based tool where i could write sql queries which will pull data from cassandra.
With Zeppelin I was able to build and run it in EC2 but not sure if configurations are right. I am pointing to a spark master which is a remote DSE node and all spark and sparksql dependencies are in the remote node. I am not sure if i need to install spark and its dependencies in the webui (zepplene) node. I am not sure talking about zepplelin in this thread is right. Thanks once again for all the help. Thanks, Pawan Venugopal On Fri, Apr 3, 2015 at 11:48 AM, Todd Nist <tsind...@gmail.com> wrote: > @Pawan > > Not sure if you have seen this or not, but here is a good example by > Jonathan Lacefield of Datastax's on hooking up sparksql with DSE, adding > Tableau is as simple as Mohammed stated with DSE. > https://github.com/jlacefie/sparksqltest. > > HTH, > Todd > > On Fri, Apr 3, 2015 at 2:39 PM, Todd Nist <tsind...@gmail.com> wrote: > >> Hi Mohammed, >> >> Not sure if you have tried this or not. You could try using the below >> api to start the thriftserver with an existing context. >> >> >> https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L42 >> >> The one thing that Michael Ambrust @ databrick recommended was this: >> >>> You can start a JDBC server with an existing context. See my answer >>> here: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Standard-SQL-tool-access-to-SchemaRDD-td20197.html >> >> So something like this based on example from Cheng Lian: >> >> *Server* >> >> import org.apache.spark.sql.hive.HiveContext >> import org.apache.spark.sql.catalyst.types._ >> >> val sparkContext = sc >> import sparkContext._ >> val sqlContext = new HiveContext(sparkContext) >> import sqlContext._ >> makeRDD((1,"hello") :: (2,"world") >> ::Nil).toSchemaRDD.cache().registerTempTable("t") >> // replace the above with the C* + spark-casandra-connectore to generate >> SchemaRDD and registerTempTable >> >> import org.apache.spark.sql.hive.thriftserver._ >> HiveThriftServer2.startWithContext(sqlContext) >> >> Then Startup >> >> ./bin/beeline -u jdbc:hive2://localhost:10000/default >> 0: jdbc:hive2://localhost:10000/default> select * from t; >> >> >> I have not tried this yet from Tableau. My understanding is that the >> tempTable is only valid as long as the sqlContext is, so if one terminates >> the code representing the *Server*, and then restarts the standard >> thrift server, sbin/start-thriftserver ..., the table won't be available. >> >> Another possibility is to perhaps use the tuplejump cash project, >> https://github.com/tuplejump/cash. >> >> HTH. >> >> -Todd >> >> On Fri, Apr 3, 2015 at 11:11 AM, pawan kumar <pkv...@gmail.com> wrote: >> >>> Thanks mohammed. Will give it a try today. We would also need the >>> sparksSQL piece as we are migrating our data store from oracle to C* and it >>> would be easier to maintain all the reports rather recreating each one from >>> scratch. >>> >>> Thanks, >>> Pawan Venugopal. >>> On Apr 3, 2015 7:59 AM, "Mohammed Guller" <moham...@glassbeam.com> >>> wrote: >>> >>>> Hi Todd, >>>> >>>> >>>> >>>> We are using Apache C* 2.1.3, not DSE. We got Tableau to work directly >>>> with C* using the ODBC driver, but now would like to add Spark SQL to the >>>> mix. I haven’t been able to find any documentation for how to make this >>>> combination work. >>>> >>>> >>>> >>>> We are using the Spark-Cassandra-Connector in our applications, but >>>> haven’t been able to figure out how to get the Spark SQL Thrift Server to >>>> use it and connect to C*. That is the missing piece. Once we solve that >>>> piece of the puzzle then Tableau should be able to see the tables in C*. >>>> >>>> >>>> >>>> Hi Pawan, >>>> >>>> Tableau + C* is pretty straight forward, especially if you are using >>>> DSE. Create a new DSN in Tableau using the ODBC driver that comes with DSE. >>>> Once you connect, Tableau allows to use C* keyspace as schema and column >>>> families as tables. >>>> >>>> >>>> >>>> Mohammed >>>> >>>> >>>> >>>> *From:* pawan kumar [mailto:pkv...@gmail.com] >>>> *Sent:* Friday, April 3, 2015 7:41 AM >>>> *To:* Todd Nist >>>> *Cc:* user@spark.apache.org; Mohammed Guller >>>> *Subject:* Re: Tableau + Spark SQL Thrift Server + Cassandra >>>> >>>> >>>> >>>> Hi Todd, >>>> >>>> Thanks for the link. I would be interested in this solution. I am using >>>> DSE for cassandra. Would you provide me with info on connecting with DSE >>>> either through Tableau or zeppelin. The goal here is query cassandra >>>> through spark sql so that I could perform joins and groupby on my queries. >>>> Are you able to perform spark sql queries with tableau? >>>> >>>> Thanks, >>>> Pawan Venugopal >>>> >>>> On Apr 3, 2015 5:03 AM, "Todd Nist" <tsind...@gmail.com> wrote: >>>> >>>> What version of Cassandra are you using? Are you using DSE or the >>>> stock Apache Cassandra version? I have connected it with DSE, but have not >>>> attempted it with the standard Apache Cassandra version. >>>> >>>> >>>> >>>> FWIW, >>>> http://www.datastax.com/dev/blog/datastax-odbc-cql-connector-apache-cassandra-datastax-enterprise, >>>> provides an ODBC driver tor accessing C* from Tableau. Granted it does not >>>> provide all the goodness of Spark. Are you attempting to leverage the >>>> spark-cassandra-connector for this? >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Apr 2, 2015 at 10:20 PM, Mohammed Guller < >>>> moham...@glassbeam.com> wrote: >>>> >>>> Hi – >>>> >>>> >>>> >>>> Is anybody using Tableau to analyze data in Cassandra through the Spark >>>> SQL Thrift Server? >>>> >>>> >>>> >>>> Thanks! >>>> >>>> >>>> >>>> Mohammed >>>> >>>> >>>> >>>> >>>> >>> >> >