Hi Todd,

Thanks for the help. So i was able to get the DSE working with tableau as
per the link provided by Mohammed. Now i trying to figure out if i could
write sparksql queries from tableau and get data from DSE. My end goal is
to get a web based tool where i could write sql queries which will pull
data from cassandra.

With Zeppelin I was able to build and run it in EC2 but not sure if
configurations are right. I am pointing to a spark master which is a remote
DSE node and all spark and sparksql dependencies are in the remote node. I
am not sure if i need to install spark and its dependencies in the webui
(zepplene) node.

I am not sure talking about zepplelin in this thread is right.

Thanks once again for all the help.

Thanks,
Pawan Venugopal


On Fri, Apr 3, 2015 at 11:48 AM, Todd Nist <tsind...@gmail.com> wrote:

> @Pawan
>
> Not sure if you have seen this or not, but here is a good example by
> Jonathan Lacefield of Datastax's on hooking up sparksql with DSE, adding
> Tableau is as simple as Mohammed stated with DSE.
> https://github.com/jlacefie/sparksqltest.
>
> HTH,
> Todd
>
> On Fri, Apr 3, 2015 at 2:39 PM, Todd Nist <tsind...@gmail.com> wrote:
>
>> Hi Mohammed,
>>
>> Not sure if you have tried this or not.  You could try using the below
>> api to start the thriftserver with an existing context.
>>
>>
>> https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L42
>>
>> The one thing that Michael Ambrust @ databrick recommended was this:
>>
>>> You can start a JDBC server with an existing context.  See my answer
>>> here:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Standard-SQL-tool-access-to-SchemaRDD-td20197.html
>>
>> So something like this based on example from Cheng Lian:
>>
>> *Server*
>>
>> import  org.apache.spark.sql.hive.HiveContext
>> import  org.apache.spark.sql.catalyst.types._
>>
>> val  sparkContext  =  sc
>> import  sparkContext._
>> val  sqlContext  =  new  HiveContext(sparkContext)
>> import  sqlContext._
>> makeRDD((1,"hello") :: (2,"world") 
>> ::Nil).toSchemaRDD.cache().registerTempTable("t")
>> // replace the above with the C* + spark-casandra-connectore to generate 
>> SchemaRDD and registerTempTable
>>
>> import  org.apache.spark.sql.hive.thriftserver._
>> HiveThriftServer2.startWithContext(sqlContext)
>>
>> Then Startup
>>
>> ./bin/beeline -u jdbc:hive2://localhost:10000/default
>> 0: jdbc:hive2://localhost:10000/default> select * from t;
>>
>>
>> I have not tried this yet from Tableau.   My understanding is that the
>> tempTable is only valid as long as the sqlContext is, so if one terminates
>> the code representing the *Server*, and then restarts the standard
>> thrift server, sbin/start-thriftserver ..., the table won't be available.
>>
>> Another possibility is to perhaps use the tuplejump cash project,
>> https://github.com/tuplejump/cash.
>>
>> HTH.
>>
>> -Todd
>>
>> On Fri, Apr 3, 2015 at 11:11 AM, pawan kumar <pkv...@gmail.com> wrote:
>>
>>> Thanks mohammed. Will give it a try today. We would also need the
>>> sparksSQL piece as we are migrating our data store from oracle to C* and it
>>> would be easier to maintain all the reports rather recreating each one from
>>> scratch.
>>>
>>> Thanks,
>>> Pawan Venugopal.
>>> On Apr 3, 2015 7:59 AM, "Mohammed Guller" <moham...@glassbeam.com>
>>> wrote:
>>>
>>>>  Hi Todd,
>>>>
>>>>
>>>>
>>>> We are using Apache C* 2.1.3, not DSE. We got Tableau to work directly
>>>> with C* using the ODBC driver, but now would like to add Spark SQL to the
>>>> mix. I haven’t been able to find any documentation for how to make this
>>>> combination work.
>>>>
>>>>
>>>>
>>>> We are using the Spark-Cassandra-Connector in our applications, but
>>>> haven’t been able to figure out how to get the Spark SQL Thrift Server to
>>>> use it and connect to C*. That is the missing piece. Once we solve that
>>>> piece of the puzzle then Tableau should be able to see the tables in C*.
>>>>
>>>>
>>>>
>>>> Hi Pawan,
>>>>
>>>> Tableau + C* is pretty straight forward, especially if you are using
>>>> DSE. Create a new DSN in Tableau using the ODBC driver that comes with DSE.
>>>> Once you connect, Tableau allows to use C* keyspace as schema and column
>>>> families as tables.
>>>>
>>>>
>>>>
>>>> Mohammed
>>>>
>>>>
>>>>
>>>> *From:* pawan kumar [mailto:pkv...@gmail.com]
>>>> *Sent:* Friday, April 3, 2015 7:41 AM
>>>> *To:* Todd Nist
>>>> *Cc:* user@spark.apache.org; Mohammed Guller
>>>> *Subject:* Re: Tableau + Spark SQL Thrift Server + Cassandra
>>>>
>>>>
>>>>
>>>> Hi Todd,
>>>>
>>>> Thanks for the link. I would be interested in this solution. I am using
>>>> DSE for cassandra. Would you provide me with info on connecting with DSE
>>>> either through Tableau or zeppelin. The goal here is query cassandra
>>>> through spark sql so that I could perform joins and groupby on my queries.
>>>> Are you able to perform spark sql queries with tableau?
>>>>
>>>> Thanks,
>>>> Pawan Venugopal
>>>>
>>>> On Apr 3, 2015 5:03 AM, "Todd Nist" <tsind...@gmail.com> wrote:
>>>>
>>>> What version of Cassandra are you using?  Are you using DSE or the
>>>> stock Apache Cassandra version?  I have connected it with DSE, but have not
>>>> attempted it with the standard Apache Cassandra version.
>>>>
>>>>
>>>>
>>>> FWIW,
>>>> http://www.datastax.com/dev/blog/datastax-odbc-cql-connector-apache-cassandra-datastax-enterprise,
>>>> provides an ODBC driver tor accessing C* from Tableau.  Granted it does not
>>>> provide all the goodness of Spark.  Are you attempting to leverage the
>>>> spark-cassandra-connector for this?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Apr 2, 2015 at 10:20 PM, Mohammed Guller <
>>>> moham...@glassbeam.com> wrote:
>>>>
>>>> Hi –
>>>>
>>>>
>>>>
>>>> Is anybody using Tableau to analyze data in Cassandra through the Spark
>>>> SQL Thrift Server?
>>>>
>>>>
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>> Mohammed
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to