Re: Tableau + Spark SQL Thrift Server + Cassandra

Todd Nist Fri, 03 Apr 2015 12:10:09 -0700

@Pawan,

So it's been a couple of months since I have had a chance to do anything
with Zeppelin, but here is a link to a post on what I did to get it working
https://groups.google.com/forum/#!topic/zeppelin-developers/mCNdyOXNikI.
This may or may not work with the newer releases from Zeppelin.


-Todd

On Fri, Apr 3, 2015 at 3:02 PM, pawan kumar <pkv...@gmail.com> wrote:

> Hi Todd,
>
> Thanks for the help. So i was able to get the DSE working with tableau as
> per the link provided by Mohammed. Now i trying to figure out if i could
> write sparksql queries from tableau and get data from DSE. My end goal is
> to get a web based tool where i could write sql queries which will pull
> data from cassandra.
>
> With Zeppelin I was able to build and run it in EC2 but not sure if
> configurations are right. I am pointing to a spark master which is a remote
> DSE node and all spark and sparksql dependencies are in the remote node. I
> am not sure if i need to install spark and its dependencies in the webui
> (zepplene) node.
>
> I am not sure talking about zepplelin in this thread is right.
>
> Thanks once again for all the help.
>
> Thanks,
> Pawan Venugopal
>
>
> On Fri, Apr 3, 2015 at 11:48 AM, Todd Nist <tsind...@gmail.com> wrote:
>
>> @Pawan
>>
>> Not sure if you have seen this or not, but here is a good example by
>> Jonathan Lacefield of Datastax's on hooking up sparksql with DSE, adding
>> Tableau is as simple as Mohammed stated with DSE.
>> https://github.com/jlacefie/sparksqltest.
>>
>> HTH,
>> Todd
>>
>> On Fri, Apr 3, 2015 at 2:39 PM, Todd Nist <tsind...@gmail.com> wrote:
>>
>>> Hi Mohammed,
>>>
>>> Not sure if you have tried this or not.  You could try using the below
>>> api to start the thriftserver with an existing context.
>>>
>>>
>>> https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L42
>>>
>>> The one thing that Michael Ambrust @ databrick recommended was this:
>>>
>>>> You can start a JDBC server with an existing context.  See my answer
>>>> here:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Standard-SQL-tool-access-to-SchemaRDD-td20197.html
>>>
>>> So something like this based on example from Cheng Lian:
>>>
>>> *Server*
>>>
>>> import  org.apache.spark.sql.hive.HiveContext
>>> import  org.apache.spark.sql.catalyst.types._
>>>
>>> val  sparkContext  =  sc
>>> import  sparkContext._
>>> val  sqlContext  =  new  HiveContext(sparkContext)
>>> import  sqlContext._
>>> makeRDD((1,"hello") :: (2,"world") 
>>> ::Nil).toSchemaRDD.cache().registerTempTable("t")
>>> // replace the above with the C* + spark-casandra-connectore to generate 
>>> SchemaRDD and registerTempTable
>>>
>>> import  org.apache.spark.sql.hive.thriftserver._
>>> HiveThriftServer2.startWithContext(sqlContext)
>>>
>>> Then Startup
>>>
>>> ./bin/beeline -u jdbc:hive2://localhost:10000/default
>>> 0: jdbc:hive2://localhost:10000/default> select * from t;
>>>
>>>
>>> I have not tried this yet from Tableau.   My understanding is that the
>>> tempTable is only valid as long as the sqlContext is, so if one terminates
>>> the code representing the *Server*, and then restarts the standard
>>> thrift server, sbin/start-thriftserver ..., the table won't be available.
>>>
>>> Another possibility is to perhaps use the tuplejump cash project,
>>> https://github.com/tuplejump/cash.
>>>
>>> HTH.
>>>
>>> -Todd
>>>
>>> On Fri, Apr 3, 2015 at 11:11 AM, pawan kumar <pkv...@gmail.com> wrote:
>>>
>>>> Thanks mohammed. Will give it a try today. We would also need the
>>>> sparksSQL piece as we are migrating our data store from oracle to C* and it
>>>> would be easier to maintain all the reports rather recreating each one from
>>>> scratch.
>>>>
>>>> Thanks,
>>>> Pawan Venugopal.
>>>> On Apr 3, 2015 7:59 AM, "Mohammed Guller" <moham...@glassbeam.com>
>>>> wrote:
>>>>
>>>>>  Hi Todd,
>>>>>
>>>>>
>>>>>
>>>>> We are using Apache C* 2.1.3, not DSE. We got Tableau to work directly
>>>>> with C* using the ODBC driver, but now would like to add Spark SQL to the
>>>>> mix. I haven’t been able to find any documentation for how to make this
>>>>> combination work.
>>>>>
>>>>>
>>>>>
>>>>> We are using the Spark-Cassandra-Connector in our applications, but
>>>>> haven’t been able to figure out how to get the Spark SQL Thrift Server to
>>>>> use it and connect to C*. That is the missing piece. Once we solve that
>>>>> piece of the puzzle then Tableau should be able to see the tables in C*.
>>>>>
>>>>>
>>>>>
>>>>> Hi Pawan,
>>>>>
>>>>> Tableau + C* is pretty straight forward, especially if you are using
>>>>> DSE. Create a new DSN in Tableau using the ODBC driver that comes with 
>>>>> DSE.
>>>>> Once you connect, Tableau allows to use C* keyspace as schema and column
>>>>> families as tables.
>>>>>
>>>>>
>>>>>
>>>>> Mohammed
>>>>>
>>>>>
>>>>>
>>>>> *From:* pawan kumar [mailto:pkv...@gmail.com]
>>>>> *Sent:* Friday, April 3, 2015 7:41 AM
>>>>> *To:* Todd Nist
>>>>> *Cc:* user@spark.apache.org; Mohammed Guller
>>>>> *Subject:* Re: Tableau + Spark SQL Thrift Server + Cassandra
>>>>>
>>>>>
>>>>>
>>>>> Hi Todd,
>>>>>
>>>>> Thanks for the link. I would be interested in this solution. I am
>>>>> using DSE for cassandra. Would you provide me with info on connecting with
>>>>> DSE either through Tableau or zeppelin. The goal here is query cassandra
>>>>> through spark sql so that I could perform joins and groupby on my queries.
>>>>> Are you able to perform spark sql queries with tableau?
>>>>>
>>>>> Thanks,
>>>>> Pawan Venugopal
>>>>>
>>>>> On Apr 3, 2015 5:03 AM, "Todd Nist" <tsind...@gmail.com> wrote:
>>>>>
>>>>> What version of Cassandra are you using?  Are you using DSE or the
>>>>> stock Apache Cassandra version?  I have connected it with DSE, but have 
>>>>> not
>>>>> attempted it with the standard Apache Cassandra version.
>>>>>
>>>>>
>>>>>
>>>>> FWIW,
>>>>> http://www.datastax.com/dev/blog/datastax-odbc-cql-connector-apache-cassandra-datastax-enterprise,
>>>>> provides an ODBC driver tor accessing C* from Tableau.  Granted it does 
>>>>> not
>>>>> provide all the goodness of Spark.  Are you attempting to leverage the
>>>>> spark-cassandra-connector for this?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Apr 2, 2015 at 10:20 PM, Mohammed Guller <
>>>>> moham...@glassbeam.com> wrote:
>>>>>
>>>>> Hi –
>>>>>
>>>>>
>>>>>
>>>>> Is anybody using Tableau to analyze data in Cassandra through the
>>>>> Spark SQL Thrift Server?
>>>>>
>>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>>
>>>>> Mohammed
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Tableau + Spark SQL Thrift Server + Cassandra

Reply via email to