Yes, it is very simple to access Cassandra data using Spark shell. Step 1: Launch the spark-shell with the spark-cassandra-connector package $SPARK_HOME/bin/spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.10:1.5.0
Step 2: Create a DataFrame pointing to your Cassandra table val dfCassTable = sqlContext.read .format("org.apache.spark.sql.cassandra") .options(Map( "table" -> "your_column_family", "keyspace" -> "your_keyspace")) .load() From this point onward, you have complete access to the DataFrame API. You can even register it as a temporary table, if you would prefer to use SQL/HiveQL. Mohammed Author: Big Data Analytics with Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/> From: Ben Slater [mailto:ben.sla...@instaclustr.com] Sent: Monday, May 9, 2016 9:28 PM To: user@cassandra.apache.org; user Subject: Re: Accessing Cassandra data from Spark Shell You can use SparkShell to access Cassandra via the Spark Cassandra connector. The getting started article on our support page will probably give you a good steer to get started even if you’re not using Instaclustr: https://support.instaclustr.com/hc/en-us/articles/213097877-Getting-Started-with-Instaclustr-Spark-Cassandra- Cheers Ben On Tue, 10 May 2016 at 14:08 Cassa L <lcas...@gmail.com<mailto:lcas...@gmail.com>> wrote: Hi, Has anyone tried accessing Cassandra data using SparkShell? How do you do it? Can you use HiveContext for Cassandra data? I'm using community version of Cassandra-3.0 Thanks, LCassa -- ———————— Ben Slater Chief Product Officer, Instaclustr +61 437 929 798