Hi Rohit, We are big users of the Spark Shell - it is used by our analytics team for the same purposes that Hive used to be. The SparkContext which is provided at startup I guess would have to be one of HDFS or Cassandra - I take it we would then manually create a second context?
Thanks, Gary On Sat, Oct 26, 2013 at 1:07 PM, Rohit Rai <[email protected]> wrote: > Hello Gary, > > This is very easy to do. You can read your data from HDFS using > FileInputFormat, transform it to a desired rows and write to Cassandra > using ColumnFamilyInputFormat. > > Our library called Calliope (Apache Licensed), > http://tuplejump.github.io/calliope/ can make the task of writing to C* > easier. > > > In case you don't want to convert it to rows and keep them as files in > Cassandra, our lightweight Cassandra backed HDFS compatible filesystem, > SnackFS can help you. SnackFS will be part of next Calliope release later > this month, but we can provide you access if you would like to try it out. > > Feel free to mail me directly in case you need any assistance. > > > Regards, > Rohit > founder @ tuplejump > > > > > On Sat, Oct 26, 2013 at 5:45 AM, Gary Malouf <[email protected]>wrote: > >> We have a use case in which much of our raw data is stored in HDFS today. >> We'd like to write our Spark jobs such that they read/aggregate data from >> HDFS and can output to our Cassandra cluster. >> >> Is there any way of doing this in spark 0.7.3? >> > > > > -- > > ____________________________ > www.tuplejump.com > *The Data Engineering Platform* >
