subject:"Cassandra row count grouped by multiple columns"

Re: Cassandra row count grouped by multiple columns

2015-09-11 Thread Eric Walker

Hi Chirag, Maybe something like this? import org.apache.spark.sql._ import org.apache.spark.sql.types._ val rdd = sc.parallelize(Seq( Row("A1", "B1", "C1"), Row("A2", "B2", "C2"), Row("A3", "B3", "C2"), Row("A1", "B1", "C1") )) val schema = StructType(Seq("a", "b", "c").map(c =>

Cassandra row count grouped by multiple columns

2015-09-10 Thread Chirag Dewan

Hi, I am using Spark 1.2.0 with Cassandra 2.0.14. I have a problem where I need a count of rows unique to multiple columns. So I have a column family with 3 columns i.e. a,b,c and for each value of distinct a1,b1,c1 I want the row count. For eg: A1,B1,C1 A2,B2,C2 A3,B3,C2 A1,B1,C1 The output