from:"asethia"

Spark GroupBy Save to different files

2017-09-01 Thread asethia

Hi, I have list of person records in following format: case class Person(fName:String, city:String) val l=List(Person("A","City1"),Person("B","City2"),Person("C","City1")) val rdd:RDD[Person]=sc.parallelize(l) val groupBy:RDD[(String, Iterable[Person])]=rdd.groupBy(_.city) I would like to sav

Market Basket Analysis by deploying FP Growth algorithm

2017-04-05 Thread asethia

Hi, We are currently working on a Market Basket Analysis by deploying FP Growth algorithm on Spark to generate association rules for product recommendation. We are running on close to 24 million invoices over an assortment of more than 100k products. However, whenever we relax the support threshol

GenericRowWithSchema to case class

2016-04-03 Thread asethia

Hi, My Cassandra table has custom user defined say example: CREATE TYPE address ( addressline1 text, addressline2 text, city text, state text, country text, pincode text ) create table person ( id text, name text, addresses set>, PRIMARY KEY (id)); val rdd=sqlContext.read.format("

transformation - spark vs cassandra

2016-03-31 Thread asethia

Hi, I am working with Cassandra and Spark, would like to know what is best performance using Cassandra filter based on primary key and cluster key vs using spark data frame transformation/filters. for example in spark: val rdd = sqlContext.read.format("org.apache.spark.sql.cassandra") .op

Re: DataFrame vs RDD

2016-03-22 Thread asethia

creating RDD is done via spark context where as creating Dataframe is from sqlcontext... so Dataframe is part of sparksql where as RDD is spark core -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/DataFrame-vs-RDD-tp26570p26573.html Sent from the Apache Spa

DataFrame vs RDD

2016-03-22 Thread asethia

Hi, I am new to Spark, would like to know any guidelines when to use Data Frame vs. RDD. Thanks, As -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/DataFrame-vs-RDD-tp26570.html Sent from the Apache Spark User List mailing list archive at Nabble.com. -

Spark GroupBy Save to different files

Market Basket Analysis by deploying FP Growth algorithm

GenericRowWithSchema to case class

transformation - spark vs cassandra

Re: DataFrame vs RDD

DataFrame vs RDD

6 matches

Site Navigation

Mail list logo

Footer information