Re: Convert DStream to DataFrame

2015-04-24 Thread Yin Huai
Hi Sergio, I missed this thread somehow... For the error "case classes cannot have more than 22 parameters.", it is the limitation of scala (see https://issues.scala-lang.org/browse/SI-7296). You can follow the instruction at https://spark.apache.org/docs/latest/sql-programming-guide.html#programm

Re: Convert DStream to DataFrame

2015-04-24 Thread Sergio Jiménez Barrio
Solved! I have solved the problem combining both solutions. The result is this: messages.foreachRDD { rdd => val message: RDD[String] = rdd.map { y => y._2 } val sqlContext = SQLContextSingleton.getInstance(rdd.sparkContext) import

Re: Convert DStream to DataFrame

2015-04-23 Thread Sergio Jiménez Barrio
Thank you ver much, Tathagata! El miércoles, 22 de abril de 2015, Tathagata Das escribió: > Aaah, that. That is probably a limitation of the SQLContext (cc'ing Yin > for more information). > > > On Wed, Apr 22, 2015 at 7:07 AM, Sergio Jiménez Barrio < > drarse.a...@gmail.com > > wrote: > >> Sorr

Re: Convert DStream to DataFrame

2015-04-22 Thread Tathagata Das
Aaah, that. That is probably a limitation of the SQLContext (cc'ing Yin for more information). On Wed, Apr 22, 2015 at 7:07 AM, Sergio Jiménez Barrio < drarse.a...@gmail.com> wrote: > Sorry, this is the error: > > [error] /home/sergio/Escritorio/hello/streaming.scala:77: Implementation > restric

Re: Convert DStream to DataFrame

2015-04-22 Thread Sergio Jiménez Barrio
Sorry, this is the error: [error] /home/sergio/Escritorio/hello/streaming.scala:77: Implementation restriction: case classes cannot have more than 22 parameters. 2015-04-22 16:06 GMT+02:00 Sergio Jiménez Barrio : > I tried the solution of the guide, but I exceded the size of case class > Row:

Re: Convert DStream to DataFrame

2015-04-22 Thread Sergio Jiménez Barrio
I tried the solution of the guide, but I exceded the size of case class Row: 2015-04-22 15:22 GMT+02:00 Tathagata Das : > Did you checkout the latest streaming programming guide? > > > http://spark.apache.org/docs/latest/streaming-programming-guide.html#dataframe-and-sql-operations > > You also

Re: Convert DStream to DataFrame

2015-04-22 Thread Tathagata Das
Did you checkout the latest streaming programming guide? http://spark.apache.org/docs/latest/streaming-programming-guide.html#dataframe-and-sql-operations You also need to be aware of that to convert json RDDs to dataframe, sqlContext has to make a pass on the data to learn the schema. This will

Re: Convert DStream to DataFrame

2015-04-22 Thread ayan guha
What about sqlcontext.createDataframe(rdd)? On 22 Apr 2015 23:04, "Sergio Jiménez Barrio" wrote: > Hi, > > I am using Kafka with Apache Stream to send JSON to Apache Spark: > > val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, > StringDecoder](ssc, kafkaParams, topicsSe

Convert DStream to DataFrame

2015-04-22 Thread Sergio Jiménez Barrio
Hi, I am using Kafka with Apache Stream to send JSON to Apache Spark: val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaParams, topicsSet) Now, I want parse the DStream created to DataFrame, but I don't know if Spark 1.3 have some easy way for t