Re: Writing RDD to a csv file

2015-02-03 Thread Gerard Maas
this is more of a scala question, so probably next time you'd like to
address a Scala forum eg. http://stackoverflow.com/questions/tagged/scala

val optArrStr:Option[Array[String]] = ???
optArrStr.map(arr = arr.mkString(,)).getOrElse()  // empty string or
whatever default value you have for this.

kr, Gerard.

On Tue, Feb 3, 2015 at 2:09 PM, kundan kumar iitr.kun...@gmail.com wrote:

 I have a RDD which is of type

 org.apache.spark.rdd.RDD[(String, (Array[String], Option[Array[String]]))]

 I want to write it as a csv file.

 Please suggest how this can be done.

 myrdd.map(line = (line._1 + , + line._2._1.mkString(,) + , +
 line._2._2.mkString(','))).saveAsTextFile(hdfs://...)

 Doing mkString on line._2._1 works but does not work for the Option type.

 Please suggest how this can be done.


 Thanks
 Kundan





Writing RDD to a csv file

2015-02-03 Thread kundan kumar
I have a RDD which is of type

org.apache.spark.rdd.RDD[(String, (Array[String], Option[Array[String]]))]

I want to write it as a csv file.

Please suggest how this can be done.

myrdd.map(line = (line._1 + , + line._2._1.mkString(,) + , +
line._2._2.mkString(','))).saveAsTextFile(hdfs://...)

Doing mkString on line._2._1 works but does not work for the Option type.

Please suggest how this can be done.


Thanks
Kundan


Re: Writing RDD to a csv file

2015-02-03 Thread kundan kumar
Thanks Gerard !!

This is working.

On Tue, Feb 3, 2015 at 6:44 PM, Gerard Maas gerard.m...@gmail.com wrote:

 this is more of a scala question, so probably next time you'd like to
 address a Scala forum eg. http://stackoverflow.com/questions/tagged/scala

 val optArrStr:Option[Array[String]] = ???
 optArrStr.map(arr = arr.mkString(,)).getOrElse()  // empty string or
 whatever default value you have for this.

 kr, Gerard.

 On Tue, Feb 3, 2015 at 2:09 PM, kundan kumar iitr.kun...@gmail.com
 wrote:

 I have a RDD which is of type

 org.apache.spark.rdd.RDD[(String, (Array[String], Option[Array[String]]))]

 I want to write it as a csv file.

 Please suggest how this can be done.

 myrdd.map(line = (line._1 + , + line._2._1.mkString(,) + , +
 line._2._2.mkString(','))).saveAsTextFile(hdfs://...)

 Doing mkString on line._2._1 works but does not work for the Option type.

 Please suggest how this can be done.


 Thanks
 Kundan






Re: Writing RDD to a csv file

2015-02-03 Thread Charles Feduke
In case anyone needs to merge all of their part-n files (small result
set only) into a single *.csv file or needs to generically flatten case
classes, tuples, etc., into comma separated values:

http://deploymentzone.com/2015/01/30/spark-and-merged-csv-files/

On Tue Feb 03 2015 at 8:23:59 AM kundan kumar iitr.kun...@gmail.com wrote:

 Thanks Gerard !!

 This is working.

 On Tue, Feb 3, 2015 at 6:44 PM, Gerard Maas gerard.m...@gmail.com wrote:

 this is more of a scala question, so probably next time you'd like to
 address a Scala forum eg. http://stackoverflow.com/questions/tagged/scala

 val optArrStr:Option[Array[String]] = ???
 optArrStr.map(arr = arr.mkString(,)).getOrElse()  // empty string or
 whatever default value you have for this.

 kr, Gerard.

 On Tue, Feb 3, 2015 at 2:09 PM, kundan kumar iitr.kun...@gmail.com
 wrote:

 I have a RDD which is of type

 org.apache.spark.rdd.RDD[(String, (Array[String],
 Option[Array[String]]))]

 I want to write it as a csv file.

 Please suggest how this can be done.

 myrdd.map(line = (line._1 + , + line._2._1.mkString(,) + , +
 line._2._2.mkString(','))).saveAsTextFile(hdfs://...)

 Doing mkString on line._2._1 works but does not work for the Option type.

 Please suggest how this can be done.


 Thanks
 Kundan