Hi,I want to filter them for values. This is what is in array 74,20160905-133143,98.11218069128827594148
I want to filter anything > 50.0 in the third column Thanks On Monday, 5 September 2016, 15:07, ayan guha <guha.a...@gmail.com> wrote: Hi x.split returns an array. So, after first map, you will get RDD of arrays. What is your expected outcome of 2nd map? On Mon, Sep 5, 2016 at 11:30 PM, Ashok Kumar <ashok34...@yahoo.com.invalid> wrote: Thank you sir. This is what I get scala> textFile.map(x=> x.split(","))res52: org.apache.spark.rdd.RDD[ Array[String]] = MapPartitionsRDD[27] at map at <console>:27 How can I work on individual columns. I understand they are strings scala> textFile.map(x=> x.split(",")).map(x => (x.getString(0)) | )<console>:27: error: value getString is not a member of Array[String] textFile.map(x=> x.split(",")).map(x => (x.getString(0)) regards On Monday, 5 September 2016, 13:51, Somasundaram Sekar <somasundar.sekar@ tigeranalytics.com> wrote: Basic error, you get back an RDD on transformations like map.sc.textFile("filename").map(x => x.split(",") On 5 Sep 2016 6:19 pm, "Ashok Kumar" <ashok34...@yahoo.com.invalid> wrote: Hi, I have a text file as below that I read in 74,20160905-133143,98. 1121806912882759414875,20160905-133143,49. 5277699881591680774276,20160905-133143,56. 0802995712398098455677,20160905-133143,46. 6368952654440752277778,20160905-133143,84. 8822714116440218155179,20160905-133143,68. 72408602520662115000 val textFile = sc.textFile("/tmp/mytextfile. txt") Now I want to split the rows separated by "," scala> textFile.map(x=>x.toString). split(",")<console>:27: error: value split is not a member of org.apache.spark.rdd.RDD[ String] textFile.map(x=>x.toString). split(",") However, the above throws error? Any ideas what is wrong or how I can do this if I can avoid converting it to String? Thanking -- Best Regards, Ayan Guha