Hello Victor,
I want to do it on multiple columns. I was able to do it on one column by the
help of Sean using code below.
val matData = file.map(_.split(";"))
val stats = matData.map(_(2).toDouble).stats()
stats.mean
stats.max
Thank you
Vineet
From: Victor Tso-Guillen [mailto:[email protected]]
Sent: Montag, 25. August 2014 18:34
To: Hingorani, Vineet
Cc: [email protected]
Subject: Re: Manipulating columns in CSV file or Transpose of
Array[Array[String]] RDD
Do you want to do this on one column or all numeric columns?
On Mon, Aug 25, 2014 at 7:09 AM, Hingorani, Vineet
<[email protected]<mailto:[email protected]>> wrote:
Hello all,
Could someone help me with the manipulation of csv file data. I have
'semicolon' separated csv data including doubles and strings. I want to
calculate the maximum/average of a column. When I read the file using
sc.textFile(test.csv).map(_.split(";"), each field is read as string. Could
someone help me with the above manipulation and how to do that.
Or maybe if there is some way to take the transpose of the data and then
manipulating the rows in some way?
Thank you in advance, I am struggling with this thing for quite sometime
Regards,
Vineet