Re: How to calculate row by now and output retults in Spark

2015-10-19 Thread Ted Yu
Under core/src/test/scala/org/apache/spark , you will find a lot of
examples for map function.

FYI

On Mon, Oct 19, 2015 at 10:35 AM, Shepherd  wrote:

> Hi all, I am new in Spark and Scala. I have a question in doing
> calculation. I am using "groupBy" to generate key value pair, and the value
> points to a subset of original RDD. The RDD has four columns, and each
> subset RDD may have different number of rows. For example, the original
> code like this:" val b = a.gorupBy(_._2) val res = b.map{case (k, v) =>
> v.map(func)} " Here, I don't know how to write the func. I have to run each
> row in v, and calculate statistic result. How can I do that? And, how can I
> write function in Map? Thanks a lot.
> --
> View this message in context: How to calculate row by now and output
> retults in Spark
> 
> Sent from the Apache Spark User List mailing list archive
>  at Nabble.com.
>


How to calculate row by now and output retults in Spark

2015-10-19 Thread Shepherd
Hi all, I am new in Spark and Scala. I have a question in doing calculation.I
am using "groupBy" to generate key value pair, and the value points to a
subset of original RDD. The RDD has four columns, and each subset RDD may
have different number of rows.For example, the original code like this:"val
b = a.gorupBy(_._2) val res = b.map{case (k, v) => v.map(func)}"Here, I
don't know how to write the func. I have to run each row in v, and calculate
statistic result.How can I do that?And, how can I write function in
Map?Thanks a lot.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-calculate-row-by-now-and-output-retults-in-Spark-tp25122.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: How to calculate row by now and output retults in Spark

2015-10-19 Thread Adrian Tanase
Are you by any chance looking for reduceByKey? IF you’re trying to collapse all 
the values in V into an aggregate, that’s what you should be looking at.

-adrian

From: Ted Yu
Date: Monday, October 19, 2015 at 9:16 PM
To: Shepherd
Cc: user
Subject: Re: How to calculate row by now and output retults in Spark

Under core/src/test/scala/org/apache/spark , you will find a lot of examples 
for map function.

FYI

On Mon, Oct 19, 2015 at 10:35 AM, Shepherd 
> wrote:
Hi all, I am new in Spark and Scala. I have a question in doing calculation. I 
am using "groupBy" to generate key value pair, and the value points to a subset 
of original RDD. The RDD has four columns, and each subset RDD may have 
different number of rows. For example, the original code like this:" val b = 
a.gorupBy(_._2) val res = b.map{case (k, v) => v.map(func)} " Here, I don't 
know how to write the func. I have to run each row in v, and calculate 
statistic result. How can I do that? And, how can I write function in Map? 
Thanks a lot.

View this message in context: How to calculate row by now and output retults in 
Spark
Sent from the Apache Spark User List mailing list 
archive at Nabble.com.