Re: how to specify columns in groupby

2014-08-29 Thread MEETHU MATHEW
Thank you Yanbo for the reply..

I 've another query related to  cogroup.I want to iterate over the results of 
cogroup operation.

My code is 
* grp = RDD1.cogroup(RDD2)
* map((lambda (x,y): (x,list(y[0]),list(y[1]))), list(grp))
My result looks like :

[((u'764', u'20140826'), [0.70146274566650391], [ ]),
 ((u'863', u'20140826'), [0.368011474609375], [ ]),
 ((u'9571520', u'20140826'), [0.0046129226684570312], [0.60009])]
 
When I do one more cogroup operation like 

grp1 = grp.cogroup(RDD3)

I am not able to see the results.All my RDDs are of the form ((x,y),z).Can 
somebody help me to solve this.

Thanks & Regards, 
Meethu M


On Thursday, 28 August 2014 5:59 PM, Yanbo Liang  wrote:
 


For your reference:

val d1 = textFile.map(line => {
  val fileds = line.split(",")
  ((fileds(0),fileds(1)), fileds(2).toDouble)
})

val d2 = d1.reduceByKey(_+_)
d2.foreach(println)




2014-08-28 20:04 GMT+08:00 MEETHU MATHEW :

Hi all,
>
>
>I have an RDD  which has values in the  format "id,date,cost".
>
>
>I want to group the elements based on the id and date columns and get the sum 
>of the cost  for each group.
>
>
>Can somebody tell me how to do this?
>
>
> 
>Thanks & Regards, 
>Meethu M

Re: how to specify columns in groupby

2014-08-28 Thread Yanbo Liang
For your reference:

val d1 = textFile.map(line => {
  val fileds = line.split(",")
  ((fileds(0),fileds(1)), fileds(2).toDouble)
})

val d2 = d1.reduceByKey(_+_)
d2.foreach(println)


2014-08-28 20:04 GMT+08:00 MEETHU MATHEW :

> Hi all,
>
> I have an RDD  which has values in the  format "id,date,cost".
>
> I want to group the elements based on the id and date columns and get the
> sum of the cost  for each group.
>
> Can somebody tell me how to do this?
>
>
> Thanks & Regards,
> Meethu M
>