Re: Efficient Aggregation over DB data

2014-05-01 Thread Andrea Esposito
Hi Sai, i don't sincerely figure out where you are using the RDDs (because the split method isn't defined in them) by the way you should use the map function instead of the foreach due the fact it is NOT idempotent and some partitions could be recomputed executing the function multiple times.

Efficient Aggregation over DB data

2014-04-22 Thread Sai Prasanna
Hi All, I want to access a particular column of a DB table stored in a CSV format and perform some aggregate queries over it. I wrote the following query in scala as a first step. *var add=(x:String)=x.split(\\s+)(2).toInt* *var result=List[Int]()* *input.split(\\n).foreach(x=result::=add(x)) *