You could do a mapPartitions and Filter within each partition -- Something like
numbers <- parallelize(sc, 1:20, 4L)
evenNumbers <- lapplyPartition(numbers, function(part) { Filter(
function(x) { x%%2 == 0} , part) })
collect(evenNumbers)
It should be simple to add this to the SparkR API as well -- Let me
know if you want to send a PR !
Thanks
Shivaram
On Wed, Jan 29, 2014 at 5:45 PM, Justin Lent <[email protected]> wrote:
> any idea when the Scala filter() function equivalent will be availalbe
> in the SparkR implementation? Or is there a simple way to implement it
> in R with the existing functions?
>
> Thanks!
> -Justin