I wonder if it will help to have a generic Monad container that wraps
either RDD or DStream and provides
map, flatmap, foreach and filter methods.
case class DataMonad[A](data: A) {
def map[B]( f : A => B ) : DataMonad[B] = {
DataMonad( f( data ) )
}
def flatMap[B]( f : A => DataMonad[B] ) : DataMonad[B] = {
f( data )
}
def foreach ...
def withFilter ...
:
:
etc, something like that
}
On Wed, Dec 18, 2013 at 10:42 PM, Reynold Xin <[email protected]> wrote:
>
> On Wed, Dec 18, 2013 at 12:17 PM, Nathan Kronenfeld <
> [email protected]> wrote:
>
>>
>>
>> Since many of the functions exist in parallel between the two, I guess I
>> would expect something like:
>>
>> trait BasicRDDFunctions {
>> def map...
>> def reduce...
>> def filter...
>> def foreach...
>> }
>>
>> class RDD extends BasicRDDFunctions...
>> class DStream extends BasicRDDFunctions...
>>
>
> I like this idea. We should discuss more about it on the dev list. It
> would require refactoring some APIs, but does lead to better unification.
>