+1

We can maintain a contrib component just like what Akka and many other projects 
do.  With the help of Scala implicits, extensions to the RDD API can be done in 
a non-intrusive way and leave spark-core untouched.

On Feb 23, 2014, at 2:06 PM, Mridul Muralidharan <mri...@gmail.com> wrote:

> Hi,
> 
>  Over the past few months, I have seen a bunch of pull requests which have
> extended spark api ... most commonly RDD itself.
> 
> Most of them are either relatively niche case of specialization (which
> might not be useful for most cases) or idioms which can be expressed
> (sometimes with minor perf penalty) using existing api.
> 
> While all of them have non zero value (hence the effort to contribute, and
> gladly welcomed !) they are extending the api in nontrivial ways and have a
> maintenance cost ... and we already have a pending effort to clean up our
> interfaces prior to 1.0
> 
> I believe there is a need to keep exposed api succint, expressive and
> functional in spark; while at the same time, encouraging extensions and
> specialization within spark codebase so that other users can benefit from
> the shared contributions.
> 
> One approach could be to start something akin to piggybank in pig to
> contribute user generated specializations, helper utils, etc : bundled as
> part of spark, but not part of core itself.
> 
> Thoughts, comments ?
> 
> Regards,
> Mridul

Reply via email to