I'm also curious what the vetting process will be for this spark-contrib
code?  Does inclusion in spark-contrib mean that it has received some sort
of review and official blessing, or is contrib just a dumping ground for
code of questionable quality, utility, maintenance, etc.?


On Sat, Feb 22, 2014 at 10:23 PM, Amandeep Khurana <ama...@gmail.com> wrote:

> Mridul,
>
> Can you give examples of APIs that people have contributed (or wanted
> to contribute) but you categorize as something that would go into
> piggybank-like (sparkbank)? Curious to know how you'd decide what
> should go where.
>
> Amandeep
>
> > On Feb 22, 2014, at 10:06 PM, Mridul Muralidharan <mri...@gmail.com>
> wrote:
> >
> > Hi,
> >
> >  Over the past few months, I have seen a bunch of pull requests which
> have
> > extended spark api ... most commonly RDD itself.
> >
> > Most of them are either relatively niche case of specialization (which
> > might not be useful for most cases) or idioms which can be expressed
> > (sometimes with minor perf penalty) using existing api.
> >
> > While all of them have non zero value (hence the effort to contribute,
> and
> > gladly welcomed !) they are extending the api in nontrivial ways and
> have a
> > maintenance cost ... and we already have a pending effort to clean up our
> > interfaces prior to 1.0
> >
> > I believe there is a need to keep exposed api succint, expressive and
> > functional in spark; while at the same time, encouraging extensions and
> > specialization within spark codebase so that other users can benefit from
> > the shared contributions.
> >
> > One approach could be to start something akin to piggybank in pig to
> > contribute user generated specializations, helper utils, etc : bundled as
> > part of spark, but not part of core itself.
> >
> > Thoughts, comments ?
> >
> > Regards,
> > Mridul
>

Reply via email to