Mridul,

Can you give examples of APIs that people have contributed (or wanted
to contribute) but you categorize as something that would go into
piggybank-like (sparkbank)? Curious to know how you'd decide what
should go where.

Amandeep

> On Feb 22, 2014, at 10:06 PM, Mridul Muralidharan <mri...@gmail.com> wrote:
>
> Hi,
>
>  Over the past few months, I have seen a bunch of pull requests which have
> extended spark api ... most commonly RDD itself.
>
> Most of them are either relatively niche case of specialization (which
> might not be useful for most cases) or idioms which can be expressed
> (sometimes with minor perf penalty) using existing api.
>
> While all of them have non zero value (hence the effort to contribute, and
> gladly welcomed !) they are extending the api in nontrivial ways and have a
> maintenance cost ... and we already have a pending effort to clean up our
> interfaces prior to 1.0
>
> I believe there is a need to keep exposed api succint, expressive and
> functional in spark; while at the same time, encouraging extensions and
> specialization within spark codebase so that other users can benefit from
> the shared contributions.
>
> One approach could be to start something akin to piggybank in pig to
> contribute user generated specializations, helper utils, etc : bundled as
> part of spark, but not part of core itself.
>
> Thoughts, comments ?
>
> Regards,
> Mridul

Reply via email to