Re: Develop custom Estimator / Transformer for pipeline

Nick Pentreath Thu, 17 Nov 2016 17:53:59 -0800

@Holden look forward to the blog post - I think a user guide PR based on it
would also be super useful :)


On Fri, 18 Nov 2016 at 05:29 Holden Karau <holden.ka...@gmail.com> wrote:

> I've been working on a blog post around this and hope to have it published
> early next month 😀
>
> On Nov 17, 2016 10:16 PM, "Joseph Bradley" <jos...@databricks.com> wrote:
>
> Hi Georg,
>
> It's true we need better documentation for this.  I'd recommend checking
> out simple algorithms within Spark for examples:
> ml.feature.Tokenizer
> ml.regression.IsotonicRegression
>
> You should not need to put your library in Spark's namespace.  The shared
> Params in SPARK-7146 are not necessary to create a custom algorithm; they
> are just niceties.
>
> Though there aren't great docs yet, you should be able to follow existing
> examples.  And I'd like to add more docs in the future!
>
> Good luck,
> Joseph
>
> On Wed, Nov 16, 2016 at 6:29 AM, Georg Heiler <georg.kf.hei...@gmail.com>
> wrote:
>
> HI,
>
> I want to develop a library with custom Estimator / Transformers for
> spark. So far not a lot of documentation could be found but
> http://stackoverflow.com/questions/37270446/how-to-roll-a-custom-estimator-in-pyspark-mllib
>
>
> Suggest that:
> Generally speaking, there is no documentation because as for Spark 1.6 /
> 2.0 most of the related API is not intended to be public. It should change
> in Spark 2.1.0 (see SPARK-7146
> <https://issues.apache.org/jira/browse/SPARK-7146>).
>
> Where can I already find documentation today?
> Is it true that my library would require residing in Sparks`s namespace
> similar to https://github.com/collectivemedia/spark-ext to utilize all
> the handy functionality?
>
> Kind Regards,
> Georg
>
>
>
>

Re: Develop custom Estimator / Transformer for pipeline

Reply via email to