Re: [mllib] State of Multi-Model training

Kyle Ellrott Tue, 16 Sep 2014 21:43:02 -0700

I'd be interested in helping to test your code as soon as its available.
The version I wrote used a paired RDD and combined by key, it worked best
if it used a custom partitioner that put all the samples in the same area.
Running things in batched matrices would probably speed things up greatly.
You probably won't need my training code, but I did write some stuff
related to calculating Binary classifications metric (
https://github.com/apache/spark/pull/1292/files#diff-6) and AUC (
https://github.com/apache/spark/pull/1292/files#diff-5) for multiple models
that you might be able to use.


Kyle


On Tue, Sep 16, 2014 at 4:09 PM, Burak Yavuz <[email protected]> wrote:

> Hi Kyle,
>
> I'm actively working on it now. It's pretty close to completion, I'm just
> trying to figure out bottlenecks and optimize as much as possible.
> As Phase 1, I implemented multi model training on Gradient Descent.
> Instead of performing Vector-Vector operations on rows (examples) and
> weights,
> I've batched them into matrices so that we can use Level 3 BLAS to speed
> things up. I've also added support for Sparse Matrices (
> https://github.com/apache/spark/pull/2294) as making use of sparsity will
> allow you to train more models at once.
>
> Best,
> Burak
>
> ----- Original Message -----
> From: "Kyle Ellrott" <[email protected]>
> To: [email protected]
> Sent: Tuesday, September 16, 2014 3:21:53 PM
> Subject: [mllib] State of Multi-Model training
>
> I'm curious about the state of development Multi-Model learning in MLlib
> (training sets of models during the same training session, rather then one
> at a time). The JIRA lists it as in progress targeting Spark 1.2.0 (
> https://issues.apache.org/jira/browse/SPARK-1486 ). But there hasn't been
> any notes on it in over a month.
> I submitted a pull request for a possible method to do this work a little
> over two months ago (https://github.com/apache/spark/pull/1292), but
> haven't yet received any feedback on the patch yet.
> Is anybody else working on multi-model training?
>
> Kyle
>
>

Re: [mllib] State of Multi-Model training

Reply via email to