RE: Spark ANN

Ulanov, Alexander Tue, 08 Sep 2015 17:51:32 -0700

That is an option too. Implementing convolutions with FFTs should be considered 
as well http://arxiv.org/pdf/1312.5851.pdf.

From: Feynman Liang [mailto:fli...@databricks.com]
Sent: Tuesday, September 08, 2015 12:07 PM
To: Ulanov, Alexander
Cc: Ruslan Dautkhanov; Nick Pentreath; user; na...@yandex.ru
Subject: Re: Spark ANN

Just wondering, why do we need tensors? Is the implementation of convnets using 
im2col (see here<http://cs231n.github.io/convolutional-networks/>) insufficient?

On Tue, Sep 8, 2015 at 11:55 AM, Ulanov, Alexander 
<alexander.ula...@hpe.com<mailto:alexander.ula...@hpe.com>> wrote:
Ruslan, thanks for including me in the discussion!

Dropout and other features such as Autoencoder were implemented, but not merged 
yet in order to have room for improving the internal Layer API. For example, 
there is an ongoing work with convolutional layer that consumes/outputs 2D 
arrays. We’ll probably need to change the Layer’s input/output type to tensors. 
This will influence dropout which will need some refactoring to handle tensors 
too. Also, all new components should have ML pipeline public interface. There 
is an umbrella issue for deep learning in Spark 
https://issues.apache.org/jira/browse/SPARK-5575 which includes various 
features of Autoencoder, in particular 
https://issues.apache.org/jira/browse/SPARK-10408. You are very welcome to join 
and contribute since there is a lot of work to be done.

Best regards, Alexander
From: Ruslan Dautkhanov 
[mailto:dautkha...@gmail.com<mailto:dautkha...@gmail.com>]
Sent: Monday, September 07, 2015 10:09 PM
To: Feynman Liang
Cc: Nick Pentreath; user; na...@yandex.ru<mailto:na...@yandex.ru>
Subject: Re: Spark ANN

Found a dropout commit from avulanov:
https://github.com/avulanov/spark/commit/3f25e26d10ef8617e46e35953fe0ad1a178be69d

It probably hasn't made its way to MLLib (yet?).

--
Ruslan Dautkhanov

On Mon, Sep 7, 2015 at 8:34 PM, Feynman Liang 
<fli...@databricks.com<mailto:fli...@databricks.com>> wrote:
Unfortunately, not yet... Deep learning support (autoencoders, RBMs) is on the 
roadmap for 1.6<https://issues.apache.org/jira/browse/SPARK-10324> though, and 
there is a spark 
package<http://spark-packages.org/package/rakeshchalasani/MLlib-dropout> for 
dropout regularized logistic regression.

On Mon, Sep 7, 2015 at 3:15 PM, Ruslan Dautkhanov 
<dautkha...@gmail.com<mailto:dautkha...@gmail.com>> wrote:
Thanks!

It does not look Spark ANN yet supports dropout/dropconnect or any other 
techniques that help avoiding overfitting?
http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf
https://cs.nyu.edu/~wanli/dropc/dropc.pdf

ps. There is a small copy-paste typo in
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/ann/BreezeUtil.scala#L43
should read B&C :)

--
Ruslan Dautkhanov

On Mon, Sep 7, 2015 at 12:47 PM, Feynman Liang 
<fli...@databricks.com<mailto:fli...@databricks.com>> wrote:
Backprop is used to compute the gradient 
here<https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala#L579-L584>,
 which is then optimized by SGD or LBFGS 
here<https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala#L878>

On Mon, Sep 7, 2015 at 11:24 AM, Nick Pentreath 
<nick.pentre...@gmail.com<mailto:nick.pentre...@gmail.com>> wrote:
Haven't checked the actual code but that doc says "MLPC employes 
backpropagation for learning the model. .."?

—
Sent from Mailbox<https://www.dropbox.com/mailbox>

On Mon, Sep 7, 2015 at 8:18 PM, Ruslan Dautkhanov 
<dautkha...@gmail.com<mailto:dautkha...@gmail.com>> wrote:
http://people.apache.org/~pwendell/spark-releases/latest/ml-ann.html

Implementation seems missing backpropagation?
Was there is a good reason to omit BP?
What are the drawbacks of a pure feedforward-only ANN?

Thanks!

--
Ruslan Dautkhanov

RE: Spark ANN

Reply via email to