That is an option too. Implementing convolutions with FFTs should be considered as well http://arxiv.org/pdf/1312.5851.pdf.
From: Feynman Liang [mailto:fli...@databricks.com] Sent: Tuesday, September 08, 2015 12:07 PM To: Ulanov, Alexander Cc: Ruslan Dautkhanov; Nick Pentreath; user; na...@yandex.ru Subject: Re: Spark ANN Just wondering, why do we need tensors? Is the implementation of convnets using im2col (see here<http://cs231n.github.io/convolutional-networks/>) insufficient? On Tue, Sep 8, 2015 at 11:55 AM, Ulanov, Alexander <alexander.ula...@hpe.com<mailto:alexander.ula...@hpe.com>> wrote: Ruslan, thanks for including me in the discussion! Dropout and other features such as Autoencoder were implemented, but not merged yet in order to have room for improving the internal Layer API. For example, there is an ongoing work with convolutional layer that consumes/outputs 2D arrays. We’ll probably need to change the Layer’s input/output type to tensors. This will influence dropout which will need some refactoring to handle tensors too. Also, all new components should have ML pipeline public interface. There is an umbrella issue for deep learning in Spark https://issues.apache.org/jira/browse/SPARK-5575 which includes various features of Autoencoder, in particular https://issues.apache.org/jira/browse/SPARK-10408. You are very welcome to join and contribute since there is a lot of work to be done. Best regards, Alexander From: Ruslan Dautkhanov [mailto:dautkha...@gmail.com<mailto:dautkha...@gmail.com>] Sent: Monday, September 07, 2015 10:09 PM To: Feynman Liang Cc: Nick Pentreath; user; na...@yandex.ru<mailto:na...@yandex.ru> Subject: Re: Spark ANN Found a dropout commit from avulanov: https://github.com/avulanov/spark/commit/3f25e26d10ef8617e46e35953fe0ad1a178be69d It probably hasn't made its way to MLLib (yet?). -- Ruslan Dautkhanov On Mon, Sep 7, 2015 at 8:34 PM, Feynman Liang <fli...@databricks.com<mailto:fli...@databricks.com>> wrote: Unfortunately, not yet... Deep learning support (autoencoders, RBMs) is on the roadmap for 1.6<https://issues.apache.org/jira/browse/SPARK-10324> though, and there is a spark package<http://spark-packages.org/package/rakeshchalasani/MLlib-dropout> for dropout regularized logistic regression. On Mon, Sep 7, 2015 at 3:15 PM, Ruslan Dautkhanov <dautkha...@gmail.com<mailto:dautkha...@gmail.com>> wrote: Thanks! It does not look Spark ANN yet supports dropout/dropconnect or any other techniques that help avoiding overfitting? http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf https://cs.nyu.edu/~wanli/dropc/dropc.pdf ps. There is a small copy-paste typo in https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/ann/BreezeUtil.scala#L43 should read B&C :) -- Ruslan Dautkhanov On Mon, Sep 7, 2015 at 12:47 PM, Feynman Liang <fli...@databricks.com<mailto:fli...@databricks.com>> wrote: Backprop is used to compute the gradient here<https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala#L579-L584>, which is then optimized by SGD or LBFGS here<https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala#L878> On Mon, Sep 7, 2015 at 11:24 AM, Nick Pentreath <nick.pentre...@gmail.com<mailto:nick.pentre...@gmail.com>> wrote: Haven't checked the actual code but that doc says "MLPC employes backpropagation for learning the model. .."? — Sent from Mailbox<https://www.dropbox.com/mailbox> On Mon, Sep 7, 2015 at 8:18 PM, Ruslan Dautkhanov <dautkha...@gmail.com<mailto:dautkha...@gmail.com>> wrote: http://people.apache.org/~pwendell/spark-releases/latest/ml-ann.html Implementation seems missing backpropagation? Was there is a good reason to omit BP? What are the drawbacks of a pure feedforward-only ANN? Thanks! -- Ruslan Dautkhanov