Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-68241121
I have compared the ANN with Support Vector Machine (SVM) and Logistic
Regression.
I have tested using a master local(5) configuration, and applied the
MNIST
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-67915148
@jkbradley @avulanov
Agree that we should refrain from adding to much options at this point in
time, and keep the implementation simple but robust
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-67584041
@bgreeven I have cloned your branch and am trying to run the MNIST
dataset.
I can't quite understand how to set the number of output neurons though
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-67585200
Addendum: notice that the ANNClassifier.train function has several
instances, and the number of nodes in the hidden layer(s) is quite critical.
Hence I would prefer
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-67422794
@jkbradley @Lewuathe
Indeed I have been thinking about such interface as well. I quite like it,
but...:
GradientDescent is private in MLLIB, so you
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-67267236
@avulanov @jkbradley
The issue is, that some optimisers use different parameters than others.
For example, LBFGS uses tolerance and whereas SGD has
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-67267487
@avulanov @jkbradley
An advantage of the string is that you can pass it as an opaque from the
ANNClassifier class to the ArtificialNeuralNetwork class, i.e
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-67281152
So you mean something as follows?
---
abstract class OptimizerInfo {}
case class OptimizerInfoSGD extends OptimizerInfo {
var noIterations
Github user bgreeven commented on a diff in the pull request:
https://github.com/apache/spark/pull/1290#discussion_r21603916
--- Diff: docs/mllib-ann.md ---
@@ -0,0 +1,239 @@
+---
+layout: global
+title: Artificial Neural Networks - MLlib
+displayTitle: a href=mllib
Github user bgreeven commented on a diff in the pull request:
https://github.com/apache/spark/pull/1290#discussion_r20620263
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/ann/ArtificialNeuralNetwork.scala
---
@@ -0,0 +1,528 @@
+/*
+ * Licensed to the Apache
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-61749598
Let's discuss a bit more about making the optimiser, updater, gradient, and
error function customizable.
Notice that for the current LBFGS algorithm, the error
Github user bgreeven commented on a diff in the pull request:
https://github.com/apache/spark/pull/1290#discussion_r19722711
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/ann/ArtificialNeuralNetwork.scala
---
@@ -0,0 +1,528 @@
+/*
+ * Licensed to the Apache
Github user bgreeven commented on a diff in the pull request:
https://github.com/apache/spark/pull/1290#discussion_r19723239
--- Diff: docs/mllib-ann.md ---
@@ -0,0 +1,223 @@
+---
+layout: global
+title: Artificial Neural Networks - MLlib
+displayTitle: a href=mllib
Github user bgreeven commented on a diff in the pull request:
https://github.com/apache/spark/pull/1290#discussion_r19733377
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/ann/ArtificialNeuralNetwork.scala
---
@@ -0,0 +1,528 @@
+/*
+ * Licensed to the Apache
Github user bgreeven commented on a diff in the pull request:
https://github.com/apache/spark/pull/1290#discussion_r19733442
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/ann/ArtificialNeuralNetwork.scala
---
@@ -0,0 +1,528 @@
+/*
+ * Licensed to the Apache
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-56343909
Changed optimiser to LBFGS. Works much faster, but has the disadvantage
(due to the increased convergence speed per iteration) that it also starts to
exhibit
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-56344396
I also needed to change the demo, as the fast convergence doesn't give an
interesting converging graph anymore. I moved the demo to the examples
directory, but we can
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-54927726
Thanks for your feedback. Your points are very helpful indeed.
Here is my response:
1. The user guide is for normal users and it should focus on how
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-54102428
Now updated such that the code supports true back-propagation.
Thanks to Alexander Ulanov (avulanov) for implementing true
back-propagation in his repository
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-53031845
Added documentation
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-53037400
Joining efforts / cooperation is always good of course. :-)
Let me have a closer look at your code first, and see how it differs from
mine. I'll try it with my
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-52889153
I have updated the code. Indeed the LeastSquaresGradientANN.compute
function was the culprit.
I removed the Breeze instructions, and replaced them by simple
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-52456503
Thanks for your feedback. I'll write some documentation, and also add some
comments. I'll try with similar size data.
The internal data structure of the weights
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-51997639
The ANN uses the existent GradientDescent from mllib.optimization for back
propagation. It uses the gradient from the new LeastSquaresGradientANN class,
and updates
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-51875281
SteepestDescend - SteepestDescent can be changed. Thanks for noticing.
Hung Pham, did it work out for you now?
---
If your project is set up for it, you can
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-50851021
Thanks a lot! I have added the extension now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-50569408
I updated the two sources to comply with sbt/sbt scalastyle. Maybe retry
the unit tests with the new modifications?
---
If your project is set up for it, you can reply
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-50421747
Hi Matthew,
Sure, I can. I was on holiday during the last two weeks, but now back in
office. I'll update the code this week.
Best regards,
Bert
Github user bgreeven commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-50436968
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
GitHub user bgreeven opened a pull request:
https://github.com/apache/spark/pull/1290
[spark-2352] Implementation of an 1-hidden layer Artificial Neural Network
(ANN)
The code contains a single layer ANN, with variable number of inputs,
outputs and hidden nodes. It takes as input
30 matches
Mail list logo