Github user sethah commented on the issue:
https://github.com/apache/spark/pull/13621
@avulanov
I used this implementation to run a simple single layer autoencoder on the
MNIST dataset. I also used keras/theano to implement the same autoencoder and
run on the MNIST data. With Spark, I got very poor results. First, here are the
results of encode/decode using Keras with a cross entropy loss function on the
output, and sigmoid activations.

The implementation in this patch yielded very similar results.

Finally, here is the Keras implementation using RELU activations.

It appears the sigmoid activations are saturating during training and
preventing the algorithm from learning. If you have any thoughts/suggestions to
improve these results I'd really appreciate it.
Does it make sense to add another algorithm based on MLP/NN when the
current functionality is so limited? If the autoencoder library is not useful
without more than sigmoid activations, I'd vote for focusing on adding new
activations before another algorithm. I'm not an expert here, so I would really
appreciate your thoughts. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]