Hi @Olivier, you are absolutely right, scipy.optimize.fmin_l_bfgs_b would not be suitable for MLP because some practitioners would want on-line updating (partial_fit()) rather than batch. However, what's your opinion about the use of 'fmin_l_bfgs_b' in naturally BATCH processing algorithms like sparse autoencoders? The refinements such as weight decay can all be given, because the optimizer can be fed with the function that returns the cost and the function that calculates the gradient which contains all the 'beta, decay, etc' parameters. One thing I noticed that is making me excited about 'fmin_l_bfgs_b' is the performance benchmark. I tested Sparse Autoencoder (SAE with fmin_l_bfgs_b ) against RBM (the one in the pull request) in fitting, with 400 iterations, on the scikit's digits dataset.
I found that SAE with fmin_l_bfgs_b took 3.45 seconds whereas RBM took 16.61 seconds. Considering that these algorithms are somewhat similar in their updating, fmin_l_bfgs_b seemed to prevail. Thanks! PS: I have just pushed a pull request here that commits "Sparse Autoencoder", a feature extractor algorithm, :) https://github.com/scikit-learn/scikit-learn/pull/2099 I will now work on larsmans MLP and work on devising/reusing an SGD that enables efficient partial_fit() Thanks again! ~Issam ------------------------------------------------------------------------------ This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general