2011/11/3 Mathieu Blondel <[email protected]>: > On Thu, Nov 3, 2011 at 6:28 AM, David Warde-Farley > <[email protected]> wrote: > >> I wonder how this compares to learning a linear tied-weights autoencoder >> with SGD and then just orthogonalizing the weight vectors (I suppose you'd >> also need to do one run with a single "neuron" in order to orient the basis >> with respect to the first p.c.). > > I was thinking of something similar: just a least-squares objective > minimized by SGD. Would be nice to compare with RandomizedPCA both in > terms of training time and performance on the final supervised > objective.
I wonder if Averaged SGD would work for autoencoders. If the number of unit in hidden layers is small enough I think the problem should be convex up to a permutation of the components (just a loosy intuition) hence the convergence bounds from ASGD might hold. > By the way, how would go about the orthogonalization? Gram亡chmidt? Is there a derivable proxy regularizer for orthogonolization? E.g. as nuclear norm is a proxy for rank and L1 is a proxy for L0. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
