2011/11/3 Mathieu Blondel <[email protected]>:
> On Thu, Nov 3, 2011 at 6:28 AM, David Warde-Farley
> <[email protected]> wrote:
>
>> I wonder how this compares to learning a linear tied-weights autoencoder
>> with SGD and then just orthogonalizing the weight vectors (I suppose you'd
>> also need to do one run with a single "neuron" in order to orient the basis
>> with respect to the first p.c.).
>
> I was thinking of something similar: just a least-squares objective
> minimized by SGD. Would be nice to compare with RandomizedPCA both in
> terms of training time and performance on the final supervised
> objective.

I wonder if Averaged SGD would work for autoencoders. If the number of
unit in hidden layers is small enough I think the problem should be
convex up to a permutation of the components (just a loosy intuition)
hence the convergence bounds from ASGD might hold.

> By the way, how would go about the orthogonalization? Gram亡chmidt?

Is there a derivable proxy regularizer for orthogonolization? E.g. as
nuclear norm is a proxy for rank and L1 is a proxy for L0.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to