If you're going the embedding route, please consider trying wsabie first.
AWE is built on top of wsabie.

http://www.thespermwhale.com/jaseweston/papers/wsabie-ijcai.pdf

And so is the following ICML paper (preprint not online yet). Btw, anyone
going?

http://icml.cc/2013/?page_id=43
*Label Partitioning For Sublinear Ranking *
Jason Weston, Ameesh Makadia, Hector Yee

I was going to modify https://issues.apache.org/jira/browse/MAHOUT-703 to
do this when I was in as startup, as essentially
wsabie is very similar to a 2 layer NN without the sigmoid and with the
WARP update rule (in the wsabie paper) which
optimizes for precision rather than AUC. People may prefer high precision
at the top
of the ranking order when ranking millions of items for recommendation
algorithms.

An implementation of wsabie is in http://torch5.sourceforge.net somewhere I
think.

Hope that helps.


On Sat, Mar 30, 2013 at 7:15 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> SOM doesn't have to be constrained to two dimensions.
>
> That said, there are bunches of non-linear embedding methods that are more
> current than SOM's.  SOM's were part of the neural plausibility movement of
> the late 80's which more recently can be seen as an approach toward modern
> formulations of stochastic gradient descent.
>
> For one example, Hector Yee was just recommending that Affinity Based
> Emedding [1] would be a useful think to look at.  I would find it hard to
> say what would be a useful project in that regard.
>
> More central to Mahout's general areas of excellence would be an
> implementation of Latent factor Log Linear models [2].  These would provide
> a very interesting complement to the alternating least squares methods that
> have been developed lately in Mahout.
>
> Either of these would strike me as more useful in the Mahout context than
> SOM's.
>
> [1] http://arxiv.org/abs/1301.4171
>
> [2] http://arxiv.org/abs/1006.2156
>
>
> On Sat, Mar 30, 2013 at 12:21 PM, Sean Owen <sro...@gmail.com> wrote:
>
> > Are SOMs actually good at dimension reduction? I had understood it to
> > just be a visualization technique. You end up with a mapping with the
> > property that things that are near are similar, but no guarantee that
> > things that are similar are near.
> >
> > On Sat, Mar 30, 2013 at 12:06 PM, Dan Filimon
> > <dangeorge.fili...@gmail.com> wrote:
> > > Hi,
> > >
> > > I have a larger assignment to work on for my Machine Learning course
> this
> > > semester and I can pick one of 4 problems to solve.
> > >
> > > One of them, is implementing self organizing maps and using them to
> > cluster
> > > the  Localization Data for Person Activity Data Set [1] and evaluate
> the
> > > clustering with the Dunn Index and F-measure.
> > >
> > > I vaguely recall talking to Ted about self organizing maps as a way of
> > > achieving dimensionality reduction, so that's where it could be useful.
> > >
> > > I need to pick a problem anyway and was wondering if there's any sort
> of
> > > interest in this one.
> > > If yes, I could work on an implementation for Mahout (likely non
> > MapReduce,
> > > at least for the purposes of this assignment).
> > >
> > > Thoughts?
> > >
> > > [1]
> > >
> >
> http://archive.ics.uci.edu/ml/datasets/Localization+Data+for+Person+Activity
> >
>



-- 
Yee Yang Li Hector <https://plus.google.com/106746796711269457249>
Professional Profile <http://www.linkedin.com/in/yeehector>
http://hectorgon.blogspot.com/

Reply via email to