Re: [mlpack] Ideas for improving Maximum Variance unfolding in mlpack

Ryan Curtin Thu, 23 Mar 2017 09:08:29 -0700

On Wed, Mar 22, 2017 at 03:32:45AM +0530, Sahil Chelaramani wrote:
> Hi Ryan,
> 
> Continuing our earlier thread and CC'ing the mlpack list. Please refer
> to the figures attached for reference. The file mvu.png is the
> original MVU formulation from the paper. The file mvu2.png has the
> formulation using the inner product matrix K. Both these formulations
> only consider the N-nearest neighbours of the input points(This is the
> Eta term = 1 in the above formulations). I realized that in the
> original formulation of the problem, the paper uses a L2-Norm or
> euclidean distance, which gives it a non-linear program. If we
> reformulate the problem using L1-norm/Manhattan distance, we get the
> formulation shown in alt_form.png.


The original formulation is actually using squared Euclidean distance,
but that is not a big difference.

> The following are some observations about this formulation of MVU:-
> 
> a. The main advantage with this formulation is that we have can
> convert this formulation into a linear program directly, which have
> known polynomial time algorithms to solve them like the Ellipsoid
> method. 
> 
> b. This system of constraints is exponential with respect to the
> number of constraints in worst case, but for a reasonable setting of
> number of neighbours, we should do reasonably well. 
> 
> c. This linear program, if solved with the Ellipsoid method, needs to
> have a separation oracle that works in linear time to find a
> constraint that is violated.
> 
> Other than this formulation, I have also tried to figure out if an
> alternative definition of the matrix K, may help better solve this
> problem. I think, we can do much better than a simple dot product,
> since this concept is similar to a kernel function, we could explore
> using a Gaussian kernel or other similar kernel functions.

I have not dug too deeply into your formulation, but in either case,
unfortunately projects for GSoC must include a code portion, so they
can't be all research.  Therefore if this is what you did want to
implement, you would first need to show that the idea works and that it
produces sufficient approximations to the original MVU itself, and that
the solution can be found in much quicker time than original MVU with an
SDP solver.

> I would also be very interested in exploring other more recent
> non-linear dimensionality reduction techniques which could be
> implemented for mlpack. Could we try and figure out some additional
> techniques that would be worth looking into?

Yes, this is another idea.  There are already two very old PRs open for
multidimensional scaling.  It could be useful to add some nonlinear
dimensionality reduction techniques to mlpack.  LLE and IsoMap might be
two useful examples, but those are not particularly new---maybe you have
some newer techniques in mind?

-- 
Ryan Curtin    | "Excuse me.  I don't mean to impose, but I am the
[email protected] | Ocean."  - Bobby
_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Re: [mlpack] Ideas for improving Maximum Variance unfolding in mlpack

Reply via email to