Hi Steve,
tahnks for the hints and clarifications.
Unfortunately, I will not be able to use the approach you suggest, The distances I generate are distances between VERY large matrices (say 100000x100000 and more) each of different dimensions (not necessarily square either), and there is no significance in terms of column properties, they are basically graphs of sort.

Is there a way out with the SVM, or I just forget that?
Martin

On 10/21/2010 5:42 PM, Steve Lianoglou wrote:
Hi,

On Thu, Oct 21, 2010 at 9:42 AM, Martin Tomko<martin.to...@geo.uzh.ch>  wrote:
Dear all,
I am exploring the possibilities for automated classification of my
data. I have successfully used KNN, but was thinking about looking at
SVM (which I did nto use before).
I have a pairwise distance matrix of training observations which are
classified in set classes, and a distance matrix of new observations to
the  training ones.
It seems to me that since you have some pairwise distance metric, your
original data is in some "vector form".

Why not just try using your original data (forget the pairwsise
distance for now) and try a few different kernels for the svm, such as
a linear kernel or an rbf/gaussian.

Is it possible to use distance matrices for SVM, and if yes, which
package would do so (e1071 ? ).
I guess you can think of a "kernel matrix" as something like a
distance matrix -- actually, it's more like a similarity matrix.

I don't recall if e1071 allows you to use kernel matrix as input, but
I'm pretty sure the svm functions from kernlab do. It was a pain to
use, though.

But anyway -- don't use your distance matrix :-)

I have little experience with SVM, and I had the impression that it is
a/ usually used with data taht have observations in terms of a number of
variables (hence, not pariwise distances);
With the exception of "plugging in" a kernel matrix (which was
calculated from data in its original feature space) that's pretty much
correct.

b/ it is not well suited for large multidimensional spaces (I have a
distance matrix of 200*200 observations, a part of this could be used as
training data, but still, we are looking at say 50 distances per
observation).
But your distance matrix isn't really the same multidemensional space
your data lives in, right?

Anyway, like I said before, try the SVM on your original data with
some different kernels. I think the RBF kernel should be closest in
spirit to your distance matrix, and will likely perform better than
your kNN ;-).

Hope that helps,
-steve



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to