We already support sparse vectors and matrices.  That should be pretty much
all you need.

There is emerging support for SVM and on-line logistic regression.  A little
less mature is support for very large scale SVD which would give you a
reasonable basis for clustering, or categorization.

On Wed, May 5, 2010 at 6:29 AM, Pedro Oliveira <cpdom...@gmail.com> wrote:

> From a quick look at the code, a straightforward solution would be to
> define
> a new type of Vector (it wouldn't be a vector in the mathematical sense,
> just a way to save relational information about an instance), and some
> DistanceMeasures to work with that vector. Then we could use distance based
> techniques, such as canopy clustering and k-means.
> Is there any plans to implement more distance-based (or kernel-based)
> algorithms, such as SVMs and KNN?
>

Reply via email to