Thank you all for the pointers. This is a subject that I'll probably have to explore in a few weeks, and your guys help is much appreciated. I'll keep in touch if something interesting comes from this work.
Cheers, Pedro On Wed, May 5, 2010 at 11:16 AM, Ted Dunning <ted.dunn...@gmail.com> wrote: > We already support sparse vectors and matrices. That should be pretty much > all you need. > > There is emerging support for SVM and on-line logistic regression. A > little > less mature is support for very large scale SVD which would give you a > reasonable basis for clustering, or categorization. > > On Wed, May 5, 2010 at 6:29 AM, Pedro Oliveira <cpdom...@gmail.com> wrote: > > > From a quick look at the code, a straightforward solution would be to > > define > > a new type of Vector (it wouldn't be a vector in the mathematical sense, > > just a way to save relational information about an instance), and some > > DistanceMeasures to work with that vector. Then we could use distance > based > > techniques, such as canopy clustering and k-means. > > Is there any plans to implement more distance-based (or kernel-based) > > algorithms, such as SVMs and KNN? > > >