Re: [R] SVM. How to use categorical attributes?

2012-03-28 Thread Steve Lianoglou
Sorry -- I should add that I'm pointing out the potential shogun implementation because I suspect their implementation of a bag-of-words -like kernel would use the kernel trick, so you won't have to map all of your data explicitly into some huge feature space that will blow your memory away. I'm n

Re: [R] SVM. How to use categorical attributes?

2012-03-28 Thread Steve Lianoglou
Hi, These suggestions still require you to explicitly compute your feature space or kernel matrix first, which might kill you memory wise. You might consider taking a look at the shogun toolbox: http://www.shogun-toolbox.org/ With some digging, I'm pretty sure you'll find a bag-of-words type of

Re: [R] SVM. How to use categorical attributes?

2012-03-28 Thread Alekseiy Beloshitskiy
] SVM. How to use categorical attributes? Sorry, I forgot to mention the following: all I wrote is only valid as long as your number of samples is smaller than the number of different words. If the number of samples exceeds the total number of different words, you should better use the explicit matrix

Re: [R] SVM. How to use categorical attributes?

2012-03-28 Thread Ulrich Bodenhofer
Sorry, I forgot to mention the following: all I wrote is only valid as long as your number of samples is smaller than the number of different words. If the number of samples exceeds the total number of different words, you should better use the explicit matrix representation and use some kernel (e.

Re: [R] SVM. How to use categorical attributes?

2012-03-28 Thread Ulrich Bodenhofer
Alex, To avoid the memory issue, you can directly use a "bag of words" kernel (which corresponds to using the linear kernel on the sparse bag of words matrix Steve suggested). Just a little toy example how this is done for two : > x1 <- c("how", "to", "grow", "tree") > x2 <- c("where", "to", "go

Re: [R] SVM. How to use categorical attributes?

2012-03-28 Thread Alekseiy Beloshitskiy
://stats.stackexchange.com/questions/25355/multi-value-categorical-attributes-how-r Thank you, -Alex From: Steve Lianoglou [mailinglist.honey...@gmail.com] Sent: 27 March 2012 21:47 To: Alekseiy Beloshitskiy Cc: r-help@r-project.org Subject: Re: [R] SVM. How to use categorical

Re: [R] SVM. How to use categorical attributes?

2012-03-27 Thread Steve Lianoglou
Hi, On Tue, Mar 27, 2012 at 6:05 AM, Alekseiy Beloshitskiy wrote: > Hi All, > > Here is the case. I want to build classification model (SVM). Some of > variables for this model are categorical attributes which represent words   > (usually 3-10 words - query for search in google). For example: >

[R] SVM. How to use categorical attributes?

2012-03-27 Thread Alekseiy Beloshitskiy
Hi All, Here is the case. I want to build classification model (SVM). Some of variables for this model are categorical attributes which represent words (usually 3-10 words - query for search in google). For example: search_id | query_words|..| result ---+