On Thu, Dec 29, 2011 at 09:18:38PM +0100, Bronco Zaurus wrote:
>    I have a beginner's question: how do you classify using non-numerical
>    features, concretely strings (for example: 'audi', 'bmw',
>    'chevrolet')?

You are in trouble as your input space is not metric: what's .5*('audi' +
'chevrolet')? Standard continuous mathematical formulations do not apply.

I do believe that they are algorithms to deal with this kind of problems,
but the scikit does not implement any, and this is quite far from my area
of expertise. My approach would be to look for other kind of features.

Sorry for the bad news.

Gaƫl

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual 
desktops for less than the cost of PCs and save 60% on VDI infrastructure 
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to