Re: [Scikit-learn-general] Strings as features

Andy Sat, 21 Jun 2014 06:23:27 -0700

Hi Abijith.

It depends on how you want to interpret the strings.

If they are texts and you want to interpret them based on their content,Brians suggestion is the right one.If you want to consider each possible string as a distinct feature, theOneHotEncoder would be the right choice.

Could you give an example of what the strings and the semantics of thestrings are?


Andy



On 06/20/2014 06:05 PM, Abijith Kp wrote:

Can anyone help me with the problem of dealing with feature which areboth strings of varying length(say from 0 to 100-150 characters) andnumbers?
What will be the most widely used techniques in such kind ofsituations? And can it be solved using only scikit-learn?
PS: Initially I have to convert a json file to a feature's list, andthen use it.
Any help is appreciated.

Regards,
Abijith

--
Abijith KP
github.com/abijith-kp <http://github.com/abijith-kp>
kpabijith.wordpress.com <http://kpabijith.wordpress.com>


------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems


_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Strings as features

Reply via email to