Hi Abijith, This should get you started: http://scikit-learn.org/dev/tutorial/text_analytics/working_with_text_data.html
Brian On 6/20/14, 12:05 PM, Abijith Kp wrote: > Can anyone help me with the problem of dealing with feature which are > both strings of varying length(say from 0 to 100-150 characters) and > numbers? > > What will be the most widely used techniques in such kind of situations? > And can it be solved using only scikit-learn? > > PS: Initially I have to convert a json file to a feature's list, and > then use it. > > Any help is appreciated. > > Regards, > Abijith > > -- > Abijith KP > github.com/abijith-kp <http://github.com/abijith-kp> > kpabijith.wordpress.com <http://kpabijith.wordpress.com> > > > ------------------------------------------------------------------------------ > HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions > Find What Matters Most in Your Big Data with HPCC Systems > Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. > Leverages Graph Analysis for Fast Processing & Easy Data Exploration > http://p.sf.net/sfu/hpccsystems > > > > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > ------------------------------------------------------------------------------ HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions Find What Matters Most in Your Big Data with HPCC Systems Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. Leverages Graph Analysis for Fast Processing & Easy Data Exploration http://p.sf.net/sfu/hpccsystems _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
