On Jan 3, 2012, at 17:02 , Olivier Grisel wrote:

> 2012/1/3 Lars Buitinck <[email protected]>:
>> 
>>> We probably need to extend the sklearn.feature_extraction.text package
>>> to make it more user friendly to work with with pure categorical
>>> features occurrences:
>> 
>> I'm not sure this belongs in feature_extraction.text; it's much more
>> broadly applicable.
>> 
>> If you poke around my branches on GitHub, you'll find some preliminary
>> work on both a one-hot transformer and an ARFF (Weka format) reader. I
>> think the latter would be very convenient for those wanting mixed
>> numerical/categorical data sets.
> 
> Noted. I don't plan to work on this in the short term but I'll make
> sure to check your work on ARFF if I ever change my mind.
> Indeed such a generic mixed numerical / categorical feature extractor
> would be a very useful contrib to the scikit.

Isn't this the kind of data that one might store as a pandas DataFrame with 
some categorical columns? Maybe we should be able to load from this format as 
well?

> -- 
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
> 
> ------------------------------------------------------------------------------
> Write once. Port to many.
> Get the SDK and tools to simplify cross-platform app development. Create 
> new or port existing apps to sell to consumers worldwide. Explore the 
> Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
> http://p.sf.net/sfu/intel-appdev
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create 
new or port existing apps to sell to consumers worldwide. Explore the 
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to