If you have data in the form of a list of dictionaries like this:
data = [{'target': 0 , 'featureVector' : [...]}, {'target': 1,
'featureVector': [...]}, ... ]
You can use pandas to easily convert them into something that scikit-learn
would accept:
In [18]: import pandas
In [19]: from sklearn import naive_bayes
In [20]: data = [{'target': 0 , 'featureVector' : [1,2,3]}, {'target': 1,
'featureVector': [2,1,2]}]
In [21]: dframe = pandas.DataFrame(data)
In [24]: list(dframe.target)
Out[24]: [0, 1]
In [25]: list(dframe.featureVector)
Out[25]: [[1, 2, 3], [2, 1, 2]]
In [33]: nb = naive_bayes.MultinomialNB().fit(list(dframe.featureVector),
list(dframe.target))
In [34]: nb.predict([1,2,3])
Out[34]: array([0])
I don't know why you can't feed the dframe.target and dframe.featureVector
variables directly into the fit method (they behave as numpy ndarrays where
needed), but using pandas can help a lot, specially if you need to do some
preprocessing.
Rafael Calsaverini
Data Scientist @ Apontador.com
Ph.D. Student @ Instituto de FĂsica - USP
cell: +55 11 7525.6222
work: +55 11 3845.0845
*
*
*8d21881718d00d997686177be1c27360493b23ea0258f5e6534437e6*
On Fri, Apr 5, 2013 at 6:37 AM, Bill Power <[email protected]> wrote:
> i know this is going to sound a little silly, but I was thinking there
> that it might be nice to be able to do this with scikit learn
>
> clf = sklearn.anyClassifier()
> clf.fit( { 0: dataWithLabel0,
> 1: dataWithLabel1 } )
>
> instead of having to separate the data/labels manually. i guess fit would
> do that internally, but it might be nice to have this
>
> bill
>
>
> ------------------------------------------------------------------------------
> Minimize network downtime and maximize team effectiveness.
> Reduce network management and security costs.Learn how to hire
> the most talented Cisco Certified professionals. Visit the
> Employer Resources Portal
> http://www.cisco.com/web/learning/employer_resources/index.html
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire
the most talented Cisco Certified professionals. Visit the
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general