HI

I updated to using the label binarizer, but i'm not sure where to go from
there.
I am still only getting one category per text

Below is my update code

y_train = ('New York','London')
train_set = ("new york nyc big apple", "london uk great britain")
vocab = {'new york' :0,'nyc':1,'big apple':2,'london' : 3, 'uk': 4, 'great
britain' : 5}
count = CountVectorizer(analyzer=WordNGramAnalyzer(min_n=1,
max_n=2),vocabulary=vocab)
test_set = ('nice day in nyc','london town','hello welcome to the big
apple. enjoy it here and london too')

X_vectorized = count.transform(train_set).todense()
smatrix2  = count.transform(test_set).todense()

Y_indicator = LabelBinarizer().fit(y_train).transform(y_train)
base_clf = MultinomialNB(alpha=1)
clf = OneVsRestClassifier(base_clf).fit(X_vectorized, Y_indicator)
Y_pred = clf.predict(smatrix2)
print Y_pred


Thanks for your time and help

bilal


On Fri, May 11, 2012 at 5:30 AM, Olivier Grisel <[email protected]>wrote:

> Hi,
>
> To make OneVsRest work at a multilabel classifier instead of
> multiclass classifier you need to "binarize" the label representation
> using LabelBinarizer as demonstrated in this example:
>
>
> http://scikit-learn.org/dev/auto_examples/plot_multilabel.html#example-plot-multilabel-py
>
> Also be ware that most classifiers in scikit-learn expect integer
> labels instead of strings (you need to define a mapping from one
> representation to another) although string labels might work for some
> of them. A new LabelEncoder class will soon be merged in master to
> help with that kind of representation switch.
>
> If you have further questions please ask them on the project mailing
> list so that other scikit-learn user can benefit from your experience.
>
> Best,
>
> --
> Olivier
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to