Hi Paul.
Sorry, I don't completely follow the code.
So you say dataActs_array has entries zero and one and there are 1505 ones
and 774 zeros?
An easier way would be to use np.unique(dataActs_array) and 
np.bincount(dataActs_array).
Could it be that the entries of dataActs_array are strings (this is what 
the keys look like).
Can you give us the dtype of y_test and y_predict and their content 
(np.unique)?

I think there was an issue with the confusion matrix not handling string 
labels,
but I am not sure if it was fixed.

Best,
Andy



Am 09.11.2012 16:48, schrieb [email protected]:
> Dear SciKitters,
>
> given a dataset (2200 sample, 90 features), I want to train a RF but run
> into an interesting issue.
>
> My array containing the labels (dataActs_array) only shows 2 classes:
> "
> from collections import defaultdict
> d = defaultdict(int)
> for elt in dataActs_array:
>      d[elt] += 1
> print d
> defaultdict(<type 'int'>, {'1': 1505, '0': 774})
> "
>
> However, after splitting into test and train:
> "
> sklearn.cross_validation import train_test_split
> X_train,X_test,y_train,y_test = train_test_split
> (dataDescrs_array,dataActs_array,test_size=.4)
> "
>
> the confusion matrix outputs 4 classes:
> "
> from sklearn.ensemble import RandomForestClassifier
> from sklearn import metrics
> clf_RF = RandomForestClassifier()
> clf_RF = clf_RF.fit(X_train,y_train)
> y_predict = clf_RF.predict(X_test)
> print metrics.confusion_matrix(y_test,y_predict)
> [[0 0 0 0]
>   [0 0 0 0]
>   [0 0 0 0]
>   [0 0 0 0]]
> "
>
> Where have all my classes gone?
> Why do I end up in a 4*4 array?
>
>
> Cheers & Thanks,
> Paul
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://www.merckgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_nov
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_nov
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to