[Scikit-learn-general] Manual categories/separate classifiers

2014-05-23 Thread Tim Head
Hello, a naive question about what I should do and what already exists in scikit-learn. I have a classification problem with two classes, and I know that one of my features has two different different distributions for one of the classes. Example made up on the spot (real life is more complicate

Re: [Scikit-learn-general] Manual categories/separate classifiers

2014-05-31 Thread Tim Head
Hi Gilles, On 23 May 2014 15:06, Gilles Louppe wrote: > Hi Tim, > > In principles, what you describe exactly corresponds to the decision tree > algorithm. You partition the input space into smaller subspaces, on which > you recursively build sub-decision trees. > Exactly. What I was wondering wa

[Scikit-learn-general] CV scores vs scores on a manual split

2015-02-18 Thread Tim Head
Hello, I was comparing scores from CV with a score obtained from training on a subset of the data used in the CV and get very different answers. This surprised me, should I be? If not how do I understand how/why this happens? I run: scores = cross_validation.cross_val_score(clf, X_dev, y_dev, sc

Re: [Scikit-learn-general] CV scores vs scores on a manual split

2015-02-19 Thread Tim Head
Hi Gilles, On Thu Feb 19 2015 at 8:35:35 AM Gilles Louppe wrote: > Hi Tim, > > By default, cross_val_score uses on StratifiedKFold(shuffle=False) to > create the train/test folds while train_test_split uses ShuffleSplit. > The discrepancy you observe might therefore come from either > shuffling,

Re: [Scikit-learn-general] CV scores vs scores on a manual split

2015-02-19 Thread Tim Head
Hi, On Thu Feb 19 2015 at 10:58:26 PM Andy wrote: > You give the roc_auc_score the result of "predict". You should give it > the result of "predict_proba". > > Yes! Thought for me roc_auc_score complains if I pass the result of predict_proba (wrong shape) but the output of decision_function() w

Re: [Scikit-learn-general] User Survey

2013-02-04 Thread Tim Head
Hi Andreas, On Sun, Feb 3, 2013 at 6:47 PM, Andreas Mueller wrote: > > On 02/03/2013 06:34 PM, Ronnie Ghose wrote: > > just wondering... what do the % signs mean iirc they should sum to 100 > right? in this case the top sums to 38862 ? > > Sorry the html formating is not so great. > Ther