date:20150420

[Scikit-learn-general] LogisticRegression: sample vs class weights

2015-04-20 Thread iBayer

Hi, I was surprised to read that class weights are implemented via sampling for LogisticRegression, is this really the case? from the LR doc --- class_weight : {dict, 'auto'}, optional Over-/undersamples the samples of each class according to the given weights. If not given,

Re: [Scikit-learn-general] randomized grid search

2015-04-20 Thread Pagliari, Roberto

Got it thanks! One last question: Are there heuristics or rules of thumb about which distribution should be used or turn out to be best with gradient boost classifiers (depth of tree, min number of samples, learning rate, etc..)? Thank you, -Original Message- From: Vlad Niculae

Re: [Scikit-learn-general] LogisticRegression: sample vs class weights

2015-04-20 Thread Mathieu Blondel

Last time I checked, liblinear didn't support sample weights, just class weights (one for positive samples and another for negative samples). Mathieu On Tue, Apr 21, 2015 at 5:56 AM, iBayer mane.d...@googlemail.com wrote: Hi, I was surprised to read that class weights are implemented via

Re: [Scikit-learn-general] logistic regression: need p-values

2015-04-20 Thread Gael Varoquaux

More importantly than the statement from Sturla, which I may or may not agree with based on the modeling assumption (and every p-value is based on a modeling assumption), the logistic in scikit-learn is a penalized logistic model. Thus the closed-form formulas for p-values are not valid. G On

Re: [Scikit-learn-general] TSNE Memory Error

2015-04-20 Thread Alexander Fabisch

Oh, I mean that is a problem of the t-SNE implementation, it is not a problem of the MNIST implementation. I don't know how that could happen. :D On 04/20/2015 08:55 AM, Jason Wolosonovich wrote: Oh wow, very cool. Thank you very much for the assistance and info Alexander! -Original

Re: [Scikit-learn-general] randomized grid search

2015-04-20 Thread Vlad Niculae

Hi Roberto what does None do for max_depth? Copy-pasted from http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples.” In particular,

Re: [Scikit-learn-general] randomized grid search

2015-04-20 Thread Pagliari, Roberto

Hi Vlad, when using randomized grid search, does sklearn look into intermediate values, or does it samples from the values provided in the parameter grid? Thank you, From: Vlad Niculae [zephy...@gmail.com] Sent: Monday, April 20, 2015 12:50 PM To:

Re: [Scikit-learn-general] Performance of LSHForest

2015-04-20 Thread Daniel Vainsencher

On 04/19/2015 08:18 AM, Joel Nothman wrote: On 17 April 2015 at 13:52, Daniel Vainsencher daniel.vainsenc...@gmail.com mailto:daniel.vainsenc...@gmail.com wrote: On 04/16/2015 05:49 PM, Joel Nothman wrote: I more or less agree. Certainly we only need to do one searchsorted per

[Scikit-learn-general] randomized grid search

2015-04-20 Thread Pagliari, Roberto

From the example in the documentation: # specify parameters and distributions to sample from param_dist = {max_depth: [3, None], max_features: sp_randint(1, 11), min_samples_split: sp_randint(1, 11), min_samples_leaf: sp_randint(1, 11),

Re: [Scikit-learn-general] randomized grid search

2015-04-20 Thread Andreas Mueller

If you have continuous parameter you should really really really use continuous distributions! On 04/20/2015 12:58 PM, Pagliari, Roberto wrote: Hi Vlad, when using randomized grid search, does sklearn look into intermediate values, or does it samples from the values provided in the parameter

Re: [Scikit-learn-general] randomized grid search

2015-04-20 Thread Pagliari, Roberto

Yes, I agree. From the example, though, my understanding is that you can only pass arrays, not functions, isn't that true? Thank you, From: Andreas Mueller [t3k...@gmail.com] Sent: Monday, April 20, 2015 2:55 PM To:

Re: [Scikit-learn-general] randomized grid search

2015-04-20 Thread Vlad Niculae

The example you cite contains these lines: max_features: sp_randint(1, 11), min_samples_split: sp_randint(1, 11), min_samples_leaf: sp_randint(1, 11), Those are not lists, but distribution objects from scipy (see at the top of the example, `from

Re: [Scikit-learn-general] randomized grid search

2015-04-20 Thread Vlad Niculae

The User Guide has an example that better illustrates what Andy meant: for continuous parameters such as C and gamma in a gaussian kernel SVM, you should use a continuous distribution (e.g. exponential):

Re: [Scikit-learn-general] TSNE Memory Error

2015-04-20 Thread Jason Wolosonovich

Oh wow, very cool. Thank you very much for the assistance and info Alexander! -Original Message- From: afabisch [mailto:afabi...@mailhost.informatik.uni-bremen.de] Sent: Saturday, April 18, 2015 9:15 AM To: scikit-learn-general@lists.sourceforge.net Subject: Re: [Scikit-learn-general]

[Scikit-learn-general] LogisticRegression: sample vs class weights

Re: [Scikit-learn-general] randomized grid search

Re: [Scikit-learn-general] LogisticRegression: sample vs class weights

Re: [Scikit-learn-general] logistic regression: need p-values

Re: [Scikit-learn-general] TSNE Memory Error

Re: [Scikit-learn-general] randomized grid search

Re: [Scikit-learn-general] randomized grid search

Re: [Scikit-learn-general] Performance of LSHForest

[Scikit-learn-general] randomized grid search

Re: [Scikit-learn-general] randomized grid search

Re: [Scikit-learn-general] randomized grid search

Re: [Scikit-learn-general] randomized grid search

Re: [Scikit-learn-general] randomized grid search

Re: [Scikit-learn-general] TSNE Memory Error

14 matches

Site Navigation

Mail list logo

Footer information