four
thousand times a month after launch.
All the best,
Sebastian Flennerhag
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
Would you be willing to share the notebook? It sounds interesting.
On May 17, 2016 5:33 AM, "Andreas Mueller" wrote:
> Hm we need to update the websites. Maybe the stable one, too.
> I kind of forgot about that.
>
> Mathieu: via the mailman web interface.
> Though I have no idea how I extracted t
You may also be interested in the 'powerlaw' Python package, which detects
the tail cutoff.
On May 26, 2016 5:46 AM, "Warren Weckesser"
wrote:
>
>
> On Thu, May 26, 2016 at 2:08 AM, Startup Hire
> wrote:
>
>> Hi all,
>>
>> Hope you are doing good.
>>
>> I am working on a project where I need to
may want to run
import numpy
numpy.test('full')
import scipy
scipy.test('full’)
to narrow down the problem further.
And how did you compile & install scikit-learn?
Best,
Sebastian
> On Jun 1, 2016, at 1:24 PM, Ruchika Nayyar wrote:
>
>
> Thanks,
> Ruchika
> -
ssues/6706)!
Like Maniteja suggested, it is likely due to “a mismatch between numpy
installed and the one scikit-learn is compiled with"
Best,
Sebastian
> On Jun 1, 2016, at 1:55 PM, Ruchika Nayyar wrote:
>
> Hello Sebastian
>
> Thanks for some insight.. So here ar
python --version
Python 3.5.1 :: Continuum Analytics, Inc.
> On Jun 1, 2016, at 2:07 PM, Matthew Brett wrote:
>
> Hi,
>
> On Wed, Jun 1, 2016 at 11:00 AM, Sebastian Raschka
> wrote:
>> Sorry,
>>
>> $ python -c 'import numpy; print(scipy.__version__)’
&
1, 2016, at 2:39 PM, Matthew Brett wrote:
>
> On Wed, Jun 1, 2016 at 11:17 AM, Sebastian Raschka
> wrote:
>>> I think you're using system Python on the Mac. I'd really strongly
>>> recommend against that, because system Python
>>
>> Yeah, but I
's own method of managing environments:
>
> On Wed, Jun 1, 2016 at 2:43 PM, Andrea Bravi wrote:
>
> Hi guys,
>
>
> I recommend using https://virtualenv.pypa.io to solve those issues!
>
>
> Best regards,
>
> Andrea
>
>
> On Wednesday, 1 June 20
to the reviewers, everyone gave their okay, the
CI tests pass, I think there’s nothing against summarizing it to a single
commit:
- implement EstimatorX
In my opinion, it helps tracking down code in the commit history in the long
run, but that’s just my personal opinion.
Best,
Sebastian
Oh wow, that looks like a neat feature, didn’t know about this, thanks for
sharing!
(And I would be in favor of this)
> On Jun 14, 2016, at 5:34 AM, Tom DLT wrote:
>
> We could stop squashing during development, and use the new Squash-and-Merge
> button on GitHub.
> What do you think?
> Tom
>
ehaviour demonstrates:
Best,
Sebastian
> On Jun 17, 2016, at 11:01 AM, Philip Tully wrote:
>
> Hi all,
>
> I notice when I train a model and expose the predict function through a web
> API, predict takes longer to run in a multi-threaded environment than a
> single-thr
am I too
conservative?
Best,
Sebastian
> On Jun 17, 2016, at 11:01 AM, Philip Tully wrote:
>
> Hi all,
>
> I notice when I train a model and expose the predict function through a web
> API, predict takes longer to run in a multi-threaded environment than a
> single-th
I think
> FeatureUnion[n_jobs=1] + GirdSearch[n_jobs <= cores]
would be better regarding the nested parallelism limitation
> On Jun 17, 2016, at 11:46 AM, Philip Tully wrote:
>
> Gotcha - so perhaps I should ensure FeatureUnion[n_jobs] + GirdSearch[n_jobs]
> < # cores?
>
> On Fri, Jun 17,
typically memory capacity,
especially if you are using multiprocessing via the cv param.
PS:
> regular numpy matrix
I think you mean "numpy array”? (Since there’s a numpy matrix datastruct in
numpy as well, however, almost no one uses it)
Best,
Sebastian
> On Jun 30, 2016, at 6:2
running Windows XP). E.g., via conda you could do
# Create
set CONDA_FORCE_32BIT=1
conda create -n 32bit_py27 python=2
# Activate
set CONDA_FORCE_32BIT=1
activate 32bit_py27
Best,
Sebastian
> On Jul 20, 2016, at 1:05 AM, lin yenchen wrote:
>
> Hi all,
>
> currently the CI te
/stable/
- A different browser
- clearing the browser cache
Hope one of these things work!
Best,
Sebastian
> On Jul 21, 2016, at 12:27 PM, Rahul Ahuja wrote:
>
> Yes I can open github pages.
>
>
>
>
>
> Kind regards,
> Rahul Ahuja
>
>
> From: sci
..."
>
>
> Today's Topics:
>
>1. Re: scikit-learn Digest, Vol 4, Issue 31 (Rahul Ahuja)
>2. Re: scikit-learn Digest, Vol 4, Issue 31 (Sebastian Raschka)
>
>
> --
>
>
it-learn-ow...@python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of scikit-learn digest..."
>
>
> Today's Topics:
>
>1. Re: scikit-learn Digest, Vol 4, Issue 31 (Rahul Ahuja)
>2. Re: scikit-lear
, sounds tricky
… Another thing you could try is visiting the site via a proxy. E.g., try to
go to
https://hide.me/en/proxy
and type "scikit-learn.org” into the form field.
Best,
Sebastian
> On Jul 21, 2016, at 2:18 PM, Rahul Ahuja wrote:
>
>
>
> yes it does via that link
Glad to hear that it works at least.
> but it may not be permanent solution?
Yeah, that’s probably not ideal, and I am not sure if there’s a better solution
if your country’s government prohibits the use of github :(.
> On Jul 21, 2016, at 3:29 PM, Rahul Ahuja wrote:
>
>
ython 2.7, 3.4 etc?
Or are you only thinking about the “comment” syntax? E.g.,
def hello(r, c=5):
s = 'hello' # type: str
return '(%d + %d) times %s' % (r, c, s)
Which should work on all Py versions.
Best,
Sebastian
> On Jul 28, 2016, at 12:49 PM, Andreas Muelle
I think that should work fine for the `pip install scikit-learn`, however, I
think the problem was with upgrading, right?
E.g., if you run
pip install scikit-learn --upgrade
it would try to upgrade numpy and scipy as well, which may not be desired. I
think the only workaround would be to run
xample, in Jupyter Notebooks/IPython regarding the
shift-tab function help. However, I’d say that your suggestion is the best bet
for now to maintain Py 2.x compatibility (until 2020 maybe :P).
Cheers,
Sebastian
> On Jul 29, 2016, at 12:55 PM, Daniel Moisset wrote:
>
> @Andreas,
virtualenv
(http://docs.python-guide.org/en/latest/dev/virtualenvs/).
Best,
Sebastian
> On Aug 1, 2016, at 3:55 PM, luizfgoncal...@dcc.ufmg.br wrote:
>
> I'm looking for the best way to install sklearn into a specific folder so
> I can make changes for my work, without worrying
Hm, that’s an “interesting” approach by SO, I guess their idea is to build a
collection of code-and-example based snippets for less well-documented
libraries — especially, libraries that want to keep their documentation lean.
> But I assume that copying without attribution is actually plagiaris
ixture models
(http://scikit-learn.org/stable/modules/mixture.html)
Best,
Sebastian
> On Aug 5, 2016, at 2:55 PM, Jared Gabor wrote:
>
> Lots of great suggestions on how to model your problem. But this might be
> the kind of problem where you seriously ask how hard it would be to g
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
earch1 = GridSearchCV(estimator=pipe, param_grid=grid)
gsearch1.fit(X, y)
Then, you can put in your desired preprocessing stuff into fit and transform.
Best,
Sebastian
> On Sep 7, 2016, at 2:03 PM, Piotr Bialecki wrote:
>
> Hi all,
>
> I am currently tuning some parameters of
with my particular mailing
list account? (besides the mailing list, my usually arrives within 1-2 seconds,
so it’s not a problem with my email client or server in general).
Best,
Sebastian
___
scikit-learn mailing list
scikit-learn@python.org
https
Thanks! So it must be something on my side (or sth. weird with this email
account in combination with the Python mailing list). Sorry for spamming, but
let me try using my gmail account and send 2 mails simultaneously (I will later
delete one of the two).
9:30:30 AM EDT (from gmail)
> On Sep 8
Thanks! So it must be something on my side (or sth. weird with this email
account in combination with the Python mailing list). Sorry for spamming, but
let me try using my gmail account and send 2 mails simultaneously (I will later
delete one of the two).
9:29:50 AM EDT
> On Sep 8, 2016, at 9:
the bother :P
> On Sep 8, 2016, at 9:29 AM, Sebastian Raschka
> wrote:
>
> Thanks! So it must be something on my side (or sth. weird with this email
> account in combination with the Python mailing list). Sorry for spamming, but
> let me try using my gmail account and send 2 ma
StandardScaler attached to
it.
Best,
Sebastian
> On Sep 13, 2016, at 8:16 AM, Brenet, Yoann wrote:
>
> Hi all,
>
> I was trying to use scikit-learn LassoCV/RidgeCV while applying a
> 'StandardScaler' on each fold set. I do not want to apply the scaler before
>
really simple here.
I created a gist of 2 simple examples with images attached:
https://gist.github.com/rasbt/6fb65bba38b70e28e60a9842b988cc67
I think it is very likely that it is not a bug in scikit-learn but rather a
matplotlib contourf bug? In case it is a bug at all …
Best,
Sebastian
Thanks a lot, Jake,
‘viridis’ seems to work, indeed. I guess I should move this to the matplotlib
bug tracker then.
Best,
Sebastian
> On Sep 13, 2016, at 10:58 AM, Jacob Vanderplas
> wrote:
>
> It seems to work correctly if you replace the colormap with a continuous one
> li
the
book ;) I hope the release date in October is fixed! :).
Cheers,
Sebastian
> On Sep 14, 2016, at 7:26 PM, Andreas Mueller wrote:
>
> Hi all.
> We just published the 0.18-rc2 release candidate on pipy and anaconda.org.
> Please go ahead and test it, so we can iron out the
ing the
> book ;) I hope the release date in October is fixed! :).
>
> Except that now it requires substantial revisions
>
> On 15 September 2016 at 09:43, Sebastian Raschka wrote:
> Thanks for all the effort putting it together! Looks like a nice set of
> features a
Scikit-learn’s GitHub repo already makes use of these templates. I think the
issue is more a technical one arising from their latest “style” changes.
> On Sep 16, 2016, at 8:25 AM, Dale T Smith wrote:
>
> A form – with required, pre-defined fields – can help when people submit
> bugs, issues,
ore important issues; I am sometimes a bit hesitant to
submit/tackle pull requests or issues since I feel like they are somewhat
distracting the core contributors from the more important stuff.
Best,
Sebastian
> On Sep 16, 2016, at 9:11 AM, Sebastian Raschka wrote:
>
> Scikit-learn’s
0.,
0., 0., 1.],
[ 1., 0., 0., 0., 0., 1., 0., 0., 0., 0., 1., 0., 0.,
0., 0., 1.]])
Best,
Sebastian
> On Sep 19, 2016, at 5:45 PM, Lee Zamparo wrote:
>
> Hi sklearners,
>
> A lab-mate came to me with a problem about encoding DNA seq
I remember that there was a discussion regarding stacking in general after we
implemented the majority voting classifier, and I just found a PR with some
stacking implementation that seems to be in progress
https://github.com/scikit-learn/scikit-learn/pull/6674
> On Sep 20, 2016, at 8:02 PM, J
Have been playing around with the new functionality tonight. There are so many
great additions, especially the new CV functionality in the model_selection
module is super great. Nested CV is much more convenient now! Congratulations
to everyone, and thanks for this great new version! :)
> On S
)
lr.coef_
> Should I be coding my predictors as +1/-1?
0 and 1 should be just fine and is the expected default.
Best,
Sebastian
> On Sep 29, 2016, at 6:09 PM, Kristen M. Altenburger
> wrote:
>
> Hi All,
>
> I am trying to understand Python’s code [function ‘_fit_liblinear'
Maybe it’s worth switching to LOOCV since you may have a bit of a pessimistic
bias here due to the small training set size (in bootstrap you only have
asymptotically 0.632 unique samples for training). I would try both linear and
nonlinear models; instead of adding more features maybe also try t
Congrats Raghav! And thanks a lot for all the great work on the model_selection
module!
> On Oct 3, 2016, at 12:53 PM, Siddharth Gupta
> wrote:
>
> Congrats Raghav! :D
>
>
> On Oct 3, 2016 10:22 PM, "Aakash Agarwal" wrote:
> Congrats Raghav!
>
> On Mon, Oct 3, 2016 at 9:54 PM, Manoj Kumar
ally 0.632 * n unique samples in your
bootstrap set. Or in other words 0.368 * n samples are not used for growing the
respective tree (to compute the OOB). As far as I understand, the random forest
OOB score is then computed as the average OOB of each tee (correct me if I am
wrong!).
Best,
Sebast
bootstrap sample.
This is asymptotically "1/e approx. 0.368” (i.e., for very, very large n)
Then, you can compute the probability of a sample being chosen as
P(chosen) = 1 - (1 - 1/n)^n approx. 0.632
Best,
Sebastian
> On Oct 3, 2016, at 3:05 PM, Ibrahim Dalal via scikit-learn
> wro
alpha=0.5,)
plt.xlabel('n')
plt.ylabel('1 - (1 - 1/n)^n')
plt.xlim([0, 210])
plt.show()
> On Oct 3, 2016, at 3:15 PM, Sebastian Raschka wrote:
>
> Say the probability that a given sample from a dataset of size n is *not*
> drawn as a bootstrap sample is
>
>
ples are left out
> (theoretically at least), some of the samples in B must be repeated?
>
> On Tue, Oct 4, 2016 at 12:50 AM, Sebastian Raschka
> wrote:
> Or maybe more intuitively, you can visualize this asymptotic behavior e.g.,
> via
>
> import matplotlib.pyplot as
Ibrahim Dalal via scikit-learn
> wrote:
>
> So what is the point of having duplicate entries in your training set? This
> seems just a pure overhead. Sorry but you will again have to help me here.
>
> On Tue, Oct 4, 2016 at 1:29 AM, Sebastian Raschka
> wrote:
> > H
x27;virginica’], where 0 -> ‘setosa’, 1 -> ‘versicolor’,
2 -> ‘virginica’.
Best,
Sebastian
> On Oct 24, 2016, at 10:18 AM, greg g wrote:
>
> bLaf1ox-forefront-antispam-report: EFV:NLI; SFV:NSPM;
> SFS:(10019020)(9893);
> DIR:OUT; SFP:1102; SCL:1; SRVR:DB5EUR03HT168;
oder
-> le = LabelEncoder()
-> y = le.fit_transform(labels)
-> le.classes_
array(['Setosa', 'Versicolor', 'Virginica'],
dtype=' import numpy as np
-> np.bincount(y)
array([50, 50, 50])
Best,
Sebastian
> On Oct 25, 2016, at 3:00 AM, greg g
om/rasbt/mlxtend/blob/master/docs/sources/user_guide/evaluate/mcnemar.ipynb
Best
Sebastian
> On Oct 30, 2016, at 3:24 PM, Suranga Kasthurirathne
> wrote:
>
>
> Hi folks!
>
> I'm using scikit-learn to build two neural networks using 10% holdout, and
> compare the
ork:
model_1 = [0.85, # experiment 1
0.84] # experiment 2
model_2 = [0.84, # experiment 1
0.83] # experiment 2
plt.boxplot([model_1, model_2])
However, a boxplot based on 2 values only doesn’t make sense imho, I you could
just plot the range.
Best,
Sebastian
> On Oct 30
Yeah, there are many useful resources and implementations scattered around the
web. However, a good, brief overview of the general ideas and concepts would be
this one, for example: http://www.svds.com/learning-imbalanced-classes/
> On Nov 16, 2016, at 3:54 PM, Dale T Smith wrote:
>
> Unbala
or under-sampling would be more
> suitable?
>
> https://dl.dropboxusercontent.com/u/48168252/PCA_of_features.png
>
> thanks for your advices
> Thomas
>
>
> On 16 November 2016 at 22:20, Sebastian Raschka wrote:
> Yeah, there are many useful resources and implement
> If you keep everything at their default values, it seems to work -
>
> ```py
> from sklearn.neural_network import MLPClassifier
> X = [[0, 0], [0, 1], [1, 0], [1, 1]]
> y = [0, 1, 1, 0]
> clf = MLPClassifier(max_iter=1000)
> clf.fit(X, y)
> res = clf.predict([[0, 0], [0, 1], [1, 0], [1, 1]])
Cheers,
Sebastian
> On Nov 24, 2016, at 8:08 PM, lin...@ruijie.com.cn wrote:
>
> @ Sebastian Raschka
> thanks for your analyzing ,
> here is another question, when I use neural network lib routine, can I save
> the trained network for use at the next time?
> Just like t
y of them need
> number of outlier and distance as input parameter in advance, is there
> algorithm more intelligently ?
>
>
>
>
>
> -邮件原件-
> 发件人: scikit-learn
> [mailto:scikit-learn-bounces+linjia=ruijie.com...@python.org] 代表 Sebastian
> Raschka
>
On first glance, the image shown in the image and the code example seem to
do/show the same thing? Maybe it would be worth adding an explanatory figure
like this to the docs to clarify?
> On Nov 28, 2016, at 7:07 PM, Joel Nothman wrote:
>
> If that clarifies, please offer changes to the exampl
I have an ipynb where I did the nested CV more “manually” in sklearn 0.17 vs
sklearn 0.18 — I intended to add it as an appendix to a blog article (model
eval part 4), which I had no chance to write, yet. Maybe the sklearn 0.17 part
is a bit more obvious (although way less elegant) than the sklea
transform(X_test)
Good luck!
Sebastian
> On Dec 6, 2016, at 6:12 AM, lin...@ruijie.com.cn wrote:
>
> Hi all:
> I uses a ‘Car Evaluation’ dataset from
> http://archive.ics.uci.edu/ml/machine-learning-databases/car/car.data to test
> the effect of MLP. (I transfer some
surface).
Best,
Sebastian
> The default is set 100 units in the hidden layer, but theoretically, it
> should work with 2 hidden logistic units (I think that’s the typical
> textbook/class example). I think what happens is that it gets stuck in local
> minima depending on the r
n is available via the loss_ attribute:
mlp = MLPClassifier(…)
# after training:
mlp.loss_
> On Dec 8, 2016, at 9:55 AM, Thomas Evangelidis wrote:
>
> Hello Sebastian,
>
> I did normalization of my training set and used the same mean and stdev
> values to normalize my test set, ins
dard deviation to get
“z” scores (e.g., this can be done by the StandardScaler()).
Best,
Sebastian
> On Dec 15, 2016, at 4:02 PM, Rachel Melamed wrote:
>
> I just tried it and it did not appear to change the results at all?
> I ran it as follows:
> 1) Normalize dummy variables (by
d for the LogisticRegressionCV, and should the n_jobs
docstring in LogisticRegression
be described as "Number of CPU cores used for model fitting” instead of “during
cross-validation,” or am I getting this wrong?
Best,
Sebastian
___
scikit-learn mailing lis
Thanks, Thomas, that makes sense! Will submit a PR then to update the docstring.
Best,
Sebastian
> On Dec 19, 2016, at 11:06 AM, Thomas Evangelidis wrote:
>
>
> Greetings,
>
> My dataset consists of objects which are characterised by their structural
> features whi
representations, e.g,. learning from
the graphs directly:
http://papers.nips.cc/paper/5954-convolutional-networks-on-graphs-for-learning-molecular-fingerprints.pdf
http://pubs.acs.org/doi/abs/10.1021/ci400187y
Best,
Sebastian
> On Dec 19, 2016, at 4:56 PM, Thomas Evangelidis wrote:
>
> t
Thanks, Tom, that makes sense. Submitted a PR to fix that.
Best,
Sebastian
> On Dec 19, 2016, at 10:14 AM, Tom DLT wrote:
>
> Hi,
>
> In LogisticRegression, n_jobs is only used for one-vs-rest parallelization.
> In LogisticRegressionCV, n_jobs is used for both one-vs
small the
sample/feature ratio), I think there are way too many (hyper/)parameters to fit
in an MLP to get good results. I think you could be better off with a kernel
SVM (if linear models don’t work well) or ensemble learning.
Best,
Sebastian
> On Dec 19, 2016, at 6:51 PM, Thomas Evangeli
the estimator that you
can initialize with “refit=False” to avoid refitting if it helps.
http://rasbt.github.io/mlxtend/user_guide/classifier/EnsembleVoteClassifier/#example-5-using-pre-fitted-classifiers
Best,
Sebastian
> On Jan 7, 2017, at 11:15 AM, Thomas Evangelidis wrote:
>
>
]))
However, it may be better to use stacking, and use the output of r.predict(X)
as meta features to train a model based on these?
Best,
Sebastian
> On Jan 7, 2017, at 1:49 PM, Thomas Evangelidis wrote:
>
> Hi Sebastian,
>
> Thanks, I will try it in another classification
between the mlps and the meta estimator. However I'd
definitely also recommend simpler models als
alternative.
Best,
Sebastian
> On Jan 7, 2017, at 4:36 PM, Thomas Evangelidis wrote:
>
>
>
>> On 7 January 2017 at 21:20, Sebastian Raschka wrote:
>> Hi, Thomas,
>
s
set a max constraint for the weights in combination with dropout, e.g. “
||w||_2 < constant “, which worked even better than dropout alone (the constant
becomes another hyperparm to tune though).
Best,
Sebastian
> On Jan 9, 2017, at 1:21 PM, Jacob Schreiber wrote:
>
> Thomas, it
24, where they talk about alternative
(the more classic) representations of protein-ligand complexes or interactions
as inputs to either random forests or multi-layer perceptrons.
Best,
Sebastian
> On Jan 10, 2017, at 7:46 AM, Thomas Evangelidis wrote:
>
> Jacob,
>
> The featur
Hi guys,
I'm new to NIR-measurement as wenn as chemometrics. My current project
involvs the recognition of determined spectra (of a reference system)
among others.
The materials are currentlys not really set. So I try to give a
predetermined mixture of substances into another matrix and group t
://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MultiLabelBinarizer.html#sklearn.preprocessing.MultiLabelBinarizer).
Also, the RandomForestClassifier should support multillabel classification.
Best,
Sebastian
> On Jan 21, 2017, at 12:59 PM, Carlton Banks wrote:
>
> Mo
Oh okay. But that shouldn’t be a problem, the RandomForestRegressor also
supports multi-outpout regression; same expected target array shape:
[n_samples, n_outputs]
Best,
Sebastian
> On Jan 21, 2017, at 1:27 PM, Carlton Banks wrote:
>
> Not classifiication… but regression..
>
PM, Carlton Banks wrote:
>
> Thanks for the Info!..
> How do you set it up..
>
> There doesn’t seem a example available for regression purposes..
>> Den 21. jan. 2017 kl. 19.32 skrev Sebastian Raschka :
>>
>> Oh okay. But that shouldn’t be a problem, the RandomFor
= StratifiedKFold(n_splits=5, shuffle=True, random_state=i)
gs = GridSearchCV(..., cv=k_fold)
...
Best,
Sebastian
> On Jan 26, 2017, at 5:39 PM, Raga Markely wrote:
>
> Hello,
>
> I was trying to do repeated Grid Search CV (20 repeats). I thought that each
> time I call GridSearchCV
u haven’t touched before. I often use
“training, validation, and testing “ approach as well, though, especially when
working with large datasets and for early stopping on neural nets.
Best,
Sebastian
> On Jan 26, 2017, at 1:19 PM, Raga Markely wrote:
>
> Thank you, Guillaume.
>
do a McNemar test.
Best,
Sebastian
> On Jan 26, 2017, at 8:09 PM, Raga Markely wrote:
>
> Ahh.. nice.. I will use that.. thanks a lot, Sebastian!
>
> Best,
> Raga
>
> On Thu, Jan 26, 2017 at 6:34 PM, Sebastian Raschka
> wrote:
> Hi, Raga,
>
> I th
.) model selection based on best algo via k-fold on whole training set
3.) fit best algo w. best hyperparams (from 2.) to whole training set
4.) evaluate on test set
5.) fit classifier to whole dataset, done
Best,
Sebastian
> On Jan 27, 2017, at 10:23 AM, Raga Markely wrote:
>
> Sounds good,
.) model selection based on best algo via k-fold on whole training set
3.) fit best algo w. best hyperparams (from 2.) to whole training set
4.) evaluate on test set
5.) fit classifier to whole dataset, done
Best,
Sebastian
> On Jan 27, 2017, at 12:49 PM, Sebastian Raschka
> wrote:
>
&
Hm, which version of scikit-learn are you using? Are you running this on
sklearn 0.18?
Best,
Sebastian
> On Jan 30, 2017, at 2:48 PM, Raga Markely wrote:
>
> Hi Sebastian,
>
> Following up on the original question on repeated Grid Search CV, I tried to
> do repeated nes
Cool, glad to hear that it was such an easy fix :)
> On Jan 30, 2017, at 3:49 PM, Raga Markely wrote:
>
> Nice catch!! The sklearn was 0.18, but I used sklearn.grid_search instead of
> sklearn.model_selection.
>
> Error is gone now.
>
> Thank you, Sebastian!
> Rag
In my opinion, Slack can be quite useful for discussing things “live.” However,
one of the main problems I have with Slack — I am using it for some other
projects — is that it is easy to lose track if important things are discussed
and one is not constantly online and checking the timeline. In a
KFold(n_splits=5, shuffle=True, random_state=1)
for name, gs_est in sorted(gridcvs.items()):
nested_score = cross_val_score(gs_est,
X=X_train,
y=y_train,
cv=outer_cv,
n_jobs=1)
3rd round
list(my_gen)[2][1] # stores an array of indices used as test fold in the 3rd
round
Hope that helps.
Best,
Sebastian
> The following did not work. This is what we get --> ValueError: too many
> values to unpack
> On Feb 27, 2017, at 5:13 PM, Ludovico Coletta wrote:
&g
Hi, Raga,
I have a short section on this here
(https://sebastianraschka.com/blog/2016/model-evaluation-selection-part2.html#the-bootstrap-method-and-empirical-confidence-intervals)
if it helps.
Best,
Sebastian
> On Mar 1, 2017, at 3:07 PM, Raga Markely wrote:
>
> Hi everyone,
>
mation rate for regression would be ...
> On Mar 1, 2017, at 5:39 PM, Raga Markely wrote:
>
> Thanks a lot, Sebastian! Very nicely written.
>
> I have a few follow-up questions:
> 1. Just to make sure I understand correctly, using the .632+ bootstrap
> method, the ACC_l
mation rate for regression would be ...
> On Mar 1, 2017, at 5:39 PM, Raga Markely wrote:
>
> Thanks a lot, Sebastian! Very nicely written.
>
> I have a few follow-up questions:
> 1. Just to make sure I understand correctly, using the .632+ bootstrap
> method, the ACC_l
:07 PM, Raga Markely wrote:
>
> No worries, Sebastian :) .. thank you very much for your help.. I learned a
> lot of new things from your site today.. it led me to some relevant chapters
> in "The Elements of Statistical Learning", which then led me to chapter 8
> p
sklearn.model_selection import cross_val_score
cross_val_score(estimator=lda, X=X, y=y, cv=loo)
```
Best,
Sebastian
> On Mar 7, 2017, at 10:01 AM, Serafeim Loukas wrote:
>
> Dear Mahesh,
>
> Thank you for your response.
>
> I read the documentation however I did not fin
Hi, Stuart,
I think the only way to do that right now would be through the SGD classifier,
e.g.,
sklearn.linear_model.SGDClassifier(loss='log', penalty='elasticnet' …)
Best,
Sebastian
> On Mar 13, 2017, at 12:57 PM, Stuart Reynolds
> wrote:
>
> Is the
completely for now. And when you run the LogisticRegression, maybe run it
multiple times with different random seeds to see if your solutions are
generally stable.
Best,
Sebastian
> On Mar 13, 2017, at 1:06 PM, Stuart Reynolds
> wrote:
>
> Both libraries are heavily parameterized. You
ive.
Best,
Sebastian
> On Mar 16, 2017, at 12:00 AM, Carlton Banks wrote:
>
> Hi…
>
> I currently trying to optimize my CNN model using gridsearchCV, but seem to
> have some problems feading my input data..
>
> My training data is stored as a list of Np.ndarr
gb ram..
>
>> Den 16. mar. 2017 kl. 05.30 skrev Sebastian Raschka :
>>
>> Sklearn estimators typically assume 2d inputs (as numpy arrays) with
>> shape=[n_samples, n_features].
>>
>>> list of Np.ndarrays of shape (6,3,3)
>>
>> I assume you
a super computer, and seem to
>> have problems with memory.. already used 62 gb ram..
>>
>> > Den 16. mar. 2017 kl. 05.30 skrev Sebastian Raschka :
>> >
>> > Sklearn estimators typically assume 2d inputs (as numpy arrays) with
>> > shape=[n_samples,
;
> I changed it to -48?.. and it seem to be running..
>> Den 16. mar. 2017 kl. 06.06 skrev Sebastian Raschka :
>>
>> the “-1” means that it will run on all processors that are available
>>
>>> On Mar 16, 2017, at 1:01 AM, Carlton Banks wrote:
>>>
>
1 - 100 of 214 matches
Mail list logo