[scikit-learn] Re: when fold is not an integer, how does refit work?

2025-09-19 Thread Guillaume Lemaître via scikit-learn
> My question is, when refit=True, what does it use to train the final model? X? As per the documentation says: "Refit an estimator using the best found parameters on the whole dataset." So it refit on the entire `X` using the best parameters found during the cross-validation procedure with respec

[scikit-learn] Re: [ANN] scikit-learn 1.7.0 release

2025-06-06 Thread Guillaume Lemaître via scikit-learn
Thanks Jeremie for taking care of this release and to all contributors to make it happen. On Fri, 6 Jun 2025 at 10:45, Jérémie du Boisberranger < jeremie.du-boisberran...@inria.fr> wrote: > Hi everyone, > > We're happy to announce the 1.7.0 release which you can install via pip > or conda: > >

[scikit-learn] (Many) new contributors in the scikit-learn teams 🎉🎉🎉

2024-12-24 Thread Guillaume Lemaître via scikit-learn
Dear all, We are super excited to announce new contributors to different scikit-learn teams: - Maren Westermann (https://github.com/marenwestermann) is now part of the Documentation Team in addition to her role in the Contributor Experience Team. - Stefanie Senger (https://github.com/StefanieSe

Re: [scikit-learn] [ANN] scikit-learn 1.6.0 release

2024-12-09 Thread Guillaume Lemaître via scikit-learn
Thanks Jérémie for taking care of this release. -- Guillaume Lemaitre Open source engineer at :probabl. > On Dec 9, 2024, at 7:34 PM, Jérémie du Boisberranger > wrote: > > Hi everyone, > > We're happy to announce the 1.6.0 release which you can install via pip or > conda: > > pip install

[scikit-learn] [ANN] New core maintainer: Lucy Liu

2024-10-14 Thread Guillaume Lemaître via scikit-learn
We are excited to welcome Lucy Liu as a core maintainer of the scikit-learn project. Lucy was already a part of the documentation and contributor experience teams. Lucy will continue some work as part of the Chan Zuckerberg Initiative's Essential Open Source Software

[scikit-learn] [ANN] scikit-learn 1.5.2 is online!

2024-09-11 Thread Guillaume Lemaître
Hello everyone, We're happy to announce the 1.5.2 release ! It contains fixes for a few regressions introduced in 1.5. You can see the changelog here: https://scikit-learn.org/stable/whats_new/v1.5.html#version-1-5-2 You can upgrade with pip as usual: pip install -U scikit-learn The conda-forg

Re: [scikit-learn] [ANN] scikit-learn 1.5.1 is online!

2024-07-03 Thread Guillaume Lemaître
Thanks Jérémie for this one. On Wed, 3 Jul 2024 at 11:26, Jérémie du Boisberranger < jeremie.du-boisberran...@inria.fr> wrote: > Hello everyone, > > We're happy to announce the 1.5.1 release ! > > > It contains fixes for a few regressions introduced in 1.5. > > You can see the changelog here: > h

Re: [scikit-learn] round robin triage

2024-02-27 Thread Guillaume Lemaître
LGTM on my side. On Tue, 27 Feb 2024 at 17:50, Adrin wrote: > Hi, > > In the last scikit-learn monthly meeting we talked about doing round robin > triage, with Tim, Loïc, Jeremie, Guillaume, Olivier, and myself. To get the > ball rolling, I'm suggesting this random order, starting next week: > >

[scikit-learn] [ANN] New core contributor: Yao Xiao

2024-02-18 Thread Guillaume Lemaître
We are excited to welcome Yao Xiao (https://github.com/Charlie-XIAO) as a core contributor of the scikit-learn project. Your past contributions are greatly appreciated, and I'm looking forward to working further with you. On behalf of the scikit-learn team. -- Guillaume Lemaitre Open source engi

[scikit-learn] [ANN] scikit-learn 1.4.1.post1 is online!

2024-02-16 Thread Guillaume Lemaître
scikit-learn 1.4.1.post1 is out on pypi.org and conda-forge! This is a maintenance release that fixes several regressions introduced in version 1.4.0 https://scikit-learn.org/stable/whats_new/v1.4.html#version-1-4-1-post1 You can

[scikit-learn] scikit-learn 1.3.2 is online!

2023-10-25 Thread Guillaume Lemaître
scikit-learn 1.3.2 is out on pypi.org and conda-forge! This is a maintenance release that fixes several regressions introduced in version 1.3 https://scikit-learn.org/stabl

[scikit-learn] [ANN] scikit-learn 1.3.1 is online!

2023-09-21 Thread Guillaume Lemaître
scikit-learn 1.3.1 is out on pypi.org and conda-forge! This is a maintenance release that fixes several regressions introduced in version 1.3 https://scikit-learn.org/

[scikit-learn] ANN: imbalanced-learn 0.11.0 released

2023-07-08 Thread Guillaume Lemaître
Hi all, We are happy to announce the 0.11.0 version of imbalanced-learn. imbalanced-learn is a toolbox aiming at providing a wide range of methods to cope with the problem of imbalanced data sets frequently encountered in machine learning and data mining. This release makes sure to be fully compa

Re: [scikit-learn] [ANN] scikit-learn 1.3.0 release

2023-06-30 Thread Guillaume Lemaître
Thank you Jeremie to have taking care of this release. On Fri, 30 Jun 2023 at 10:13, Jeremie du Boisberranger < jeremie.du-boisberran...@inria.fr> wrote: > Hi everyone, > > We're happy to announce the 1.3.0 release which you can install via pip > or conda: > > pip install -U scikit-learn > >

Re: [scikit-learn] [ANN] scikit-learn 1.3.0rc1 is online!

2023-06-15 Thread Guillaume Lemaître
Thank you Jeremie for taking care of this release. On Thu, 15 Jun 2023 at 21:33, Jeremie du Boisberranger < jeremie.du-boisberran...@inria.fr> wrote: > Hi everyone, > > Please help us test the first release candidate for scikit-learn 1.3.0: > > pip install scikit-learn==1.3.0rc1 > > Changelog

Re: [scikit-learn] version warning - do I have to fix the minor version when unpickling?

2023-06-15 Thread Guillaume Lemaître
ith it, you should be in an environment that has the same version as the pickle to be safe. > > Cheers, Martin > > > Am 14.06.2023 18:24 schrieb Guillaume Lemaître: > > Hi Martin, > > > > The public API is stable but the internal can change which can affect > > the

Re: [scikit-learn] version warning - do I have to fix the minor version when unpickling?

2023-06-14 Thread Guillaume Lemaître
Hi Martin, The public API is stable but the internal can change which can affect the pickle. For instance, calling a missing private function that does not exist can happen. Since that we don’t guarantee any support in this regard, this is the reason why a warning is raised even between minor or p

Re: [scikit-learn] CFP: GitHub copilot for PRs

2023-03-25 Thread Guillaume Lemaître
I assume that we need to check which feature could be used. For instance, providing automatic description in PRs could be something that I kind of like. Proposing non-regression tests for new comers that never wrote some could also be useful. At the end, we will always add manual reviews before

Re: [scikit-learn] classification model that can handle missing values w/o learning from missing values

2023-03-10 Thread Guillaume Lemaître
Hi Martin, I think that you could use `imbalanced-learn` and a bit of Pandas/NumPy to get the behaviour that you want. You can use a `FunctionSampler` ( https://imbalanced-learn.org/stable/references/generated/imblearn.FunctionSampler.html) in which you remove the sample containing missing values.

Re: [scikit-learn] [ANN] scikit-learn 1.2.2 is online!

2023-03-09 Thread Guillaume Lemaître
Thanks for taking care of this release Jeremie. Cheers, On Thu, 9 Mar 2023 at 11:17, Jeremie du Boisberranger < jeremie.du-boisberran...@inria.fr> wrote: > scikit-learn 1.2.2 is out on pypi.org and conda-forge! > This is a maintenance release that fixes several regressions introduced in > versio

Re: [scikit-learn] obtaining intervals from the decision tree struture

2023-03-07 Thread Guillaume Lemaître
Hi Sole, You can use `apply` on the training `X` to get the leaf where the sample will fall in. Then a groupby should allow you to get the statistic that you want. Cheers, -- Guillaume Lemaitre Scikit-learn @ Inria Foundation https://glemaitre.github.io/ > On 7 Mar 2023, at 15:53, Sole Galli vi

[scikit-learn] ANN: imbalanced-learn 0.10.0 released

2022-12-09 Thread Guillaume Lemaître
Hi all, We are happy to announce the 0.10.0 version of imbalanced-learn. imbalanced-learn is a toolbox aiming at providing a wide range of methods to cope with the problem of imbalanced data sets frequently encountered in machine learning and data mining. This release makes sure to be fully compa

[scikit-learn] ANN: New member in the Contributor Experience Team

2022-11-17 Thread Guillaume Lemaître
We are excited to welcome a new member to the Contributor Experience Team: - Tim Head: https://github.com/betatim Looking forward to furthering interactions within the scikit-learn community. On the behalf of the scikit-learn team. -- Guillaume Lemaitre Scikit-learn @ Inria Foundation https:

[scikit-learn] [ANN] scikit-learn 1.1.3 is online!

2022-10-29 Thread Guillaume Lemaître
scikit-learn 1.1.3 is out on pypi.org and conda-forge! This bugfix release only includes fixes for compatibility with the latest SciPy release >= 1.9.2 and wheels for Python 3.11. Note that support for 32-bit Python on Windows has been dropped in this release. This is due to the fact that SciPy 1.

[scikit-learn] ANN: New members in the Contributor Experience Team

2022-08-25 Thread Guillaume Lemaître
We are excited to welcome new members to the Contributor Experience Team: - Meekail Zain: https://github.com/Micky774 - Maxwell Liu: https://github.com/MaxwellLZH Both Meekail and Maxwell did tremendous contributions to scikit-learn. Looking forward to furthering interactions within the sci

[scikit-learn] [ANN] scikit-learn 1.1.2 is online!

2022-08-05 Thread Guillaume Lemaître
scikit-learn 1.1.2 is out on pypi.org and conda-forge! This is a small maintenance release that fixes a couple of regressions: https://scikit-learn.org/dev/whats_new/v1.1.html#version-1-1-2 You can upgrade with pip as usual: pip install -U scikit-learn The conda-forge builds will be available s

Re: [scikit-learn] question regarding 'RANSACRegressor' object has no attribute 'inlier_mask_'

2022-07-29 Thread Guillaume Lemaître
You need to fit the estimator to access the fitted attribute: In [1]: from sklearn.linear_model import RANSACRegressor ...: from sklearn.datasets import make_regression ...: X, y = make_regression( ...: n_samples=200, n_features=2, noise=4.0, random_state=0) ...: reg = RANSACRegres

Re: [scikit-learn] New contributer

2022-06-23 Thread Guillaume Lemaître
You can start by reading the contributing guide: https://scikit-learn.org/dev/developers/contributing.html We also have additional materials on YouTube to get started contributing: https://www.youtube.com/watch?v=5OL8XoMMOfA&list=PLM-1Q

[scikit-learn] Announcement of EuroSciPy 2022

2022-05-25 Thread Guillaume Lemaître
Dear all, The following conference could be of interest. Please find the CfP announcement of EuroSciPy 2022 below. --- We are happy to announce that EuroSciPy, the 14th European Conference on Python in Science, will be back as an in-person event. EuroSciPy 2022 will take place from Monday, August

[scikit-learn] [ANN] scikit-learn 1.1.1 is online!

2022-05-19 Thread Guillaume Lemaître
scikit-learn 1.1.1 is out on pypi.org and conda-forge! This is a small maintenance release that fixes a couple of regressions: https://scikit-learn.org/dev/whats_new/v1.1.html#version-1-1-1 Notably, if you are using tree-based models (i.e. decision tree, random forest, gradient boosting), we corr

Re: [scikit-learn] Experience with black formatting in scikit-learn for astropy

2022-05-19 Thread Guillaume Lemaître
I just answer in the text below. This is my 2c. Hope this helps Cheers, On Wed, 18 May 2022 at 22:08, Tom Aldcroft wrote: > Hi - > > The astropy core is currently considering implementing code formatting > with black, much as scikit-learn did in 2020 ( > https://github.com/scikit-learn/scikit-

[scikit-learn] ANN: imbalanced-learn 0.9.1 released

2022-05-16 Thread Guillaume Lemaître
Hi all, We are happy to announce the 0.9.1 version of imbalanced-learn. imbalanced-learn is a toolbox aiming at providing a wide range of methods to cope with the problem of imbalanced data sets frequently encountered in machine learning and data mining. This release makes sure to be fully comp

Re: [scikit-learn] willing to attend to the sprint on March 12

2022-02-15 Thread Guillaume Lemaître
The primary organiser are WiMLDS Paris. All information are available there: https://www.meetup.com/fr-FR/Paris-Women-in-Machine-Learning-Data-Science/events/283918976/ Cheers, -- Guillaume Lemaitre Scikit-learn @ Inria Foundation https://glemaitre.github.io/ > On 15 Feb 2022, at 17:05, Şeyma Ba

Re: [scikit-learn] Looking for sklearn.neighbors.kde

2022-01-24 Thread Guillaume Lemaître
Looking at the documentation: https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KernelDensity.html The public import is: `from sklearn.neighbors import KernelDensity` omitting the kde. C

[scikit-learn] [ANN] scikit-learn 1.0.2 is online!

2021-12-25 Thread Guillaume Lemaître
scikit-learn 1.0.2 is out on pypi.org and conda-forge! This is a small maintenance release that fixes a couple of regressions. Binaries and wheels are available for Python 3.10. https://scikit-learn.org/dev/whats_new/v1.0.html#version-1-0-2 You can upgrade with pip as usual: pip install -U scik

[scikit-learn] scikit-learn office hours on Monday Dec. 20, 2021

2021-12-17 Thread Guillaume Lemaître
Hi all, Some of us will be online on the scikit-learn discord next Monday at 10:00 PT / 13:00 ET / 18:00 UTC / 19:00 CET for about an hour or so. First time and occasional contributors are welcome to join us to discord using this invitation link: https://discord.gg/N8dGHPpq

[scikit-learn] scikit-learn office hours on Monday Dec. 6 2021

2021-12-01 Thread Guillaume Lemaître
Hi all, Some of us will be online on the scikit-learn discord next Monday at 10:00 PT / 13:00 ET / 18:00 UTC / 19:00 CET for about an hour or so. First time and occasional contributors are welcome to join us to discord using this invitation link: https://discord.gg/YyYRXMju

Re: [scikit-learn] (no subject)

2021-11-10 Thread Guillaume Lemaître
You can refer to https://scikit-learn.org/stable/about.html#citing-scikit-learn depending what is the scope of your research paper. -- Guillaume Lemaitre Scikit-learn @ Inria Foundation https://glemaitre.github.io/ > On 10 Nov 2021

Re: [scikit-learn] scikit-learn office hours on Monday Nov. 8 2021

2021-11-07 Thread Guillaume Lemaître
Dear all, Please find a new discord invite since the previous invitation expired: https://discord.gg/84atnsdjTa <https://discord.gg/84atnsdjTa> Cheers, -- Guillaume Lemaitre Scikit-learn @ Inria Foundation https://glemaitre.github.io/ > On 7 Nov 2021, at 14:35, Guillaume Lemaît

[scikit-learn] scikit-learn office hours on Monday Nov. 8 2021

2021-11-07 Thread Guillaume Lemaître
Hi all, Some of us will be online on the scikit-learn discord this Monday at 11:00 ET / 16:00 UTC / 17:00 CET for about an hour or so. First time and occasional contributors are welcome to join us to discord using this invitation link: https://discord.gg/YBdN45kD Th

[scikit-learn] New core dev: Julien Jerphanion

2021-10-30 Thread Guillaume Lemaître
The scikit-learn core development team has welcomed a new member, Julien Jerphanion, who has contributed code, reviews, and documentation since this March (aside from occasional contributions in the past). Congratulation and welcome Julien! On the behalf of the scikit-learn team -- Guillaume Le

[scikit-learn] [ANN] scikit-learn 1.0.1 is online!

2021-10-25 Thread Guillaume Lemaître
scikit-learn 1.0.1 is out on pypi.org and conda-forge! This is a small maintenance release that fixes a couple of regressions: https://scikit-learn.org/dev/whats_new/v1.0.html#version-1-0-1 You can upgrade with pip as usual:

Re: [scikit-learn] scikit-learn office hours on Friday Oct. 8 2021

2021-10-08 Thread Guillaume Lemaître
I see that Olivier did a small mistake. I will be have the office hours from 18:00 to 19:00 UTC. So there is no office hour from 19:00 to 20:00 UTC. Cheers, -- Guillaume Lemaitre Scikit-learn @ Inria Foundation https://glemaitre.github.io/ > On 6 Oct 2021, at 16:42, Olivier Grisel wrote: > > Hi

Re: [scikit-learn] scikit-learn for Apple Silicon M1 Macs

2021-08-02 Thread Guillaume Lemaître
There is no currently available wheel in PyPI because NumPy and SciPy does not provide wheels as well: https://github.com/scikit-learn/scikit-learn/issues/19137 However, one can use `miniforge` or `mambaforge` to install binaries witho

Re: [scikit-learn] scikit-learn monthly meeting: Monday 26 July 2021 - 8 pm UTC

2021-07-28 Thread Guillaume Lemaître
Dear all, Please find the notes of our monthly meeting: https://github.com/scikit-learn/administrative/blob/master/meeting_notes/2021-07-26.md Cheers, On Sun, 25 Jul 2021 at 14:27, Guillaume Lemaître wrote: > Please find the correct local times: > > https://www.timeanddate.com/w

Re: [scikit-learn] random forests and multil-class probability

2021-07-27 Thread Guillaume Lemaître
ulti-output-problems> >> It's not a one-vs-rest strategy and can be summed up as: >> >> >>> Store n output values in leaves, instead of 1; >>> >>> Use splitting criteria that compute the average reduction across all n >>> outputs. &

Re: [scikit-learn] random forests and multil-class probability

2021-07-27 Thread Guillaume Lemaître
> On 27 Jul 2021, at 11:08, Sole Galli via scikit-learn > wrote: > > Hello community, > > Do I understand correctly that Random Forests are trained as a 1 vs rest when > the target has more than 2 classes? Say the target takes values 0, 1 and 2, > then the model would train 3 estimators 1 p

Re: [scikit-learn] scikit-learn monthly meeting: Monday 26 July 2021 - 8 pm UTC

2021-07-25 Thread Guillaume Lemaître
Please find the correct local times: https://www.timeanddate.com/worldclock/meetingdetails.html?year=2021&month=7&day=26&hour=20&min=0&sec=0&p1=1440&p2=240&p3=248&p4=195&p5=179&p6=224 On Sun, 25 Jul 2021 at 13:22, Guillaume Lemaître wrote: > D

[scikit-learn] scikit-learn monthly meeting: Monday 26 July 2021 - 8 pm UTC

2021-07-25 Thread Guillaume Lemaître
Dear all, The scikit-learn developer monthly meeting will take place on Monday July 26th at 8 pm UTC - Video call link: https://meet.google.com/qbg-ucpe-ngz - Meeting notes / agenda: https://hackmd.io/0yokz72CTZSny8y3Re648Q - Local times: https://www.timeanddate.com/worldclock/meetingdetails.html

[scikit-learn] New member of the triage team: Norbert

2021-06-17 Thread Guillaume Lemaître
We are excited to welcome a new member of the triage team: * Norbert Preining https://github.com/norbusan The thorough work of the triage team on helping the scikit-learn community by triaging issues and PRs, organizing sprints, responding to discussions, is extremely valuable and helpful in the

[scikit-learn] New member of the triage team: Julien

2021-06-10 Thread Guillaume Lemaître
We are excited to welcome a new member of the triage team: * Julien Jerphanion https://github.com/jjerphan The thorough work of the triage team on helping the community is much appreciated. Cheers, -- Guillaume Lemaitre Scikit-learn @ Inria Foundation https://glemaitre.github.io/ __

Re: [scikit-learn] check_estimator _NotAnArray

2021-05-12 Thread Guillaume Lemaître
Scikit-learn estimator should validate X and y. This validation will convert the input X and y into a NumPy array to do the numerical operation of the estimator. This check makes sure that passing an array-like (but not a NumPy array) is still working as passing an array. It basically ensure that t

Re: [scikit-learn] Issue regarding Feature Union

2021-05-06 Thread Guillaume Lemaître
you can get the pipeline (with optimized hyperparameters) using grid_search.best_estimator_. Applying the code of Chris on this estimator will work. On Thu, 6 May 2021 at 13:45, mitali katoch wrote: > Hi Chris, > I forgot to mention that this pipeline I have used within the GridSearchCV. > I hav

[scikit-learn] [ANN] scikit-learn 0.24.2 is online!

2021-04-28 Thread Guillaume Lemaître
scikit-learn 0.24.2 is out on pypi.org and conda-forge! This is a small maintenance release that fixes a couple of regressions: https://scikit-learn.org/stable/whats_new/v0.24.html#version-0-24-2 You can upgrade with pip as usual: pip install -U scikit-learn The conda-forge builds will be avai

Re: [scikit-learn] GridSearchCV not working properly

2021-04-16 Thread Guillaume Lemaître
Hi Mujeebur, On the top of the head, I don't recall any drastic changes in GradientBoostingClassifier and GridSearchCV in the last version. We start to continuously benchmark performance regressions and we did not notice anything but it is possible that we missed it (or we do not test with the sor

Re: [scikit-learn] ANN: scikit-learn-extra 0.2.0 released

2021-04-15 Thread Guillaume Lemaître
Cool work guys. Thanks to the team that takes care about these extensions. On Wed, 14 Apr 2021 at 22:54, Timothee Mathieu < timothee.math...@universite-paris-saclay.fr> wrote: > Hello, > > We're happy to announce the 0.2.0 version of scikit-learn-extra. > scikit-learn-extra is a Python module for

Re: [scikit-learn] Crypto project to fund open source

2021-02-08 Thread Guillaume Lemaître
On Mon, 8 Feb 2021 at 10:20, Adrin wrote: > I had a chat with Guillaume and he raised the concern of energy > consumption on blockchain platforms. > > I had a little look, and realized this platform runs on Etherium, which > has a significantly lower energy footprint > than bitcoin, but still a r

Re: [scikit-learn] LassoCV.coef not implemented (I think)

2021-01-31 Thread Guillaume Lemaître
On Sun, 31 Jan 2021 at 21:37, Guillaume Lemaître wrote: > > > On Sun, 31 Jan 2021 at 21:24, Robert Slater wrote: > >> Appreciate the clarification. I definitely think the docs need some >> polish as coef_ only returns a single fitting of coefficients and not the >&g

Re: [scikit-learn] LassoCV.coef not implemented (I think)

2021-01-31 Thread Guillaume Lemaître
kit-learn repository because the documentation is actually the docstring from the classes and functions. The user guide documentation is located in the /doc folder and the contributing guide will be helpful to start with: https://scikit-learn.org/stable/developers/contributing.html > > On

Re: [scikit-learn] LassoCV.coef not implemented (I think)

2021-01-31 Thread Guillaume Lemaître
Hi Robert, > I do have a .coef_ variable which I believe is the coefficient for the best fit only. `coef` never existed. Fitted attributes always end with underscore. We do not store coefficients for all fitted `alphas_`. We provide some information regarding the MSE path for all tried alphas: ht

Re: [scikit-learn] scikit-learn 0.24 installation fails with ModuleNotFoundError: No module named 'scipy'

2021-01-22 Thread Guillaume Lemaître
2.3 MB) > > However, using the API sc.install_pypi_package("scikit-learn") still uses > the tar file instead of the whl file (even after the pip upgrade). > > Collecting scikit-learn > Using cached > https://files.pythonhosted.org/packages/f4/7b/d415b0c89babf23dcd8ee631015f043e2d76795edd9c7359d6e632

Re: [scikit-learn] Finding the PC that captures a specific variable

2021-01-22 Thread Guillaume Lemaître
I am not really understanding the question, sorry. Are you seeking for the `explained_variance_ratio_` attribute that give you a relative value of the eigenvalues associated to the eigenvectors? On Fri, 22 Jan 2021 at 10:16, Mahmood Naderan wrote: > Hi > I have a question about PCA and that is,

Re: [scikit-learn] scikit-learn 0.24 installation fails with ModuleNotFoundError: No module named 'scipy'

2021-01-22 Thread Guillaume Lemaître
@Bertrand Could you tell us which version of `pip` to you use (you need pip >= 19.0 for manylinux2010 and pip >= 19.3 for manylinux2014) On Fri, 22 Jan 2021 at 09:49, Guillaume Lemaître wrote: > We might experience an issue with PyPI not selecting the manylinux2010 > wheel: https:

Re: [scikit-learn] scikit-learn 0.24 installation fails with ModuleNotFoundError: No module named 'scipy'

2021-01-22 Thread Guillaume Lemaître
he right questions!" > (Robert Helmbold, 2013) > > > On Wednesday, January 20, 2021, 04:16:15 PM MST, Guillaume Lemaître < > g.lemaitr...@gmail.com> wrote: > > > Basically it get the tar with the source and recompile instead of using > the wheel. Could you force

Re: [scikit-learn] scikit-learn 0.24 installation fails with ModuleNotFoundError: No module named 'scipy'

2021-01-20 Thread Guillaume Lemaître
Basically it get the tar with the source and recompile instead of using the wheel. Could you force an install from PyPI without using the cached file. We pushed wheels yesterday for 0.24.1 as well so it should not get the 0.24.0 version. For 0.23.2, you can see that it used the wheel (.whl). 

[scikit-learn] [ANN] scikit-learn 0.24.1 is online!

2021-01-19 Thread Guillaume Lemaître
scikit-learn 0.24.1 is out on pypi.org and conda-forge! This is a small maintenance release that fixes the macOS wheels and small bugs in SelfTrainingClassifier and adjusted_mutual_info_score: https://scikit-learn.org/stable/whats_new/v0.24.html#version-0-24-1 You can upgrade with pip as usual:

Re: [scikit-learn] 2 million samples dataset caused python and OS crash

2021-01-06 Thread Guillaume Lemaître
And it seems that the piece of traceback refer to NumPy. On Wed, 6 Jan 2021 at 12:48, Andrew Howe wrote: > A core dump generally happens when a process tries to access memory > outside it's allocated address space. You've not specified what estimator > you were using, but I'd guess it attempted

Re: [scikit-learn] Comparing Scikit and Xlstat for PCA analysis

2021-01-05 Thread Guillaume Lemaître
anks for the reply. May I know if I can choose different solvers in the > scikit package or not. > > Regards, > Mahmood > > > > > On Mon, Dec 28, 2020 at 4:30 PM Guillaume Lemaître > wrote: > >> n_components set to 'auto' is a strategy that will

Re: [scikit-learn] Comparing Scikit and Xlstat for PCA analysis

2020-12-28 Thread Guillaume Lemaître
n_components set to 'auto' is a strategy that will pick the number of components. The sign of the PC does not matter so much since they are still orthogonal. So change will depend of the solver that should be different in both software. Sent from my phone - sorry to be brief and potential

[scikit-learn] ANN scikit-learn 0.24.0 release

2020-12-22 Thread Guillaume Lemaître
We're happy to announce the 0.24.0 release and already out on PyPI and conda-forge. You can read the release highlights under https://scikit-learn.org/stable/auto_examples/release_highlights/plot_release_highlights_0_24_0.html and the long version of the change log under https://scikit-learn.org/s

Re: [scikit-learn] Changes in Travis billing

2020-12-03 Thread Guillaume Lemaître
ARM support On Thu, 3 Dec 2020 at 01:37, Andreas C. Mueller wrote: > Sorry I'm probably missing some detail but what does travis provide that > github actions and azure pipeline don't provide? > > -- Original Message -- > From: "Gael Varoquaux" > To: "Scikit-learn mailing list" > Sent:

Re: [scikit-learn] Creating dataset

2020-11-08 Thread Guillaume Lemaître
I would not recommend the solution of Alex. Do not modify the scikit-learn source code. Write it in your own Python module. But most probably the solution of Nicolas should be enough for 99% of the use-cases. Cheers, On Sun, 8 Nov 2020 at 12:41, Alex Levin wrote: > Hi Mahmood > You can add you

Re: [scikit-learn] Issue with Sklearn.Logistic Regression

2020-11-01 Thread Guillaume Lemaître
; (Robert Helmbold, 2013) > > > On Sunday, November 1, 2020, 02:58:46 PM MST, Guillaume Lemaître < > g.lemaitr...@gmail.com> wrote: > > > You forgot the parentheses to instantiate the object LogisticRegression > > On Sun, 1 Nov 2020 at 22:55, The Helmbolds via scikit-

Re: [scikit-learn] Issue with Sklearn.Logistic Regression

2020-11-01 Thread Guillaume Lemaître
You forgot the parentheses to instantiate the object LogisticRegression On Sun, 1 Nov 2020 at 22:55, The Helmbolds via scikit-learn < scikit-learn@python.org> wrote: > Here's my ynp and Xnp arrays: > >Print ynp > [0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 > 0

Re: [scikit-learn] Generating biased dataset using make_classification

2020-08-31 Thread Guillaume Lemaître
It depends on which type of biases you want to induce. I would think that the current function is pretty limited to introduce biases thought. On Sat, 29 Aug 2020 at 01:12, Saha, Prashanta wrote: > To generate a biased dataset using the make_classfication method which > parameter is needed to be

Re: [scikit-learn] Feature Request

2020-08-25 Thread Guillaume Lemaître
In scikit-learn, you have the agglomerative approach (bottom-up): https://scikit-learn.org/stable/modules/clustering.html#hierarchical-clustering On Tue, 25 Aug 2020 at 04:06, Jayanth B wrote: > Is there a class / module for Hierarchical Divisive Clustering in sci-kit > learn ? > -- > Jayanth Bo

Re: [scikit-learn] Opinion on reference mentioning that RF uses weak learners

2020-08-16 Thread Guillaume Lemaître
One needs to define what is the definition of weak learner.In boosting, if I recall well the literature, weak learner refers to learner which unfit performing slightly better than a random learner. In this regard, a tree with shallow depth will be a weak learner and is used in adaboost or gradien

[scikit-learn] ANN: scikit-learn 0.23.2 release

2020-08-04 Thread Guillaume Lemaître
We are happy to announce the 0.23.2 release which fixes a couple of issues. You can see the changelog here: https://scikit-learn.org/stable/whats_new/v0.23.html#version-0-23-2 You can check this version out via pip: pip install -U scikit-learn The conda-forge builds will be available shortl

Re: [scikit-learn] major league hacking summer internship program

2020-05-29 Thread Guillaume Lemaître
Hey, I can dedicate some time to review. Cheers, On Fri, 29 May 2020 at 11:43, Adrin wrote: > Thanks Andy, sounds pretty cool. > > I can commit some reviewing time. There should be maybe two of us at least > that they know they can ping, and we can ping others if needed. > > Cheers, > Adrin >

Re: [scikit-learn] [GridSearchCV] Reduction of elapsed time at the second interation

2020-05-27 Thread Guillaume Lemaître
Regarding scikit-learn, the only thing that we cache is the transformer processing in the pipeline (see the memory parameter in Pipeline). It seems that you are passing a different set of features at each iteration. Is the number of features different? On Sun, 29 Mar 2020 at 19:23, Pedro Cardoso

Re: [scikit-learn] Class weight SVC

2020-05-27 Thread Guillaume Lemaître
I don't think that we rescale the sample_weight and therefore the results should be different. On Fri, 24 Apr 2020 at 12:41, Francesco basciani wrote: > Hi, i have a question regarding the class weights in SVC. I have an > imbalanced binary classification problem. In my case the ratio between th

Re: [scikit-learn] Random Binning Features

2020-05-27 Thread Guillaume Lemaître
The algorithm in scikit-learn-extra are usually algorithms which did not meet the inclusion criteria (too early publication, not enough citations, etc.) However, the code quality is as good and tested than scikit-learn (usually they were PR in the main repository). Doing in this manner allows us to

Re: [scikit-learn] Fwd: StackingClassifier

2020-05-05 Thread Guillaume Lemaître
Your analysis is correct: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/_stacking.py#L59 It will be the prediction of each learner in the order in the list given and finally the features which are pass-through. It would nice when we will be able to propagate feature na

Re: [scikit-learn] Vote: Add Adrin Jalali to the scikit-learn technical committee

2020-05-02 Thread Guillaume Lemaître
+1 On Tue, 28 Apr 2020 at 20:59, Joris Van den Bossche < jorisvandenboss...@gmail.com> wrote: > +1 > > On Tue, 28 Apr 2020 at 01:34, Joel Nothman wrote: > >> +1 >> >> On Tue, 28 Apr 2020 at 02:23, Tom DLT wrote: >> >>> +1 >>> >>> Le lun. 27 avr. 2020, à 07 h 00, Alexandre Gramfort < >>> alexand

Re: [scikit-learn] Why does sklearn require one-hot-encoding for categorical features? Can we have a "factor" data type?

2020-05-01 Thread Guillaume Lemaître
OrdinalEncoder is the equivalent of pd.factorize and will work in the scikit-learn ecosystem. However, be aware that you should not just swap OneHotEncoder to OrdinalEncoder just at your wish. It depends of your machine learning pipeline. As mentioned by Gael, tree-based algorithm will be fine wi

Re: [scikit-learn] Monthly meetings

2020-02-21 Thread Guillaume Lemaître
Hi all, I attached the notes that I prepared: notes <https://bit.ly/2SOWpYe> We might have to prioritize if we want to make the meeting in an hour. Cheers, On Fri, 21 Feb 2020 at 17:15, Guillaume Lemaître wrote: > Thanks, Nicolas for the recall. I will prepare a sort of agend

Re: [scikit-learn] Monthly meetings

2020-02-21 Thread Guillaume Lemaître
Thanks, Nicolas for the recall. I will prepare a sort of agenda for the meeting which I will post before the meeting. On Fri, 21 Feb 2020 at 00:08, Nicolas Hug wrote: > Hi all, > > The next scikit-learn monthly meeting will take place on Monday at the > usual time ( > https://www.timeanddate.com

[scikit-learn] SLEP011: Change of governance

2020-01-30 Thread Guillaume Lemaître
Dear all, I would like to propose a change of governance in the decision process to make it possible to retract a SLEP and to not escalate a mandatory TC vote. I open a SLEP where we can discuss about it: https://github.com/scikit-learn/enhancement_proposals/pull/28/files Cheers, -- Guillaume L

Re: [scikit-learn] Which sparse matrix should be use for fit?

2020-01-29 Thread Guillaume Lemaître
if you could open an issue on GitHub, it would be great because this info would be useful in the docstring. On Wed, 29 Jan 2020 at 10:58, Guillaume Lemaître wrote: > Looking at check_array in the SVR and SVC, we convert to CSR format if the > sparse matrices are not from this format: >

Re: [scikit-learn] Which sparse matrix should be use for fit?

2020-01-29 Thread Guillaume Lemaître
Looking at check_array in the SVR and SVC, we convert to CSR format if the sparse matrices are not from this format: https://github.com/scikit-learn/scikit-learn/blob/b194674c4/sklearn/svm/_base.py#L146 Basically, this is more efficient because we are going to make operation which will get row.,

Re: [scikit-learn] ask a question about weights for features in svc with rbf kernel

2020-01-20 Thread Guillaume Lemaître
You can look at the attribute coef_ once your model is fitted.  Sent from my phone - sorry to be brief and potential misspell.

Re: [scikit-learn] logistic regression results are not stable between solvers

2020-01-08 Thread Guillaume Lemaître
t doesn't raise a convergence warning should probably be considered a bug. > It uses the maximum weight change as a stopping criterion right now. > We could probably compute the dual objective once in the end to see if we > converged, right? Or is that not possible with SAGA? If not, w

Re: [scikit-learn] Vote on SLEP010: n_features_in_ attribute

2019-12-16 Thread Guillaume Lemaître
I am +1 as well. I think that what proposed by @Joel Nothman should be considered. It seems that we have cases that we know that it is not meant to have the parameters (e.g., Vectorizer). I think that it would make sense to have an estimator tag. Thus, the FutureWarning for a third-party library m

Re: [scikit-learn] scikit-learn twitter account

2019-11-05 Thread Guillaume Lemaître
We stopped twitting PR since January if I am not mistaken. So the current channel does not have a real purpose :) On Tue, 5 Nov 2019 at 09:02, Roman Yurchak wrote: > Maybe re-purposing? I'm not sure if people find useful the current > approach of a tweet per PR. > It would make things less confu

Re: [scikit-learn] scikit-learn twitter account

2019-11-04 Thread Guillaume Lemaître
+1 for outreach / -1 for support FWIW we have several persons asking us how they could know about future sprints at the Man AHL sprint. The Twitter account could be a nice channel to relay the info about such public event. Communicating on the releases would also be great. Sent from my pho

Re: [scikit-learn] Decision tree results sometimes different with scaled data

2019-10-22 Thread Guillaume Lemaître
Even with the same random state, it can happen that several features will lead to a best split and this split is chosen randomly (even with the seed fixed - this is reported as an issue I think). Therefore, the rest of the tree could be different leading to different prediction. Another possibilit

Re: [scikit-learn] logistic regression results are not stable between solvers

2019-10-09 Thread Guillaume Lemaître
Ups I did not see the answer of Roman. Sorry about that. It is coming back to the same conclusion :) On Wed, 9 Oct 2019 at 23:37, Guillaume Lemaître wrote: > Uhm actually increasing to 1 samples solve the convergence issue. > SAGA is not designed to work with a so small sample siz

Re: [scikit-learn] logistic regression results are not stable between solvers

2019-10-09 Thread Guillaume Lemaître
Uhm actually increasing to 1 samples solve the convergence issue. SAGA is not designed to work with a so small sample size most probably. On Wed, 9 Oct 2019 at 23:36, Guillaume Lemaître wrote: > I slightly change the bench such that it uses pipeline and plotted the > coefficient: >

Re: [scikit-learn] logistic regression results are not stable between solvers

2019-10-09 Thread Guillaume Lemaître
I slightly change the bench such that it uses pipeline and plotted the coefficient: https://gist.github.com/glemaitre/8fcc24bdfc7dc38ca0c09c56e26b9386 I only see one of the 10 splits where SAGA is not converging, otherwise the coefficients look very close (I don't attach the figure here but they

Re: [scikit-learn] logistic regression results are not stable between solvers

2019-10-09 Thread Guillaume Lemaître
Could you generate more samples, set penalty to none, reduce the tolerance and check the coefficients instead of predictions. This is sure to be sure that this is not only a numerical error. Sent from my phone - sorry to be brief and potential misspell.   Original Message   Fro

  1   2   >