Re: [scikit-learn] unsubscribe from the mailing list

2016-05-17 Thread Matthieu Brucher
Jsut read the end of the email. 2016-05-17 22:02 GMT+01:00 Jianjian Jin : > Hi, > >Looks like there are too many emails, so I'll have to unsubscribe to > the mailing list, would you help me with that? Thanks a lot. > > best > > Jianjian > > ___ > sc

Re: [scikit-learn] The culture of commit squashing

2016-06-13 Thread Matthieu Brucher
I don't even think that squashing them before the merge is actually sound. You will still need the history of why something happened several years down the road (and rebasing actually has a similar issue). This bit me quite often (having just one big commit to analyze after a merge from ancient VCS

[scikit-learn] Question about error of LLE and backtransformation of coordinates

2016-06-16 Thread Matthieu Brucher
Hi! The errors are quite small compared to the machine precision. As the reduction is also an approximation of the underlying manifold, not an "isotropic" one as well (you can see int he example that red points are less squashed together than blue ones), you won't have a perfect reconstruction eit

Re: [scikit-learn] New Contributor

2016-06-18 Thread Matthieu Brucher
You can also try one, and if you are stuck, just ask for help. Someone should be able to help you out ;) 2016-06-18 10:33 GMT+01:00 Sagar Kar : > Thanks Gael, > I read it. But I am having hard time finding an issue to work on. Frankly, > I am unable to understand how to approach the easy issues a

Re: [scikit-learn] Model trained in 0.17 gives entirely different results in 0.15

2016-08-03 Thread Matthieu Brucher
More often than not, forward compatiblity is not possible. I don't think there are lots of companies doing so, as even backward compatibility is tricky to achieve. Even with serializing the version, if the previous version doesn't know about the additional data structures that have an impact on the

Re: [scikit-learn] Model trained in 0.17 gives entirely different results in 0.15

2016-08-03 Thread Matthieu Brucher
True! 2016-08-03 20:38 GMT+01:00 Andreas Mueller : > > > On 08/03/2016 03:16 PM, Matthieu Brucher wrote: > >> More often than not, forward compatiblity is not possible. I don't think >> there are lots of companies doing so, as even backward compatibility is >&

[scikit-learn] Recurrent questions about speed for TfidfVectorizer

2018-11-25 Thread Matthieu Brucher
Hi all, I've noticed a few questions online (mainly SO) on TfidfVectorizer speed, and I was wondering about the global effort on speeding up sklearn. Is there something I can help on this topic (Cython?), as well as a discussion on this tough subject? Cheers, Matthieu -- Quantitative analyst, P

Re: [scikit-learn] Recurrent questions about speed for TfidfVectorizer

2018-12-05 Thread Matthieu Brucher
Hi qll, Sorry for the late reply, lots of things to work on currently. I'll have a look at the roadmap and the pointers to see what could be done to enhance the situation. Cheers, Matthieu Le lun. 26 nov. 2018 à 20:09, Roman Yurchak via scikit-learn < scikit-learn@python.org> a écrit : > Trie

Re: [scikit-learn] How is linear regression in scikit-learn done? Do you need train and test split?

2019-06-04 Thread Matthieu Brucher
Hi CW, It's not about the concept of the black box, none of the algorithms in sklearn are a blackbox. The question is about model validity. Is linear regression a valid representation of your data? That's what the train/test answers. You may think so, but only this process will answer it properly.

Re: [scikit-learn] Opinion on reference mentioning that RF uses weak learners

2020-08-16 Thread Matthieu Brucher
Hi, What are you wondering? The individual tree is weakened by design (accepts more errors), so indeed, the individual trees are weak learners and the combination of them (the forest) becomes the strong learner. You can have a strong tree as well (deeper, more parameters), but that's not what is s

Re: [scikit-learn] Creating dataset

2020-11-08 Thread Matthieu Brucher
data_file["data"], this works only if you have such a column as well. load_csv can perfectly do what you need, but you have to adapt the script to what you have in the csv (which is something only you know!). You need to understand what the different statements are doing; just as you need to unders

Re: [scikit-learn] Inquiry on Genetic Algorithm

2022-10-30 Thread Matthieu Brucher
GA are not a machine learning model, they are a way of minimizing a cost function, so there are probably modules that are dedicated to this elsewhere. Matthieu Le dim. 30 oct. 2022 à 12:19, Thomas Evangelidis a écrit : > Hi, > > I am not aware of any *official* scikit-learn implementation of a