Re: [scikit-learn] The culture of commit squashing

2016-06-13 Thread Matthieu Brucher
I don't even think that squashing them before the merge is actually sound. You will still need the history of why something happened several years down the road (and rebasing actually has a similar issue). This bit me quite often (having just one big commit to analyze after a merge from ancient VCS

Re: [scikit-learn] Adding BM25 relevance function to sklearn.feature_extraction.text

2016-06-13 Thread Joel Nothman
Hi Basil, Scikit-learn isn't a library for information retrieval. The question is: how useful is the BM25 feature reweighting in a machine learning context? This has been previously discussed at https://www.mail-archive.com/scikit-learn-general@lists.sourceforge.net/msg11353.html. The whole threa

Re: [scikit-learn] The culture of commit squashing

2016-06-13 Thread Juan Nunez-Iglesias
I think the main idea behind commit squashes is to make sure that every *commit* passes testing, rather than only merge commits. This is important because there's no way to tell git bisect to only look at merge commits. So when you are doing a git bisect to hunt down a regression or bug, it is very

Re: [scikit-learn] The culture of commit squashing

2016-06-13 Thread Joel Nothman
My concern is that people are responding to being asked to squash on one PR by squashing during development on the next (as if merge were always imminent). I want that to stop. Is part of the solution to stop squashing, or make the person merging always perform the squash? On 14 June 2016 at 12:53

Re: [scikit-learn] The culture of commit squashing

2016-06-13 Thread Sebastian Raschka
Hi, Joel, in my opinion, it really depends on the particular case, but in general I am pro squashing — that is, when it happens at the very end. I also agree that squashing and force pushing while there’s still a review going on is clearly disruptive. Say there’s a new estimator being added. Th

Re: [scikit-learn] The culture of commit squashing

2016-06-13 Thread Andy
I agree that it adds some annoying overhead. For me, one of the main motivations is to make cherry picks for bugfix releases easier. It's very hard to cherry pick things that are spread out over many commits, and it's hard to find the relevant bug fixes among hundreds of minor commits. This rea

Re: [scikit-learn] Adding BM25 relevance function to sklearn.feature_extraction.text

2016-06-13 Thread Basil Beirouti
Hello all, You can use sklearn.feature_extraction.text.TfidfVectorizer to learn a corpus of documents and rank them in order of relevance to a new previously unseen query. BM25 works in a similar manner to TfidfVectorizer, but is more complex and considered one of the most successful information

Re: [scikit-learn] The culture of commit squashing

2016-06-13 Thread Andy
On 06/13/2016 09:36 PM, Joel Nothman wrote: (And apologies for wasting your time on such a silly issue, but I'm sick of clicking links in emails to find the commit's disappeared.) I really see no reason why someone should squash something before it is ready to be merged. (as Jacob suggested).

Re: [scikit-learn] The culture of commit squashing

2016-06-13 Thread Jacob Schreiber
My research work involves frequently contributing small changes. I like to keep these around as a record of what I've done, until I've finished with that part of the code. However, I also hate having large numbers of commits (frequently can commit 50+ times a day without much substantitve progress)

[scikit-learn] The culture of commit squashing

2016-06-13 Thread Joel Nothman
For the last few years, there's been a notion that we should squash PRs down to a single commit before merging. Squashing can give a cleaner commit history, and avoid overrepresentation of minor work given silly commit count metrics used by Github and others. I'm not sure if there are other motivat

Re: [scikit-learn] how to create and execute a machine learning models in Java/JVM based application (in production) using Python

2016-06-13 Thread 颜发才
how about spark? It contains some common machine learning algorithms and support JAVA api. On Jun 13, 2016 01:26, "Gaurav gupta" wrote: > > Hi All, > > > > Could you please guide me on how to *create and execute *a machine > learning models/statistical models (regression, Decision tree, K means >