A form – with required, pre-defined fields – can help when people submit bugs, 
issues, or requests for new features. Perhaps creating an issue template for 
scikit-learn is a good first step.


Pull requests also have a template


I am not sure how these fit into the team’s review and release workflow.

If this doesn’t quite fit your needs, perhaps engaging Github Support will 
yield something interesting.

Dale Smith | Macy's Systems and Technology | IFS eCommerce | Data Science
770-658-5176 | 5985 State Bridge Road, Johns Creek, GA 30097 | 

From: scikit-learn 
[mailto:scikit-learn-bounces+dale.t.smith=macys....@python.org] On Behalf Of 
Joel Nothman
Sent: Friday, September 16, 2016 1:15 AM
To: Scikit-learn user and developer mailing list
Subject: Re: [scikit-learn] Github project management tools

I think we're quite close to the intended users of Github, they just started 
simple and with all these more feature-complete competitors appear, are adding 
those features but haven't quite got it right yet. I'm not convinced that it's 
the perfect tool (although I haven't seen this threading problem; gmail seems 
to still be keeping one thread per PR?), but its simplicity and 
familiarity/popularity is a great advantage for handling new contributors. In 
terms of contributor familiarity, most of the projects that we integrate with 
use same: numpy, scipy, cython (recently), pandas, matplotlib, ipython. While I 
appreciate that we are somewhat arbitrarily supporting a near-monopoly, the 
case for moving away from, or even wrapping, github seems poor to me.

Apart from distinguishing between possible bug, actual bug and other (which are 
fairly static categories), classifying issues by status is too hard to manage. 
What I'd like to suggest is that we choose a way to highlight high-priority 
issues for the next release, either through the milestone feature, the project 
feature. Other issues will still get attention by way of random traffic, but we 
care less about the timing of their resolution.

(I'm sure there must be a way using the API to find issues linked to by PRs or 
not, but I don't think that's available in the UI.)

On 16 September 2016 at 09:35, Andreas Mueller 
<t3k...@gmail.com<mailto:t3k...@gmail.com>> wrote:
Hey Joel.
Thanks for bringing this up. I have a really hard time keeping up with what's 
on the issue tracker and I have no idea how you manage.

The current tags are certainly not always helpful. Also, they are rarely 

I have been very frustrated by github. I used email to track all issues, but 
their new "upgrade"
made that impossible as issues are no longer email threads - each review is 
it's own thread.

It might make sense to switch to something like reviewable or gerrit.
These sit on top of github, and people can interact with them without using 
I haven't really worked with either, but heard only good things about them.

Any way to prioritize issues and putting them into the buckets that you listed 
would be a great step forward.
That would require someone manually going through 470 PRs and 762 issues, 
I would be happy to do that if we actually stick to the system. A single person 
is not enough to keep the tags (or whatever we end up using)
up to date, though.

Your statuses only apply to PRs, too, and we need to have something similar for 
issues, which have maybe these statuses

* random idea / feature request
* feature request with consensus to implement
* possible bug
* confirmed bug
* feature request or bug with active PR
* feature request or bug with stale PR

One problem with these is that man feature requests never get any comments, 
similar for PRs.
Is a PR without comment waiting for review? Or in dispute?
A PR could be reviewed but dispute could happen later, as we don't always agree 
on what to do.

I agree that we should try to organize ourselves better. I'm doubtful the new 
github features will help.
They certainly already have tremendously hindered me in keeping up in the 
couple of hours they've been online.

There is still no way to mark a comment as addressed, and comments are still 
more or less randomly hidden
(and links to them become dead). Both of these issues are fixed in the other 
review platforms.

I don't think we are the intended users of github, though I'm not sure who is.

On 09/15/2016 07:14 PM, Joel Nothman wrote:
One of the biggest issues with scikit-learn as a project is managing its 
backlog of issues; another is release scheduling. Some of this cannot be fixed 
as long as our model of voluntary contribution (with a couple of important 
exceptions) does not change. However, it may be worth considering the new 
project management features in Github.

At the moment we have the following management:
* labels corresponding to type (bug, enhancement, new feat, question), scope 
(API, Build/CI, ?Large Scale, Documentation), difficulty (easy, moderate), 
status/scheduling (needs contributor, needs review, sprint).
* PR status management with title prefixes [WIP], [MRG], [MRG+1], [MRG+2]

Firstly, we might benefit from prefixing labels by category, i.e. 
difficulty:easy so that complementary labels appear together.

In truth, PRs have roughly these statuses:
* WIP (not ready for review)
* waiting for review
* waiting for changes (with or without one of the following)
* in dispute (i.e. fundamental doubts about the PR)
* the above together with 1 or 2 "official" approvals
* ready for merge (pending minor changes such as what's new documentation)

New github features:

* reviews with "approved" or "request changes". A list of approvers can be 
found in the merge/CI panel. We could replace the MRG+1 annotation with this 
and use it to track disputation too. I'm not sure how it works with changes 
that are added after approval. I think it would have avoided one improper merge 
by me... One downside is that there does not yet seem to be a way to search for 
PRs with a specified level of approval (while searching for "MRG+1" sort-of 
* Milestone prioritising: issues in a milestone, such as 
https://github.com/scikit-learn/scikit-learn/milestone/21, can be ranked with 
drag-and-drop. I think this could help with release scheduling as it would 
allow us to identify the top priorities for a release and see when enough of 
them are completed.
* The Kanban-style workflow management of the new Projects tool 
https://github.com/scikit-learn/scikit-learn/projects is another way of 
managing status and, I think, priority, for a small set of related issues. This 
might be an alternative way of managing milestone scope, or of working towards 
big changes like the one just completed for model selection; like proposed 
expansions to get_feature_names expansion; like estimator tags; making 
utilities public/private...

So with the goal of making it easier to track where attention is most needed, 
and when to move to release: What's worth trying?


scikit-learn mailing list



scikit-learn mailing list

* This is an EXTERNAL EMAIL. Stop and think before clicking a link or opening 
scikit-learn mailing list

Reply via email to