Re: [scikit-learn] Github project management tools

2016-09-19 Thread Joel Nothman
On 17 September 2016 at 01:21, Gael Varoquaux wrote: > On Fri, Sep 16, 2016 at 09:14:12AM +1000, Joel Nothman wrote: > > One downside is that there does not yet seem to be a way to search for > > PRs with a specified level of approval (while searching for "MRG+1" > sort-of > > works). > > Yes, I

Re: [scikit-learn] Github project management tools

2016-09-19 Thread Tim Head
On Mon, Sep 19, 2016 at 4:06 PM Joel Nothman wrote: > I think it would be worth trying to have a rough *priority ranking for > things we'd like to see in 0.19*. However the Github Milestones feature > is a bit crippled in UI: you can rank issues, but cannot filter by anything > but open/closed, s

Re: [scikit-learn] Github project management tools

2016-09-19 Thread Joel Nothman
Another bot-able tool might be pinging inactive PRs to ask if they're being worked on, and labelling "Needs contributor" if there's no reply within n days...! On 20 September 2016 at 00:05, Joel Nothman wrote: > On 17 September 2016 at 01:21, Gael Varoquaux < > gael.varoqu...@normalesup.org> wro

Re: [scikit-learn] Github project management tools

2016-09-19 Thread Nelle Varoquaux
> Another bot-able tool might be pinging inactive PRs to ask if they're being > worked on, and labelling "Needs contributor" if there's no reply within n > days...! If PRs are inactive, it might also be interesting to tag them as easy_fix when there is little to do. > > On 20 September 2016 at 00

[scikit-learn] behaviour of OneHotEncoder somewhat confusing

2016-09-19 Thread Lee Zamparo
Hi sklearners, A lab-mate came to me with a problem about encoding DNA sequences using preprocessing.OneHotEncoder, and I find it to produce confusing results. Suppose I have a DNA string: myguide = ‘ACGT’ He’d like use OneHotEncoder to transform DNA strings, character by character, into a one

Re: [scikit-learn] behaviour of OneHotEncoder somewhat confusing

2016-09-19 Thread Sebastian Raschka
Hi, Lee, maybe set `n_value=4`, this seems to do the job. I think the problem you encountered is due to the fact that the one-hot encoder infers the number of values for each feature (column) from the dataset. In your case, each column had only 1 unique feature in your example > array([[0, 1,

Re: [scikit-learn] behaviour of OneHotEncoder somewhat confusing

2016-09-19 Thread Joel Nothman
OneHotCoder has issues, but I think all you want here is ohe.fit_transform(np.transpose(le.fit_transform([c for c in myguide]))) Still, this seems like it is far from the intended use of OneHotEncoder (which should not really be stacked with LabelEncoder), so it's not surprising it's tricky. On

Re: [scikit-learn] behaviour of OneHotEncoder somewhat confusing

2016-09-19 Thread Lee Zamparo
Hi Sebastian, Great, thanks! The docstring doesn’t make it very clear that using the default ’n_values=‘auto’ infers the number of different values column-wise; maybe I could do a quick PR to update it? Or, maybe I could make your example into a, well, example for the documentation online? Alte

Re: [scikit-learn] behaviour of OneHotEncoder somewhat confusing

2016-09-19 Thread Lee Zamparo
Hi Joel, Yea, seems that the one-hot encoding of the transpose solves the issue. As you say, and as I mentioned to Sebastian, it seems a bit off-usage for OneHotEncoder. Thanks for the solution all the same though. -- Lee Zamparo On September 19, 2016 at 7:48:15 PM, Joel Nothman (joel.noth...