Re: [scikit-learn] Welcome Raghav to the core-dev team

2016-10-11 Thread Joaquin Vanschoren
A bit late, but heartfelt congrats to Raghav :) On Tue, Oct 4, 2016 at 12:43 PM Joel Nothman wrote: > Congratulations, Raghav! Thanks for your dedication, as a student and > mentor in GSoC, but at all other times too! > > On 4 October 2016 at 19:14, Jaques Grobler > wrote: > > Congrats Raghav!

[scikit-learn] HashingVectorizer slow in version 0.18

2016-10-11 Thread Gabriel Trautmann
Hi, After upgrading to scikit-learn 0.18 HashingVectorizer is about 10 times slower. Before: scikit-learn 0.17. Numpy 1.11.2. Python 3.5.2 AMD64 Vectorizing 20newsgroup 11314 documents Vectorization completed in 4.594092130661011 seconds, resulting shape (11314, 1048576) After upgrade: scik

Re: [scikit-learn] HashingVectorizer slow in version 0.18

2016-10-11 Thread Olivier Grisel
I cannot reproduce such a degradation on my machine: (sklearn-0.17)ogrisel@is146148:~/code/scikit-learn$ python ~/tmp/bench_vectorizer.py scikit-learn 0.17.1. Numpy 1.11.2. Python 3.5.0 x86_64 Vectorizing 20newsgroup 11314 documents Vectorization completed in 4.033604383468628 seconds, resulting

Re: [scikit-learn] HashingVectorizer slow in version 0.18

2016-10-11 Thread Gabriel Trautmann
Thank you for your response, have Windows 7 Enterprise 64 bit / Intel Xeon E5 2640 CPU, same problem on two similar machines python-3.5.2-amd64.exe - fresh installation numpy-1.11.2+mkl-cp35-cp35m-win_amd64.whl - from Christoph Gohlke scipy-0.18.1-cp35-cp35m-win_amd64.whl pip install scikit-lean

Re: [scikit-learn] ANN Scikit-learn 0.18 released

2016-10-11 Thread Piotr Bialecki
Congratulations to all contributors! I would like to update to the new version using conda, but apparently it is not available: ~$ conda update scikit-learn Fetching package metadata ... Solving package specifications: .. # All requested packages already installed. # packages in env

Re: [scikit-learn] ANN Scikit-learn 0.18 released

2016-10-11 Thread Maciek Wójcikowski
Hi Piotr, I've been there - most probably some package is blocking you to update via numpy dependency. Try to update numpy first and the conflicting package should pop up: "conda update numpy=1.11" Pozdrawiam, | Best regards, Maciek Wójcikowski mac...@wojcikowski.pl 2016-10-11 14:32 GMT+0

Re: [scikit-learn] ANN Scikit-learn 0.18 released

2016-10-11 Thread Piotr Bialecki
Hi Maciek, thank you very much! Numpy and opencv were indeed the conflicted packages. Apperently my version of opencv was using numpy 1.10, so I uninstalled opencv, updated numpy and updated scikit to 0.18. Thank's for the fast help! Best regards, Piotr On 11.10.2016 14:39, Maciek Wójcikowski

Re: [scikit-learn] HashingVectorizer slow in version 0.18

2016-10-11 Thread Olivier Grisel
That's really weird. I don't have a windows machine handy at the moment. It would be nice if someone else could confirm. Could you please run the Python profiler on this to see where the time is spent on the slow setup? -- Olivier ___ scikit-learn mail

Re: [scikit-learn] HashingVectorizer slow in version 0.18

2016-10-11 Thread Piotr Bialecki
I just tested it on my Ubuntu machine and could not see any performance issues (5.68 seconds in scikit-learn 0.17 vs. 6.67 seconds in scikit-learn 0.18) However, on another Windows 10 machine I could indeed see this issue: scikit-learn 0.17.1. Numpy 1.11.1. Python 2.7.12 AMD64 Vectorizing 20new

Re: [scikit-learn] HashingVectorizer slow in version 0.18

2016-10-11 Thread Gael Varoquaux
Could it be a case of compilation: it seems to me that we are compiling MKL vs non MKL builds. ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] HashingVectorizer slow in version 0.18

2016-10-11 Thread Mathieu Blondel
On Tue, Oct 11, 2016 at 10:49 PM, Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > Could it be a case of compilation: it seems to me that we are compiling > MKL vs non MKL builds. > The hashing vectorizer is written in Cython and doesn't use BLAS, though. Mathieu

Re: [scikit-learn] HashingVectorizer slow in version 0.18

2016-10-11 Thread Andreas Mueller
Please open an issue on the issue tracker: https://github.com/scikit-learn/scikit-learn/issues On 10/11/2016 08:19 AM, Gabriel Trautmann wrote: Thank you for your response, have Windows 7 Enterprise 64 bit / Intel Xeon E5 2640 CPU, same problem on two similar machines python-3.5.2-amd64.exe -