subject:"\[scikit\-learn\] Recurrent questions about speed for TfidfVectorizer"

Re: [scikit-learn] Recurrent questions about speed for TfidfVectorizer

2018-12-05 Thread Matthieu Brucher

Hi qll, Sorry for the late reply, lots of things to work on currently. I'll have a look at the roadmap and the pointers to see what could be done to enhance the situation. Cheers, Matthieu Le lun. 26 nov. 2018 à 20:09, Roman Yurchak via scikit-learn < scikit-learn@python.org> a écrit : > Trie

Re: [scikit-learn] Recurrent questions about speed for TfidfVectorizer

2018-11-26 Thread Roman Yurchak via scikit-learn

Tries are interesting, but it appears that while they use less memory that dicts/maps they are generally slower than dicts for a large number of elements. See e.g. https://github.com/pytries/marisa-trie/blob/master/docs/benchmarks.rst. This is also consistent with the results in the below linke

Re: [scikit-learn] Recurrent questions about speed for TfidfVectorizer

2018-11-26 Thread Andreas Mueller

I think tries might be an interesting datastructure, but it really depends on where the bottleneck is. I'm really surprised they are not used more, but maybe that's just because implementations are missing? On 11/26/18 8:39 AM, Roman Yurchak via scikit-learn wrote: Hi Matthieu, if you are int

Re: [scikit-learn] Recurrent questions about speed for TfidfVectorizer

2018-11-26 Thread Roman Yurchak via scikit-learn

Hi Matthieu, if you are interested in general questions regarding improving scikit-learn performance, you might be want to have a look at the draft roadmap https://github.com/scikit-learn/scikit-learn/wiki/Draft-Roadmap-2018 -- there is a lot topics where suggestions / PRs on improving performa

[scikit-learn] Recurrent questions about speed for TfidfVectorizer

2018-11-25 Thread Matthieu Brucher

Hi all, I've noticed a few questions online (mainly SO) on TfidfVectorizer speed, and I was wondering about the global effort on speeding up sklearn. Is there something I can help on this topic (Cython?), as well as a discussion on this tough subject? Cheers, Matthieu -- Quantitative analyst, P

Re: [scikit-learn] Recurrent questions about speed for TfidfVectorizer

Re: [scikit-learn] Recurrent questions about speed for TfidfVectorizer

Re: [scikit-learn] Recurrent questions about speed for TfidfVectorizer

Re: [scikit-learn] Recurrent questions about speed for TfidfVectorizer

[scikit-learn] Recurrent questions about speed for TfidfVectorizer

5 matches

Site Navigation

Mail list logo

Footer information