On Thu, Nov 05, 2015 at 07:05:11AM +0000, Raphael C wrote:
> https://github.com/szilard/benchm-ml
> The upshot is that in some cases it seems that the scikit-learn
> versions have room for improvement.
The various main lessons that I can see from those results are:
* Linear models (aka LogisticRegression) don't scale very well:
- The page benches the default, which is liblinear.
I would be very curious to see how the other solvers (Newton, and
SAG) fair on this dataset.
It would be useful to introduce a 'solver="auto"' for logistic
regression, based on heavy benchmarks and heuristics.
I have created an issue about this, to discuss if we want to do this:
https://github.com/scikit-learn/scikit-learn/issues/5736
- Having fused types to avoid increased memory would be useful.
For this we first need to finish adding cython as a build dependency:
https://github.com/scikit-learn/scikit-learn/pull/5492
- In tree-based Not handling categorical variables as such hurts us a lot
There's a PR to fix that, it still needs a bit of love:
https://github.com/scikit-learn/scikit-learn/pull/4899
Gaƫl
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general