Paolo,

I noticed that too - maybe @glouppe can comment on this - I think the
reason was a change in the ``n_features`` heuristic but I might be
mistaken.

Concerning the GaussianNB - there's a PR [1] adressing a critical bug
in the estimator - it should be merged ASAP. Furthermore, test time is
quite low - this might be due to memory layout issues - SGDClassifier
converts ``coef_`` to fortran-style for increased test-time
performance.

best,
 Peter

[1] https://github.com/scikit-learn/scikit-learn/pull/731

2012/3/27 Paolo Losi <[email protected]>:
> Hi all,
>
> I've just run bench_covertype on today master.
> I needed to uncomment ExtraTrees and RandomForest benchs.
>
> The result are quite unexpected:
>
> Classifier   train-time test-time error-rate
> --------------------------------------------
> Liblinear     13.5609s   0.0683s     0.2307
> GaussianNB    3.6565s    0.1753s     0.6367
> SGD           0.4522s    0.0170s     0.2300
> CART          35.3378s   0.0375s     0.0476
> RandomForest 246.8737s   0.6908s     0.0807
> Extra-Trees  182.0412s   0.6269s     0.1986
>
>
> with respect to the what reported in bech_covertype.py:
>
> Classifier train-time test-time error-rate
> -------------------------------------------- Liblinear 11.8977s 0.0285s
> 0.2305 GaussianNB 3.5931s 0.6645s 0.3633 SGD 0.2924s 0.0114s 0.2300 CART
> 39.9829s 0.0345s 0.0476 RandomForest 794.6232s 1.0526s 0.0249 Extra-Trees
> 1401.7051s 1.1181s 0.0230
>
> Unless I'm missing something obvious I'll open a ticket
> a try to give git bisect a run ...
>
> Thanks!
> Paolo
>
> PS: I just noticed that also GaussianNB results are worse...



-- 
Peter Prettenhofer

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to