2013/1/14 Andreas Mueller <[email protected]>:
> Hi Peter.
> I only skimmed your mail, but I understood you said that the problem is
> the use of a boolean mask.
> Wouldn't it be possible to do the subsampling explicitly before training
> the tree if the sample_fraction is low?

absolutely, when I wrote the code I haven't thought about very low
values of ``subsample`` (<< 0.5). But again, this would incur high
memory costs (we would need to fancy index)

> Or is the complexity of applying the sample mask higher than training
> the tree?

by applying the sample mask you mean fancy indexing with the sample
mask? In general, if you build deep trees the complexity of fancy
indexing can be amortized by subsequent split computations; if you
build shallow trees, it often cannot be amortized thus you're better
off with the sparse sample_mask. I had the impression that it didn't
help much for GBRT and shallow trees (below depth 6).

>
> Also: would it be possible to speed this up using the recently
> introduced sample weights?
> That helped for the random forests, right?

no, unfortunately not - in GBRT we do sampling w/o replacement (RF is
w/ replacement)

>
> Best,
> Andy
>
> ------------------------------------------------------------------------------
> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> MVPs and experts. SALE $99.99 this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122412
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



-- 
Peter Prettenhofer

------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122412
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to