Re: [Scikit-learn-general] on-demand feature for Random Decision Forests

Andy Thu, 20 Nov 2014 07:22:37 -0800

Hi Alex.

I am not super familiar with the internals of the trees, but I think itmight be possible to implement this based on the scikit-learn treeswithout any patches. There are "splitter" and "criterion" classes thathandle the feature processing, and it might be possible to define your

own to implement feature sampling.

That said, I am not sure this is a good idea. The trees are heavilyoptimized for working with features and for sorting the features.For on-demand features, using a bucketing strategy is much more commonas no feature is seen twice.

If your goal are vision applications, there is another gotcha which isthat input and output formats are quite different from what is used inscikit-learn.If you want to use random forests on vision applications, I would reallyrecommend looking into the link that I posted earlier to curfil, or lookat thework done at microsoft cambridge. I think there is also animplementation in the point cloud library.

To summarize, in principle I think you might be able to reuse thescikit-learn code to create trees with on-demand features.I think this is not a great idea, though, and in particular if you wantto do computer vision applications, I'd highly recommend looking intoother existing implementations.



Best,
Andy


On 11/20/2014 09:54 AM, Alexander Rüsch wrote:

Hey Andreas,
thanks a lot for your quick reply. The gil-released functions are abit difficult to handle so I guess we should provide the most commonfunctions.Maybe it is possible to make an add-on to enable the on-demandfunctionality?
One idea is to make a spin-off of the latest stable version andimplement the on-demand functionality with a set of the most commonfunctions to choose of. Thus, the user just need to implementgil-released functions if he really tries new things.
Or is it possible to make a patch to add the functions? This wouldprobably be the most practical way to give easy access to newfunctionality. The need to generate a new patch for every new releaseversion of sklearn is a disadvantage that should be mentioned.
As you can see I'm searching for a way to use the scikit-learn libraryas a strong basis and add my functionality. Because I am justinterested in RDFs I wonder if there will be trouble when I just copythe "tree" section of sklearn to add my new DecisionTreeClassifier andimport the rest of the scikit-learn library with respect to Cython?This way I get a small library on its own.
Or is there another safe way to create such an add-on for scikit-learn?


Best,
Alex
PS: I hope this email will reach its aim, otherwise: I reply to this<http://sourceforge.net/p/scikit-learn/mailman/message/33052675/>.
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] on-demand feature for Random Decision Forests

Reply via email to