Hey Deneche, so here is yet another nice to have for DT :) It would be helpful to have a toDot method on the node method that creates a dot file which can then be visualized with graphviz. I generated some of those and while these graphs can be very large sometimes they are also helpful.
Andrey On Wed, Nov 3, 2010 at 8:28 AM, deneche abdelhakim <[email protected]>wrote: > I will have to investigate this further, just to make sure I didn't > introduce any new bug. I will keep you informed. I shall add the depth > limit as soon as possible. > > Deneche > > On Wed, Nov 3, 2010 at 2:22 AM, Andrey Gusev <[email protected]> > wrote: > > Just a note, I am observing a slight change in the accuracy numbers but > > there are very small and probably just a result of slight changes at the > > long branches of the trees. So overall, I think the fix works. Thanks > again! > > Andrey > > > > On Tue, Nov 2, 2010 at 1:49 PM, Andrey Gusev <[email protected]> > wrote: > >> > >> Hey Deneche - that works on my dataset. Thanks! > >> I am also still using the limited depth - this also helps limit the > size > >> of the model, but overall this fix does address infinite recursion that > I > >> was observing. > >> Andrey > >> > >> On Mon, Nov 1, 2010 at 10:10 AM, deneche abdelhakim <[email protected] > > > >> wrote: > >>> > >>> Hey Andrey, > >>> > >>> I committed my changes to the trunk, this should -hopefully- fix your > >>> infinite recursion problems. Please let me know if it worked for you. > >>> > >>> On Thu, Oct 28, 2010 at 5:38 AM, Andrey Gusev <[email protected]> > >>> wrote: > >>> > Thanks Deneche! > >>> > > >>> > On Tue, Oct 26, 2010 at 9:12 PM, deneche abdelhakim > >>> > <[email protected]> > >>> > wrote: > >>> >> > >>> >> Hi Andrey, > >>> >> Yes, this would be great. Actually, it was on my todo list for some > >>> >> time now. And now that you have requested it, it should become top > >>> >> priority for me. I'm just waiting for the release of Mahout-0.4 and > >>> >> the end of the code freeze and I will add both the "maxDepth" to > >>> >> DefaultTreeBuilder and start working on this one. > >>> >> I also found what was causing the infinite recursion on your > dataset, > >>> >> a patch is available here: > >>> >> https://issues.apache.org/jira/browse/MAHOUT-526 > >>> >> I should commit it as soon as the code freeze ends. > >>> >> > >>> >> Thanks for your feedback, > >>> >> Deneche > >>> >> > >>> >> On Tue, Oct 26, 2010 at 10:00 PM, Andrey Gusev > >>> >> <[email protected]> > >>> >> wrote: > >>> >> > Hey Deneche, > >>> >> > I wanted to also let you know about another feature that may be > >>> >> > useful > >>> >> > for > >>> >> > bagged decision trees. It would be nice to have an option of > getting > >>> >> > confidence value (probability) along with prediction. This could > >>> >> > help > >>> >> > for > >>> >> > cases where precision needs to be increased with possible lower > >>> >> > recall > >>> >> > values. > >>> >> > For example, I modified the code to include confidence as the > ratio > >>> >> > of > >>> >> > trees > >>> >> > that have predicted particular label - i.e. get counts for each > >>> >> > label > >>> >> > from > >>> >> > all the trees and set return confidence as the ratio of > predictions > >>> >> > for > >>> >> > the > >>> >> > label with most prediction divided by total number of bagged > trees. > >>> >> > What > >>> >> > do > >>> >> > you think? > >>> >> > Andrey > >>> > > >>> > > >> > > > > >
