While reviewing Decision Forest code, I noticed that computing the "out of bag" error (OOB) of the forest while training it made the implementation really messy. I made a lot of assumptions about the way Hadoop works internally (especially the way it splits the data), this proven many times to be buggy because with each new version of Hadoop I hade to "tweak" the code to make it run.
So I am asking the users and developers alike: is computing the OOB really necessary ? if yes, I will spend the time to figure out a better way to compute it, but if no I will just get rid of it for now and leave a JIRA issue about getting it back again if someone actually need it.
