Hi, I tried to use Mahout PartialBuilder to build a random forest with our training data. In our data, the values first two attributes are all 0.0. However, after sampling by Mahout, some of the data set has values -0.0, NaN for the first two attributes respectively. Consequently, the class OptIgSplit throws ArrayIndexOutOfBoundException at the method computeFreauencies(). Also the sizes of sampling datasets vary drastically. For our case, we got sizes of 9783, 19, 3, 4, 144, 12, …. Are these are issues of partial implementation? Thanks.
Ey-Chih Chow
