the current implementation doesn't support the ARFF format out-of-the-box, as described in the Wiki you need to remove the header of the file and leave only the data. Actually, this implementation is fully compatible with UCI's datasets which are comma separated text files. You'll also need to call the dataset description tool (see the wiki) in order to generate a proper description file (contains the nature of each attribute: Numerical or Categorical).
Yes you can use BuildForest and TestForest to generate and use Random forest models from the command line On Tue, Jul 12, 2011 at 2:19 PM, Xiaobo Gu <[email protected]> wrote: > Hi, > > The Random Forest partial implementation in > https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation > use the ARFF file format, is ARFF the only supportted file format when > using the BuildForest and TestForest program, and are BuildForest and > TestForest program are official tools to build Random Forest models > from the command line? > > Regards, > > Xiaobo Gu >
