That would be very nice, actually I haven't tested most of Mahout algorithms for that reason...
2011/7/25 Xiaobo Gu <[email protected]> > Hi, > Most time Mahout algorithms use Vector as the model training input, > but don’t take care of how the instance vectors are generated, then > every algorithm has it’s unique way, causing the original input file > format requirement bound to specific algorithm. That causes a lot of > work for the actual users, especially for command line users. For > example, if we want to build a Logistic Regression and Naïve bayes > model for the same data, we must prepare the data in two formats. > Hence here comes for requirement that can you provide a universal > mechanism for handling input data, such as CSV and a CSV to Vector > encoder, then all algorithms will use it, and users just have to > prepare data as CSV. > > Regards, > > Xiaobo Gu >
