Hi, I'm new to Mahout and many of the machine learning ideas, but from what I understand of Naive Bayes classifier, it's possible to train a Naive Bayes model with continuous, categorical and word-like features from my understanding of the wikipedia entry http://en.wikipedia.org/wiki/Naive_Bayes_classifier
The 20news and wikipedia examples currently in mahout from what I gather only use a target categorical variable and a text-like variables. I'm trying to replicate the person-gender-guesser used in the wikipedia article using mahout. Can anyone give me any tips about how to: * format input files (train and test) for different data types * inform the trainer and classifier which features are continuous, categorical and word-like My dataset is quite small, so I'd like to be able to process this in code (using Vectors, Models, etc), but I'm quite confused about how to use the classifier.bayes packages to train/create model with all my feature types. Thanks in advance for any guidance. Cheers, -- Vijay Santhanam Software Engineer http://au.linkedin.com/in/vijaysanthanam 0407525087
