Hi,
A few quick Qs about classifiers in Mahout: * Are any of the classifiers we have in Mahout MR-free? That is, do all classifier implementations run exclusively on top of Hadoop? * Do any classifiers offer the option of basing classification on linguistic rules? Concretely, I'm wondering if there is anything in Mahout that would let me classify very short text comments (300 bytes of avg). Here is an example from another system that uses linguistic rules (of some kind -- I don't have the details): There's a rule that classifies items as "Customer Wants a Callback" that identifies comments such as, "I want a manager to call me about my engine issue." It also finds comments such as "I want a refund." It has dictionaries and rules to discover parts of speech indicating a callback is needed. Another example is a rule that finds comments about Staff Speed. It identifies comments that indicate that the staff was slow in the performance of their duties. I think we have nothing that would do the above in Mahout, but I thought I'd ask. Also, I am *guessing* the existing classifiers in Mahout would not do well with very short pieces of text? Thanks, Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/
