Hi Mahouters, I just posted part 1 of a series on extracting text features for machine learning…
http://www.scaleunlimited.com/2013/07/10/text-feature-selection-for-machine-learning-part-1/ I decided it was time to try to share some of what I'd learned over the years in processing text for classification, clustering and other related ML tasks. It undoubtedly has some things that are unclear or even incorrect, so please comment :) Thanks, -- Ken -------------------------- Ken Krugler +1 530-210-6378 http://www.scaleunlimited.com custom big data solutions & training Hadoop, Cascading, Cassandra & Solr
