Hi, I am new to Java and Machine Learning concept. I was searching for a method to extract keywords (like names of people, organization, places etc) from new stories sorted by relevance. I found several web services like OpenCalais that provide similar service, but they don't detect most of my terms. I have a list of approved keywords, and only need to detect from that list.
I found out about Machine Learning and got interested in the concept. I read somewhere that the classification feature of mahout can be used for detecting keywords by classifying terms as keywords and non-keywords. I have been trying to learn mahout for the past 30 hours, but haven't reached anywhere. It is not useful to waste time trying to learn, if mahout is not the tool to solve my problem. Can someone provide details on using mahout for term extraction? Is it possible to do this with little to medium knowledge in Java? Is it an overkill to use mahout for this? Should I go for an NLP solution? Thanks, Joyce
