hi! could someone please shed some light on the following newbie questions - i'd definitly would like to use mahout (currently i'm using kea) for keyphrase extraction based on a controlled vocabulary, so here's my situation :
1. i'm using a modified version of kea (http://www.nzdl.org/Kea/), that is capable of getting it's controlled vocabulary from any SAIL-RDF-Repository (http://www.openrdf.org). kea has two modes to extract keyphrases from a text document - a free one and a controlled one (which checks keyphrase candidate against a skos:thesaurus). 2. it's possible to train such an extraction model on the fly via a webinterface (i should admit that i'm not conviced that "training" is correct term for giving an extraction model new input data - people would assume it's getting better and better, but in most cases it's only getting different. 3. what i also liked to achieve is, that if someone creates a new skos:Concept with some skos:prefLabel, i'd like to suggest where to place this new concept in the thesaurus (suggest what could be it's skos:narrowers or skos:broaders) currently i'm doing this via the bridge of indexed documents. (i.e.: a new skos:Concept gets the skos:prefLabel "house", i search for all documents containing "house", count the allready existing skos:Concepts these documents are tagged with and print out the list of concepts with the number of their occurrences. - has anyone some experience with extracting keyphrases from a document using mahout - has anyone some experience with extracting keyphrases based on a controlled vocabulary from a document using mahout - has anyone some experience making thesaurus suggestions, i.e in my thesaurus there's a concept with prefLabel "xml" someone enters a new concept "extensible markup language" : how could i suggest, not to create a new concept, but to use "extensible markup language" as an altLabel for xml. - could someone point me into the right direction for basic intro into keyphrase extraction using mahout any help or comments really appreciated wkr www.turnguard.com -- View this message in context: http://www.nabble.com/newbie-intro-tp25530069p25530069.html Sent from the Mahout User List mailing list archive at Nabble.com.
