Hi Dave,
both types of key artifacts are a part of the default kim pipeline, i.e. they are running in a standard GATE pipeline. The key phrase extraction has been originally developed by Kalina Bontcheva (USFD) and probably others at USFD. We took it some years ago and worked together to extend it. It is now available in GATE - check the creole plugins available and search for Keyphrase. It is in /plugins/Keyphrase_Extraction_Algorithm The module is based on TF.IDF, where the document frequency in IDF is calculated on a pre-defined corpus during the training of the model. You can limit the size of the model, the number of tokens in a phrase (e.g. taking only phrases 2 to 3 tokens of length). During runtime you can specify how many keyphrases you'd like to get per doc.

I'm pretty certain, although we've changed it, that you would be able to get similar results easily with what is available in GATE.

The key entities identification components are derived from this one, but they count on unique (for the entire corpus) identifier of entities - in our case URIs of instances in a knowledge base. Without it - you can not do the stats. I do not think that this functionality is available in GATE - mainly because you do not have this unique ID capability there - although with all the ontology extensions that the community introduced in the recent years - i might be wrong - so please check with the gate list.
        
all the best
 borislav

On Mar 2, 2010, at 4:49 PM, Harrill, David C wrote:

To whom it may concern,

In working with the KIM tool, I came across the Document Detail screen which displays both the Features associated with the document as well as the document content. Within the Features section, there exists two Features (KeyEntities and KeyPhrases). Are these two features derived from the GATE application and if so using what GATE plug-in? Otherwise how do these entities and phrases get populated on this screen. I appreciate any information you can provide on this matter and I look forward to hearing from you in regard to this matter.

Thanks,
Dave

_______________________________________________
Kim-discussion mailing list
[email protected]
http://ontotext.com/mailman/listinfo/kim-discussion

_______________________________________________
Kim-discussion mailing list
[email protected]
http://ontotext.com/mailman/listinfo/kim-discussion

Reply via email to