[Kim-discussion] Fwd: KIM_Entity_Search

borislav popov Tue, 02 Mar 2010 09:43:05 -0800

sorry guys. sent this only to dave. fwding to the list as well
b


Begin forwarded message:

From: borislav popov <[email protected]>
Date: March 2, 2010 6:19:10 PM GMT+02:00
To: "Harrill, David C" <[email protected]>
Subject: Re: [Kim-discussion] KIM_Entity_Search

Dave
        you came to an existential question. the simple answer is:
- if you need pure text analysis and you know what to do with theresults afterwards - you need only GATE.- if you need annotations with respect to some structured data sets- like knowledge bases, conceptual models ... that allows search andnavigation based on FTS, structure of the data set, co-occurrenceand a combination of those; if you need ways to obtain contentthrough rss feeds or focussed crawling; etc. - you need KIM.
So briefly - text analysis we have by default or produce forcustomers is GATE compliant - and in most cases can be executedwithin a pure GATE environment. GATE embedded is integral part ofKIM for modeling documents, annotations, corpora, etc.The rest are content feeding services, semantic annotation, indexing& search. The Web UI you know is just an example one - oftencustomers choose completely different route.Beside this we provide customizations of everything in a kim-basedsystem - from crawlers, IE pipelines to background knowledge andsearch. And support our customers as they go on.
for GATE only: also there you might stumble upon stuff that is notpresent with the main package - like parallel processing, annotationpatterns based search, manual curation infrastructurefor these and more we have joint offerings with the GATE group and/or other partners, like Matrixware.
So the answer is not that simple as the dependency is multi-layered.There are also several products that are still not public in whichwe heavily cooperate with GATE, but bits and pieces go into customprojects for customers already.
i remember a project several years ago where we had to explain howin GATE there is a KIM Client calling a KIM Server which is based onGATE. phew.
So when you know what you would like to do - we hope this info willhelp you take the right route and not waste time.
all the best
b


On Mar 2, 2010, at 5:46 PM, Harrill, David C wrote:
Borislav,
Thanks for your quick reply. I wanted to ask an additional questionpertaining to the overall KIM application. I have also been workingwith GATE and have been attempting to differentiate the twoapplications i.e. What overall capability that KIM provides thatGATE does not (with associative plug-ins). I have been reviewingthe vast documentation for both applications and have been unableto come up with a clear cut difference (excluding the wonderfulsearch capabilities in KIM). Could you potentially provide me withinformation on why an individual would primarily use GATE over KIMor vice versa. Again, thanks for your assistance in regard to thismatter.
Dave

From: borislav popov [mailto:[email protected]]
Sent: Tuesday, March 02, 2010 10:28 AM
To: Harrill, David C
Cc: [email protected]
Subject: Re: [Kim-discussion] KIM_Entity_Search

Hi Dave,
both types of key artifacts are a part of the defaultkim pipeline, i.e. they are running in a standard GATE pipeline.The key phrase extraction has been originally developed by KalinaBontcheva (USFD) and probably others at USFD. We took it some yearsago and worked together to extend it. It is now available in GATE -check the creole plugins available and search for Keyphrase. It isin /plugins/Keyphrase_Extraction_AlgorithmThe module is based on TF.IDF, where the document frequency in IDFis calculated on a pre-defined corpus during the training of themodel. You can limit the size of the model, the number of tokens ina phrase (e.g. taking only phrases 2 to 3 tokens of length). Duringruntime you can specify how many keyphrases you'd like to get perdoc.
I'm pretty certain, although we've changed it, that you would beable to get similar results easily with what is available in GATE.
The key entities identification components are derived from thisone, but they count on unique (for the entire corpus) identifier ofentities - in our case URIs of instances in a knowledge base.Without it - you can not do the stats. I do not think that thisfunctionality is available in GATE - mainly because you do not havethis unique ID capability there - although with all the ontologyextensions that the community introduced in the recent years - imight be wrong - so please check with the gate list.
all the best
 borislav

On Mar 2, 2010, at 4:49 PM, Harrill, David C wrote:


To whom it may concern,
In working with the KIM tool, I came across the Document Detailscreen which displays both the Features associated with thedocument as well as the document content. Within the Featuressection, there exists two Features (KeyEntities and KeyPhrases).Are these two features derived from the GATE application and if sousing what GATE plug-in? Otherwise how do these entities andphrases get populated on this screen. I appreciate any informationyou can provide on this matter and I look forward to hearing fromyou in regard to this matter.
Thanks,
Dave

_______________________________________________
Kim-discussion mailing list
[email protected]
http://ontotext.com/mailman/listinfo/kim-discussion

_______________________________________________
Kim-discussion mailing list
[email protected]
http://ontotext.com/mailman/listinfo/kim-discussion

[Kim-discussion] Fwd: KIM_Entity_Search

Reply via email to