Hello Philip, thank you for the response. Can you please provide a list of
languages that can be handled by KIM out of the box or following the given
process?

Regards,
Naaman Musawwir.

-----Original Message-----
From: Philip Alexiev [mailto:philip.alex...@ontotext.com] 
Sent: Wednesday, February 15, 2012 5:35 PM
To: Naaman Musawwir
Cc: kim-discussion@ontotext.com
Subject: Re: [Kim-discussion] Processing Spanish Laguange Documents

Hello Naaman,

KIM does not support Spanish out of the box. The knowledge base of the
public distribution contains information for more generic information
extraction, like world news.  It contains the most famous Persons, Locations
and Organizations around the world. Probably this knowledge is too general
for your domain of interest.  Various amount of tuning is required.

There are generally two aspects of the named entity recognition process. 

Gazetteer Lookups
This is the process of recognizing well known objects in the text.  You need
to supply a comprehensive set of named entities in spanish, which will feed
the gazetteer. Then the gazetteer will be able to match them in the texts
and create the corresponding annotations over them.  More information on
this can be found in KIM's system documentation* under  Administration ->
Extending the KIM ontology and knowledge base.

New Entities
This is a more complicated and composite approach, combining different
techniques and rules. An example is using titles like "Mayor" and "Mrs.".
You can start by looking at the grammars that are being loaded
(KIM/context/default/resources/grammar/main/main.jape)  and the rules that
create annotations. These rules may include direct text matching of the
context around the annotation or matching previously created annotations. 

* http://www.ontotext.com/sites/default/files/kim/KimDocs-3.0-EN.zip

Hope this helps
Philip Alexiev
Software Engineer, KIM team


On 14 Feb 2012, at 3:30 PM, Naaman Musawwir wrote:

> Hello,
> 
> We are going to try keyword extraction on some documents those are in 
> Spanish Language. Please direct how to configure my KIM instance to do
that.
> 
> Regards,
> Naaman Musawwir.
> 
> 
> _______________________________________________
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion




-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.1913 / Virus Database: 2112/4810 - Release Date: 02/14/12


_______________________________________________
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

Reply via email to