Re: [Kim-discussion] need help with Integrating DBpedia in KIM

2013-10-21 Thread Philip Alexiev

Hello Jeroen,

The file was incomplete indeed. I have updated it. Now you can download it.
The dbpedia instances are inside.

All the best

Philip Alexiev
Senior Software Engineer
Ontotext

On 10/21/2013 08:27 AM, Jeroen Lapre' wrote:

I'm trying to follow the Example - Integrating DBpedia in KIM:

https://confluence.ontotext.com/display/KimDocs37EN/Example+-+Integrating+DBpedia+in+KIM

But can't seem to find the *dbpedia_instances.nt file*it mentions to 
put in /context/default/kb/dbpedia/.


Also, the link to dbpedia-ontology.zip 
<https://confluence.ontotext.com/download/attachments/33327689/dbpedia-ontology.zip?version=1&modificationDate=1335269372000> appears 
to be an incomplete file.


Can someone help me with these issues?

thanks
Jeroen



___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] KIM IE w/ extended knowledge base

2013-10-03 Thread Philip Alexiev

Glad I could help.

Good luck with your PhD.

All the best

Philip Alexiev
Senior Software Engineer
Ontotext

On Oct 3, 2013, at 6:31 PM, Karl Hammar  wrote:

> Hi,
> 
> That explains the problem. I've now tested loading the data at initial launch 
> through the owlim config file, and entity recognition works just as expected. 
> I'm very impressed, this will make for some nice demos w/ project 
> stakeholders. Many thanks for your quick and helpful response!
> 
> /Karl
> 
> --
> Karl Hammar, Tekn. Lic., M.Sc.
> Program Manager, Software Engineering and Mobile Platforms
> School of Engineering, Jönköping University
> phone +46 (0)36 101611
> mobile +46 (0)73 509 5910
> 
> 3 okt 2013 kl. 16:31 skrev Philip Alexiev :
> 
>> Hello Karl,
>> 
>> The RDF you generated seems perfectly valid.
>> 
>> I will explain a little more about what happens under the hood when KIM 
>> starts.
>> 
>> When the platform starts, first it loads the RDF data described in the Owlim 
>> repository template in  KIM/config/owlim.ttl . After this, a GATE pipeline 
>> is ran, including the resources in its configuration file:  
>> KIM/context/default/resources/IE.gapp . One of the resources is a gazetteer. 
>> It creates a dictionary of terms memory. This dictionary is filled by a 
>> query to the semantic store (Owlim).  Then the same dictionary is used to 
>> recognize the entities in the analyzed documents.
>> 
>> In short:  you should put your persons RDF in KIM and add it to the 
>> owlim:imports section in KIM/config/owlim.ttl  (don't forget to add a 
>> default namespace for the import in owlim:defaultNS).  After this, clear the 
>> old indexes. Do that by deleting the contents of the 
>> KIM/context/default/populated folder.  This will also remove all your 
>> previously added documents.  Then start KIM again and reprocess the 
>> documents.
>> 
>> Ask if you have more problems.
>> 
>> Thank you for your interest in KIM.
>> 
>> Philip Alexiev
>> Senior Software Engineer
>> Ontotext
>> 
>> On Oct 3, 2013, at 5:05 PM, Karl Hammar  wrote:
>> 
>>> Hi,
>>> 
>>> I'm testing out KIM, to see whether it is suitable for a project in the 
>>> scope of my PhD work. I'm having a bit of trouble getting it to correctly 
>>> identify entities in text that I populate it with, and would like to check 
>>> if I'm missing something obvious. 
>>> 
>>> What I've done is taken a dataset of some 1600 people, originally expressed 
>>> using FOAF, and adapted it to suit PROTON, before adding it to the KIM KB 
>>> using the rdf import tool. The adaptation included adding mainAlias, 
>>> mainLabel, and generatedBy predicates as described on 
>>> (https://confluence.ontotext.com/display/KimDocs37EN/Extending+the+instance+base),
>>>  and changing their type from foaf:Person to PROTON Person. When using the 
>>> Structure tab of the KIM Web UI I can query for my added entities and find 
>>> the people in the system with no problem. KIM even finds pictures of the 
>>> people in question, which is kind of cool, as these pictures were not 
>>> linked in the original data!
>>> 
>>> However, when adding documents using the populater tool, rather than 
>>> annotating those documents with the existing people entities, additional 
>>> new entities are created and used for annotation instead. These new 
>>> entities are typed as Person and have only a label and no other additional 
>>> data. 
>>> 
>>> Attached you find example data - one file contains the triples associated 
>>> with me, as resulting from the initial RDF import. I'd prefer if the 
>>> entities in this file were used when annotating documents. The other file 
>>> contains the triples generated by KIM/GATE when it comes across the string 
>>> "Karl Hammar" in an input document. I'd prefer if this was not used.
>>> 
>>> What am I missing?
>>> 
>>> Best regards,
>>> 
>>> Karl Hammar
>>> 
>>> --
>>> Karl Hammar, Tekn. Lic., M.Sc.
>>> Program Manager, Software Engineering and Mobile Platforms
>>> School of Engineering, Jönköping University
>>> phone +46 (0)36 101611
>>> mobile +46 (0)73 509 5910
>>> ___
>>> Kim-discussion mailing list
>>> Kim-discussion@ontotext.com
>>> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion
>> 
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] KIM IE w/ extended knowledge base

2013-10-03 Thread Philip Alexiev
Hello Karl,

The RDF you generated seems perfectly valid.

I will explain a little more about what happens under the hood when KIM starts.

When the platform starts, first it loads the RDF data described in the Owlim 
repository template in  KIM/config/owlim.ttl . After this, a GATE pipeline is 
ran, including the resources in its configuration file:  
KIM/context/default/resources/IE.gapp . One of the resources is a gazetteer. It 
creates a dictionary of terms memory. This dictionary is filled by a query to 
the semantic store (Owlim).  Then the same dictionary is used to recognize the 
entities in the analyzed documents.

In short:  you should put your persons RDF in KIM and add it to the 
owlim:imports section in KIM/config/owlim.ttl  (don't forget to add a default 
namespace for the import in owlim:defaultNS).  After this, clear the old 
indexes. Do that by deleting the contents of the KIM/context/default/populated 
folder.  This will also remove all your previously added documents.  Then start 
KIM again and reprocess the documents.

Ask if you have more problems.

Thank you for your interest in KIM.

Philip Alexiev
Senior Software Engineer
Ontotext

On Oct 3, 2013, at 5:05 PM, Karl Hammar  wrote:

> Hi,
> 
> I'm testing out KIM, to see whether it is suitable for a project in the scope 
> of my PhD work. I'm having a bit of trouble getting it to correctly identify 
> entities in text that I populate it with, and would like to check if I'm 
> missing something obvious. 
> 
> What I've done is taken a dataset of some 1600 people, originally expressed 
> using FOAF, and adapted it to suit PROTON, before adding it to the KIM KB 
> using the rdf import tool. The adaptation included adding mainAlias, 
> mainLabel, and generatedBy predicates as described on 
> (https://confluence.ontotext.com/display/KimDocs37EN/Extending+the+instance+base),
>  and changing their type from foaf:Person to PROTON Person. When using the 
> Structure tab of the KIM Web UI I can query for my added entities and find 
> the people in the system with no problem. KIM even finds pictures of the 
> people in question, which is kind of cool, as these pictures were not linked 
> in the original data!
> 
> However, when adding documents using the populater tool, rather than 
> annotating those documents with the existing people entities, additional new 
> entities are created and used for annotation instead. These new entities are 
> typed as Person and have only a label and no other additional data. 
> 
> Attached you find example data - one file contains the triples associated 
> with me, as resulting from the initial RDF import. I'd prefer if the entities 
> in this file were used when annotating documents. The other file contains the 
> triples generated by KIM/GATE when it comes across the string "Karl Hammar" 
> in an input document. I'd prefer if this was not used.
> 
> What am I missing?
> 
> Best regards,
> 
> Karl Hammar
> 
> --
> Karl Hammar, Tekn. Lic., M.Sc.
> Program Manager, Software Engineering and Mobile Platforms
> School of Engineering, Jönköping University
> phone +46 (0)36 101611
> mobile +46 (0)73 509 5910
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Editing KIM Ontology

2013-07-03 Thread Philip Alexiev
Hello Naaman,

If you are going to use the World Knowledge Base that comes with KIM or not is 
a matter of choice. It contains only named entities. The taxonomy ( definitions 
of the classes of objects and the relations between them) is actually Proton 
and is located in KIM/context/default/kb/owl/ . Most of the parts of KIM work 
with Proton and that is why your new instances should map to it also.  You can 
create your own domain specific knowledge base, which is actually the preferred 
approach. The WKB is too generic and is intended to recognize entities in 
global news.

You check the documentation here:
https://confluence.ontotext.com/display/SSDC/Semantic+Solutions+Docs+Collection
It has a whole section about extending the data and the ontology.

All the best,
Philip

On Jul 3, 2013, at 6:14 PM, Naaman Musawwir  wrote:

> Hello Philip, thank you for the response.
>  
> I went through the included files. Looks like data instances are mainly in 
> the file wkb.nt.
>  
> Our target is to extract terms from documents those are only related to 
> “Consumer Electronics” field. For this purpose we need to clean up the 
> current ontology and feed related ontology.
>  
> Do you know of some existing knowledge base that can fulfill our need? Also, 
> guide how can we add/remove a term in the existing knowledge base.
>  
> Regards,
> Naaman Musawwir.
>  
> From: Philip Alexiev [mailto:philip.alex...@ontotext.com] 
> Sent: Thursday, June 20, 2013 12:26 PM
> To: Naaman Musawwir
> Cc: kim-discussion@ontotext.com
> Subject: Re: Editing KIM Ontology
>  
> Hello Naaman,
>  
> On first start, KIM is reading a list of RDF files to import in the semantic 
> repository from the repository configuration file:  KIM/config/owlim.ttl. You 
> can change the list of files to be loaded or modify their content.
>  
> When the system is started for the first time, a semantic repository is 
> created based on the settings from this configuration file. The subsequent 
> stop/start actions are not reading from the file again. They are loading the 
> image of the already created repository. So if you change the configuration 
> or data, you will have to clear the state of the system. This is done by 
> deleting everything under  KIM/context/default/populated/ . It will wipe all 
> data currently in the system and the next start will create a fresh 
> repository. All documents currently in the system are also removed.
>  
> Hope this helps,
> Philip Alexiev
> Senior Software Engineer
> Ontotext
>  
> On Jun 19, 2013, at 9:02 PM, "Naaman Musawwir"  
> wrote:
> 
> 
> Hello there,
>  
> We are in need of adding/deleting some terms in the existing Ontology. Please 
> guide how to do that? Are there some editable files to do that and how can we 
> use those?
>  
> Regards,
> Naaman Musawwir.
>  
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Editing KIM Ontology

2013-06-20 Thread Philip Alexiev
Hello Naaman,

On first start, KIM is reading a list of RDF files to import in the semantic 
repository from the repository configuration file:  KIM/config/owlim.ttl. You 
can change the list of files to be loaded or modify their content.

When the system is started for the first time, a semantic repository is created 
based on the settings from this configuration file. The subsequent stop/start 
actions are not reading from the file again. They are loading the image of the 
already created repository. So if you change the configuration or data, you 
will have to clear the state of the system. This is done by deleting everything 
under  KIM/context/default/populated/ . It will wipe all data currently in the 
system and the next start will create a fresh repository. All documents 
currently in the system are also removed.

Hope this helps,
Philip Alexiev
Senior Software Engineer
Ontotext

On Jun 19, 2013, at 9:02 PM, "Naaman Musawwir"  wrote:

> Hello there,
>  
> We are in need of adding/deleting some terms in the existing Ontology. Please 
> guide how to do that? Are there some editable files to do that and how can we 
> use those?
>  
> Regards,
> Naaman Musawwir.
>  

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] KIM Client in GATE GUI

2013-03-18 Thread Philip Alexiev
Hello Mike,

You will have no problems running KIM as a background service on a server 
without a graphical interface. KIM is intended to run this way.

The GATE developer interface is just a development tool to build the 
information extraction logic. You need a graphical interface for it. But once 
you are finished with building the pipeline, KIM does not need a GUI to use it.

The populater tool has a graphical and a non-graphical interface. The 
non-graphical uses the configuration in  KIM/config/populater.xml .

Generally it depends on the specific tool whether it needs a GUI or not. The 
tools connect to the KIM server through RMI . The process is described in the 
documentation.

Hope this helps,
Philip Alexiev
Software Engineer, KIM team


On Mar 18, 2013, at 3:05 PM, Michael Patek  wrote:

> Hi Philip,
> 
> Thanks.  Do you know if there are any KIM tools that can only be as a GUI?  
> I'm trying to determine if there's anything that cannot easily be done on a 
> hosted server via the command line.  If it is the case that there is some KIM 
> functionality that is not available on the command line, please let me know.
> 
> Thanks,
> -Mike
> 
> 
> On Fri, Mar 15, 2013 at 12:05 PM, Philip Alexiev 
>  wrote:
> Hello Mike,
> 
> If you only want to look at the information extraction process, performed by 
> KIM internally, you can do this easily. Run an instance of KIM locally. There 
> is no need to connect to a separate one. Just run the GATE Developer 
> interface of the concrete KIM installation locally. This can be done by  
> performing the following command :
> 
> KIM/bin/kim  gate
> 
> You can observe and customize the different aspects of the process from this 
> interface.
> 
> Hope this helps.
> Philip Alexiev
> Software Engineer, KIM team
> 
> On Mar 14, 2013, at 10:05 PM, Michael Patek  wrote:
> 
>> Hi Phil,
>> 
>> Our remote KIM instance is running on a hosted server.  We have access to 
>> the command line on that server via ssh, but we can't run the kim-gate-gui 
>> directly on that server.  Is there a way to run the kim-gate-gui on a remote 
>> installation of KIM?
>> 
>> Thanks,
>> -Mike
>> 
>> 
>> On Thu, Mar 14, 2013 at 3:48 PM, Philip Alexiev 
>>  wrote:
>> Hello Michael,
>> 
>> Why are you trying to have one KIM instance connect to another? Please share 
>> the big picture. Most probably the goal can be achieved with one running KIM.
>> 
>> Best,
>> Phil
>> 
>> 
>> 
>> 
>> On Mar 14, 2013, at 6:17 PM, Michael Patek  wrote:
>> 
>> > Hello,
>> >
>> > I have a fresh install of the KIM platform running on a local computer, 
>> > and another install of KIM running on a remote server.  I run the GATE GUI 
>> > (bin/tools/kim-gate-gui), and try to add a new KIM Client (under 
>> > Processing Resources).  I provide a name, and the ip address of the remote 
>> > server.  When I click on 'OK', I get the following error:
>> >
>> > gate.creole.ResourceInstantiationException: Couldn't find parameter named 
>> > outputASName in com.ontotext.kim.gate.KIMClient
>> >   at 
>> > gate.creole.AbstractResource.setParameterValue(AbstractResource.java:227)
>> >   at 
>> > gate.creole.AbstractResource.setParameterValues(AbstractResource.java:255)
>> >   at 
>> > gate.creole.AbstractResource.setParameterValues(AbstractResource.java:414)
>> >   at gate.Factory.createResource(Factory.java:259)
>> >   at gate.gui.NewResourceDialog$4.run(NewResourceDialog.java:225)
>> >   at java.lang.Thread.run(Thread.java:679)
>> >
>> > Any help would be greatly appreciated.
>> >
>> > Thanks,
>> > -Mike
>> >
>> > BTW, here is the dubugging output that I see when I start up kim-gate-gui:
>> >
>> > KIM_HOME=/home/michael/projects/kw/kim-platform-3.7
>> > KIM_CONTEXT=/home/michael/projects/kw/kim-platform-3.7/context/default
>> > KIM_MAX_JAVA_HEAP=2g
>> > KIM_LOG_FOLDER=/home/michael/projects/kw/kim-platform-3.7/log
>> > CREOLE Directory file:///home/michael/projects/kw/kim-platform-3.7/config 
>> > queued for registration
>> > 12:15:14.257 [main] INFO  gate.Gate - Using 
>> > /home/michael/projects/kw/kim-platform-3.7 as GATE home
>> > 12:15:14.267 [main] INFO  gate.Gate - Using 
>> > /home/michael/projects/kw/kim-platform-3.7/plugins as installed plug-ins 
>> > directory.
>> > 12:15:14.267 [main] INFO  gate.Gate - Using 
>> > /home/michael/projects/kw/kim-platform-3.

Re: [Kim-discussion] KIM Client in GATE GUI

2013-03-15 Thread Philip Alexiev
Hello Mike,

If you only want to look at the information extraction process, performed by 
KIM internally, you can do this easily. Run an instance of KIM locally. There 
is no need to connect to a separate one. Just run the GATE Developer interface 
of the concrete KIM installation locally. This can be done by  performing the 
following command :

KIM/bin/kim  gate

You can observe and customize the different aspects of the process from this 
interface.

Hope this helps.
Philip Alexiev
Software Engineer, KIM team

On Mar 14, 2013, at 10:05 PM, Michael Patek  wrote:

> Hi Phil,
> 
> Our remote KIM instance is running on a hosted server.  We have access to the 
> command line on that server via ssh, but we can't run the kim-gate-gui 
> directly on that server.  Is there a way to run the kim-gate-gui on a remote 
> installation of KIM?
> 
> Thanks,
> -Mike
> 
> 
> On Thu, Mar 14, 2013 at 3:48 PM, Philip Alexiev  
> wrote:
> Hello Michael,
> 
> Why are you trying to have one KIM instance connect to another? Please share 
> the big picture. Most probably the goal can be achieved with one running KIM.
> 
> Best,
> Phil
> 
> 
> 
> 
> On Mar 14, 2013, at 6:17 PM, Michael Patek  wrote:
> 
> > Hello,
> >
> > I have a fresh install of the KIM platform running on a local computer, and 
> > another install of KIM running on a remote server.  I run the GATE GUI 
> > (bin/tools/kim-gate-gui), and try to add a new KIM Client (under Processing 
> > Resources).  I provide a name, and the ip address of the remote server.  
> > When I click on 'OK', I get the following error:
> >
> > gate.creole.ResourceInstantiationException: Couldn't find parameter named 
> > outputASName in com.ontotext.kim.gate.KIMClient
> >   at 
> > gate.creole.AbstractResource.setParameterValue(AbstractResource.java:227)
> >   at 
> > gate.creole.AbstractResource.setParameterValues(AbstractResource.java:255)
> >   at 
> > gate.creole.AbstractResource.setParameterValues(AbstractResource.java:414)
> >   at gate.Factory.createResource(Factory.java:259)
> >   at gate.gui.NewResourceDialog$4.run(NewResourceDialog.java:225)
> >   at java.lang.Thread.run(Thread.java:679)
> >
> > Any help would be greatly appreciated.
> >
> > Thanks,
> > -Mike
> >
> > BTW, here is the dubugging output that I see when I start up kim-gate-gui:
> >
> > KIM_HOME=/home/michael/projects/kw/kim-platform-3.7
> > KIM_CONTEXT=/home/michael/projects/kw/kim-platform-3.7/context/default
> > KIM_MAX_JAVA_HEAP=2g
> > KIM_LOG_FOLDER=/home/michael/projects/kw/kim-platform-3.7/log
> > CREOLE Directory file:///home/michael/projects/kw/kim-platform-3.7/config 
> > queued for registration
> > 12:15:14.257 [main] INFO  gate.Gate - Using 
> > /home/michael/projects/kw/kim-platform-3.7 as GATE home
> > 12:15:14.267 [main] INFO  gate.Gate - Using 
> > /home/michael/projects/kw/kim-platform-3.7/plugins as installed plug-ins 
> > directory.
> > 12:15:14.267 [main] INFO  gate.Gate - Using 
> > /home/michael/projects/kw/kim-platform-3.7/gate.xml as site configuration 
> > file.
> > 12:15:14.267 [main] INFO  gate.Gate - Using /home/michael/.gate.xml as user 
> > configuration file
> > 12:15:14.268 [main] INFO  gate.Gate - Using /home/michael/.gate.session as 
> > user session file
> > 12:15:15.528 [main] DEBUG gate.Gate - user config loaded; DBCONFIG={}
> > 12:15:16.515 [main] INFO  gate.creole.CreoleRegisterImpl - CREOLE plugin 
> > loaded: file:/home/michael/projects/kw/kim-platform-3.7/config/
> >
> > ___
> > Kim-discussion mailing list
> > Kim-discussion@ontotext.com
> > http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion
> 
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] KIM Client in GATE GUI

2013-03-14 Thread Philip Alexiev
Hello Michael,

Why are you trying to have one KIM instance connect to another? Please share 
the big picture. Most probably the goal can be achieved with one running KIM.

Best,
Phil




On Mar 14, 2013, at 6:17 PM, Michael Patek  wrote:

> Hello,
> 
> I have a fresh install of the KIM platform running on a local computer, and 
> another install of KIM running on a remote server.  I run the GATE GUI 
> (bin/tools/kim-gate-gui), and try to add a new KIM Client (under Processing 
> Resources).  I provide a name, and the ip address of the remote server.  When 
> I click on 'OK', I get the following error:
> 
> gate.creole.ResourceInstantiationException: Couldn't find parameter named 
> outputASName in com.ontotext.kim.gate.KIMClient
>   at 
> gate.creole.AbstractResource.setParameterValue(AbstractResource.java:227)
>   at 
> gate.creole.AbstractResource.setParameterValues(AbstractResource.java:255)
>   at 
> gate.creole.AbstractResource.setParameterValues(AbstractResource.java:414)
>   at gate.Factory.createResource(Factory.java:259)
>   at gate.gui.NewResourceDialog$4.run(NewResourceDialog.java:225)
>   at java.lang.Thread.run(Thread.java:679)
> 
> Any help would be greatly appreciated.
> 
> Thanks,
> -Mike
> 
> BTW, here is the dubugging output that I see when I start up kim-gate-gui:
> 
> KIM_HOME=/home/michael/projects/kw/kim-platform-3.7
> KIM_CONTEXT=/home/michael/projects/kw/kim-platform-3.7/context/default
> KIM_MAX_JAVA_HEAP=2g
> KIM_LOG_FOLDER=/home/michael/projects/kw/kim-platform-3.7/log
> CREOLE Directory file:///home/michael/projects/kw/kim-platform-3.7/config 
> queued for registration
> 12:15:14.257 [main] INFO  gate.Gate - Using 
> /home/michael/projects/kw/kim-platform-3.7 as GATE home
> 12:15:14.267 [main] INFO  gate.Gate - Using 
> /home/michael/projects/kw/kim-platform-3.7/plugins as installed plug-ins 
> directory.
> 12:15:14.267 [main] INFO  gate.Gate - Using 
> /home/michael/projects/kw/kim-platform-3.7/gate.xml as site configuration 
> file.
> 12:15:14.267 [main] INFO  gate.Gate - Using /home/michael/.gate.xml as user 
> configuration file
> 12:15:14.268 [main] INFO  gate.Gate - Using /home/michael/.gate.session as 
> user session file
> 12:15:15.528 [main] DEBUG gate.Gate - user config loaded; DBCONFIG={}
> 12:15:16.515 [main] INFO  gate.creole.CreoleRegisterImpl - CREOLE plugin 
> loaded: file:/home/michael/projects/kw/kim-platform-3.7/config/
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] help about kim

2013-03-11 Thread Philip Alexiev
No such functionality exists out of the box. You can implement it using a 
custom document handler and attach it to the system to be executed on various 
events.

Here is the part of KIM's public documentation describing the document handlers:
https://confluence.ontotext.com/display/KimDocs37EN/Custom+KIM+Document+Handlers

The document structure is represented by the KIMDocument java class. You can 
look it up in the API documentation which comes with KIM. The handlers also 
work with this class.

How do you plan to use the generated RDF? Can you share the big picture?

Best,
Philip

On Mar 11, 2013, at 4:36 PM, Lydia Khelifa  wrote:

> Hello Philip,
> 
> I would like  to thank you for your answer :).
> So i would like to annotate a corpora of documents with my wordlist (a manual 
> dictionary which is a list of word), and store the annotated documents in RDF 
> triples.
> So you say that your semantic annotations are stored in Owlim.
> Could i have the model or structure of this storage. I see your example and i 
> would like if you can of course get the model of storage that i will probably 
> transform into RDF one.
> My goal is to obtain an instance of annotation of my document corpora.
> Thank you very much.
> 
> Lydia K
> PHD student
> CNAM paris
> 
> 
> 2013/3/11 Philip Alexiev 
> Hello Lydia,
> 
> The simple answer to your question is - you can't. 
> 
> For the complete explanation, I will reveal a little more about KIM's 
> internals and how exactly the data is stored and where.
> 
> We can look at the process of annotating a document in KIM as a two step 
> process:
> 1. Annotate the document (Information Extraction phase)
> 2. Store the annotated document in the persistent store
> 
> The information extraction phase is performed by the GATE framework 
> (http://gate.ac.uk/) . It analyzes the text of the documents and recognizes 
> entities, that the system already knows about. Also can recognize new 
> entities based on rules or machine learning algorithms.  The output of this 
> process is a standard GATE document, with annotations over the content. The 
> GATE configuration KIM uses is customized, so that at the end, semantic 
> annotations are created. This means that the annotations represent an entity 
> from the semantic database and has features to relate it to this entity (uri 
> and class features).
> 
> The second stage is storing the document in a persistent store. KIM uses a 
> combination of a semantic store and a content store to most efficiently 
> achieve this goal.
> The document object (without the actual content) is stored in the semantic 
> store (Owlim). Further, all the features of this document and all the 
> relations to entities found in the document are also stored. This allows us 
> to see which documents mention which entities. Here is a sample document and 
> the information in the semantic store about this document:
> 
> 
> <http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
> <http://proton.semanticweb.org/2006/05/protont#Document> .
> <http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
> <http://proton.semanticweb.org/2006/05/protonu#NewsArticle> .
> <http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
> <http://www.w3.org/2000/01/rdf-schema#label> "Bumper North Sea oil profits 
> pose taxing questions for the Chancellor" .
> <http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
> <http://proton.semanticweb.org/2006/05/protont#hasDate> 
> "99730440"^^<http://www.w3.org/2001/XMLSchema#long> .
> <http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
> <http://proton.semanticweb.org/2006/05/protont#title> "Bumper North Sea oil 
> profits pose taxing questions for the Chancellor" .
> <http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
> <http://proton.semanticweb.org/2006/05/protont#derivedFromSource> "news 
> agency" .
> <http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
> <http://ordi.ontotext.com/sar#hasFeature> 
> <http://www.ontotext.com/kim/2006/05/wkb#Doc1_features_0> .
> <http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
> <http://ordi.ontotext.com/sar#hasFeature> 
> <http://www.ontotext.com/kim/2006/05/wkb#Doc1_features_2> .
> <http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
> <http://ordi.ontotext.com/sar#hasFeature> 
> <http://www.ontotext.com/kim/2006/05/wkb#Doc1_features_3> .
> <http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
> <http://proton.semanticweb.org/2006/05/protonkm#mentions> 
> <http://www.ontotext.com/kim/2006/05/wkb#Minist

Re: [Kim-discussion] help about kim

2013-03-11 Thread Philip Alexiev
Hello Lydia,

The simple answer to your question is - you can't. 

For the complete explanation, I will reveal a little more about KIM's internals 
and how exactly the data is stored and where.

We can look at the process of annotating a document in KIM as a two step 
process:
1. Annotate the document (Information Extraction phase)
2. Store the annotated document in the persistent store

The information extraction phase is performed by the GATE framework 
(http://gate.ac.uk/) . It analyzes the text of the documents and recognizes 
entities, that the system already knows about. Also can recognize new entities 
based on rules or machine learning algorithms.  The output of this process is a 
standard GATE document, with annotations over the content. The GATE 
configuration KIM uses is customized, so that at the end, semantic annotations 
are created. This means that the annotations represent an entity from the 
semantic database and has features to relate it to this entity (uri and class 
features).

The second stage is storing the document in a persistent store. KIM uses a 
combination of a semantic store and a content store to most efficiently achieve 
this goal.
The document object (without the actual content) is stored in the semantic 
store (Owlim). Further, all the features of this document and all the relations 
to entities found in the document are also stored. This allows us to see which 
documents mention which entities. Here is a sample document and the information 
in the semantic store about this document:


<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://proton.semanticweb.org/2006/05/protont#Document> .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://proton.semanticweb.org/2006/05/protonu#NewsArticle> .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://www.w3.org/2000/01/rdf-schema#label> "Bumper North Sea oil profits pose 
taxing questions for the Chancellor" .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://proton.semanticweb.org/2006/05/protont#hasDate> 
"99730440"^^<http://www.w3.org/2001/XMLSchema#long> .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://proton.semanticweb.org/2006/05/protont#title> "Bumper North Sea oil 
profits pose taxing questions for the Chancellor" .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://proton.semanticweb.org/2006/05/protont#derivedFromSource> "news agency" 
.
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://ordi.ontotext.com/sar#hasFeature> 
<http://www.ontotext.com/kim/2006/05/wkb#Doc1_features_0> .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://ordi.ontotext.com/sar#hasFeature> 
<http://www.ontotext.com/kim/2006/05/wkb#Doc1_features_2> .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://ordi.ontotext.com/sar#hasFeature> 
<http://www.ontotext.com/kim/2006/05/wkb#Doc1_features_3> .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://proton.semanticweb.org/2006/05/protonkm#mentions> 
<http://www.ontotext.com/kim/2006/05/wkb#Ministry_T.9> .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://proton.semanticweb.org/2006/05/protonkm#mentions> 
<http://www.ontotext.com/kim/2006/05/wkb#Government_T.52> .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://proton.semanticweb.org/2006/05/protonkm#mentions> 
<http://www.ontotext.com/kim/2006/05/wkb#CalendarMonth_T.3> .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://proton.semanticweb.org/2006/05/protonkm#mentions> 
<http://www.ontotext.com/kim/2006/05/wkb#Newspaper_T.2> .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://proton.semanticweb.org/2006/05/protonkm#mentions> 
<http://www.ontotext.com/kim/2006/05/wkb#Newspaper_T.20> .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://proton.semanticweb.org/2006/05/protonkm#mentions> 
<http://www.ontotext.com/kim/2006/05/wkb#Number_T.11> .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://proton.semanticweb.org/2006/05/protonkm#mentions> 
<http://www.ontotext.com/kim/2006/05/wkb#InternationalOrganization_T.10> .
<http://www.ontotext.com/kim/2006/05/wkb#Doc1> 
<http://proton.semanticweb.org/2006/05/protonkm#mentions> 
<http://www.ontotext.com/kim/2006/05/wkb#Person_T.1> .


The actual content of the document and also the positions of the annotations 
are not stored in the semantic store. When there is a need to visualize the 
document content and annotations, the content store is asked. In our case this 
is Lucene. 

So no RDF for the concrete annotations is kept in the RDF store. Just document 
level relations (document 

Re: [Kim-discussion] a problem occurred in starting kim

2013-03-05 Thread Philip Alexiev
; 
> I think the main reason is I do not have good knowledge of Semantic Web 
> technologies, For instance, I can't find a way when I try to add files to 
> "KIM/config/owlim.ttl".
> < div>
> I feel ashamed for my ignorance of t his kind of knowledge. If you have more 
> detailed files for how to customize KIM, please send some to me.
> 
> Or could you please offer some other ways to help me out? I'm so sorry if I 
> put you to inconvenience.< /div>
> 
> With thanks again and hope you have a nice day.
> 
> Best Regards,
> 
> Estella
> 
> 
> 
> 
> From: philip.alex...@ontotext.com
> Subject: Re: a problem occurred in starting kim
> Date: Tue, 12 Feb 2013 17:18:35 +0200
> To: estella...@hotmail.com
> 
> Hello Estella,
> 
> We have just provided an updated version of the KIM platform 3 .7. Please 
> update your version, as  3.6 had a problem with the licenses.
> 
> You can change the value of the Java maximum heap size by editing the 
> following script:
> 
> Windows:
>   KIM/bin/config/config.bat
> 
> Linux/Unix:
>   KIM/bin/config/config
> 
> 
> You can change the setting in the following line:
> Windows:
>   if "%KIM_MAX_JAVA_HEAP%"=="" set KIM_MAX_JAVA_HEAP=2g
> 
> Linux/Unix:
>   export KIM_MAX_JAVA_HEAP="2g"
> 
> 
> Feel free to contact me anytime if you have  further difficulties.
> 
> All the best,
> Philip
> 
> On Feb 12, 2013, at 5:12 PM, 金碧漪  wrote:
> 
> Dear Philip,
> 
> Thank you! Your sugestions really help me a lot! I do use a 32 bit system. 
> 
> One of my friend is running KIM successfully with a 32 bit system. When I 
> asked him about this problem, he told me that his KIM_MAX_HEAP_SIZE value is 
> 1G, and his KIM version is 3.0. So I think I find the key point of this 
> problem.
> 
> But please forgive my ignorance——I cannot figure out how to edit the value of 
> this parameter. I can't find the correct file in the kim floder.
> 
> Or could you  tell me how to download the lower version of KIM?  The current 
> version is 3.6.
> 
> I appreciate your kindness!
> 
> Thanks and best regards,
> 
> Estella
> 
> From: philip.alex...@ontotext.com
> Subject: Re: a problem occurred in starting kim
> Date: Mon, 11 Feb 2013 11:33:12 +0200
> To: estella...@hotmail.com
> 
> Hello Estella,
> 
> One guess is that you are using a 32 bit system. And it seems there is a 
> limit on the amount of memory a 32 bit process can allocate. It is about 1.5 
> (differs on different OSes and jvm implementations). So if you are running a 
> 32 bit system please try to lower the KIM_MAX_HEAP_SIZE value to 1.5, 1.2  
> etc. This variable actually holds the -Xmx val ue that is given to the JVM 
> upon starting of the server.
> 
> Also please make sure you have all the desired memory available, when you 
> start the server. If  not - the JVM will try to allocate it, and fail with 
> this very error.< /div>
> These are my sugg estions,
> If the problem still persists, please contact me.
> 
> All the best,
> Philip
> 
> On Feb 9, 2013, at 10:26 AM, 金碧漪 << a 
> href="mailto:estella...@hotmail.com";>estella...@hotmail.com> wrote:
> 
> Dear Philip,
> 
> Thank you for you timely and enthusiastic reply. Actually I do have 3GB total 
> system RAM. So I think that's enough for running KIM.
> 
> But this problem still cannot been solved.< /div>
> 
> It cou ldn't be better if you have other recommand method to deal with this 
> problem.
> 
> Thank you so much.
> 
> Estella
> 
> 
> 
> Subject: Re : a problem occurred in starting kim
> From: philip.alex...@ontotext.com
> Date: Thu, 7 Feb 2013 13:21:12 +0200
> CC: kim-i...@ontotext.com
> To: estella...@hotmail.com
> 
> Hello Estella,
> 
> This is a typical java virtual machine error log, when you do not have enough 
> memory on the machine to satisfy the prov ided requirem ents.
> 
> Do you have 2 GB RAM mem ory free and avail able prior to starting the KIM 
> server?
> 
> You can try lowering the heap size for the JVM, but it I strongly discourage 
> running KIM with less than 2 GB.
> 
> Hope this helps,
> Philip Alexiev
> Software Engineer, KIM team
> 
> On Feb 7, 2013, at 1:16 PM, 金碧漪  wrote:
> 
> Dear Sir,
> 
> I'm a new user of KIM platform. I downloaded and installed the KIM 3 for my 
> own resear ch days ago.&n bsp;
> 
> However a problem occurred when I was trying to start KIM.  After I tr ied 
> some useless ways I finally decide to write for your help.
> 
> Here comes my question.
> 
> when I < font class="ecxApple-style-span" size="2">executed the command of 
> "kim start" in the bin folder of KIM, the respond was "
> 
> 'KIM_MAX_JAVA_HEAP=2g'
> 'JAVA_HOME=D:\Java\jdk1.6.0_38'
> 'KIM_HOME=D:\kim-platform-3.6\bin\..'
> error occurred during initialization of VM
> could not reserve enough space for object heap
> could not create the Java virtual machine"
> 
> 
> I really want to solve this problem, so please do me a favor and  tell me how 
> to deal with it ! 
> 
> Thank you so much and hope you will write back as soon as possible.
> 
> Best regards.
> 
> Estella
> 
> 

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Kim Client war files

2013-03-04 Thread Philip Alexiev
Hello Cong,

You are correct. We have moved the interface to a new location :  
KIM/services/httpd/webapps/KIM.war  . This is for convenience. KIM has the 
option to run an embedded jetty instance now, which will deploy the interface 
from this location automatically. You can always disable the internal jetty and 
deploy the war in a tomcat instance.

Hope this helps,
Philip Alexiev
Software Engineer, KIM team

On Mar 2, 2013, at 2:38 PM, Onto Genesis  wrote:

> Hello,
> 
> It appears that the 'KIM Clients' directory, the one that gets deployed to a 
> servlet engine is missing from the kim-platform-3.7 bundle for windows.
> 
> Thanks
> --G
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Is KIM-3.6 free for non-comercial use?

2013-02-12 Thread Philip Alexiev
Hello Cong Nguyen,

We have updated the KIM installation provided for evaluation purposes on our 
site. You can download KIM platform 3.7 now with evaluation licenses until 
01.05.2013 .

All the best,
Philip Alexiev
Software Engineer, KIM team

On Feb 6, 2013, at 9:07 AM, Cong Nguyen  wrote:

> Dear KIM team
> I had used the previous version of KIM (3.0RC4 for Windows) for purpose of 
> research (a QA semantic system for sport news, and I customize KIM for 
> annotation), but my HDD died and i couldn't recovery all of my data, so I 
> re-downloaded KIM (3.6) but the platform expire after 3 days running.
> 
> [INFO] Extracting license properties
> The license key expired on 01-02-2013
> [ERROR] Software license validation has failed: The license key expired on 
> 01-02-2013
> 
> As far as I know, KIM is free for non-commercial, is it still true for KIM 
> 3.6? If it is, how can I obtain KIM for free?
> I'm so sorry if there is any unclearly meaning, my English is not good
> Best regards
> Cong
> 
> -- 
> Cong Hoang Nguyen
> University: Hanoi University of Science and Techonology.
> Email: congnh0...@gmail.com
> Facebook: http://www.facebook.com/monday0rsunday
> YH: congnh0902
> Skype: monday0rsunday
> Phone: (+84)1678565200
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] KIM annotation pipeline

2013-02-11 Thread Philip Alexiev
Hello Srecko,

I believe you will find the GATE documentation very useful. It is a 
comprehensive description of the extraction and components that are also used 
in KIM. The major difference between, and the reason why we have some of the 
components slightly customized for KIM, is that GATE is a tool for information 
extraction and is annotation centric. That means, that the building blocks are 
annotations and their features. KIM uses the same information extraction and 
annotations blocks, but it also enriches the annotations with a unique ID, 
which is the ID of the object in the semantic database. For KIM each annotation 
may have a "class" and an "inst" feature, which means, that this annotation 
represent an object from the semantic store with a certain class and uri. Let 
me provide a short Example:

A pure GATE pipeline may recognize "Nelson Mandela" as a person, and will 
create a Person annotation over the phrase. In the general case this annotation 
will have no features, and means that the phrase under the annotation - "Nelson 
Mandela" is actually a person.

In KIM, "Nelson Mandela" will exist in the semantic repository as an object and 
this object will have several labels (e.g. "Mandela", "N. Mandela"  etc.). So 
when the object is recognized, there will be again a Person annotation over 
"Nelson Mandela", but this time the annotation will have a "class" and "inst" 
features with the values of the ontology class and the instance's URI  
respectively.

Most of the GATE resources work with the type of the annotations and the 
semantic properties of the annotation are not important for them. That is why 
they are still applicable for semantic annotations. But there are also some 
cases where we need to tune the resource.

The logic behind GATE and each resource individually is very good described in 
GATE's documentation: http://gate.ac.uk/sale/tao/ .  The plugins are described 
here: http://gate.ac.uk/gate/doc/plugins.html  .

The resources are described in their corresponding  creaole.xml  files,  where 
descriptions are provided for the configuration options (the descriptions are 
also visible in the Gate Developer).

Hope this helps.
Philip


On Feb 8, 2013, at 8:09 PM, srecko joksimovic  
wrote:

> Hi Philip,
> 
> I have been using KIM for some time, and usually I used tool with my 
> ontology. Most of the time, KIM provided very good results in short time. 
> But, I need to clarify few things regarding annotation pipeline. If I am 
> correct, this would be the default pipeline: Document Reset PR, TextCat PR 
> (Language identification), ANNIE English Tokeniser, RegEx Sentence Splitter, 
> POS Tagger, Morphological analyzer, ANNIE Gazetteer, Large KB Gazetteer 
> (created one for my instances), LingPripe NER PR, Jape Transducer, Annotation 
> Filter, Annotation Set Transfer, TF.IDF Entity Extractor, KIM OrthoMatcher, 
> Instance Generator, and Annotation Cleaner.
> 
> I read about GATE and these components, but as I understand, they are 
> slightly adjusted for KIM annotation pipeline. Could you please explain me 
> the main purpose of each of these components, and what are their default 
> values?
> 
> Thank you!
> Srecko
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Strange behaviour for two PRs in KIM customized IE pipeline

2012-09-04 Thread Philip Alexiev

Hello Jie Gao,

Actually what happens is that the Jape rules generate correctly 
potential persons - TempPerson. The rules are not sufficient to deduce 
that those entities are valid persons with sufficient certainty. Some 
further analysis is required.


For example Mr Bush  by itself speaks nothing about the concrete person. 
But, if earlier in the document the full form - George W. Bush  is 
mentioned, then here the OrthoMatcher concludes that this is the same 
person and associates the two, using the URI of the long one.  That is 
why in the example document you sent, just "Miss Putran" is not 
sufficient for us to know which Miss Putran the document is referring 
to.  But later in the text when you see "Fariha Nadia" and "Miss Nadia" 
we have enough information to know, that Miss Nadia is exactly Fariha Nadia.


An important thing to understand, is that the information extraction 
process, which comes with the default KIM installation is not a complete 
solution ready to work out of the box. The rules and mechanisms there 
are too general and will not be sufficient for the majority of the use 
cases. Further tuning, taking in account the specific domain and 
document sets, is always necessary.


After some point, tuning is a matter of turning the slider towards 
precision or towards recall. It's difficult to increase the two 
together. It's either that more phrases will be annotated, but many of 
them will not be correct (false positives) or less annotations will be 
created (risking to miss some), but with more certainty, that they are 
relevant and exact.


With this said, identifying a person just by knowing that the name may 
be prefixed by a title is not a very good idea. There is a good chance 
to introduce more noise than benefit. But it can be done anyway. A 
simple approach is to add a jape rule that creates a Person annotation 
from the TempPerson annotation created by the rule "PersonTitlePror". 
This is how it should look:



Phase:PersonWithTitle
Input: TempPerson
Options: control = appelt

Rule: PersonWithTitle
(
 {TempPerson.rule == "PersonTitlePror"}
)
:person
-->
  :person.Person = {class=:person.TempPerson.class, 
originalName=:person.TempPerson.originalName, rule="PersonWithTitle"}



You will notice right after you run the process, that still many of the 
TempPersons are not recognized as persons. This is because there is a 
strict rule in the InstanceGenerator, that Person annotation containing 
just one word is never sufficient enough to identify a concrete person. 
The InstanceGenerator will never put an "inst" feature to such an 
annotation and it will eventually be removed from the annotation set.



About putting the title in the original name, it is not semantically 
correct. Our names do not include our titles. If it is important, there 
are different mechanisms to use it in KIM, similar to the JobTitle 
recognition.


Hope this helps,

Philip Alexiev
Software Engineer, KIM team


On 09/03/2012 09:08 PM, JIE GAO wrote:

Hi, Philip:

Thanks for your prompt response. I should have provided the test 
documents so that you may be able to re-produce the scenario.


Actually, the two problems as follows for me is easy to be re-produced 
by KIM basic pipeline:


1. Fail to generate URI for some entities

Test data: Please refer to "TestPersonTitlePror.xml" in the 
attachment. I also provided the KIM basic pipeline in the attchement 
for your convenience.


Scenario: Run [*KIM basic IE pipeline*] to annotate all the person 
mentioned in the document (i *disabled the "Annotation Cleaner"* in 
order to facilitate the analysis).


Expected result: I expect around 18 person entities that can be 
recognised, including the duplicated entities.

Actual Result:  However, very few of them can be recognised.

Analysis: If we check the annotation set of "TempPerson", most of 
expected person entities can be recognised by the "PersonTitlePror" 
rule. And, if we further debug the pipeline, the reason failing to see 
those entities in the final annotation sets is that instance generator 
cannot generate URI for them and finally removed them from the 
"Person" annotation sets. This means that instance generator will not 
only generate URI for entities but also remove those entity instances 
without a valid URI.


Debug&experiment:  But if i try to change "originalName" to 
include person title,*the IE pipeline can recognise more entities* as 
i've previously expected. The instance URI include the title,e.g., 
"http://www.ontotext.com/kim/2006/05/wkb#Person_Mr_Elsandabesee";.


 The experiment change can be easily to make for the rule 
"PersonTitlePror"(i attached the changed*person_name.jape* as well for 
your reference) :
 

Re: [Kim-discussion] Strange behaviour for two PRs in KIM customized IE pipeline

2012-09-03 Thread Philip Alexiev

Hello Jie Gao,

Can you send an example pipeline (including the documents in it) that 
demonstrates these cases.It will be very helpful for understanding the 
context.


Thank you,
Philip Alexiev
Software Engineer, KIM team

On 08/31/2012 01:16 PM, JIE GAO wrote:

Hi, ontotext team:

I am currently evaluating KIM 3.6-SNAPSHOT.

I've found that two KIM customized GATE PRs have strange behaviour for 
me. The one is KIM OrthoMatcher and another is Instance Generator.


The scenario is that i customized many JAPE grammars based on KIM 
default Jape grammar PR. There is a typical rule which is used to 
extract person entity with title appearing in text. The default rule 
in orginal KIM grammar is called "PersonTitle". The same as default 
grammar definition, i set the "originalName" as person name, while 
annotating the person entity combined the name with title. The 
orthmatcher perform well for this practice. However, instance 
generator failed to generate instance URI for the entity. Then, i 
changed the grammar to set "originalName" to full name (e.g., "Miss 
Putran" rather than "Putran"). This change makes the KIM OrthoMatcher 
not working. In other words, the entity "Miss Putran" cannot be 
matched with the entity labeled "Santosh //Putran", whereas, this time 
"Miss Putran" can be generated with an URI by instance generator.


 Based on my analysis from KIM sourcecode, KIM orthomatcher 
will always use "originalName" to retrive the entity label which is 
hardcoded and not changable. I have found the "stripPersonTitle" 
function, whereas havn't found any usage in the sourcecode so far. The 
experiment result turns out that the KIM Orthomatcher failed to strip 
Person title from "originalName" , which cause the orthomatcher 
failing to remove person title before matching.
  Meanwhile, i have found that the instance generator will 
firstly to check the duplication in knowledge base if there is matched 
entity, it will be directly specified with the same URI . If there is 
no same entity in the knowledge base, the entity will be processed by 
"IEMetadataAppender" and set the context to be "http://newEntity=true"; 
if no specific one given. However, my test turns out that if the 
original name is not the same as full name of specific extracted 
entity (i.e., original name is "Putran" rather than "Miss Putran" and 
"Putran" is not identified as a existing entity in KB). The instance 
generator will not genrate URI for such kind of entity.


/Look forwards to your kindly help.

///

*Thanks & Regards*

JIE GAO


___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] KIM questions

2012-08-29 Thread Philip Alexiev

I will be happy to help.

All the best,
Philip

On 08/29/2012 11:57 AM, Arshad Ali Khan wrote:


Hi Philip, Eleni/All

thanks for sharing this. Actually I am too working my way out of this 
stage and would be happy to share my findings with Eleni in the course 
of next few weeks. At the same time would appreciate if Eleni keep 
intouch through the discussion forum which would be a great source of 
learning from each other experience.

many thanks once again

Arshad Ali Khan

Researcher, University of Southampton





Date: Wed, 29 Aug 2012 11:26:25 +0300
From: philip.alex...@ontotext.com
To: afiont...@aueb.gr
CC: kim-discussion@ontotext.com
Subject: Re: [Kim-discussion] KIM questions

Hello Eleni,

KIM uses internally the GATE framework, which uses the TIKA toolkit 
(http://tika.apache.org) to parse the documents and extract the 
unified text content from them. Basically, KIM should be able to 
support all the formats TIKA supports, but we have restricted the set 
to the ones that make sense. You can see the list of supported file 
extensions in the populater configuration file -  
$KIM_HOME/config/populater.xml . Here is the line itself:


doc,docx,rtf,xhtml,odf,ods,odp,htm,html,txt,pdf,page,xml,gzip


As for the second topic, there is no automatic and easy way to just 
drop the ontology and update the system. You can learn more details at 
our online KIM documentation:

http://www.ontotext.com/kim/getting-started/documentation

Hope this helps,
Philip Alexiev
Software Engineer, KIM team

On 08/25/2012 06:48 PM, afiont...@aueb.gr <mailto:afiont...@aueb.gr> 
wrote:


Dear Sir or Madam,

I am writing to ask a few questions about KIM platform. In the
context of my dissertation I need to annotate documents in PDF
format. Is it possible to use KIM for that, or is it used only for
web pages?

In addition, I would like to use my own ontology but I have read
that KIM uses its own. Can I import an ontology (.owl file) to KIM?

Yours faithfully,
Eleni Afiontzi
Athens University of Economics and Business



___ Kim-discussion mailing 
list Kim-discussion@ontotext.com 
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion



___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] KIM questions

2012-08-29 Thread Philip Alexiev

Hello Eleni,

KIM uses internally the GATE framework, which uses the TIKA toolkit 
(http://tika.apache.org) to parse the documents and extract the unified 
text content from them. Basically, KIM should be able to support all the 
formats TIKA supports, but we have restricted the set to the ones that 
make sense. You can see the list of supported file extensions in the 
populater configuration file -  $KIM_HOME/config/populater.xml . Here is 
the line itself:


doc,docx,rtf,xhtml,odf,ods,odp,htm,html,txt,pdf,page,xml,gzip


As for the second topic, there is no automatic and easy way to just drop 
the ontology and update the system. You can learn more details at our 
online KIM documentation:

http://www.ontotext.com/kim/getting-started/documentation

Hope this helps,
Philip Alexiev
Software Engineer, KIM team

On 08/25/2012 06:48 PM, afiont...@aueb.gr wrote:

Dear Sir or Madam,

I am writing to ask a few questions about KIM platform. In the context 
of my dissertation I need to annotate documents in PDF format. Is it 
possible to use KIM for that, or is it used only for web pages?


In addition, I would like to use my own ontology but I have read that 
KIM uses its own. Can I import an ontology (.owl file) to KIM?


Yours faithfully,
Eleni Afiontzi
Athens University of Economics and Business



___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Large_KB_gazatteer

2012-07-10 Thread Philip Alexiev
Hello Aditya,

We provide documentation on running the KIM installation, which also includes 
customizing the information extraction and in that sense - customizing the LKB 
Gazetteer. 

You can find it at our public site:

http://www.ontotext.com/kim/getting-started/documentation

Hope this helps.

all the best
Philip Alexiev
Software Engineer, KIM team



On 27 Jun 2012, at 2:57 PM, Reneta Popova wrote:

> 
> 
> Begin forwarded message:
> 
>> From: Aditya Pathak 
>> Subject: Large_KB_gazatteer
>> Date: 27 юни 2012 14:55:36 Гриинуич+0300
>> To: 
>> 
>> Hi Reneta,
>> 
>> I a Aditya from India. 
>> I am trying to index large amount of textual data(mostly pdf files) to 
>> create a full text search.
>> Earlier I was using the standard mimir-demo application, but doesn't work 
>> properly with large amount of data. I tried using your Large KB Gazetteer 
>> but I am unable to run it properly. Please help me in this regard as i am a 
>> newbie in this field.
>> 
>> Regards,
>> 
>> Aditya Pathak
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Large KB Gazetteer

2012-06-18 Thread Philip Alexiev
Keith,

Please make sure you are executing it with the "DISTINCT" keyword. When I do 
without "DISTINCT" I also get 329 results.

I will look into the other problem in a moment.

Hth,
Philip

On 18 Jun 2012, at 4:02 PM, Keith Cortis wrote:

> Hi Philip,
>  
> I just double checked it more than once and I got 329 countries if I execute 
> the sparql query directly with the dbpedia sparql endpoint.
>  
> To be honest I’m more concerned about the issue related to the Large KB 
> Gazetteer, i.e. to why it does not recognise names containing a special 
> character (as provided in the examples below), although even the mentioned 
> issue is important.
>  
> Thanks a lot for your replies.
>  
> Keith
>  
>  
> From: Philip Alexiev [mailto:philip.alex...@ontotext.com] 
> Sent: 18 June 2012 13:52
> To: Keith Cortis
> Cc: 'KIM discussion'
> Subject: Re: [Kim-discussion] Large KB Gazetteer
>  
> Keith,
>  
> When I executed this sparql query over the provided sparql endpoint 
> (http://dbpedia.org/sparql)  I got exactly 305 results. Could you double 
> check to confirm that you get 329 ?
>  
> Thanks,
> Philip
>  
> On 18 Jun 2012, at 1:56 PM, Keith Cortis wrote:
> 
> 
> Hi Philip,
>  
> Thanks for your quick reply.
>  
> The following is the SPARQL query:
>  
> SELECT DISTINCT ?Name ?Country ?Cls
> WHERE {
> ?Country a ?Cls ; rdfs:label ?Name ;
> <http://dbpedia.org/property/capital> ?capital .
> OPTIONAL { ?Country dbpedia-owl:dissolutionYear ?year } .
> FILTER(!BOUND(?year))
> FILTER (?Cls = <http://dbpedia.org/ontology/Country>)
> FILTER ( langMatches( lang(?Name), "es") )
>  
> }
> ORDER BY (?Name)
>  
> The DBPedia SPARQL Endpoint (http://dbpedia.org/sparql) returns a total of 
> 329 country names for the query above, whilst the same query returns 305 
> country names only within the Large KB Gazetteer.
>  
> From the tests conducted I noticed that all the Spanish country names that do 
> not contain any special character such as Austria, Australia, etc. are all 
> recognised (since they have been populated in the gazetteer), whilst the ones 
> containing special characters such as Brunéi, Camerún, etc. are not 
> recognised as Countries, even though some of the country names are within the 
> gazetteer.
>  
> I can’t figure out why the names containing special characters are not being 
> recognised by the Large KB Gazetteer, even though some of the names are 
> listed within.
>  
> Regards,
>  
> Keith
>  
> From: Philip Alexiev [mailto:philip.alex...@ontotext.com] 
> Sent: 18 June 2012 11:11
> To: Keith Cortis
> Cc: KIM discussion
> Subject: Re: [Kim-discussion] Fwd: Large KB Gazetteer
>  
> Hi Keith,
>  
> Most probably the gazetteer query is not matching the RDF for those labels.
>  
> Please provide the RDF for some of the missed countries and also the 
> gazetteer query, in case you customized it.
>  
> Regards,
> Philip Alexiev
> Software Engineer, KIM team
>  
>  
> On 18 Jun 2012, at 12:55 PM, Philip Alexiev wrote:
> 
> 
> 
>  
>  
> Begin forwarded message:
> 
> 
> 
> I have been testing out the Large KB Gazetteer module in GATE (v 7.0), where 
> I noticed that the country names having a special character, are not being 
> imported into the newly created gazetteer. For example, if I want to create a 
> Gazetteer containing all the countries in the world, in Spanish (rdfs:label 
> ="es"), the gazetteer is only loading 299 instances from a possible 324. 
> Therefore, country names such as: Afganistán, Azerbaiyán, Benín, Brunéi, 
> etc.. are not being loaded, thus not recognised as a Country entity. The same 
> problem is occurring for city names, where all the names are being imported 
> into the gazetteer, but the ones containing any special character (like the 
> example provided above), are not being recognised as being an entity.
>  
> Do you know what might be causing this issue please?
>  
> Thanks a lot for your help.
>  
> Regards,
>  
> Keith
>  
> 
> Keith Cortis
> Digital Enterprise Research Institute (DERI) Galway,
> Semantic Collaborative Software Unit (USCS)
> National University of Ireland, Galway
> Lower Dangan
> Galway, Ireland
>  
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion
>  
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion
>  
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Large KB Gazetteer

2012-06-18 Thread Philip Alexiev
Keith,

When I executed this sparql query over the provided sparql endpoint 
(http://dbpedia.org/sparql)  I got exactly 305 results. Could you double check 
to confirm that you get 329 ?

Thanks,
Philip

On 18 Jun 2012, at 1:56 PM, Keith Cortis wrote:

> Hi Philip,
>  
> Thanks for your quick reply.
>  
> The following is the SPARQL query:
>  
> SELECT DISTINCT ?Name ?Country ?Cls
> WHERE {
> ?Country a ?Cls ; rdfs:label ?Name ;
> <http://dbpedia.org/property/capital> ?capital .
> OPTIONAL { ?Country dbpedia-owl:dissolutionYear ?year } .
> FILTER(!BOUND(?year))
> FILTER (?Cls = <http://dbpedia.org/ontology/Country>)
> FILTER ( langMatches( lang(?Name), "es") )
>  
> }
> ORDER BY (?Name)
>  
> The DBPedia SPARQL Endpoint (http://dbpedia.org/sparql) returns a total of 
> 329 country names for the query above, whilst the same query returns 305 
> country names only within the Large KB Gazetteer.
>  
> From the tests conducted I noticed that all the Spanish country names that do 
> not contain any special character such as Austria, Australia, etc. are all 
> recognised (since they have been populated in the gazetteer), whilst the ones 
> containing special characters such as Brunéi, Camerún, etc. are not 
> recognised as Countries, even though some of the country names are within the 
> gazetteer.
>  
> I can’t figure out why the names containing special characters are not being 
> recognised by the Large KB Gazetteer, even though some of the names are 
> listed within.
>  
> Regards,
>  
> Keith
>  
> From: Philip Alexiev [mailto:philip.alex...@ontotext.com] 
> Sent: 18 June 2012 11:11
> To: Keith Cortis
> Cc: KIM discussion
> Subject: Re: [Kim-discussion] Fwd: Large KB Gazetteer
>  
> Hi Keith,
>  
> Most probably the gazetteer query is not matching the RDF for those labels.
>  
> Please provide the RDF for some of the missed countries and also the 
> gazetteer query, in case you customized it.
>  
> Regards,
> Philip Alexiev
> Software Engineer, KIM team
>  
>  
> On 18 Jun 2012, at 12:55 PM, Philip Alexiev wrote:
> 
> 
>  
>  
> Begin forwarded message:
> 
> 
> I have been testing out the Large KB Gazetteer module in GATE (v 7.0), where 
> I noticed that the country names having a special character, are not being 
> imported into the newly created gazetteer. For example, if I want to create a 
> Gazetteer containing all the countries in the world, in Spanish (rdfs:label 
> ="es"), the gazetteer is only loading 299 instances from a possible 324. 
> Therefore, country names such as: Afganistán, Azerbaiyán, Benín, Brunéi, 
> etc.. are not being loaded, thus not recognised as a Country entity. The same 
> problem is occurring for city names, where all the names are being imported 
> into the gazetteer, but the ones containing any special character (like the 
> example provided above), are not being recognised as being an entity.
>  
> Do you know what might be causing this issue please?
>  
> Thanks a lot for your help.
>  
> Regards,
>  
> Keith
>  
> 
> Keith Cortis
> Digital Enterprise Research Institute (DERI) Galway,
> Semantic Collaborative Software Unit (USCS)
> National University of Ireland, Galway
> Lower Dangan
> Galway, Ireland
>  
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion
>  
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Fwd: Large KB Gazetteer

2012-06-18 Thread Philip Alexiev
Hi Keith,

Most probably the gazetteer query is not matching the RDF for those labels.

Please provide the RDF for some of the missed countries and also the gazetteer 
query, in case you customized it.

Regards,
Philip Alexiev
Software Engineer, KIM team


On 18 Jun 2012, at 12:55 PM, Philip Alexiev wrote:

> 
> 
> Begin forwarded message:
> 
>> I have been testing out the Large KB Gazetteer module in GATE (v 7.0), where 
>> I noticed that the country names having a special character, are not being 
>> imported into the newly created gazetteer. For example, if I want to create 
>> a Gazetteer containing all the countries in the world, in Spanish 
>> (rdfs:label ="es"), the gazetteer is only loading 299 instances from a 
>> possible 324. Therefore, country names such as: Afganistán, Azerbaiyán, 
>> Benín, Brunéi, etc.. are not being loaded, thus not recognised as a Country 
>> entity. The same problem is occurring for city names, where all the names 
>> are being imported into the gazetteer, but the ones containing any special 
>> character (like the example provided above), are not being recognised as 
>> being an entity.
>>  
>> Do you know what might be causing this issue please?
>>  
>> Thanks a lot for your help.
>>  
>> Regards,
>>  
>> Keith
>>  
>> 
>> Keith Cortis
>> Digital Enterprise Research Institute (DERI) Galway,
>> Semantic Collaborative Software Unit (USCS)
>> National University of Ireland, Galway
>> Lower Dangan
>> Galway, Ireland
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


[Kim-discussion] Fwd: Large KB Gazetteer

2012-06-18 Thread Philip Alexiev


Begin forwarded message:

> I have been testing out the Large KB Gazetteer module in GATE (v 7.0), where 
> I noticed that the country names having a special character, are not being 
> imported into the newly created gazetteer. For example, if I want to create a 
> Gazetteer containing all the countries in the world, in Spanish (rdfs:label 
> ="es"), the gazetteer is only loading 299 instances from a possible 324. 
> Therefore, country names such as: Afganistán, Azerbaiyán, Benín, Brunéi, 
> etc.. are not being loaded, thus not recognised as a Country entity. The same 
> problem is occurring for city names, where all the names are being imported 
> into the gazetteer, but the ones containing any special character (like the 
> example provided above), are not being recognised as being an entity.
>  
> Do you know what might be causing this issue please?
>  
> Thanks a lot for your help.
>  
> Regards,
>  
> Keith
>  
> 
> Keith Cortis
> Digital Enterprise Research Institute (DERI) Galway,
> Semantic Collaborative Software Unit (USCS)
> National University of Ireland, Galway
> Lower Dangan
> Galway, Ireland

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Adding Domain Ontology

2012-06-08 Thread Philip Alexiev
Hello Naaman,

I think you will find the documentation on this page useful:
http://www.ontotext.com/kim/getting-started/documentation

and more particularly - the   Customizing KIM Guide .

Hope this helps,
Philip

On 8 Jun 2012, at 8:58 AM, Naaman Musawwir wrote:

> Hello there,
> 
> I have to add domain ontology into KIM's knowledge base and want to know the
> process of doing so. I have tried importing RDFs using rdf import tool but
> that does not seem to work. Please guide.
> 
> Regards,
> Naaman Musawwir.
> 
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] How to get instance properties !

2012-05-21 Thread Philip Alexiev
Hello Minh Hoang,

Semantic annotations in documents serve only to link specific parts of the 
texts to some instances in the semantic repository. The phrase is just 
identified as a specific instance from the knowledge base. Once you have the id 
of the object (the URI), you can query the semantic repository to retrieve the 
complete molecule (its properties and relations). You can use the 
SemanticRepository service to do that.

Hth,
Philip

On 19 May 2012, at 2:20 PM, Minh Hoang wrote:

> Hi all,
> 
> in my ontology, i has created "rooney" instance with some data property 
> (name,age,...).
> i maked a search form to search document that contain "rooney" instance. in 
> returned results (KIMDocument object) i has got content, annotation, feature 
> ... but have not infomation about data properties.
> 
> so how can i get data properties of an instance using web service API?
> 
> thank advance .
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] KIM Installation Error : Server Starter thread failed when connecting with remote OWLIM repository

2012-05-04 Thread Philip Alexiev
Hello Jerry,

Excuse me for the big delay.

Most probably the reason is in different versions of the sesame library in 
Owlim and in KIM. Since you are running them separately, this is quite 
possible. You can check this page that pointed me to this idea:

http://sesame-general.435816.n3.nabble.com/Malformed-query-result-from-server-td3451630.html

Hope this helps,
Philip Alexiev
Software Engineer, KIM team

On 26 Apr 2012, at 1:42 AM, Jerry Gao wrote:

> Hi, ontotext team:
>  
> I am currently evaluating KIM Platform. The version of KIM given by your team 
> is kim-platform-3.0-RC5-Windows / kim-platform-3.0-RC5-UnixCompatible and 
> owlim-se-4.3.4824.
>  
> However, exception happened when i manage to configure KIM platform to 
> connect with remote OWLIM Repository via SPARQL endpoint. I haven't found any 
> document about how to configure KIM platform to work with remote semantic 
> repository (or a remote sparql endpoint ) rather than an encapsulated OWLIM 
> Inside.
>  
> My detailed configuration as follows:
>  
> 1. create kim sesame repository via openrdf-console and an configured 
> template file "owlim-se-kim.ttl ".
>  
> The simple configuration of the " owlim-se-kim.ttl" is :
> (ps: the dbpedia_3.7.owl has already been deployed in tomcat server and can 
> be access via the URL)
> ==
> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
> @prefix rep: <http://www.openrdf.org/config/repository#>.
> @prefix sr: <http://www.openrdf.org/config/repository/sail#>.
> @prefix sail: <http://www.openrdf.org/config/sail#>.
> @prefix owlim: <http://www.ontotext.com/trree/owlim#>.
>  
> [] a rep:Repository ;
>rep:repositoryID "kim" ;
>rdfs:label "kim Repository" ;
>rep:repositoryImpl [
>   rep:repositoryType "openrdf:SailRepository" ;
>   sr:sailImpl [
> sail:sailType "owlim:Sail" ;
> owlim:ruleset "owl-max-optimized" ;
> owlim:storage-folder "owlim-storage" ;
> owlim:repository-type "weighted-file-repository" ;
> owlim:base-URL "http://www.ontotext.com/kim/2006/05/wkb#"; ;
>owlim:imports "http://localhost:8083/dbpedia/dbpedia_3.7.owl;"; ;
> owlim:defaultNS "http://dbpedia.org/resource/"; ;
> owlim:entity-index-size "9000" ;
> owlim:cache-memory "3000m" ;
> owlim:tuple-index-memory "1200m" ;
> owlim:enablePredicateList "true" ;
> owlim:predicate-memory "1000m" ;
> owlim:fts-memory "800m" ;
> owlim:ftsIndexPolicy "onStartup" ;
> owlim:ftsLiteralsOnly "true" ;
> owlim:build-pcsot "true" ;
> owlim:build-ptsoc "true" ;
> owlim:in-memory-literal-properties "true" ;
> owlim:journaling "true" ;
> ]
> ].
> =
>  
> 2. then, change the "owlim.ttl" in "kim-platform/config" to connect to the 
> repository:
>  
> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
> @prefix rep: <http://www.openrdf.org/config/repository#>.
> @prefix sr: <http://www.openrdf.org/config/repository/sail#>.
> @prefix sail: <http://www.openrdf.org/config/sail#>.
> @prefix owlim: <http://www.ontotext.com/trree/owlim#>.
> @prefix hr: <http://www.openrdf.org/config/repository/http#>.
>  
> rep:kim a rep:Repository ;
> rep:repositoryImpl [
>rep:repositoryType "openrdf:HTTPRepository" ;
>hr:repositoryID "kim" ;
>hr:repositoryURL 
> <http://localhost:8083/openrdf-sesame/repositories/kim>
> ];
> rep:repositoryID "kim" ;
> rdfs:label "my remote sesame repository" .
>  
> 3. when running the kim service, i encountered exception as follows:
>  
> the detailed log is:
> 
> E:\DevSoftware\KIM-Evaluation\kim-platform-3.0-RC5-Windows\kim-platform-3.0-RC5\bin>kim
> "KIM_MAX_JAVA_HEAP = 4g"
> "JAVA_HOME = E:\DevSoftware\Program Files\Java\jdk1.6.0_27"
> "KIM_HOME = 
> E:\DevSoftware\KIM-Evaluation\kim-platform-3.0-RC5-Windows\kim-platform-3.0-RC5\bin\.."
> [INFO] : : : : : : : : KIM SERVER START : : : : : : : :
> [INFO] KIMService registered on port 1099
> [ERROR] Server Starter thread failed!
> java.rmi.RemoteException: Malformed query result from server
> at 
> com.ontotext.kim.semanticrepository.SemanticRepositoryBase.loadNam

Re: [Kim-discussion] ceate KIMDocument from URL?

2012-04-13 Thread Philip Alexiev
Hello Minh Hoang,

You can look at the KIM documentation provided at Ontotext's site in  
ManageDocumentsExamples.html .

The code you pasted worked for me. Please check the package of the KIMDocument 
object, returned by the createDocument invocation and the KIMDocument you have 
imported (you are trying to cast it to) . My guess is they are different.

Hope this helps.
Philip Alexiev
Software Engineer, KIM team


On 13 Apr 2012, at 9:47 AM, Minh Hoang wrote:

> hi all,
> 
> i want to create KIMDocument from an URl (example : 
> file:///C:/KIM/corpus/Wayne_Rooney.htm), but in webservice API seem have no 
> function name as createDocument() ???
> 
> i has try KIM JAVA RMI API but i got some error : INFO: 
> java.lang.ClassCastException: $Proxy185 cannot be cast to 
> SemanticAnnotationAPI.KIMDocument
> 
> my RMI code example:
> 
> KIMService service1 = null;
> DocumentRepositoryAPI repository = null;
> CorporaAPI corpora = null;
> 
> service1 = GetService.from("localhost", 1099);
> repository = service1.getDocumentRepositoryAPI();
> corpora = service1.getCorporaAPI();
> 
> URL url = new 
> URL("http://en.wikipedia.org/wiki/Wayne_Rooney";);
> KIMDocument kdocFromUrl = (KIMDocument) 
> corpora.createDocument(url, "UTF-8");//erro this line
> 
> help me please :(
> 
> thank you!
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] KIM installation issue

2012-04-12 Thread Philip Alexiev
I forgot to mention that KIM does not currently run on  JDK 7 .  Please use the 
latest JDK 6 instead.

All the best,
Philip

On 12 Apr 2012, at 7:20 PM, Philip Alexiev wrote:

> Hello Yves Dassas,
> 
> It looks like the cause for the problem is not in KIM or its underlying 
> services.  Most likely the Java installation you use is incomplete and is 
> missing a localization for javac (resource bundle). You may consider 
> downloading the latest JDK from the official site and using trying with it.
> 
> Thank you for using KIM,
> Philip Alexiev
> Software Engineer, KIM team
> 
> On 12 Apr 2012, at 7:43 PM, yves HT wrote:
> 
>> Hello,
>> 
>> I downloaded your KIM platform in order to assess it.
>> 
>> The installation (kim.bat) goes well (see below) until I get the following 
>> java error:
>> 
>> 'Exception in thread "ComponentStarter-Thread-2" java.lang.InternalError: 
>> Cannot find javac resource bundle for locale en_US'
>> 
>> I would be grateful if you could let me know whether this is a known issue?
>> 
>> 
>> "KIM_MAX_JAVA_HEAP = 1g"
>> "JAVA_HOME = C:\Program Files\Java\jdk1.7.0_03"
>> "KIM_HOME = G:\Yves\Matlab\kim-platform-3.0-RC4\bin\.."
>> [INFO] : : : : : : : : KIM SERVER START : : : : : : : :
>> [INFO] KIMService registered on port 1099
>> [INFO] OwlimSchemaRepository: 3.3
>> [INFO] Build date:  06-22-2010 11:57
>> [INFO] Configured parameter 'imports' to 'kb/owl/owl.rdfs;
>>kb/owl/protons.owl;
>>kb/owl/protont.owl;
>>kb/owl/protonu.owl;
>>kb/owl/kimso.owl;
>>kb/owl/kimlo.owl;
>>kb/skos-owl1-dl.rdf;
>>kb/wkb.nt;
>>kb/wkbx.nt;'
>> [INFO] Configured parameter 'defaultNS' to 'http://www.w3.org/2002/07/owl#;
>>   http://proton.semanticweb.org/2006/05/protons#;
>>   http://proton.semanticweb.org/2006/05/protont#;
>>   http://proton.semanticweb.org/2006/05/protonu#;
>>   http://www.ontotext.com/kim/2006/05/wkb#;
>>   http://www.ontotext.com/kim/2006/05/wkb#;
>>   http://www.ontotext.com/kim/2006/05/wkb#;
>>   http://www.ontotext.com/kim/2006/05/wkb#;
>>   http://www.ontotext.com/kim/2006/05/wkb#;'
>> [INFO] Configured parameter 'base-URL' to 
>> 'http://www.ontotext.com/kim/2006/05/wkb#'
>> [INFO] Configured parameter 'ruleset' to 'kb/KIMRules.pie'
>> [INFO] Configured parameter 'ftsLiteralsOnly' to 'true'
>> [INFO] Configured parameter 'console-thread' to 'false'
>> [INFO] Configured parameter 'useShutdownHooks' to 'false'
>> [INFO] Configured parameter 'entity-index-size' to '40'
>> [INFO] Configured parameter 'ftsIndexPolicy' to 'onStartup'
>> [INFO] Tokenization regular expression: [\p{L}\d_]+
>> [INFO] Repository fragments: 1
>> [INFO] Inferencer threads: 1
>> [INFO] ftsPolicy = on-startup
>> [INFO] fts: indexing literals only
>> [INFO] Configured parameter 'tuple-index-memory' to '100M'
>> [INFO] Configured parameter 'fts-memory' to '80M'
>> [INFO] Cache pages for tuples: 5241
>> [INFO] Cache pages for predicates: 0
>> [INFO] Cache pages for FTS: 4193
>> [INFO] Configured parameter 'storage-folder' to 'populated'
>> [INFO] Configured parameter 'repository-type' to 'file-repository'
>> Compiled: 
>> 'G:\Yves\Matlab\kim-platform-3.0-RC4\context\default\kb\KIMRules.pie'
>> Exception in thread "ComponentStarter-Thread-2" java.lang.InternalError: 
>> Cannot find javac resource bundle for locale en_US
>>   at 
>> com.sun.tools.javac.util.JavacMessages.getBundles(JavacMessages.java:124)
>>   at 
>> com.sun.tools.javac.util.JavacMessages.setCurrentLocale(JavacMessages.java:73)
>>   at com.sun.tools.javac.util.JavacMessages.(JavacMessages.java:98)
>>   at com.sun.tools.javac.util.JavacMessages.(JavacMessages.java:88)
>>   at com.sun.tools.javac.main.Main.getLocalizedString(Main.java:564)
>>   at com.sun.tools.javac.m

[Kim-discussion] Fwd: KIM installation issue

2012-04-12 Thread Philip Alexiev


Begin forwarded message:

> From: Philip Alexiev 
> Subject: Re: KIM installation issue
> Date: 12 April 2012 7:20:28 PM GMT+03:00
> To: yves HT 
> Cc: KIM discussion 
> 
> Hello Yves Dassas,
> 
> It looks like the cause for the problem is not in KIM or its underlying 
> services.  Most likely the Java installation you use is incomplete and is 
> missing a localization for javac (resource bundle). You may consider 
> downloading the latest JDK from the official site and using trying with it.
> 
> Thank you for using KIM,
> Philip Alexiev
> Software Engineer, KIM team
> 
> On 12 Apr 2012, at 7:43 PM, yves HT wrote:
> 
>> Hello,
>> 
>> I downloaded your KIM platform in order to assess it.
>> 
>> The installation (kim.bat) goes well (see below) until I get the following 
>> java error:
>> 
>> 'Exception in thread "ComponentStarter-Thread-2" java.lang.InternalError: 
>> Cannot find javac resource bundle for locale en_US'
>> 
>> I would be grateful if you could let me know whether this is a known issue?
>> 
>> 
>> "KIM_MAX_JAVA_HEAP = 1g"
>> "JAVA_HOME = C:\Program Files\Java\jdk1.7.0_03"
>> "KIM_HOME = G:\Yves\Matlab\kim-platform-3.0-RC4\bin\.."
>> [INFO] : : : : : : : : KIM SERVER START : : : : : : : :
>> [INFO] KIMService registered on port 1099
>> [INFO] OwlimSchemaRepository: 3.3
>> [INFO] Build date:  06-22-2010 11:57
>> [INFO] Configured parameter 'imports' to 'kb/owl/owl.rdfs;
>>kb/owl/protons.owl;
>>kb/owl/protont.owl;
>>kb/owl/protonu.owl;
>>kb/owl/kimso.owl;
>>kb/owl/kimlo.owl;
>>kb/skos-owl1-dl.rdf;
>>kb/wkb.nt;
>>kb/wkbx.nt;'
>> [INFO] Configured parameter 'defaultNS' to 'http://www.w3.org/2002/07/owl#;
>>   http://proton.semanticweb.org/2006/05/protons#;
>>   http://proton.semanticweb.org/2006/05/protont#;
>>   http://proton.semanticweb.org/2006/05/protonu#;
>>   http://www.ontotext.com/kim/2006/05/wkb#;
>>   http://www.ontotext.com/kim/2006/05/wkb#;
>>   http://www.ontotext.com/kim/2006/05/wkb#;
>>   http://www.ontotext.com/kim/2006/05/wkb#;
>>   http://www.ontotext.com/kim/2006/05/wkb#;'
>> [INFO] Configured parameter 'base-URL' to 
>> 'http://www.ontotext.com/kim/2006/05/wkb#'
>> [INFO] Configured parameter 'ruleset' to 'kb/KIMRules.pie'
>> [INFO] Configured parameter 'ftsLiteralsOnly' to 'true'
>> [INFO] Configured parameter 'console-thread' to 'false'
>> [INFO] Configured parameter 'useShutdownHooks' to 'false'
>> [INFO] Configured parameter 'entity-index-size' to '40'
>> [INFO] Configured parameter 'ftsIndexPolicy' to 'onStartup'
>> [INFO] Tokenization regular expression: [\p{L}\d_]+
>> [INFO] Repository fragments: 1
>> [INFO] Inferencer threads: 1
>> [INFO] ftsPolicy = on-startup
>> [INFO] fts: indexing literals only
>> [INFO] Configured parameter 'tuple-index-memory' to '100M'
>> [INFO] Configured parameter 'fts-memory' to '80M'
>> [INFO] Cache pages for tuples: 5241
>> [INFO] Cache pages for predicates: 0
>> [INFO] Cache pages for FTS: 4193
>> [INFO] Configured parameter 'storage-folder' to 'populated'
>> [INFO] Configured parameter 'repository-type' to 'file-repository'
>> Compiled: 
>> 'G:\Yves\Matlab\kim-platform-3.0-RC4\context\default\kb\KIMRules.pie'
>> Exception in thread "ComponentStarter-Thread-2" java.lang.InternalError: 
>> Cannot find javac resource bundle for locale en_US
>>   at 
>> com.sun.tools.javac.util.JavacMessages.getBundles(JavacMessages.java:124)
>>   at 
>> com.sun.tools.javac.util.JavacMessages.setCurrentLocale(JavacMessages.java:73)
>>   at com.sun.tools.javac.util.JavacMessages.(JavacMessages.java:98)
>>   at com.sun.tools.javac.util.JavacMessages.(JavacMessages.java:88)
>>   at com.sun.tools.javac.main.Main.getLocalizedString(Main.java:564)
>>   at com

Re: [Kim-discussion] KIM API webservice function

2012-04-09 Thread Philip Alexiev
Hello Minh Hoang,

The documentation is available online at  http://www.ontotext.com/kim.  You can 
find the link to the system documentation at the bottom of the page:
http://www.ontotext.com/sites/default/files/kim/KimDocs-3.0-EN.zip

The Web Services  API is described there.

All the best,
Philip Alexiev
Software Engineer, KIM team


On 8 Apr 2012, at 1:11 PM, Minh Hoang wrote:

> hi Philip,
> 
> i'm try to test kim webservice API to build my client application, i has 
> followed kim KimDocs-3.0-EN, but description of function is not fully. i 
> can't understand input and out put of method. can you help me how to test 
> these function (more detail document or example for it)
> 
> thank advance!
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Rép. : Re: Problem with Large KB Gazetteer and chinese caracters

2012-04-05 Thread Philip Alexiev
Hi Fabian,

Using any tokenizer with support for Gate is easily applicable in the ancestor 
of the LKB gazetteer - the LD gazetteer.

We are currently discussing the future and licensing of this component. Expect 
outcome from us the following weeks.

All the best,
Philip

On 5 Apr 2012, at 3:54 PM, Fabian Cretton wrote:

> Hi Philip,
>  
> Sorry to trouble you again
>  
> Is there any way to give the large KB gazetteer a texte in chinese, for 
> instance with white spaces between the characters or any other thing I could 
> do to make it work ?
>  
> Or would you have another suggestion which gazetteer I could use for large 
> chinese gazetteer ?
>  
> Or, coulc I be able to modify the large KB gazetteer to make it work with 
> chinese ?
>  
> Thanks a lot, any information is welcome as I don't see any other workaround 
> so far and it is quiet a problem for me
> Fabian
> 
> >>> Philip Alexiev  05.04.2012 13:44 >>>
> Hello Fabian,
> 
> The LKB gazetteer uses its own tokenization, which is generally - whitespace 
> based. This is the reason why it won't work over asian texts. 
> 
> Unfortunately we no longer support it. 
> 
> All the best,
> Philip
> 
> On 4 Apr 2012, at 10:59 AM, Fabian Cretton wrote:
> 
>> Dear all,
>>  
>> I am using Gate 6.1 and the large KB Gazetteer. It works just fine with 
>> french and english.
>>  
>> But when I include chinese 'aliases', no lookup appear for the chinese words.
>>  
>> Where should I look for a mistake ?
>>  
>> The ontology does have labels with chinese as "你好"@zh.
>> They are loaded in OWLIM 4.3 from a .ttl file with encoding "UTF-8 without 
>> BOM"
>> (using the Sesame workbench, if the file is only "UTF-8", I get an error 
>> "Not a valid (absolute) URI: nullhttp [line 1]", so I kept the UTF-8 without 
>> BOM)
>>  
>> The display in the Workbench is fine. Chinese caracters seems good and can 
>> be queried with SPARQL.
>>  
>> The Large KB Gazetteer is created without an error. I did try both: mixing 
>> french/english/chinese aliases, and also having only chinese aliases.
>>  
>> The files for the corpus are loaded in Gate specifying "UTF-8" as encoding. 
>> They are displayed correctly in Gate.
>>  
>> The pipeline runs well, but no lookup on chinese caracters are created.
>> With a gazetteer mixing english/chinese, the english words do have lookups, 
>> but not the chinese ones.
>>  
>> Thanks for any help or pointer
>> Fabian
>>  
>> ___
>> Kim-discussion mailing list
>> Kim-discussion@ontotext.com
>> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Problem with Large KB Gazetteer and chinese caracters

2012-04-05 Thread Philip Alexiev
Hello Fabian,

The LKB gazetteer uses its own tokenization, which is generally - whitespace 
based. This is the reason why it won't work over asian texts. 

Unfortunately we no longer support it. 

All the best,
Philip

On 4 Apr 2012, at 10:59 AM, Fabian Cretton wrote:

> Dear all,
>  
> I am using Gate 6.1 and the large KB Gazetteer. It works just fine with 
> french and english.
>  
> But when I include chinese 'aliases', no lookup appear for the chinese words.
>  
> Where should I look for a mistake ?
>  
> The ontology does have labels with chinese as "你好"@zh.
> They are loaded in OWLIM 4.3 from a .ttl file with encoding "UTF-8 without 
> BOM"
> (using the Sesame workbench, if the file is only "UTF-8", I get an error "Not 
> a valid (absolute) URI: nullhttp [line 1]", so I kept the UTF-8 without BOM)
>  
> The display in the Workbench is fine. Chinese caracters seems good and can be 
> queried with SPARQL.
>  
> The Large KB Gazetteer is created without an error. I did try both: mixing 
> french/english/chinese aliases, and also having only chinese aliases.
>  
> The files for the corpus are loaded in Gate specifying "UTF-8" as encoding. 
> They are displayed correctly in Gate.
>  
> The pipeline runs well, but no lookup on chinese caracters are created.
> With a gazetteer mixing english/chinese, the english words do have lookups, 
> but not the chinese ones.
>  
> Thanks for any help or pointer
> Fabian
>  
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Rép. : Re: Large KB Gazetteer - information about the gazetteer size/system configuration

2012-03-28 Thread Philip Alexiev
Hello,

My opinion is that you will have no problems loading 10-30 million aliases in 
the dictionary. It will just take longer to init and load.  Nevertheless, the 
task should be completely feasible. The difference in performance (if any) when 
matching in text,  should be insignificant, as the gazetteer uses a hash to 
store the entries.

In other words: Just go ahead and do it.

Hope this helps,
Philip

On 28 Mar 2012, at 4:36 PM, Fabian Cretton wrote:

> Thank you very much Philip.
>  
> But do you mean that it works ok with 50'000 to 100'000 aliases ?
>  
> For my tests, the ANNIE standard gazetteer works very well with 200'000 
> aliases (wordnet).
>  
> But now, I will need much more.
> Is it stupid to try the large KB Gazetteer with 10'000'000 aliases (10 
> millions) or even 30 millions aliases  ? should I right away look for another 
> option ?
>  
> Thanks
> Fabian
> 
> >>> Philip Alexiev  28.03.2012 15:22 >>>
> Hello Fabian,
> 
> Unfortunately we do not have benchmarks on the gazetteer. The general setup 
> is with a dictionary of 50 - 100 k  aliases, in which case the performance is 
> good even on a desktop machine. 
> 
> The performance of the gazetteer depends  on the dictionary size and does not 
> depend on the number of documents processed. The gazetteer resource is 
> stateless. It will have persistent behavior on one document and through 1000 
> documents. 
> 
> When annotating huge sets of documents, the important thing is to delete the 
> gate document resource after it is annotated.  This is example code from Gate 
> group, that demonstrates how to annotate documents in a client:
> 
> http://gate.ac.uk/wiki/code-repository/src/sheffield/examples/BatchProcessApp.java
> 
> And deleting the document resource:
> 25  Factory.deleteResource(doc);
> 
> 
> For further questions on this matter you can also refer directly to Gate's 
> mailing list.
> 
> Hth,
> Philip Alexiev
> Software Engineer, KIM team
> 
> 
> 
> On 28 Mar 2012, at 9:41 AM, Fabian Cretton wrote:
> 
>> Dear all,
>>  
>> I am running the large KB Gazetteer smoothly on my desktop computer with 8GB 
>> Ram, for 2 millions alias, on a small text file of 20 lines.
>> With this configuration, the standard Gate gazetteer already won't do the 
>> job.
>>  
>> Is it possible to have some little examples of what can be done with bigger 
>> sets ?
>> On small text files, on my desktop computer, will I be able to run the 
>> gazetteer for 10 millions or 50 millions entries ?
>> If not, what kind of server would I need ?
>>  
>> Thanks a lot for that information
>> Fabian
>> ___
>> Kim-discussion mailing list
>> Kim-discussion@ontotext.com
>> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Large KB Gazetteer - information about the gazetteer size/system configuration

2012-03-28 Thread Philip Alexiev
Hello Fabian,

Unfortunately we do not have benchmarks on the gazetteer. The general setup is 
with a dictionary of 50 - 100 k  aliases, in which case the performance is good 
even on a desktop machine. 

The performance of the gazetteer depends  on the dictionary size and does not 
depend on the number of documents processed. The gazetteer resource is 
stateless. It will have persistent behavior on one document and through 1000 
documents. 

When annotating huge sets of documents, the important thing is to delete the 
gate document resource after it is annotated.  This is example code from Gate 
group, that demonstrates how to annotate documents in a client:

http://gate.ac.uk/wiki/code-repository/src/sheffield/examples/BatchProcessApp.java

And deleting the document resource:
25  Factory.deleteResource(doc);


For further questions on this matter you can also refer directly to Gate's 
mailing list.

Hth,
Philip Alexiev
Software Engineer, KIM team



On 28 Mar 2012, at 9:41 AM, Fabian Cretton wrote:

> Dear all,
>  
> I am running the large KB Gazetteer smoothly on my desktop computer with 8GB 
> Ram, for 2 millions alias, on a small text file of 20 lines.
> With this configuration, the standard Gate gazetteer already won't do the job.
>  
> Is it possible to have some little examples of what can be done with bigger 
> sets ?
> On small text files, on my desktop computer, will I be able to run the 
> gazetteer for 10 millions or 50 millions entries ?
> If not, what kind of server would I need ?
>  
> Thanks a lot for that information
> Fabian
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Rép. : Re: Large KB Gazetteer

2012-03-28 Thread Philip Alexiev
Hello Fabian,

The matter is being discussed at the moment.  When we come out with a final 
statement I will inform you directly.

Thank you for your interest.

All the best,
Philip

On 26 Mar 2012, at 1:51 PM, Fabian Cretton wrote:

> Hello Philip,
>  
> That helped a lot!
>  
> Will the Linked Data Gazetteer really become available ? if yes, do you know 
> when ?
>  
> Thanks again
> Fabian
> 
> >>> Philip Alexiev  26.03.2012 11:50 >>>
> Hello Fabian,
> 
> This is the right place to ask questions related to the LKB gazetteer, as it 
> is developed by Ontotext. I will apply my answers inline - under your 
> questions.
> 
> On 26 Mar 2012, at 9:18 AM, Fabian Cretton wrote:
> 
>> Dear all,
>>  
>> I write to this list as this seems to be the place to ask questions about 
>> the large KB Gazetteer. This question is not directly related to KIM, thank 
>> you to redirect me to a more appropriate list if needed.
>>  
>> I have only done little tests with the large KB Gazetteer in Gate (not with 
>> KIM), and there are a few things about which I couldn't find more 
>> information:
>> - can the large KB Gazetteer use the output of a lemmatisation ? if not, is 
>> there so far no way to use Gate with a very large gazetteer, but doing 
>> lookups also on lemmas ?
> 
> We have also met with this requirement. The LKB Gazetteer does not support 
> this functionality. Internally, we are using its successor - the Linked Data 
> Gazetteer (LD Gazetteer). Unfortunately it is still not publicly released. 
> 
>> - when the large KB Gazetteer is used with a traditional gate gazetteer (a 
>> .lst file), is it possible to add features to each entry in the list, as in 
>> the ANNIE gazetteer ?
> 
> The LKB Gazetteer has the capability to only set class and instance features 
> of the annotations, in this way relating them to instances in the semantic 
> repository. Thus the name - semantic annotations. You can use JAPE rules, or 
> write your own resource to set the features you desire. If you provide some 
> more information about the scenario, we could perhaps help you more.
> 
>> - I see a strange behaviour working with Gate 6.1, the large KB gazetteer, 
>> and connecting to an OWLIM 4.3 store:
>> the initial SPARQL query gives:
>> ** Loading completed: 2414240 aliases in 213 second(s).
>> but when restarting the project:
>> ** Aliases in IGNORE list:0
>> ** Loading of trusted entities from C:\Fab\Semantic 
>> Web\SoftCust\Ontology\gate_largeKB_remoteRep\kim.trusted.entities.cache
>> ** 1662791 elements loaded.
>> Is it normal those figures are not the same ?
> 
> My opinion is that this is caused either by an ignore list (although the log 
> shows that this list is empty)  or  duplicate records. A bunch of identical 
> statements in the result will cause a single record in the dictionary to be 
> created.  Try putting a "distinct" to your gazetteer query.
> 
>>  
>> Thank you for any information
>> Fabian
>> ___
>> Kim-discussion mailing list
>> Kim-discussion@ontotext.com
>> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion
> 
> Hope this helps,
> Philip Alexiev
> Software Engineer, KIM team
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Large KB Gazetteer

2012-03-26 Thread Philip Alexiev
Hello Fabian,

This is the right place to ask questions related to the LKB gazetteer, as it is 
developed by Ontotext. I will apply my answers inline - under your questions.

On 26 Mar 2012, at 9:18 AM, Fabian Cretton wrote:

> Dear all,
>  
> I write to this list as this seems to be the place to ask questions about the 
> large KB Gazetteer. This question is not directly related to KIM, thank you 
> to redirect me to a more appropriate list if needed.
>  
> I have only done little tests with the large KB Gazetteer in Gate (not with 
> KIM), and there are a few things about which I couldn't find more information:
> - can the large KB Gazetteer use the output of a lemmatisation ? if not, is 
> there so far no way to use Gate with a very large gazetteer, but doing 
> lookups also on lemmas ?

We have also met with this requirement. The LKB Gazetteer does not support this 
functionality. Internally, we are using its successor - the Linked Data 
Gazetteer (LD Gazetteer). Unfortunately it is still not publicly released. 

> - when the large KB Gazetteer is used with a traditional gate gazetteer (a 
> .lst file), is it possible to add features to each entry in the list, as in 
> the ANNIE gazetteer ?

The LKB Gazetteer has the capability to only set class and instance features of 
the annotations, in this way relating them to instances in the semantic 
repository. Thus the name - semantic annotations. You can use JAPE rules, or 
write your own resource to set the features you desire. If you provide some 
more information about the scenario, we could perhaps help you more.

> - I see a strange behaviour working with Gate 6.1, the large KB gazetteer, 
> and connecting to an OWLIM 4.3 store:
> the initial SPARQL query gives:
> ** Loading completed: 2414240 aliases in 213 second(s).
> but when restarting the project:
> ** Aliases in IGNORE list:0
> ** Loading of trusted entities from C:\Fab\Semantic 
> Web\SoftCust\Ontology\gate_largeKB_remoteRep\kim.trusted.entities.cache
> ** 1662791 elements loaded.
> Is it normal those figures are not the same ?

My opinion is that this is caused either by an ignore list (although the log 
shows that this list is empty)  or  duplicate records. A bunch of identical 
statements in the result will cause a single record in the dictionary to be 
created.  Try putting a "distinct" to your gazetteer query.

>  
> Thank you for any information
> Fabian
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

Hope this helps,
Philip Alexiev
Software Engineer, KIM team

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] true way to merge my class with protont class?

2012-03-22 Thread Philip Alexiev
Hi,

The best way to see what happens is to check what goes in the gazetteer 
dictionary. That is - what is the result of the gazetteer query.

From what I see in you gazetteer setup, it seems like the path is not correct. 
Your euro.gapp file should be located in KIM/context/default/resources and the 
FeedSetupPath should be something like:
FeedSetupPath=$relpath$../../../context/default/resources/gazetteer/euro


You can set the log level to DEBUG of the semantic repository logger, in order 
to track every query that is executed in the semantic repository.:


#SEMANTIC_REPOSITORY
log4j.category.SEMANTIC_REPOSITORY = DEBUG, aSEMANTIC_REPOSITORY


Then, if you clear the cache (rm -rf KIM/context/default/populated)  and run 
the server, in the logs (and on the standard output) you should see your own 
gazetteer query executed. If the path in FeedSetupPath was not correct, the 
gazetteer will use the default query and this is what you will see in the log.

The next output should look similar to this:

...
[INFO] Loaded 1 aliases in 0 second(s).
[INFO] Loaded 1 aliases in 3 second(s).
[INFO] Loaded 2 aliases in 5 second(s).
[INFO] Loaded 3 aliases in 6 second(s).
...

showing, that records are being added to the dictionary.

It is advisable to execute the gazetteer query separately to see the exact 
information that will form the gazetteer dictionary. You can do this in at 
least 2 ways:
- Use some of the tests provided in the documentation and evaluate the query 
through the SemanticRepositoryAPI .
- Use  JVisualVM with the MBeans plugin to connect through JMX to the KIM 
server. There is a bean  com.ontotext.kim.client.SemanticRepositoryMgmt that 
has some useful methods to evaluate queries against KIM's semantic repository. 
You can use theList evaluate(String query, String language)  operation to 
execute the gazetteer query.

You should then see from the output, the records for "Rooney".  Maybe there are 
more than one.

You can then inspect the RDF for the entity Rooney itself using the same 
mechanisms. This will give you a hint to what is its type. 
Have in mind that all the statements from your imported RDF files will go into 
the "explicit" graph and all the inferred statements will go in the "implicit" 
graph in Owlim. That is, if you want to get the real class of an entity you can 
do it with the query:

select ?type 
from <http://www.ontotext.com/explicit> 
where { my:entity  a ?type }

All the inferred types will go in the "implicit" graph. Usually these are the 
parent classes of the direct entity type. You can query them accordingly:

select ?type 
from <http://www.ontotext.com/implicit> 
where { my:entity  a ?type }

This is actually the process we use when implementing a new scenario. These 
tools will give you insight of what is actually going on under the hood.


Hope this helps.
Philip Alexiev
Software Engineer , KIM team



On 22 Mar 2012, at 5:55 AM, Minh Hoang wrote:

> here is my config:
> + my euro ontology:
> https://lh4.googleusercontent.com/--93t-13XbRI/T2qYm7eaVcI/KLw/f1ujZ0HcMRw/s1600/Aloxovn.com-Capture.PNG
> + in KIM\context\default\kb\euro have folow file:
>   euro.owl
>   euro_kim.nt
>   euro_labels.nt
>   euro_proton.nt
> 
> + in euro_kim.nt:
> <http://www.semanticweb.org/ontologies/2011/10/Euro.owl> 
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
> <http://proton.semanticweb.org/2006/05/protons#Trusted> .
> 
> + in euro_labels.nt has nothing (because i don't create any instance yet, 
> just want to annotate with true class)
> 
> + in euro_protont.nt:
> <http://www.semanticweb.org/ontologies/2011/10/Euro.owl#Agent> 
> <http://www.w3.org/2000/01/rdf-schema#subClassOf> 
> <http://proton.semanticweb.org/2006/05/protont#Agent> .
> <http://www.semanticweb.org/ontologies/2011/10/Euro.owl#Location> 
> <http://www.w3.org/2000/01/rdf-schema#subClassOf> 
> <http://proton.semanticweb.org/2006/05/protont#Location> .
> <http://www.semanticweb.org/ontologies/2011/10/Euro.owl#FootballEvent> 
> <http://www.w3.org/2000/01/rdf-schema#subClassOf> 
> <http://proton.semanticweb.org/2006/05/protont#Event> .
> <http://www.semanticweb.org/ontologies/2011/10/Euro.owl#FootballConcept> 
> <http://www.w3.org/2000/01/rdf-schema#subClassOf> 
> <http://proton.semanticweb.org/2006/05/protont#GeneralTerm> .
> <http://www.semanticweb.org/ontologies/2011/10/Euro.owl#FootballOrganization> 
> <http://www.w3.org/2000/01/rdf-schema#subClassOf> 
> <http://proton.semanticweb.org/2006/05/protont#Organization> .
> 
> + include these file in owlim.ttl:
>   ...
>   kb/euro/euro.owl;
>   kb/euro/euro_kim.nt;
>   kb/euro/euro_labels.nt;
>   kb/euro/euro_proton.

Re: [Kim-discussion] true way to merge my class with protont class?

2012-03-21 Thread Philip Alexiev
Can you provide more information.  What are the exact customizations you are 
applying to the default installation?  

On 21 Mar 2012, at 6:43 PM, Minh Hoang wrote:

> hi Philip,
> 
> thank you for quick reply.
> how about my first question : "What happen if i sub-class to protons:Entity 
> with all of class in euro ontology?is this a reason to make KIM annotating 
> with wrong class?" . i want to know what i only need to do making it matches.
> 
> thank!
> 
> On Wed, Mar 21, 2012 at 11:14 PM, Philip Alexiev 
>  wrote:
> Hi  Minh Hoang,
> 
> What types are the annotations depends on several resources. 
> 
> First the gazetteer matches the objects and creates Lookup annotations. What 
> phrases are filled in the gazetteer's dictionary depends on the query it 
> executes to the semantic repository (Owlim).  The default query is:
> 
> select LA, I, DC from 
> (
>   {TI} rdf:type 
> {<http://proton.semanticweb.org/2006/05/protons#Trusted>}, 
>   {I} <http://proton.semanticweb.org/2006/05/protons#generatedBy> {TI}, 
>   {I} rdf:type {<http://proton.semanticweb.org/2006/05/protons#Entity>}, 
>   {I} serql:directType {DC}, 
>   {I} <http://proton.semanticweb.org/2006/05/protons#hasAlias> {} 
> rdfs:label {LA}; 
>   [<http://www.ontotext.com/kim/2006/05/kimlo#ignoredAlias> {IG}]
> ) 
> UNION 
> (
>   {I} <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
> {<http://www.ontotext.com/kim/2006/05/kimlo#NERLexica>},  
>   {I} serql:directType {DC},  
>   {I} <http://www.w3.org/2000/01/rdf-schema#label> {LA}
> ) 
> WHERE 
>   IG = NULL 
>   AND isLiteral(LA) 
>   AND DC != <http://proton.semanticweb.org/2006/05/protont#JobPosition> 
>   AND NOT LA = ""
> 
> It can be overwritten by  providing a  query.txt file  containing either a 
> SERQL or SPARQL query in the config/ folder of the KIM installation.  
> Executing this query directly will show you what will be filled in the 
> dictionaries and what ontology class will be associated with the newly 
> created Lookup annotation. 
> 
> This line from the query:
> 
> {I} serql:directType {DC},
> 
> stands to say, that we want only the direct type of the entity, and not the 
> inferred types. If we used rdf:type, then the gazetteer will create 
> additional Lookup for each inferred type also, because they will be in the 
> result of the query.   So what type will be associated with the lookup you 
> can control by changing the query and you can check by executing the query.
> 
> This is the first step. The other is to create a jape rule that actually 
> transforms the Lookup annotation with this class to an annotation of a 
> specific type. Like Lookup with type protont:Person to a Person annotation.  
> An important thing to note is that in order for this annotation to stay, it 
> must be included in the whitelist of entity annotation in KIM/nerc.properties 
>  in the property -  com.ontotext.kim.KIMConstants.IE_ANN_TYPES .
> 
> Another thing to have in mind: when the query for the gazetteer is changed, 
> the already generated dictionary should be cleared, in order for it to be 
> generated again. You can check where is the cache by checking the 
> staticDictSerializationPath property of the LKB Gazetteer gate resource.  In 
> the default KIM installation it is in KIM/context/default/populated/cache.  
> So deleting this folder and running the server again will cause the 
> dictionaries to be regenerated.
> 
> 
> Hope this helps,
> Philip Alexiev
> Software Engineer, KIM team
> 
> 
> On 21 Mar 2012, at 4:25 PM, Minh Hoang wrote:
> 
>> Hi all,
>> 
>> I'm new to KIM platform. i have folowed KimDocs-3.0-EN to merge my ontology 
>> (euro ontology) with protont. euro ontology has some class that i am not 
>> sure to set subclass with any class in proton. KimDoc said that i should 
>> sub-class with protons:Entity. but i am not sure it right because KIM alway 
>> annotate with wrong class (for example: Rooney->person, but in euro ontology 
>> Rooney must be a player). so this is my question:
>> What happen if i sub-class to protons:Entity with all of class in euro 
>> ontology?is this a reason to make KIM annotating with wrong class?if not, 
>> tell me how to make it true?
>> 
>> p/s : sorry for some wrong english :)
>> ___
>> Kim-discussion mailing list
>> Kim-discussion@ontotext.com
>> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion
> 
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] true way to merge my class with protont class?

2012-03-21 Thread Philip Alexiev
Hi  Minh Hoang,

What types are the annotations depends on several resources. 

First the gazetteer matches the objects and creates Lookup annotations. What 
phrases are filled in the gazetteer's dictionary depends on the query it 
executes to the semantic repository (Owlim).  The default query is:

select LA, I, DC from 
(
{TI} rdf:type 
{<http://proton.semanticweb.org/2006/05/protons#Trusted>}, 
{I} <http://proton.semanticweb.org/2006/05/protons#generatedBy> {TI}, 
{I} rdf:type {<http://proton.semanticweb.org/2006/05/protons#Entity>}, 
{I} serql:directType {DC}, 
{I} <http://proton.semanticweb.org/2006/05/protons#hasAlias> {} 
rdfs:label {LA}; 
[<http://www.ontotext.com/kim/2006/05/kimlo#ignoredAlias> {IG}]
) 
UNION 
(
{I} <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
{<http://www.ontotext.com/kim/2006/05/kimlo#NERLexica>},  
{I} serql:directType {DC},  
{I} <http://www.w3.org/2000/01/rdf-schema#label> {LA}
) 
WHERE 
IG = NULL 
AND isLiteral(LA) 
AND DC != <http://proton.semanticweb.org/2006/05/protont#JobPosition> 
AND NOT LA = ""

It can be overwritten by  providing a  query.txt file  containing either a 
SERQL or SPARQL query in the config/ folder of the KIM installation.  Executing 
this query directly will show you what will be filled in the dictionaries and 
what ontology class will be associated with the newly created Lookup 
annotation. 

This line from the query:

{I} serql:directType {DC},

stands to say, that we want only the direct type of the entity, and not the 
inferred types. If we used rdf:type, then the gazetteer will create additional 
Lookup for each inferred type also, because they will be in the result of the 
query.   So what type will be associated with the lookup you can control by 
changing the query and you can check by executing the query.

This is the first step. The other is to create a jape rule that actually 
transforms the Lookup annotation with this class to an annotation of a specific 
type. Like Lookup with type protont:Person to a Person annotation.  An 
important thing to note is that in order for this annotation to stay, it must 
be included in the whitelist of entity annotation in KIM/nerc.properties  in 
the property -  com.ontotext.kim.KIMConstants.IE_ANN_TYPES .

Another thing to have in mind: when the query for the gazetteer is changed, the 
already generated dictionary should be cleared, in order for it to be generated 
again. You can check where is the cache by checking the 
staticDictSerializationPath property of the LKB Gazetteer gate resource.  In 
the default KIM installation it is in KIM/context/default/populated/cache.  So 
deleting this folder and running the server again will cause the dictionaries 
to be regenerated.


Hope this helps,
Philip Alexiev
Software Engineer, KIM team


On 21 Mar 2012, at 4:25 PM, Minh Hoang wrote:

> Hi all,
> 
> I'm new to KIM platform. i have folowed KimDocs-3.0-EN to merge my ontology 
> (euro ontology) with protont. euro ontology has some class that i am not sure 
> to set subclass with any class in proton. KimDoc said that i should sub-class 
> with protons:Entity. but i am not sure it right because KIM alway annotate 
> with wrong class (for example: Rooney->person, but in euro ontology Rooney 
> must be a player). so this is my question:
> What happen if i sub-class to protons:Entity with all of class in euro 
> ontology?is this a reason to make KIM annotating with wrong class?if not, 
> tell me how to make it true?
> 
> p/s : sorry for some wrong english :)
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Problem in getting ontology class using KIM API

2012-03-08 Thread Philip Alexiev
Hello Cong,

I must see the actual ontology to get a clearer picture, but generally it is 
not a good idea to identify classes ( that is - to create annotations for 
classes). Classes serve only to group and classify objects. Objects are the 
entities detected in texts. So I suggest you have "coach" as an instance of 
some class (for example:  protont:JobPosition).  Then this position will be 
identified by the gazetteer and a Lookup annotation will be created for it. 
Then you can have a JAPE rule transform this lookup to a JobPosition 
annotation. This annotation type should also be present in the  annotation 
types whitelist in  nerc.properties file in  
com.ontotext.kim.KIMConstants.IE_ANN_TYPES .

If you send the ontology I will try to give more specific advices.

Hope this helps

Philip Alexiev
Software Engineer, KIM team

On 7 Mar 2012, at 1:09 AM, Cong Nguyen wrote:

> Hello Philip and Kim team,
> I want to annotate all of text which are the same as label of all classes in 
> my ontology, for example, i have class http://example#Coach rdfs:label 
> "Coach"@en, and a document "Alex Ferguson is a coach" will have annotation 
> "coach". I try to use jape rule, but, it doesn't work, even though with 
> example in Gate Jape Tutorial!!!. So i use KIM API, configure KIM not 
> removing Token annotation, and compare all the token with classes retrieving 
> from KIM API (Ontology.getSubClasses), but i must also configure all of my 
> class being visible in KIM. It'll be ok if there is only a small ontology, 
> but not if there is 2-3 ontology with hundreds of class
> Please show me how to automatic configuring class's visibility in KIM, or 
> suggest me some advices.
> Thank you and regards,
> Cong
> 
> -- 
> Cong Hoang Nguyen
> University: Hanoi University of Science and Techonology.
> Email: congnh0...@gmail.com
> Facebook: http://www.facebook.com/monday0rsunday
> YH: congnh0902
> Skype: monday0rsunday
> Phone: (+84)1678565200
> 

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Inst feature doesn't match Class feature

2012-02-21 Thread Philip Alexiev
Hello Cong,

Using separate data sources almost always introduce the problem of identical 
objects from different sources.

Naturally, most quality is achieved through manual evaluation and merging of 
the identical records. Detecting such entities can be found with the help of a 
simple query to retrieve each set of entities that have identical label. Then 
the conflicts can be either merged or filtered.  In reality, this process is 
slow and ineffective for big data sets and regular dataset updates. It is 
preferred for managable in size data that is static in time.

A possible automated approach is to identify those sets of identical entities  
and then, after some human evaluation, mark them as identical for OWLIM. This 
can be done with the help of a special predicate from OWL - sameAs.  OWLIM 
supports it and even has some optimizations for it. More information can be 
found here:  http://www.ontotext.com/owlim/owl-sameas-optimisation .   
Nevertheless, this is not a perfect solution, as the resulting merged object, 
will be of type Person as well as  Head-Coach.  So the query for the gazetteer 
will return one record for "Alex Ferguson" for Person and one for Head-Coach. 

Various techniques can be used to manage this case, like tuning the gazetteer 
query to return only the lowest subclass in the hierarchy (assuming that 
identical entities are of identical types or subclass types), managing the 
conflicts using JAPE rules   etc.  Nevertheless, the knowledge base is usually 
static in time, so manually resolving those cases is the preferred approach.

Hope this helps,
Philip Alexiev


On 20 Feb 2012, at 11:26 PM, Cong Nguyen wrote:

> Hello Philip,
> In my KB, there are only statements about Coach and Player:
> football:Coach_AlexFerguson a football:Head-Coach;
>   rdfs:label "Alex Ferguson";
>   protons:mainLabel "Alex Ferguson";
>   protons:hasMainAlias football:SirAlexFerguson.
> football:SirAlexFerguson  a protons:Alias;
>   rdfs:Label "Sir Alex Ferguson";
>   protons:mainLabel "Sir Alex Ferguson".
> football:Coach_AlexFerguson protons:generatedBy 
> <http://www.semanticweb.org/ontologies/2011/10/Ontology1321965120856>.
> football:SirAlexFerguson protons:generatedBy 
> <http://www.semanticweb.org/ontologies/2011/10/Ontology1321965120856>.
> 
> and i found that there are also some statements related to Ferguson in 
> default KB wkb.nt, for example:
> <http://www.ontotext.com/kim/2006/05/wkb#PersonFirstMale_T.2403> 
> <http://www.w3.org/2000/01/rdf-schema#label> "Ferguson" .
> <http://www.ontotext.com/kim/2006/05/wkb#Person_T.668.0> 
> <http://www.w3.org/2000/01/rdf-schema#label> "Derek Ferguson" .
> But i want to customize KIM using all of KBs, so i don't want to remove 
> statement like this. Is there any way to change the priority of statement 
> among them?
> 
> 
> 2012/2/20 Philip Alexiev 
> Hello Cong,
> 
> The LKB gazetteer is the resource , which sets the class and inst features of 
> these annotations. The gazetteer uses the knowledge base to fill its 
> dictionaries. The query which retrieves the data for the dictionaries by 
> default is this one:
> 
> select LA, I, DC from 
> (
>   {TI} rdf:type {<http://proton.semanticweb.org/2006/05/protons#Trusted>}, 
>   {I} <http://proton.semanticweb.org/2006/05/protons#generatedBy> {TI}, 
>   {I} rdf:type {<http://proton.semanticweb.org/2006/05/protons#Entity>}, 
>   {I} serql:directType {DC}, 
>   {I} <http://proton.semanticweb.org/2006/05/protons#hasAlias> {} rdfs:label 
> {LA}; 
>   [<http://www.ontotext.com/kim/2006/05/kimlo#ignoredAlias> {IG}]
> ) 
> UNION 
> (
>   {I} <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
> {<http://www.ontotext.com/kim/2006/05/kimlo#NERLexica>},  
>   {I} serql:directType {DC},  
>   {I} <http://www.w3.org/2000/01/rdf-schema#label> {LA}
> ) 
> 
> WHERE IG = NULL AND isLiteral(LA) AND DC != 
> <http://proton.semanticweb.org/2006/05/protont#JobPosition> AND NOT LA = ""
> 
> You can also specify your own query in a  query.txt  file in KIM's  config/  
> directory.
> 
> The type of every instance is retrieved with this construct:
>   {I} serql:directType {DC}
> 
> DC is the explicitly set class of this instance. 
> 
> In your case it looks like the Ferguson has an implicitly set type - Person. 
> Make sure the only explicit type statement for the coaches is like this one:
> 
> example:Furguson  rdf:type  example:Coach
> 
> 
> Hope this helps,
> Philip Alexiev
> Software Engineer, KIM team
> 
> 
> On 18 Feb 2012, at 10:18 PM, Cong Nguyen wrote:
> 
&

Re: [Kim-discussion] Inst feature doesn't match Class feature

2012-02-20 Thread Philip Alexiev
Hello Cong,

The LKB gazetteer is the resource , which sets the class and inst features of 
these annotations. The gazetteer uses the knowledge base to fill its 
dictionaries. The query which retrieves the data for the dictionaries by 
default is this one:

select LA, I, DC from 
(
  {TI} rdf:type {<http://proton.semanticweb.org/2006/05/protons#Trusted>}, 
  {I} <http://proton.semanticweb.org/2006/05/protons#generatedBy> {TI}, 
  {I} rdf:type {<http://proton.semanticweb.org/2006/05/protons#Entity>}, 
  {I} serql:directType {DC}, 
  {I} <http://proton.semanticweb.org/2006/05/protons#hasAlias> {} rdfs:label 
{LA}; 
  [<http://www.ontotext.com/kim/2006/05/kimlo#ignoredAlias> {IG}]
) 
UNION 
(
  {I} <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
{<http://www.ontotext.com/kim/2006/05/kimlo#NERLexica>},  
  {I} serql:directType {DC},  
  {I} <http://www.w3.org/2000/01/rdf-schema#label> {LA}
) 

WHERE IG = NULL AND isLiteral(LA) AND DC != 
<http://proton.semanticweb.org/2006/05/protont#JobPosition> AND NOT LA = ""

You can also specify your own query in a  query.txt  file in KIM's  config/  
directory.

The type of every instance is retrieved with this construct:
  {I} serql:directType {DC}

DC is the explicitly set class of this instance. 

In your case it looks like the Ferguson has an implicitly set type - Person. 
Make sure the only explicit type statement for the coaches is like this one:

example:Furguson  rdf:type  example:Coach


Hope this helps,
Philip Alexiev
Software Engineer, KIM team


On 18 Feb 2012, at 10:18 PM, Cong Nguyen wrote:

> Hello everyone!
> I've mapped concepts (Player, Coach) of a sport ontology  to Proton ontology 
> (Person), created some instances (Alex Ferguson...), and annotated documents, 
> for example  http://www.bbc.co.uk/sport/0/football/16237330 (I followed the 
> CaseStudy-IntegrationDbPedia in KIMDocs-3.0). All annotations about Player 
> are ok, but annotations about Coach are not, for example: Inst = Ferguson 
> have Class = Person, not sport ontology's concept Coach.
> Please help me to solve this problem.
> Thank you,
> Cong
> -- 
> Cong Hoang Nguyen
> University: Hanoi University of Science and Techonology.
> Email: congnh0...@gmail.com
> Facebook: http://www.facebook.com/monday0rsunday
> YH: congnh0902
> Skype: monday0rsunday
> Phone: (+84)1678565200
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Processing Spanish Laguange Documents

2012-02-15 Thread Philip Alexiev
Hello Naaman,

KIM does not support Spanish out of the box. The knowledge base of the public 
distribution contains information for more generic information extraction, like 
world news.  It contains the most famous Persons, Locations and Organizations 
around the world. Probably this knowledge is too general for your domain of 
interest.  Various amount of tuning is required.

There are generally two aspects of the named entity recognition process. 

Gazetteer Lookups
This is the process of recognizing well known objects in the text.  You need to 
supply a comprehensive set of named entities in spanish, which will feed the 
gazetteer. Then the gazetteer will be able to match them in the texts and 
create the corresponding annotations over them.  More information on this can 
be found in KIM's system documentation* under  Administration -> Extending the 
KIM ontology and knowledge base.

New Entities
This is a more complicated and composite approach, combining different 
techniques and rules. An example is using titles like "Mayor" and "Mrs.".  You 
can start by looking at the grammars that are being loaded 
(KIM/context/default/resources/grammar/main/main.jape)  and the rules that 
create annotations. These rules may include direct text matching of the context 
around the annotation or matching previously created annotations. 

* http://www.ontotext.com/sites/default/files/kim/KimDocs-3.0-EN.zip

Hope this helps
Philip Alexiev
Software Engineer, KIM team


On 14 Feb 2012, at 3:30 PM, Naaman Musawwir wrote:

> Hello,
> 
> We are going to try keyword extraction on some documents those are in
> Spanish Language. Please direct how to configure my KIM instance to do that.
> 
> Regards,
> Naaman Musawwir.
> 
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Annotate Unicode text

2012-01-23 Thread Philip Alexiev
Hello Srecko,

The short answer is - it is possible, but not directly out of the box.

The paper you noted addresses exactly this problem.  In order to be able to 
recognize named entities in  cyrillic, you need to use resources, tuned for the 
target language.

The paper you mentioned:
http://www.aclweb.org/anthology/W/W11/W11-4205.pdf

and also this paper:
http://www.dcs.shef.ac.uk/intranet/research/resmes/CS0201.pdf

which is created by the GATE team and is dedicated to "SLAVONIC NAMED ENTITIES 
IN GATE"  may serve as starting point and reference for future exploration of 
the task.  The task is not be easy, as there are not many specialized resources 
and it will require a lot of manual work. But the above documents provide a 
good roadmap.

As KIM uses GATE internally as a module to create semantic annotations,  
everything in these papers is valid and directly applicable. The role of KIM is 
to annotate, using GATE, then to create indexes, based on the annotations.

Hope this helps,
Philip

On 22 Jan 2012, at 2:04 AM, Srecko Joksimovic wrote:

> Hi,
>  
> Is it possible to annotate cirillyc text using KIM platform? I have read this 
> publication
>  
> [GeorgievEtAl2011] Georgi Georgiev, Valentin Zhikov, Borislav Popov, and 
> Preslav Nakov. Building a Named Entity Recognizer in Three Days: Application 
> to Disease Name Recognition in Bulgarian Epicrises. In Proceedings of the 
> RANLP'2011 Workshop on Biomedical Natural Language Processing 
> (BiomedicalNLP'11). The paper.
>  
> Could you please explain to me if something like that is possible using 
> cirillyc text?
>  
> Best,
> Srecko
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] [Help] Haven't seen Lookup annotation in KIM GATE

2011-12-30 Thread Philip Alexiev
Hello Cong,

Do you see the Lookup annotations you expect with a different type - for 
example Person, Location or Organization.  The way the pipeline works, is it 
uses the gazetteer to create Lookups. After this, some more resources proceed 
over them. Some of them create the corresponding Person, Location,Organization 
annotations from the Lookups  of this type.

A simple example:

Let's assume the gazetteer has recognized  "John Smith" in the text and created 
a Lookup with feature class=protont:Person. Then a grammar sees this Lookup and 
its type and created a Person over the same span.

So two variations here:

1. The annotations do not exist as a Person/Location/Organization  annotation.
In this case, the gazetteer failed to recognize the phrase in text. You should 
check the setup.

2. The annotation exists as a Person/Location/Organization annotation.
In this case everything is OK and the gazetteer recognized them and they were 
transformed from Lookup to one of this type.

In order to see the Lookup annotations, you have to disable the Annotation 
Cleaner processing resource at the end of the pipeline.  It serves to remove 
the temporary annotations. Only annotations in the whitelist are left. The 
whitelist is defined in  KIM/config/nerc.properties   with the 
com.ontotext.kim.KIMConstants.IE_ANN_TYPES . Any annotation not in this list is 
removed. The reason is, some annotations,  Lookup included, serve only to 
create annotations of the meaningful types in the whitelist.

Hope this makes the horizon a little clearer.
Please feel free to ask for more clarification.

Happy holidays.
Philip Alexiev
Software Engineer, KIM team


On 30 Dec 2011, at 2:11 PM, Cong Nguyen wrote:

> Hi everyone.
> I follow KimDocs-3.0-EN/CaseStudy-IntegrationDbPedia.html and Customize KIM 
> 3, and in step setting up the gazetteers, when i run KIM GATE,  i don't see 
> any Lookup annotation. Is there any problem in my result?
> Please see my attacked files.
> Thank you and regards.
> -- 
> Cong Hoang Nguyen
> University: Hanoi University of Science and Techonology.
> Email: congnh0...@gmail.com
> Facebook: http://www.facebook.com/monday0rsunday
> YH: congnh0902
> Skype: monday0rsunday
> Phone: (+84)1678565200
> 
>  annotation.rar>___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Issues with running sample pipeline.gapp from behind firewall

2011-12-21 Thread Philip Alexiev
Hello,

Using Java behind a Proxy  often leads to problems. We do not have much 
experience running KIM in this context. What I can recommend is you try with 
the following parameters to the JVM:

-Dhttp.proxyHost=10.0.0.100 -Dhttp.proxyPort=8800
This address and port are for the sake of the example. Substitute with yours.

You can add these parameters to KIM_OPTS in KIM/bin/kim  if you run the server 
under Linux and  to KIM_RT_OPTS in KIM/bin/kim.bat if you run the server under 
Windows.

Hope this helps.

Philip Alexiev
Software Engineer, KIM team


On 20 Dec 2011, at 2:35 PM, borislav popov wrote:

> Thanks Genesis, 
>   i've posted the mail to the kim discussion. The registration takes some 
> time to be processed - that's the reason 
> thank you 
> borislav popov 
> head of semantic annotation and search
> 
> On Dec 20, 2011, at 2:14 PM, Onto Genesis wrote:
> 
>> Hello,
>> 
>> I just registered myself.  I submitted a question to the discussion forum.  
>> However, I am getting an error message as follows.  Please advice.
>> 
>> Thanks
>> --Genesis
>> 
>> On Tue, Dec 20, 2011 at 7:11 AM,  wrote:
>> You are not allowed to post to this mailing list, and your message has
>> been automatically rejected.  If you think that your messages are
>> being rejected in error, contact the mailing list owner at
>> kim-discussion-ow...@ontotext.com.
>> 
>> 
>> 
>> -- Forwarded message --
>> From: Onto Genesis 
>> To: kim-discussion@ontotext.com
>> Cc: 
>> Date: Tue, 20 Dec 2011 07:11:36 -0500
>> Subject: Issues with running sample pipeline.gapp from behind firewall
>> Hello,
>> 
>> I a looking for help regarding an issue I have with running GATE examples 
>> within a corporate environment where one has to provide a proxy to make an  
>> http request that is outside the firewall.
>> 
>> I am trying to run the sample sample_linked_data_mashup.gapp or 
>> pipeline.gapp.
>> 
>> The issue seems to be caused by
>> Caused by: java.net.UnknownHostException: factforge.net
>> I have tried to specify proxyHost and proxyport to JVM as a parameter; also 
>> as part of the ANT script but with no luck.
>> 
>> Here is a stack trace of the error file.
>> 
>> GATE 6.2-SNAPSHOT build 4141 started at Mon Dec 19 18:55:24 EST 2011
>> and using Java 1.6.0_27 Sun Microsystems Inc. on Windows XP x86 5.1.
>> CREOLE plugin loaded: 
>> file:/C:/apps_custom/gate-6.2-SNAPSHOT-build4141-ALL/plugins/ANNIE/
>> Logger com.ontotext.kim level set to INFO, overriding the default effective 
>> level of DEBUG. Set the level of com.ontotext.kim explictly if required.
>> Logger org.openrdf.sesame level set to INFO, overriding the default 
>> effective level of DEBUG. Set the level of org.openrdf.sesame explictly if 
>> required.
>> Logger httpclient level set to INFO, overriding the default effective level 
>> of DEBUG. Set the level of httpclient explictly if required.
>> Logger org.apache.commons.httpclient level set to INFO, overriding the 
>> default effective level of DEBUG. Set the level of 
>> org.apache.commons.httpclient explictly if required.
>> Query loaded from 
>> C:\apps_custom\gate-6.2-SNAPSHOT-build4141-ALL\plugins\Gazetteer_LKB\samples\dictionary_from_remote_repository\query.txt
>> Looking for changes in configuration ...
>> Aliases in IGNORE list:0
>> Loading of trusted entities from Sesame
>> Initialized Sesame repository: 
>> org.openrdf.repository.http.HTTPRepository@a42c31
>> Loading failed.
>> com.ontotext.kim.client.query.KIMQueryException: Error in repository 
>> connection.
>> at 
>> org.openrdf.repository.http.PrivateRepositoryFeed.feedTo(PrivateRepositoryFeed.java:106)
>> at 
>> com.ontotext.kim.model.AliasCacheImpl.loadTrustedMaps(AliasCacheImpl.java:364)
>> at com.ontotext.kim.model.AliasCacheImpl.initCache(AliasCacheImpl.java:278)
>> at 
>> com.ontotext.kim.model.AliasCacheImpl.createInstance(AliasCacheImpl.java:151)
>> at com.ontotext.kim.model.AliasCacheImpl.getInstance(AliasCacheImpl.java:107)
>> at com.ontotext.kim.gate.KimGazetteer.init(KimGazetteer.java:87)
>> at gate.Factory.createResource(Factory.java:372)
>> at 
>> gate.util.persistence.ResourcePersistence.createObject(ResourcePersistence.java:83)
>> at gate.util.persistence.PRPersistence.createObject(PRPersistence.java:76)
>> at 
>> gate.util.persistence.LanguageAnalyserPersistence.createObject(LanguageAnalyserPersistence.java:51)
>> at 
>> gate.util.persistence.PersistenceManager.getTransientRepresentation(PersistenceManager.java:

Re: [Kim-discussion] Strange KIMQuery problem

2011-11-25 Thread Philip Alexiev
Hello Jerry,

On the first question:

Serql uses the full URI syntax for the search literal. This means 
<64217a6726705e3e864e331074da011e:64217a6726705e3e864e331074da011e>
should conform to the URI specification. The specification can be found at  
http://www.ietf.org/rfc/rfc2396.txt . You can see there that the schema (the 
first part of the URI - the part before the ':' ) can start only with a letter. 
 This explains the error at query validation and why it works with the hash 
code that starts with a letter.

A possible solution to this will be to generate your hash code and put a letter 
character at the beginning of it. So that if it is 'x' for example, all your 
codes will start with 'x'. You will then search for the code with an 'x' in 
front.


On the second question:

This behavior is strange.  Please make sure you are working with identical 
document queries.

If you send an example document and a test case that reproduces the error, it 
will be easy for us to track it down.

Thank you for your interest in KIM.

Philip Alexiev
Softrware Engineer , KIM team



On 24 Nov 2011, at 6:35 PM, Jerry Gao wrote:

> Strange 1. Encounter with KIMQueryException:
> 
> 1. I put hashed web content into 'CONTENTHASHCODE'. The value is 
> 64217a6726705e3e864e331074da011e.
> Java code: 
>  DocumentQuery docQuery = new DocumentQuery();
>try{
>AtomExpr expr = new 
> AtomExpr(map.get("CONTENTHASHCODE").toString(), "CONTENTHASHCODE");
>docQuery.setKeywordRestriction(expr);
>docQuery.setMaxResultLength(2);
>
>long size = irApi.getDocumentCount(docQuery); // 
> DocumentRepositoryAPI irApi
>if (size > 0L){
>System.out.println("The document was stored");
>return true;
>}
>   } catch (Exception x) {
> KimLogs.logPOPULATER.debug("Error checking isStored() in 
> DuplicateHunter.");
>   }
>   return false;
> 
> Exception details:
> 
> com.ontotext.kim.client.query.KIMQueryException: Encountered " "<" "< "" at 
> line 2, column 278.
> Was expecting one of:
> "{" ...
> "}" ...
>  ...
>  ...
>  ...
>  ...
>  ...
>  ...
>  ...
>  ...
>  ...
>  ...
>  
>  while running this query: 
> SELECT DISTINCT D FROM 
> {D} <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
>   {<http://proton.semanticweb.org/2006/05/protont#Document>} , 
> {D} <http://ordi.ontotext.com/sar#hasFeature> 
> {F0} <http://ordi.ontotext.com/sar#hasKey> {"CONTENTHASHCODE"}, 
> {F0} <http://ordi.ontotext.com/sar#hasValue> {V0}, 
> {<64217a6726705e3e864e331074da011e:64217a6726705e3e864e331074da011e>} 
> <http://www.ontotext.com/prefixMatchIgnoreCase> {V0} LIMIT 1
> 
> However,...
> 
> when i change to set hashed URL value (e.g., 
> e22c60e9d8cf41d897b2fae7da041f35), it works!! 
> 
> Java Code:
>  DocumentQuery docQuery = new DocumentQuery();
>try{
>long docId = 
> loadDocumentIdByContentHashCode(map.get("LINKURLHASHCODE").toString());
>
>AtomExpr expr = new 
> AtomExpr(map.get("LINKURLHASHCODE").toString(), "LINKURLHASHCODE");
>docQuery.setKeywordRestriction(expr);
>docQuery.setMaxResultLength(2);
>&nb! sp;
>long size = irApi.getDocumentCount(docQuery);
>if (size > 0L){
>System.out.println("The document was stored");
>return true;
> }
>   } catch (Exception x) {
> KimLogs.logPOPULATER.debug("Error checking isStored() in 
> DuplicateHunter.");
>   }
>   return false;
> 
> NO exception happens! 
> 
> Why???
> 
> Strange 2: We are not familiar with the difference between method ' 
> irApi.getDocumentCount(DocumentQuery )' and method 
> 'irApi.getDocumentIds(DocumentQuery )':
> 
> To our best knowledge of the two methods, they should response with 
> consistent results.
> 
> However, the actual results are strange:
> 
> For example, we us! e the same query and value 
> (e22c60e9d8cf41d897b2fae7da041f35) to query:
> 1. first method 'isStored' (java code):
>   AtomEx

Re: [Kim-discussion] Dictionary life cycle for the Large KB Gazetteer

2011-11-22 Thread Philip Alexiev
Buna Mihaela,

  In the general case, changing the ontology on the fly introduces more 
complications than benefits. That is why it is not our practice and we haven't 
implemented the mechanisms for this process.  Currently, the LKB Gazetteer 
lacks this capability. Nevertheless, it is an interesting case, that we will 
take into consideration for the next releases.

  Usually we are working with two kinds of knowledge.  
  One consists of  well known facts from trusted sources. Those include objects 
like countries, cities, famous people etc. Those facts are carefully examined 
and can be relied upon. This quality data forms the gazetteer dictionary.  It 
does not change during the work of the system. It may improve between different 
deploys.
  The other type of data consists of facts, that have been recognized with the 
help of some rules and logic.  The quality here is not so good, so they don't 
have a place in the gazetteer. These facts are added to the RDF store when 
annotating the documents.

  So the ontology scheme and trusted data do not change often and there is no 
need to reload the gazetteer on the fly.  Also such changes may lead to serious 
inconsistencies, due to the complicated inference rules in the RDF databases.  
That is why it is easy to add hard to modify and remove facts from them. This 
also answers the question about the differential update. It is easier to clear 
the dictionaries thus forcing the gazetteer to generate them anew. 

  It will be interesting to know more about your usecase and why you need such 
a flexible process. That way we might be able to help more.

Thank you for your interest in KIM

Philip Alexiev
Software Engineer, KIM team



On 22 Nov 2011, at 2:08 PM, Reneta Popova wrote:

> 
> 
> Begin forwarded message:
> 
>> From: Mihaela Olteanu 
>> Subject: Dictionary life cycle for the Large KB Gazetteer
>> Date: 22 ноември 2011 13:57:18 Гриинуич+0200
>> To: marin.nozhc...@ontotext.com, da...@ontotext.com, 
>> reneta.pop...@ontotext.com
>> 
>> Hello,
>>  
>> I am using the Large KB Gazetteer with AllegroGraph. The ontology definition 
>> that I use can change while the pipeline is running.
>> In this case I could simply reinitialize the processing resource. The 
>> question is: this is the only way of reloading the dictionary for large KB 
>> gazetteer? Can't this operation be done automatically, maybe setting the 
>> gazetteer as listener to my updated ontology event and on message 
>> reinitialize the component automatically?
>> Is it possible to load only the newly added definitions, or the updates?
>>  
>> Thanks,
>> Mihaela
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Annotation with KimPlugin

2011-11-16 Thread Philip Alexiev
Hello Kele,

You can not access the stored documents directly.  Only though the KIM 
interfaces.

The documents are stored in 2 separate places:
 - The content of the document and the annotations are stored in a Lucene index 
inside KIM. It is used for full text search and  visualization.
 - Information about occurrances  (which entities are mentioned in which 
documents) are stored in a separate Owlim index. It is used for the more 
complicated semantic search.

You can however observe the result of the annotation process by running the 
Gate interface of KIM and annotating a document. Do this by entering  './kim  
gate'  at the command line when inside KIM/bin .  This will load the 
configuration of resources used in the annotation phase of KIM.  Then add a 
document of your choice and execute the pipeline (additional information on how 
to work with gate can be found at the official site:   http://gate.ac.uk ).  
Please note that annotating a document this way will not  insert it into KIM. 
That is - you will not see it the next time you start KIM and browse the visual 
interfaces.

I hope this answers your question. If not, please be more specific as to what 
you want to achieve.

All the best,

Philip Alexiev
Software Engineer, KIM team 
 

On 16 Nov 2011, at 4:09 PM, Kele Belloze wrote:

> Hi Philip,
> sorry for the delay. Thank you for contact.
> I was trying to run KIM again. I followed the Quick Start Guide and I 
> visualized the annotated documents in the KIM Web UI.
> 
> But, I have two questions:
> 1- Where do I access (which folder) the annotated documents? What format are 
> they stored?
> 2 - Can I load an arbitray ontology?
>  
> Regards,
> Kele
> 2011/11/9 Philip Alexiev 
> Hi Kele,
> 
> Thank you for your interest in KIM.
> 
> The easiest way to start is with the Quick Start Guide from the 
> documentation.  I will provide the direct link for convenience: Quick Start 
> Guide .
> 
> All the best,
> Philip Alexiev
> Software Engineer, KIM team
> 
> On 8 Nov 2011, at 4:10 PM, Kele Belloze wrote:
> 
>> Hi,
>> i'm a student in the Oswaldo Cruz Institute (Brazil) and I'm trying KIM for 
>> semantic annotation of documents. I installed it. But, If the Kim Plugin is 
>> no longer supported, how to perform these annotations? It is not clear in 
>> the documentation.
>> 
>> Can you help me?
>> 
>> Thanks,
>> Kele ___
>> Kim-discussion mailing list
>> Kim-discussion@ontotext.com
>> http://ontotext.com/mailman/listinfo/kim-discussion
> 
> 

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Annotation with KimPlugin

2011-11-09 Thread Philip Alexiev
Hi Kele,

Thank you for your interest in KIM.

The easiest way to start is with the Quick Start Guide from the documentation.  
I will provide the direct link for convenience: Quick Start Guide .

All the best,
Philip Alexiev
Software Engineer, KIM team

On 8 Nov 2011, at 4:10 PM, Kele Belloze wrote:

> Hi,
> i'm a student in the Oswaldo Cruz Institute (Brazil) and I'm trying KIM for 
> semantic annotation of documents. I installed it. But, If the Kim Plugin is 
> no longer supported, how to perform these annotations? It is not clear in the 
> documentation.
> 
> Can you help me?
> 
> Thanks,
> Kele ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] [Interested-in-kim] Help

2011-10-24 Thread Philip Alexiev
Hi again,

Thank you for the advice. It is very good.

All the best,
Philip

On 24 Oct 2011, at 11:28 PM, kalthoum kalthoum wrote:

> hello Philip,
> thanks it works now.
> Perhaps it would be preferable if you notice this problem in the quick start 
> guide of KIM.
> Sincerly,
> Kalthoum 
> 
> 
> --- En date de : Lun 24.10.11, Philip Alexiev  a 
> écrit :
> 
>> De: Philip Alexiev 
>> Objet: Re: [Interested-in-kim] Help
>> À: interested-in-...@ontotext.com
>> Cc: kalthoum_...@yahoo.fr
>> Date: Lundi 24 octobre 2011, 15h47
>> Hello Kalthoum,
>> 
>> KIM is not compatible with  jdk1.7  . 
>> Please try with the latest   jdk1.6 .
>> 
>> Thank you for your interest in KIM.
>> 
>> Philip Alexiev
>> Software Engineer, KIM team
>> 
>> 
>> 
>> On 24 Oct 2011, at 4:40 PM, KIM Platform info newsletter
>> wrote:
>> 
>>> 
>>> 
>>> --- En date de : Dim 23.10.11, kalthoum kalthoum
>> 
>> a écrit :
>>> 
>>>> De: kalthoum kalthoum 
>>>> Objet: Help
>>>> À: interested-in-...@ontotext.com
>>>> Date: Dimanche 23 octobre 2011, 23h02
>>>> Hi,
>>>> I'm trying to start KIM3 server but I encounter
>> this
>>>> problem:
>>>> exception in thread "componentstarter-thread-2"
>>>> java.lang.internal error cannot find javac
>> resource bundle
>>>> for locale fr_FR.
>>>> Knowing that I installed jdk1.7.0_01 and Tomcat
>> 7.0.
>>>> Sincerly,
>>>> Kalthoum
>>>> 
>>> ___
>>> interested-in-KIM mailing list
>>> interested-in-...@ontotext.com
>>> http://ontotext.com/mailman/listinfo/interested-in-kim
>> 
>> 

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Help - More about customizing IE of KIM

2011-10-24 Thread Philip Alexiev
Hello Cong Nguyen,

You have many choices here. 

The first decision is whether you want to use the ontology KIM provides (it is 
general enough to be usable in most of the cases). In this case you have to map 
your specific ontology to proton. We have an official guide how this is done.
Or you can decide to use your ontology only. I will address this case below.

KIM knows what ontology to load on init, by checking at the configuration in 
KIM/config/owlim.ttl . There at the bottom you can see relative paths to RDF 
files and their corresponding namespaces. You can rewrite those settings and 
put your ontology files there (do not forget to clear the caches after). When 
KIM starts next time, it will load your ontology, instead of PROTON + KIM world 
knowledge base.

Next , the gazetteer should be configured to fill its dictionary with the terms 
from the new ontology. The good news is the query that provides the data for 
the dictionary is configurable. If you look at the configuration of the LKB 
Gazetteer  resource in the pipeline, you will see:

dictFeederParams  :  [AllUpperEnrichment=true, IgnoreAliasListPath=, 
FeedSetupPath=$relpath$../../../config]

FeedSetupPath is the location where the gazetteer will look for a file called 
query.txt . This file holds either a sparql or serql query, returning the 
concepts that will be recognizable by the gazetteer. The query should return :  
 label of the entity , entity URI , entity direct class  .The order is 
important.  When the gazetteer is run over the text, it will match the  'label' 
 and assign the 'uri' and 'class' to the newly created annotations.

So you can either setup your own path or create a file - query.txt  in  
KIM/config/  with your custom query. Here is a very basic SeRQL query, that you 
can use as a starting point:

select LA, I, DC from 
(
{I} serql:directType {DC}, 
{I} rdfs:label {LA}
) 


Setting up the gazetteer is not sufficient to incorporate your ontology in the 
Information Extraction. Some of the other resources should be tuned also. The 
process is described in more details at our official site:  
http://www.ontotext.com/sites/default/files/Customizing%20KIM3.pdf .

Also please note, that we have some ontology conventions that are heavily used 
in the interface:

-  We expect that all the classes that we want to be visible in the interface, 
are marked visible. That is - each has such a statement:
  <http://www.ontotext.com/kim/2006/05/kimso#visibilityLevel1>  "" . 

-  We expect that each entity in the gazetteer lists is generated by a trusted 
source. That is - there are such statements for each such entity:
  <http://proton.semanticweb.org/2006/05/protons#generatedBy> 
 .

and the source is declared as trusted :
  <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://proton.semanticweb.org/2006/05/protons#Trusted> .


A work in progress is to make the visual components configurable and not tied 
to the concrete ontology. But these changes won't see daylight until the next 
release. 


Thank you for using KIM.
Philip Alexiev,
Software Engineer, KIM team


On 24 Oct 2011, at 7:39 AM, Cong Nguyen wrote:

> Hello everybody.
> I would like to customize KIM IE for only our domain ontology (about sport): 
> annotating only instances which classes are, or have relationship with 
> classes in our ontology. But i don't know how to do, and what is ongoing in 
> KIM IE processing. Please suggest me some advices and, if possible, let me 
> know more details about KIM IE processing.
> Thank you in advance.
> Cong
> 
> -- 
> Cong Hoang Nguyen
> University: Hanoi University of Science and Techonology.
> Email: congnh0...@gmail.com
> Facebook: http://www.facebook.com/monday0rsunday
> YH: congnh0902
> Skype: monday0rsunday
> Phone: (+84)1678565200
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Unable to start KIM

2011-10-14 Thread Philip Alexiev
Hello,

Thank you for the feedback. 

All the best,
Philip

On 14 Oct 2011, at 8:01 PM, Ruben Costa wrote:

> Hello again,
>  
> It seems that the error is related with Java installation alright, but with 
> the version of Java installed.
> I’ve tried to downgrade to JDK ver. 1.6.0_27 and it’s working ok. Not working 
> with the latest JDK version (JDK 1.7.0)
> Thank you once again.
>  
> Best regards,
>  
> From: Philip Alexiev [mailto:philip.alex...@ontotext.com] 
> Sent: sexta-feira, 14 de Outubro de 2011 15:58
> To: Ruben Costa
> Cc: Kim-discussion@ontotext.com
> Subject: Re: [Kim-discussion] Unable to start KIM
>  
> Hello Mr. Costa,
>  
> It looks like the cause for the problem is not in KIM or its underlying 
> services.  Most likely the Java installation you use is incomplete and is 
> missing a localization for javac (resource bundle). You may consider 
> downloading the latest JDK from the official site.
>  
> Thank you for your interest in KIM,
> Philip Alexiev
> Software Engineer, KIM team
>  
>  
> On 14 Oct 2011, at 5:15 PM, Ruben Costa wrote:
> 
> 
> Hello all,
>  
> I’m facing some problems in running KIM on a Windows 7 x64 OS.
> I’m getting the following error message:
> Exception in thread "ComponentStarter-Thread-2" java.lang.InternalError:
> Cannot find javac resource bundle for locale en_US
>  
> Any help is very much appreciated.
>  
> Thank you all.
>  
> Best regards,
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion
>  
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Unable to start KIM

2011-10-14 Thread Philip Alexiev
Hello Mr. Costa,

It looks like the cause for the problem is not in KIM or its underlying 
services.  Most likely the Java installation you use is incomplete and is 
missing a localization for javac (resource bundle). You may consider 
downloading the latest JDK from the official site.

Thank you for your interest in KIM,
Philip Alexiev
Software Engineer, KIM team


On 14 Oct 2011, at 5:15 PM, Ruben Costa wrote:

> Hello all,
>  
> I’m facing some problems in running KIM on a Windows 7 x64 OS.
> I’m getting the following error message:
> Exception in thread "ComponentStarter-Thread-2" java.lang.InternalError:
> Cannot find javac resource bundle for locale en_US
>  
> Any help is very much appreciated.
>  
> Thank you all.
>  
> Best regards,
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] SPARQL endpoint?

2011-10-10 Thread Philip Alexiev
Hi again,

Unfortunately, we do not support this in the public release. Although, this 
functionality is currently implemented and KIM can be configured to work with a 
remote sparql endpoint, instead of running an encapsulated Owlim inside.  This 
will also come with the new release.

All the best,
Philip Alexiev


On 10 Oct 2011, at 12:13 PM, Jamie Forth wrote:

> On Mon, 2011-10-10 at 10:32 +0300, Philip Alexiev wrote:
> 
>> In  KIM release 3.0-RC4 the way to communicate with the semantic
>> repository is through the API or using the exposed MBeans.
> 
> Thanks for the clarification.
> 
>> Forest also exposes a sparql endpoint.
> 
> This is what I'm looking for, I look forward to the next release.
> 
> In the meantime, we have OWLIM-SE and openrdf-sesame also installed on
> the same server as KIM. Is it possible to configure KIM to use this
> instance as the semantic repository, or are the KIM distributions
> customised in some way that would make this not possible?
> 
> Thanks again,
> 
> Jamie
> 
> 
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] SPARQL endpoint?

2011-10-10 Thread Philip Alexiev
Hi again,

Unfortunately, we do not support this in the public release. Although, this 
functionality is currently implemented and KIM can be configured to work with a 
remote sparql endpoint, instead of running an encapsulated Owlim inside.  This 
will also come with the new release.

All the best,
Philip Alexiev


On 10 Oct 2011, at 12:13 PM, Jamie Forth wrote:

> On Mon, 2011-10-10 at 10:32 +0300, Philip Alexiev wrote:
> 
>> In  KIM release 3.0-RC4 the way to communicate with the semantic
>> repository is through the API or using the exposed MBeans.
> 
> Thanks for the clarification.
> 
>> Forest also exposes a sparql endpoint.
> 
> This is what I'm looking for, I look forward to the next release.
> 
> In the meantime, we have OWLIM-SE and openrdf-sesame also installed on
> the same server as KIM. Is it possible to configure KIM to use this
> instance as the semantic repository, or are the KIM distributions
> customised in some way that would make this not possible?
> 
> Thanks again,
> 
> Jamie
> 
> 
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Help - Adding to document repository, importing new instances got from annotations into KB using Java API.

2011-10-06 Thread Philip Alexiev
Hi again,

KIM uses a combination of semantic repository (Owlim) and document repository 
(Lucene) to index and retrieve documents. Both store their indexes in  
KIM/context/default/populated .  Exporting only the RDF will not be sufficient. 
So in order to backup, you will need to archive the whole  "populated" folder. 
Although, I will advise archiving the whole KIM installation.

Hope this helps,
Philip

On 6 Oct 2011, at 1:53 PM, Cong Nguyen wrote:

> Hi, Philip
> Thank you very much for your helpful answers.
> I has another question in problem 2. I remember that i read somewhere in 
> kim-discussion which said that all new entities, also documents, stored in 
> /context/default/kb/populated. If there is any problem with KIM, i must 
> delete all thing in this folder and repopulate all documents. So does KIM 
> have any way to "backup" all entities, for example, export all entities to 
> N-triples file.
> Best regards.
> Cong
> 
> Vào 16:39 Ngày 06 tháng 10 năm 2011, Philip Alexiev 
>  đã viết:
> Hi  Hoàng Công Nguyễn,
> 
> My apologies. The answer to the first question was not correct.  KIM uses 
> some indexes, which need to be synchronized when new documents are added. The 
> javadoc  provides a good description:
> 
> 
> /**
>  * Synchronizes the index
>  * 
>  * 
>  * Asks the index to sync the newly added documents with the search 
> results.
>  * Before calling synchronizeIndex(true), documents recently added 
> documents may
>  * not be included the search results. Calling synchronizeIndex with 
> force = false
>  * cause the index to evaluate the need for synchronization and possibly 
> postpone it.
>  * 
>  * @throws DocumentRepositoryException
>  * @return true - if operation was executed; false - if conditional 
> execution
>  * was invoked (force == false) and the operation was not 
> initiated
>  * due to conditions;
>  */
> @Description("Synchronizes the index")
> public boolean synchronizeIndex(@PName("force") boolean force) throws 
> DocumentRepositoryException;
> 
> 
> All the best.
> Philip
> 
> On 6 Oct 2011, at 12:29 PM, Philip Alexiev wrote:
> 
>> 
>> 
>> Begin forwarded message:
>> 
>>> From: Philip Alexiev 
>>> Subject: Re: [Kim-discussion] Help - Adding to document repository, 
>>> importing new instances got from annotations into KB using Java API.
>>> Date: 6 October 2011 12:26:11 PM GMT+03:00
>>> To: Hoàng Công Nguyễn 
>>> Cc: KIM discussion 
>>> 
>>> Hello  Hoàng Công Nguyễn,
>>> 
>>> I will answer the questions in order:
>>> 
>>> 1)  Adding the document through the DocumentRepositoryAPI  is sufficient to 
>>> be able to see it in the  UI.  Making a search for documents will show it.
>>> 
>>> 2) New entities, found in documents, are automatically added to the 
>>> semantic repository. They are assigned a label and a type. Also the 
>>> relation is created, that this document mentions that entity.  Adding them 
>>> manually is not necessary.
>>> 
>>> 3) The answer to this question is more a matter of style. You might want to 
>>> keep your ontology as a separate, complete module, so using the properties 
>>> from another ontology module might not be a good idea. You can relate the 
>>> properties, by making them subproperties of PROTON's equivalents. In fact, 
>>> this can be made by a separate module, which solely purpose is to create 
>>> the mapping between the PROTON and your ontology. You can map classes and 
>>> properties there.  Then you can use your classes and properties, as they 
>>> are more relevant to your classes.
>>> 
>>> Using any of the approaches will work  (if you plan to use mainly your 
>>> ontology). It is just a matter of good style to keep ontologies in separate 
>>> modules and map them with a dedicated mapping module.
>>> 
>>> There are functional differences in making your properties a subproperty of 
>>> PROTON ones. If you make a query, using your properties, you will get 
>>> statements with your properties only. If you make a query using the PROTON 
>>> equivalent, you will get both statements with PROTON property and yours.
>>> 
>>> Hope this helps,
>>> Philip Alexiev,
>>> Software Engineer, KIM team
>>> 
>>> 
>>> On 6 Oct 2011, at 11:07 AM, Hoàng Công Nguyễn wrote:
>>> 
>>>> Hi,
>>>> I'm using KIM platform 3.0-RC-4 and i've got some problems

Re: [Kim-discussion] Help - Adding to document repository, importing new instances got from annotations into KB using Java API.

2011-10-06 Thread Philip Alexiev
Hi again,

KIM uses a combination of semantic repository (Owlim) and document repository 
(Lucene) to index and retrieve documents. Both store their indexes in  
KIM/context/default/populated .  Exporting only the RDF will not be sufficient. 
So in order to backup, you will need to archive the whole  "populated" folder. 
Although, I will advise archiving the whole KIM installation.

Hope this helps,
Philip

On 6 Oct 2011, at 1:53 PM, Cong Nguyen wrote:

> Hi, Philip
> Thank you very much for your helpful answers.
> I has another question in problem 2. I remember that i read somewhere in 
> kim-discussion which said that all new entities, also documents, stored in 
> /context/default/kb/populated. If there is any problem with KIM, i must 
> delete all thing in this folder and repopulate all documents. So does KIM 
> have any way to "backup" all entities, for example, export all entities to 
> N-triples file.
> Best regards.
> Cong
> 
> Vào 16:39 Ngày 06 tháng 10 năm 2011, Philip Alexiev 
>  đã viết:
> Hi  Hoàng Công Nguyễn,
> 
> My apologies. The answer to the first question was not correct.  KIM uses 
> some indexes, which need to be synchronized when new documents are added. The 
> javadoc  provides a good description:
> 
> 
> /**
>  * Synchronizes the index
>  * 
>  * 
>  * Asks the index to sync the newly added documents with the search 
> results.
>  * Before calling synchronizeIndex(true), documents recently added 
> documents may
>  * not be included the search results. Calling synchronizeIndex with 
> force = false
>  * cause the index to evaluate the need for synchronization and possibly 
> postpone it.
>  * 
>  * @throws DocumentRepositoryException
>  * @return true - if operation was executed; false - if conditional 
> execution
>  * was invoked (force == false) and the operation was not 
> initiated
>  * due to conditions;
>  */
> @Description("Synchronizes the index")
> public boolean synchronizeIndex(@PName("force") boolean force) throws 
> DocumentRepositoryException;
> 
> 
> All the best.
> Philip
> 
> On 6 Oct 2011, at 12:29 PM, Philip Alexiev wrote:
> 
>> 
>> 
>> Begin forwarded message:
>> 
>>> From: Philip Alexiev 
>>> Subject: Re: [Kim-discussion] Help - Adding to document repository, 
>>> importing new instances got from annotations into KB using Java API.
>>> Date: 6 October 2011 12:26:11 PM GMT+03:00
>>> To: Hoàng Công Nguyễn 
>>> Cc: KIM discussion 
>>> 
>>> Hello  Hoàng Công Nguyễn,
>>> 
>>> I will answer the questions in order:
>>> 
>>> 1)  Adding the document through the DocumentRepositoryAPI  is sufficient to 
>>> be able to see it in the  UI.  Making a search for documents will show it.
>>> 
>>> 2) New entities, found in documents, are automatically added to the 
>>> semantic repository. They are assigned a label and a type. Also the 
>>> relation is created, that this document mentions that entity.  Adding them 
>>> manually is not necessary.
>>> 
>>> 3) The answer to this question is more a matter of style. You might want to 
>>> keep your ontology as a separate, complete module, so using the properties 
>>> from another ontology module might not be a good idea. You can relate the 
>>> properties, by making them subproperties of PROTON's equivalents. In fact, 
>>> this can be made by a separate module, which solely purpose is to create 
>>> the mapping between the PROTON and your ontology. You can map classes and 
>>> properties there.  Then you can use your classes and properties, as they 
>>> are more relevant to your classes.
>>> 
>>> Using any of the approaches will work  (if you plan to use mainly your 
>>> ontology). It is just a matter of good style to keep ontologies in separate 
>>> modules and map them with a dedicated mapping module.
>>> 
>>> There are functional differences in making your properties a subproperty of 
>>> PROTON ones. If you make a query, using your properties, you will get 
>>> statements with your properties only. If you make a query using the PROTON 
>>> equivalent, you will get both statements with PROTON property and yours.
>>> 
>>> Hope this helps,
>>> Philip Alexiev,
>>> Software Engineer, KIM team
>>> 
>>> 
>>> On 6 Oct 2011, at 11:07 AM, Hoàng Công Nguyễn wrote:
>>> 
>>>> Hi,
>>>> I'm using KIM platform 3.0-RC-4 and i've got some problems

Re: [Kim-discussion] Help - Adding to document repository, importing new instances got from annotations into KB using Java API.

2011-10-06 Thread Philip Alexiev
Hi  Hoàng Công Nguyễn,

My apologies. The answer to the first question was not correct.  KIM uses some 
indexes, which need to be synchronized when new documents are added. The 
javadoc  provides a good description:


/**
 * Synchronizes the index
 * 
 * 
 * Asks the index to sync the newly added documents with the search results.
 * Before calling synchronizeIndex(true), documents recently added 
documents may
 * not be included the search results. Calling synchronizeIndex with force 
= false
 * cause the index to evaluate the need for synchronization and possibly 
postpone it.
 * 
 * @throws DocumentRepositoryException
 * @return true - if operation was executed; false - if conditional 
execution
 * was invoked (force == false) and the operation was not 
initiated
 * due to conditions;
 */
@Description("Synchronizes the index")
public boolean synchronizeIndex(@PName("force") boolean force) throws 
DocumentRepositoryException;


All the best.
Philip

On 6 Oct 2011, at 12:29 PM, Philip Alexiev wrote:

> 
> 
> Begin forwarded message:
> 
>> From: Philip Alexiev 
>> Subject: Re: [Kim-discussion] Help - Adding to document repository, 
>> importing new instances got from annotations into KB using Java API.
>> Date: 6 October 2011 12:26:11 PM GMT+03:00
>> To: Hoàng Công Nguyễn 
>> Cc: KIM discussion 
>> 
>> Hello  Hoàng Công Nguyễn,
>> 
>> I will answer the questions in order:
>> 
>> 1)  Adding the document through the DocumentRepositoryAPI  is sufficient to 
>> be able to see it in the  UI.  Making a search for documents will show it.
>> 
>> 2) New entities, found in documents, are automatically added to the semantic 
>> repository. They are assigned a label and a type. Also the relation is 
>> created, that this document mentions that entity.  Adding them manually is 
>> not necessary.
>> 
>> 3) The answer to this question is more a matter of style. You might want to 
>> keep your ontology as a separate, complete module, so using the properties 
>> from another ontology module might not be a good idea. You can relate the 
>> properties, by making them subproperties of PROTON's equivalents. In fact, 
>> this can be made by a separate module, which solely purpose is to create the 
>> mapping between the PROTON and your ontology. You can map classes and 
>> properties there.  Then you can use your classes and properties, as they are 
>> more relevant to your classes.
>> 
>> Using any of the approaches will work  (if you plan to use mainly your 
>> ontology). It is just a matter of good style to keep ontologies in separate 
>> modules and map them with a dedicated mapping module.
>> 
>> There are functional differences in making your properties a subproperty of 
>> PROTON ones. If you make a query, using your properties, you will get 
>> statements with your properties only. If you make a query using the PROTON 
>> equivalent, you will get both statements with PROTON property and yours.
>> 
>> Hope this helps,
>> Philip Alexiev,
>> Software Engineer, KIM team
>> 
>> 
>> On 6 Oct 2011, at 11:07 AM, Hoàng Công Nguyễn wrote:
>> 
>>> Hi,
>>> I'm using KIM platform 3.0-RC-4 and i've got some problems, please help me 
>>> solving them:
>>> 1) I create new document, add to document repository
>>> 
>>> DocumentRepositoryAPI apiDocs = kimService.getDocumentRepositoryAPI();
>>> CorporaAPI apiCorpora = kimService.getCorporaAPI();
>>> String content = DocumentResource.TEST_URL_01;
>>>  KIMDocument doc = apiCorpora.createDocument(content, true);
>>> doc=apiSemnAnn.execute(doc);
>>> apiDocs.addDocument(doc);
>>> 
>>> but when i browse web ui, i can't see this document and i must restart KIM 
>>> to take effect. Is there any way adding new document to repository without 
>>> restarting or it is the same problem with deleting document in 
>>> http://www.mail-archive.com/kim-discussion@ontotext.com/msg00696.html?
>>> 
>>> 2) In documentation, to import new instances into KB, i must create new 
>>> file (nt..) , import it to owlim and restart to take effect. So with new 
>>> instances got from annotated documents, is there another way to import them 
>>> into KB using java API.
>>> 
>>> 3) When i add my own ontology to KIM, there may be some properties which 
>>> are the same as properties in PROTON, how i should do with them? Mapping it 
>>> to PROTON's properties using rdfs:subPropertyOf or using only my properties 
>>&

[Kim-discussion] Fwd: Help - Adding to document repository, importing new instances got from annotations into KB using Java API.

2011-10-06 Thread Philip Alexiev


Begin forwarded message:

> From: Philip Alexiev 
> Subject: Re: [Kim-discussion] Help - Adding to document repository, importing 
> new instances got from annotations into KB using Java API.
> Date: 6 October 2011 12:26:11 PM GMT+03:00
> To: Hoàng Công Nguyễn 
> Cc: KIM discussion 
> 
> Hello  Hoàng Công Nguyễn,
> 
> I will answer the questions in order:
> 
> 1)  Adding the document through the DocumentRepositoryAPI  is sufficient to 
> be able to see it in the  UI.  Making a search for documents will show it.
> 
> 2) New entities, found in documents, are automatically added to the semantic 
> repository. They are assigned a label and a type. Also the relation is 
> created, that this document mentions that entity.  Adding them manually is 
> not necessary.
> 
> 3) The answer to this question is more a matter of style. You might want to 
> keep your ontology as a separate, complete module, so using the properties 
> from another ontology module might not be a good idea. You can relate the 
> properties, by making them subproperties of PROTON's equivalents. In fact, 
> this can be made by a separate module, which solely purpose is to create the 
> mapping between the PROTON and your ontology. You can map classes and 
> properties there.  Then you can use your classes and properties, as they are 
> more relevant to your classes.
> 
> Using any of the approaches will work  (if you plan to use mainly your 
> ontology). It is just a matter of good style to keep ontologies in separate 
> modules and map them with a dedicated mapping module.
> 
> There are functional differences in making your properties a subproperty of 
> PROTON ones. If you make a query, using your properties, you will get 
> statements with your properties only. If you make a query using the PROTON 
> equivalent, you will get both statements with PROTON property and yours.
> 
> Hope this helps,
> Philip Alexiev,
> Software Engineer, KIM team
> 
> 
> On 6 Oct 2011, at 11:07 AM, Hoàng Công Nguyễn wrote:
> 
>> Hi,
>> I'm using KIM platform 3.0-RC-4 and i've got some problems, please help me 
>> solving them:
>> 1) I create new document, add to document repository
>> 
>> DocumentRepositoryAPI apiDocs = kimService.getDocumentRepositoryAPI();
>> CorporaAPI apiCorpora = kimService.getCorporaAPI();
>> String content = DocumentResource.TEST_URL_01;
>>  KIMDocument doc = apiCorpora.createDocument(content, true);
>> doc=apiSemnAnn.execute(doc);
>> apiDocs.addDocument(doc);
>> 
>> but when i browse web ui, i can't see this document and i must restart KIM 
>> to take effect. Is there any way adding new document to repository without 
>> restarting or it is the same problem with deleting document in 
>> http://www.mail-archive.com/kim-discussion@ontotext.com/msg00696.html?
>> 
>> 2) In documentation, to import new instances into KB, i must create new file 
>> (nt..) , import it to owlim and restart to take effect. So with new 
>> instances got from annotated documents, is there another way to import them 
>> into KB using java API.
>> 
>> 3) When i add my own ontology to KIM, there may be some properties which are 
>> the same as properties in PROTON, how i should do with them? Mapping it to 
>> PROTON's properties using rdfs:subPropertyOf or using only my properties or 
>> using only PROTON's properties...
>> 
>> Regards.
>> Cong
>> -- 
>> Cong Hoang Nguyen
>> University: Hanoi University of Techonology and Science.
>> Email: congnh0...@gmail.com
>> Facebook: http://www.facebook.com/monday0rsunday
>> YH: congnh0902
>> Skype: monday0rsunday
>> Phone: (+84)1678565200
>> 
>> ___
>> Kim-discussion mailing list
>> Kim-discussion@ontotext.com
>> http://ontotext.com/mailman/listinfo/kim-discussion
> 

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] UTF-8 problem

2011-10-05 Thread Philip Alexiev

Hi Tran Ngoc Duc,

The problem is actually caused by the format of the file. Seems like 
Java does not handle well UTF-8 with BOM. I am providing a reference to 
a thread in the Sesame forum (Sesame is the RDF framework we use). The 
bottom post describes the problem:


http://www.openrdf.org/forum/mvnforum/viewthread?thread=86

When I changed the encoding to standard UTF-8 without BOM it loaded OK.

All the best,
Philip

On 5-10-2011 7:03 AM, Tran Ngoc Duc wrote:

Hi Philip Alexiev,

The attached file is in the UFT-8 (with BOM) format. I have tried to 
change "Astract" class label to VietNamese languge.

When i restart KIM server, it can't work with that file.

Thanks,

On Tue, Oct 4, 2011 at 8:43 PM, <mailto:philip.alex...@ontotext.com>> wrote:


Hi  Tran Ngoc Duc,

Loading an ontology in UTF-8 should not be a problem.

Can you provide the file that causes the problem.

    Thank you,
Philip Alexiev
Software Engineer, KIM team

>
>
>Hi,
>
>I'm newbie with KIM. I save PROTON ontology file with UTF-8
format then
start KIM server but it doesn't load this file.
>I need KIM work with some UFT-8 file format in my language
(VietNamese).
How can it do that?
>Please give me an advise.
>
>Many thanks,




___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] UTF-8 problem

2011-10-04 Thread philip . alexiev
Hi  Tran Ngoc Duc,

Loading an ontology in UTF-8 should not be a problem.

Can you provide the file that causes the problem.

Thank you,
Philip Alexiev
Software Engineer, KIM team

>
>
>Hi,
>
>I'm newbie with KIM. I save PROTON ontology file with UTF-8 format then
start KIM server but it doesn't load this file.
>I need KIM work with some UFT-8 file format in my language (VietNamese).
How can it do that?
>Please give me an advise.
>
>Many thanks,
___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Keyword and description problem

2011-09-07 Thread Philip Alexiev @ Ontotext
Hi Srecko,

I noticed that in the first document the term is mentioned as "Stainless Steel" 
and in the second as "Stainless steel" and "stainless steel".  My guess is that 
you have a  case sensitive gazetteer and  the term is in it either as 
"stainless steel" or "Stainless steel" .  Is this the cse ?

If not please send the pipeline if it is different from the standard KIM 
pipeline, and also the part of the ontology that describes "Stainless steel" .

Thank you,
Philip Alexiev
Software Engineer, KIM team


On 2 Sep 2011, at 3:11 PM, srecko joksimovic wrote:

> Hello,
> 
> I have configured KIM to annotate using my ontology. And that part works 
> fine. But, I have noticed a strange behavior.
> 
> When I annotate this page:
> 
> http://www.keytometals.com/page.aspx?ID=CheckArticle&LN=EN&site=KTS&NM=67
> 
> I never get term "Stainless Steel" (which is in ontology). But I get 
> titanium, and few more terms that should not be there. But, when I annotate 
> this page:
> 
> http://en.wikipedia.org/wiki/Stainless_steel
> 
> there is "Stainless Steel", among other terms. I checked pages, and what I 
> saw is that the first page has defined keywords and description. I could find 
> term "Titanium" among keywords and in description, but not "Stainless Steel". 
> The other page does not have defined keywords and description, and annotation 
> is good.
> 
> Looks like when KIM finds keywords and description, it does not search the 
> rest of the text. In this case, keywords and description are wrong.
> Is this behavior "normal", and how can I make it search the rest of the text?
> 
> Best regards,
> Srecko Joksimovic
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Topic

2011-07-14 Thread Philip Alexiev @ Ontotext
Hi Srecko,

Make sure you add the JAPE transducer after the  gazetteer.

Also it will be much simpler to examine the process directly in the GATE 
interface.  Create a file, add it to the corpus, set the corpus to the pipeline 
and execute the pipeline.  

Disable the last resource to not remove the temporary annotations.  Run the 
pipeline and explore the annotations. See if Lookup are created over the topics 
in the text.

Things to have in mind:
-  after you change the RDF that is imported in KIM,  you should delete the  
KIM/context/default/populated folder  (where the cache is stored) and restart 
the server
-  the  JAPE transducer should be after the Gazetteer, as it uses the 
annotations , created by the Gazetteer as its input.

Hope this helps
Philip


On 14 Jul 2011, at 3:42 PM, srecko joksimovic wrote:

> Hi Boyan,
> I created JAPE rule like you and Philip sugested, and stored to 
> context/default/resources/grammar/acm folder. Then I run gate, and created 
> JAPE transducer, topic_jape. I didn't specify inputASName, but I did add it 
> to pipeline. Saved application state, populated KIMServer corpus, and run the 
> application. I don't know how, but there is everything but Topic.
> 
> I'm still missing something, but I don't know what. Should I create Large KB 
> Gazetteer?
> 
> Best,
> Srecko
> 
> On Thu, Jul 14, 2011 at 2:15 PM, srecko joksimovic 
>  wrote:
> Hi Boyan,
> I didn't understand that I must create JAPE rule before I do everything else.
> I'll try this now.
> 
> Thank you!
> 
> Srecko
> 
> 
> On Thu, Jul 14, 2011 at 2:13 PM, Boyan Kukushev  
> wrote:
> Hi Srecko,
> 
> In order to see your Topic annotations, you must create the JAPE rule that
> Philip suggested:
> 
> Phase:  GazTopic
> Input: Lookup
> Options: control = appelt
> Rule: Topic
> (
>  {Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"}
> ):topic
> -->
> :topic.Topic = {rule=GazTopic, class=:topic.Lookup.class,
> inst=:topic.Lookup.inst}
> 
> and put that rule just after the gazetteer phases within the GATE pipeline.
> The easiest way to do this is using the KIM GATE interface by starting
>KIM/bin/kim(.bat) gate
> 
> and modifying the pipeline.
> 
> You have already added the Topic annotation type to the list of allowed
> annotation types in KIM/config/nerc.properties. After you run the pipeline
> with this new resource incuded, Topic annotations should appear in the default
> annotation set for each document you process.
> 
> To be able to use again the pipeline, you should save it, again using the KIM
> GATE interface - right click on the pipeline and select 'Save application
> state'. Remember to remove (or empty) the document corpus used by the
> application. You choose whether to overwrite the default KIM pipeline
> (IE.gapp) or create a new one and point KIM to use it (setting the
> corresponding property in KIM/config/nerc.properties).
> 
> Hope this helps!
> 
> Regards,
> Boyan
> 
> P.S. What is happening exactly:
>  - the gazetteer phases use pre-defined knowledge base to find specific
> 'things' in the text you process; they produce annotations of type Lookup
>  - the JAPE rule would take all Lookup annotations that have the specific
> class (in your case that is
> "http://proton.semanticweb.org/2006/05/protont#Topic";) and would create a new
> annotation of type Topic that is fully overlapping the current Lookup
> annotation
>  - the last phase in the pipeline removes all temporary annotations - the
> Lookup annotation is also a temporary annotation, but Topic (as it is added to
> the allowed annotations list) will not be removed.
> 
> On Thursday, July 14, 2011 14:50:51 srecko joksimovic wrote:
> > I configured nerc.properties, and now I have this:
> >
> > com.ontotext.kim.KIMConstants.IE_ANN_TYPES=Abstract, Brand, ContactInfo,
> > Date, Entity, Event, GeneralTerm, KeyLocation, KeyOrganization, KeyPerson,
> > KeyPhrase, Location, Money, Object, Organization, Percent, Person,
> > Position, Time, Acquirement, JobTitle, Number, Topic
> >
> > then I disabled last resource in pipeline, but I still can't see Topic.
> > Maybe I didn't understand well... should I first create Jape rule, or this
> > is enough to see Topic?
> >
> > Best,
> > Srecko
> >
> >
> >
> > On Thu, Jul 14, 2011 at 1:15 PM, Philip Alexiev @ Ontotext <
> >
> > philip.alex...@ontotext.com> wrote:
> > > The process is described in the customization guide you mentioned.
> > >
> > > You have added this RDF to the semantic repository.  This means that now
> > >

Re: [Kim-discussion] Topic

2011-07-14 Thread Philip Alexiev @ Ontotext
eratedBy <http://www.lornet.org/acm-ccs/proton#TrustedSrc> ;
>   protons:hasMainAlias
>   
> <http://www.lornet.org/acm-ccs/proton#Alias_4694954a-ba5e-4333-9ea9-8d5b94790c4e>
>  .
> 
> <http://www.lornet.org/acm-ccs/proton#D.3.2.10>
>   a   protont:Topic ;
>   protons:generatedBy <http://www.lornet.org/acm-ccs/proton#TrustedSrc> ;
>   protons:hasMainAlias
>   
> <http://www.lornet.org/acm-ccs/proton#Alias_d59afecf-26fc-4a5b-af92-e8c994542b23>
>  .
> 
> <http://www.lornet.org/acm-ccs/proton#G.3.16>
>   a   protont:Topic ;
>   protons:generatedBy <http://www.lornet.org/acm-ccs/proton#TrustedSrc> ;
>   protons:hasMainAlias
>   
> <http://www.lornet.org/acm-ccs/proton#Alias_33e428d6-3157-41ca-95ee-df79734c5a3d>
>  .
> 
> <http://www.lornet.org/acm-ccs/proton#C.1.1.2>
>   a   protont:Topic ;
>   protons:generatedBy <http://www.lornet.org/acm-ccs/proton#TrustedSrc> ;
>   protons:hasMainAlias
>   
> <http://www.lornet.org/acm-ccs/proton#Alias_2d3fd573-5a41-403b-bdf3-22332ad9d839>
>  .
> 
> <http://www.lornet.org/acm-ccs/proton#B.5.2.1>
>   a   protont:Topic ;
>   protons:generatedBy <http://www.lornet.org/acm-ccs/proton#TrustedSrc> ;
>   protons:hasMainAlias
>   
> <http://www.lornet.org/acm-ccs/proton#Alias_171d052f-553e-4990-bd15-8416d28f4cf1>
>  .
> 
> <http://www.lornet.org/acm-ccs/proton#Alias_d680fd76-0dcf-417b-9f4e-5cbda4616b72>
>   a   protons:Alias ;
>   <http://www.w3.org/2000/01/rdf-schema#label>
>   "Pixel Classification@en" .
> 
> I added this document to owlim.ttl and imported my instances.
> 
> I tried to follow document Customizing KIM 3.pdf, but as mapping has already 
> been done, I didn't know what else to do. Maybe I should create Jape rule, or 
> something like that, but I think that I should see Topic with or without my 
> instances. I'm not sure, that is only my opinion. 
> 
> Best,
> Srecko
> 
> On Thu, Jul 14, 2011 at 12:48 PM, Philip Alexiev @ Ontotext 
>  wrote:
> Can you describe the exact actions you take to add the topics to the  IE 
> logic ?  The exact customizations you have made to KIM.
> 
> Thanks,
> Philip
> 
> On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote:
> 
>> Hi Philip,
>> with GATE is same as with Java code. I get the same annotations. I tried to 
>> edit nerc.properties and add Topic to 
>> com.ontotext.kim.KIMConstants.IE_ANN_TYPES list, but nothing changed. 
>> 
>> Do I have to change something else?
>> 
>> Best,
>> Srecko
>> 
>> On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext 
>>  wrote:
>> Hi Srecko,
>> 
>> You can run the gate interface to check exactly what annotations are create 
>> ant their type. You can do this by running:
>> bash  KIM/bin/kim  gate
>> 
>> You probably use a Jape rule to match the Lookup annotations with 
>> class="http://proton.semanticweb.org/2006/05/protont#Topic";  and are 
>> creating one of the entity annotations over it  (the entity annotations are 
>> a whitelist of annotations that remain after the annotation process 
>> finishes, all annotations not in this list are removed).
>> 
>> So check what type of annotation you are creating.
>> 
>> If this is not the case, please provide more details how you handle the  
>> topic lookups.
>> 
>> All the best,
>> Philip
>> 
>> 
>> On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote:
>> 
>> > Hello Philip,
>> >
>> > I included my instances in KIM. When I use web UI, I see them all, and 
>> > everything looks ok. But when I run code like this:
>> >
>> >KIMDocument kimDoc = 
>> > apiCorpora.createDocument(_string_to_annotate, true);
>> >
>> >kimDoc = apiSemAnn.execute(kimDoc);
>> >
>> >KIMAnnotationSet kimASet = kimDoc.getAnnotations();
>> >Set typesSet = kimASet.getAllTypes();
>> >Iterator iterator = typesSet.iterator();
>> >
>> >// show annotations of every type separately
>> >while(iterator.hasNext())
>> >{
>> >Object key = iterator.next();
>> >KIMAnnotationSet kimFilteredASet = 
>> > kimASet.get(String.valueOf(key));
>> >Iterator annIterator = kimFilteredASet.iterator();
>> >System.out.println(" = Annotations of type [" + 
>> > String.valueOf(key) + "] :");
>> >
>> >while(annIterator.hasNext())
>> >{
>> >System.out.println(" -- " + annIterator.next());
>> >}
>> >}
>> >System.out.println("[ Document's Typed Annotations (end) ]");
>> >
>> > I don't see any annotation of type Topic. I see all of them when I use web 
>> > UI, like I said. But when I try to annotate string from Java application, 
>> > I don't get any Topic annotations.
>> >
>> > Could you please help me on this one?
>> >
>> > Best,
>> > Srecko
>> > ___
>> > Kim-discussion mailing list
>> > Kim-discussion@ontotext.com
>> > http://ontotext.com/mailman/listinfo/kim-discussion
>> 
>> 
>> ___
>> Kim-discussion mailing list
>> Kim-discussion@ontotext.com
>> http://ontotext.com/mailman/listinfo/kim-discussion
> 
> 

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Topic

2011-07-14 Thread Philip Alexiev @ Ontotext
Can you describe the exact actions you take to add the topics to the  IE logic 
?  The exact customizations you have made to KIM.

Thanks,
Philip

On 14 Jul 2011, at 1:41 PM, srecko joksimovic wrote:

> Hi Philip,
> with GATE is same as with Java code. I get the same annotations. I tried to 
> edit nerc.properties and add Topic to 
> com.ontotext.kim.KIMConstants.IE_ANN_TYPES list, but nothing changed. 
> 
> Do I have to change something else?
> 
> Best,
> Srecko
> 
> On Thu, Jul 14, 2011 at 12:26 PM, Philip Alexiev @ Ontotext 
>  wrote:
> Hi Srecko,
> 
> You can run the gate interface to check exactly what annotations are create 
> ant their type. You can do this by running:
> bash  KIM/bin/kim  gate
> 
> You probably use a Jape rule to match the Lookup annotations with 
> class="http://proton.semanticweb.org/2006/05/protont#Topic";  and are creating 
> one of the entity annotations over it  (the entity annotations are a 
> whitelist of annotations that remain after the annotation process finishes, 
> all annotations not in this list are removed).
> 
> So check what type of annotation you are creating.
> 
> If this is not the case, please provide more details how you handle the  
> topic lookups.
> 
> All the best,
> Philip
> 
> 
> On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote:
> 
> > Hello Philip,
> >
> > I included my instances in KIM. When I use web UI, I see them all, and 
> > everything looks ok. But when I run code like this:
> >
> >KIMDocument kimDoc = 
> > apiCorpora.createDocument(_string_to_annotate, true);
> >
> >kimDoc = apiSemAnn.execute(kimDoc);
> >
> >KIMAnnotationSet kimASet = kimDoc.getAnnotations();
> >Set typesSet = kimASet.getAllTypes();
> >Iterator iterator = typesSet.iterator();
> >
> >// show annotations of every type separately
> >while(iterator.hasNext())
> >{
> >Object key = iterator.next();
> >KIMAnnotationSet kimFilteredASet = 
> > kimASet.get(String.valueOf(key));
> >Iterator annIterator = kimFilteredASet.iterator();
> >System.out.println(" = Annotations of type [" + 
> > String.valueOf(key) + "] :");
> >
> >while(annIterator.hasNext())
> >{
> >System.out.println(" -- " + annIterator.next());
> >}
> >}
> >System.out.println("[ Document's Typed Annotations (end) ]");
> >
> > I don't see any annotation of type Topic. I see all of them when I use web 
> > UI, like I said. But when I try to annotate string from Java application, I 
> > don't get any Topic annotations.
> >
> > Could you please help me on this one?
> >
> > Best,
> > Srecko
> > ___
> > Kim-discussion mailing list
> > Kim-discussion@ontotext.com
> > http://ontotext.com/mailman/listinfo/kim-discussion
> 
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Topic

2011-07-14 Thread Philip Alexiev @ Ontotext
Hi Srecko,

You can run the gate interface to check exactly what annotations are create ant 
their type. You can do this by running:
bash  KIM/bin/kim  gate

You probably use a Jape rule to match the Lookup annotations with 
class="http://proton.semanticweb.org/2006/05/protont#Topic";  and are creating 
one of the entity annotations over it  (the entity annotations are a whitelist 
of annotations that remain after the annotation process finishes, all 
annotations not in this list are removed).  

So check what type of annotation you are creating.

If this is not the case, please provide more details how you handle the  topic 
lookups.

All the best,
Philip


On 14 Jul 2011, at 1:19 PM, srecko joksimovic wrote:

> Hello Philip,
> 
> I included my instances in KIM. When I use web UI, I see them all, and 
> everything looks ok. But when I run code like this:
> 
>KIMDocument kimDoc = 
> apiCorpora.createDocument(_string_to_annotate, true);
> 
>kimDoc = apiSemAnn.execute(kimDoc);
> 
>KIMAnnotationSet kimASet = kimDoc.getAnnotations();
>Set typesSet = kimASet.getAllTypes();
>Iterator iterator = typesSet.iterator();
> 
>// show annotations of every type separately
>while(iterator.hasNext())
>{
>Object key = iterator.next();
>KIMAnnotationSet kimFilteredASet = 
> kimASet.get(String.valueOf(key));
>Iterator annIterator = kimFilteredASet.iterator();
>System.out.println(" = Annotations of type [" + 
> String.valueOf(key) + "] :");
> 
>while(annIterator.hasNext())
>{
>System.out.println(" -- " + annIterator.next());
>}
>}
>System.out.println("[ Document's Typed Annotations (end) ]");
> 
> I don't see any annotation of type Topic. I see all of them when I use web 
> UI, like I said. But when I try to annotate string from Java application, I 
> don't get any Topic annotations.
> 
> Could you please help me on this one?
> 
> Best,
> Srecko
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Checking if a document already exists in the document repository -- KIMQueryException: Lucene special character

2011-07-09 Thread Philip Alexiev @ Ontotext
Hi Jeremy,

It is best if you provide a simple standalone class or a test case that works 
with with some test data and will reproduce the problem.  That way we can track 
exactly what is happening.

Thank you,
Philip

On 8 Jul 2011, at 6:08 PM, Jeremy Raes wrote:

> Hey,
> 
> I am building an application upon KIM whereby I need to check if a document 
> already exist in the repository before deciding on adding it.
> 
> To do this, I wrote the following code:
> 
> private boolean itemNotInRepository(Item item){
> assert(item != null);
> DocumentQuery query = new DocumentQuery();
> DocumentQueryResult queryResult = null;
> try {
> String escaped = QueryParser.escape(item.getDescription());
> query.setKeywordRestriction(escaped);
> queryResult = this.apiDR.getDocumentIds(query);
> } catch (KIMQueryException e) {
> e.printStackTrace();
> }
> return queryResult.isEmpty();
> }
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Because some of the Strings, returned by item.getDescription(), might contain 
> special characters [mainly "(" and ")"], I added the String escaped = 
> QueryParser.escape(item.getDescription()) to my code, but nonetheless I get a 
> KIMQueryException: 
> 
> com.ontotext.kim.client.query.KIMQueryException: Lucene special characters in 
> field name in brackets: [Canalhopper.\(Duur\]
> at 
> com.ontotext.kim.lucene.LuceneDocumentRepositoryImpl.getDocumentIds(LuceneDocumentRepositoryImpl.java:429)
> at 
> com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds(CachingDocumentRepository.java:91)
> at com.ontotext.kim.coredb.RdfCore.getDocumentIds(RdfCore.java:266)
> at 
> com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds(CachingDocumentRepository.java:91)
> at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke(ChannelIfaceImpl.java:513)
> at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:680)
> Exception in thread "main" java.lang.NullPointerException
> at 
> knowledgeAcquisition.KIMKnowledgeAcquisition.itemNotInRepository(KIMKnowledgeAcquisition.java:147)
> at 
> knowledgeAcquisition.KIMKnowledgeAcquisition.execute(KIMKnowledgeAcquisition.java:188)
> at run.Main.main(Main.java:21)
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> My guess is that KIM pre-checks the query (before processing it with Lucene) 
> and throws an error when a special character is found -- even though there is 
> an "\" before the special character. Any suggestions on how I can (1) either 
> avoid this error or (2) any other methods to check if a document already 
> exists in the document repository?
> 
> 
> 
> Any help is appreciated. Thanks in advanced!
> 
> 
> 
> Best regards, 
> 
> Jeremy
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] How to declare trusted?

2011-07-08 Thread Philip Alexiev @ Ontotext
Hi Srecko,

You can see how some trusted sources are defined in KIM's  KB in  
KIM/context/default/kb/wkb.nt :

 
 
 .
 
 
 .

In order to declare your custom gazetteer as a trusted source, just add to the  
RDF  this statement:

 
 
 .

Best,
Philip

On 7 Jul 2011, at 10:08 PM, Srecko Joksimovic wrote:

> How can I make sure that:
>  
> http://www.lornet.org/acm-ccs/proton#TrustedSrc is declared as Trusted, when 
> I want to extend Proton ontology?
> I did everything like it says in Customizing KIM 3.0, but I’m not sure how to 
> check if this is declared as trusted?
>  
> Best,
> Srecko
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] KIM

2011-07-05 Thread Philip Alexiev @ Ontotext
Hello Saufiene,

Please check that you have followed all the steps in the guide. Make sure the 
KIM server is running. If the problem persists, please send me the logs of KIM, 
which are in  KIM/log/  folder.

All the best,
Philip Alexiev
Software Engineer, KIM team


On 5 Jul 2011, at 5:30 PM, borislav popov wrote:

> Hi Soufiene, 
>i fwd-ed your request to kim-discussion 
> Please sign in to the list from http://www.ontotext.com/kim/support
> all the best 
> borislav 
> 
> 
> On Jul 5, 2011, at 1:54 PM, soufiene katet wrote:
> 
>> Dear Mr ,
>> i'm using your guide 
>> http://www.ontotext.com/sites/default/files/kim/KIM_Getting_Started_Guide.pdf
>>  to install KIM ,bat i have a problem "We All Make Mistakes
>> A problem occurs while processing the request:
>> java.lang.NullPointerException
>> Cannot add null element to the form.
>> The issue has been logged. The cause may have been loss of connection to the 
>> KIM Server.
>> 
>> The connection to KIM cannot be established. Please verify the server is 
>> still running and is not reporting errors. Try connecting again after that."
>> 
>> Can you help me to resolve it .
>> 
>> Cordially 
>> Soufiene KATET
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


[Kim-discussion] Fwd: Install Ontologies?

2011-07-03 Thread Philip Alexiev @ Ontotext


Begin forwarded message:

> From: "Philip Alexiev @ Ontotext" 
> Date: 4 July 2011 9:01:10 AM GMT+03:00
> To: Ben Fino-Radin 
> Cc: kim-i...@ontotext.com
> Subject: Re: Install Ontologies?
> 
> Hi Ben,
> 
> http://www.ontotext.com/sites/default/files/Customizing%20KIM3.pdf
> 
> This is a guide how to add a new ontology to KIM.  It is more complicated 
> than just importing it in the semantic repository. The resources in KIM 
> should be made aware of it and start working with it. The steps are described 
> in the guide.
> 
> Hope this helps
> 
> Philip Alexiev
> Sofware Engineer,  
> KIM team
> 
> On 4 Jul 2011, at 12:09 AM, Ben Fino-Radin wrote:
> 
>> Hi There,
>> 
>> Does KIM offer the option to install ontologies from RDF/XML?
>> 
>> For exmple: http://vocab.org/bio/0.1/.html
>> 
>> Best,
>> Ben
> 

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Extend proton ontology

2011-07-02 Thread Philip Alexiev @ Ontotext
Hi Srecko,

The best way to check what will be filled in the gazetteer dictionary is to run 
the KIM server and use JVisualVM to execute the gazetteer query against it.

If you give more context, I can provide some more concrete guidelines. What is 
the ontology you are using. What are the concepts you want to recognize.

All the best
Philip

On 2 Jul 2011, at 2:27 PM, Srecko Joksimovic wrote:

> I’m sorry, I forgot to attach screenshot in last email. Just in case, I’m 
> sending it again.
>  
>  
> Hi Philip!
>  
> I was out of work for few days, but now I have new question. I'm reading 
> Customizing KIM3.pdf, step by step. As a result, I got this:
> 
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
> As you can see, everything is there. Even Topic. Topic is class that I added. 
> But, looks like there is no annotations for Topic. Could it maybe be because 
> of query?
>  
> prefix rdfs:  
> prefix protont: 
> PREFIX protons:  
>  
> SELECT ?entity ?cl
> WHERE {  
> 
>  ?entity a ?cl ;  
>   
> protons:generatedBy 
>  .  
>  ?cl rdfs:subClassOf protont:Topic .  
>
>  OPTIONAL 
> 
>  { 
> ?sc rdfs:subClassOf ?cl.
> ?entity a ?sc .
> filter(?cl != ?sc)
>  } 
>  filter (!bound(?sc) && isURI (?cl))  
>  
> }  
>  
> I know that it could be anything. But I created Large KB Gazetter 
> (TopicLKBGazetter), added to pipeline, and as you can see, it is there. If 
> you have any idea, please let me know. I will try few more things.
>  
> Best,
> Srecko
>  
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Extend proton ontology

2011-06-28 Thread Philip Alexiev @ Ontotext
Hi Srecko,

Have you run the KIM server?
Is it on your local machine?
Are you accessing it through a proxy ?

best,
philip

On 28 Jun 2011, at 5:18 PM, Srecko Joksimovic wrote:

> Hi Philip,
> 
> Yes I have. It is exactly same as in your screenshot. When I select MBeans,
> all I see is this message:
> "Data not available because JMX connection to the JMX agent could not be
> established."
> 
> Maybe I should do something to enable this JMX agent?
> 
> Best,
> Srecko
> 
> -Original Message-
> From: Philip Alexiev @ Ontotext [mailto:philip.alex...@gmail.com] 
> Sent: Tuesday, June 28, 2011 12:32
> To: srecko joksimovic
> Cc: kim-discussion@ontotext.com mailing-list
> Subject: Re: [Kim-discussion] Extend proton ontology
> 
> Hi Srecko
> 
> Have you activated the MBeans extension in JVisualVM ?  Once you do it, when
> you select the KIM server process,  you will be able to see the list of
> registered MBeans and choose from them as you see in my screenshot.
> 
> Hth,
> Philip
> 
> 
> On 28 Jun 2011, at 1:22 PM, srecko joksimovic wrote:
> 
>> Hi Philip,
>> 
>> I hope that you are still wiling to help me :) 
>> 
>> I finished the first part - started KIM and installed JVisualVM. I didn't
> have any problem to get it up and running, but I couldn't get MBeans because
> of:
>> 
>> "Data not available because JMX connection to the JMX agent could not be
> established."
>> 
>> I tried "Add JMX Connection..." and I got this message:
>> 
>> "Cannot connect to localhost:8080 using
> service:jmx:rmi://jndi/rmi://localhost:8080/jmxrmi"
>> 
>> I probably missed something again... could you please tell me what?
>> 
>> 
>> ___
>> Kim-discussion mailing list
>> Kim-discussion@ontotext.com
>> http://ontotext.com/mailman/listinfo/kim-discussion
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Extend proton ontology

2011-06-22 Thread Philip Alexiev @ Ontotext
For step 2, you can take a look at the already existing grammars, who transform 
Lookup annotations to meaningful annotations. One such is 
KIM/context/default/resources/grammar/main/gazrules.jape .

Your grammar for Topic will look like this:

Phase:  GazTopic
Input: Lookup
Options: control = appelt

Rule:   GazTopic
(
{Lookup.class == "http://proton.semanticweb.org/2006/05/protont#Topic"}
):topic
-->
:topic.Topic = { inst = :topic.Lookup.inst , class = :topic.Lookup.class }


You can add such a rule for each class you are interested in. Just remember to 
add  "Topic" to KIM's annotation types whitelist.

best
philip


On 22 Jun 2011, at 9:54 PM, Srecko Joksimovic wrote:

> Thank you Philip. I know something about GATE, but I have to learn more.
> Ok, this means that I have step 2. to solve...
>  
> Best,
> Srecko
>  
> From: Philip Alexiev @ Ontotext [mailto:philip.alex...@ontotext.com] 
> Sent: Wednesday, June 22, 2011 8:17 PM
> To: Srecko Joksimovic
> Cc: kim-discussion@ontotext.com
> Subject: Re: [Kim-discussion] Extend proton ontology
>  
> Hi Srecko,
>  
> You state that a source is trusted by importing a statement like this in 
> owlim:
>  
> <http://www.lornet.org/acm-ccs/proton#TrustedSrc> 
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
> <http://proton.semanticweb.org/2006/05/protons#Trusted> .
>  
> You can either add this to your custom RDF or you can use the already 
> predefined trusted sources (they are defined with similar statements in  
> KIM/context/default/kb/wkb.nt ) .
>  
> You can read about JAPE in the official GATE documentation. The documentation 
> is very comprehensive and helpful: 
> http://gate.ac.uk/sale/tao/splitch8.html#chap:jape
>  
> GATE is the semantic annotation platform KIM uses. If you want to customize 
> the default KIM information extraction process, getting familiar with GATE 
> will be very helpful.
>  
> HTH
> Philip
>  
>  
> On 22 Jun 2011, at 7:32 PM, Srecko Joksimovic wrote:
> 
> 
> Hi Philip,
>  
> I have one more question for now... I am beginning to doubt if I know, but 
> let’s assume that I know how to do what you said in your last answer (point 3 
> and 4). I know what should JAPE rule be, and I will probably be able to do 
> that.
> But how to make sure that  <http://www.lornet.org/acm-ccs/proton#TrustedSrc> 
> is declared as trusted?
>  
> This could be funny question, but I am not sure what to do.
>  
> Best,
> Srecko
>  
> From: Srecko Joksimovic [mailto:sreckojoksimo...@gmail.com] 
> Sent: Wednesday, June 22, 2011 5:47 PM
> To: 'Philip Alexiev @ Ontotext'
> Cc: 'kim-discussion@ontotext.com'
> Subject: RE: [Kim-discussion] Extend proton ontology
>  
> Thank you Philip!
> This was a really brief answer. I will try to do what you said. Thank you 
> again.
>  
> Best,
> Srecko
>  
> From: Philip Alexiev @ Ontotext [mailto:philip.alex...@ontotext.com] 
> Sent: Wednesday, June 22, 2011 5:44 PM
> To: srecko joksimovic
> Cc: kim-discussion@ontotext.com
> Subject: Re: [Kim-discussion] Extend proton ontology
>  
> Hi,
>  
> 1. First make sure <http://www.lornet.org/acm-ccs/proton#TrustedSrc> is 
> declared as trusted. This is a statement from KIM's knowledge base:
> <http://www.ontotext.com/kim/2006/05/wkb#Gazetteer> 
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
> <http://proton.semanticweb.org/2006/05/protons#Trusted> .
> You can either use some of the existing trusted sources, or declare yours as 
> trusted.
>  
> 2. Then create a jape rule to match the  Lookup annotations with class 
> feature   "http://proton.semanticweb.org/2006/05/protont#Topic";  and to 
> create a Topic annotation.
>  
> 3. Ad the  Topic annotation type to the whitelist of annotations of KIM in 
> KIM/config/nerc.properties in feature  
> com.ontotext.kim.KIMConstants.IE_ANN_TYPES .
>  
> 4. Delete the cache by removing the KIM/context/default/populated folder and 
> start KIM again. You can start it with the Gate interface to check if your 
> annotations are created. To do that run  KIM/bin/kim gate .
>  
>  
> Hope this helps
> Philip
>  
> On 22 Jun 2011, at 6:29 PM, srecko joksimovic wrote:
>  
> 
> Hello Philip,
>  
> I think that I could send you a part of this file. This is what I have 
> defined:
>  
> <http://www.lornet.org/acm-ccs/proton#K.7.0>
>   a   protont:Topic ;
>   protons:generatedBy <http://www.lornet.org/acm-ccs/proton#TrustedSrc> ;
>   protons:hasMainAlias
>   
> <http://www.lornet.org/acm-ccs/proton#Alias_756db3b5-b66b-41fe-a82e-1012f18a6672>
>  

Re: [Kim-discussion] Extend proton ontology

2011-06-22 Thread Philip Alexiev @ Ontotext
Hi Srecko,

You state that a source is trusted by importing a statement like this in owlim:

<http://www.lornet.org/acm-ccs/proton#TrustedSrc> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://proton.semanticweb.org/2006/05/protons#Trusted> .

You can either add this to your custom RDF or you can use the already 
predefined trusted sources (they are defined with similar statements in  
KIM/context/default/kb/wkb.nt ) .

You can read about JAPE in the official GATE documentation. The documentation 
is very comprehensive and helpful: 
http://gate.ac.uk/sale/tao/splitch8.html#chap:jape

GATE is the semantic annotation platform KIM uses. If you want to customize the 
default KIM information extraction process, getting familiar with GATE will be 
very helpful.

HTH
Philip


On 22 Jun 2011, at 7:32 PM, Srecko Joksimovic wrote:

> Hi Philip,
>  
> I have one more question for now... I am beginning to doubt if I know, but 
> let’s assume that I know how to do what you said in your last answer (point 3 
> and 4). I know what should JAPE rule be, and I will probably be able to do 
> that.
> But how to make sure that  <http://www.lornet.org/acm-ccs/proton#TrustedSrc> 
> is declared as trusted?
>  
> This could be funny question, but I am not sure what to do.
>  
> Best,
> Srecko
>  
> From: Srecko Joksimovic [mailto:sreckojoksimo...@gmail.com] 
> Sent: Wednesday, June 22, 2011 5:47 PM
> To: 'Philip Alexiev @ Ontotext'
> Cc: 'kim-discussion@ontotext.com'
> Subject: RE: [Kim-discussion] Extend proton ontology
>  
> Thank you Philip!
> This was a really brief answer. I will try to do what you said. Thank you 
> again.
>  
> Best,
> Srecko
>  
> From: Philip Alexiev @ Ontotext [mailto:philip.alex...@ontotext.com] 
> Sent: Wednesday, June 22, 2011 5:44 PM
> To: srecko joksimovic
> Cc: kim-discussion@ontotext.com
> Subject: Re: [Kim-discussion] Extend proton ontology
>  
> Hi,
>  
> 1. First make sure <http://www.lornet.org/acm-ccs/proton#TrustedSrc> is 
> declared as trusted. This is a statement from KIM's knowledge base:
> <http://www.ontotext.com/kim/2006/05/wkb#Gazetteer> 
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
> <http://proton.semanticweb.org/2006/05/protons#Trusted> .
> You can either use some of the existing trusted sources, or declare yours as 
> trusted.
>  
> 2. Then create a jape rule to match the  Lookup annotations with class 
> feature   "http://proton.semanticweb.org/2006/05/protont#Topic";  and to 
> create a Topic annotation.
>  
> 3. Ad the  Topic annotation type to the whitelist of annotations of KIM in 
> KIM/config/nerc.properties in feature  
> com.ontotext.kim.KIMConstants.IE_ANN_TYPES .
>  
> 4. Delete the cache by removing the KIM/context/default/populated folder and 
> start KIM again. You can start it with the Gate interface to check if your 
> annotations are created. To do that run  KIM/bin/kim gate .
>  
>  
> Hope this helps
> Philip
>  
> On 22 Jun 2011, at 6:29 PM, srecko joksimovic wrote:
>  
> 
> Hello Philip,
>  
> I think that I could send you a part of this file. This is what I have 
> defined:
>  
> <http://www.lornet.org/acm-ccs/proton#K.7.0>
>   a   protont:Topic ;
>   protons:generatedBy <http://www.lornet.org/acm-ccs/proton#TrustedSrc> ;
>   protons:hasMainAlias
>   
> <http://www.lornet.org/acm-ccs/proton#Alias_756db3b5-b66b-41fe-a82e-1012f18a6672>
>  .
>  
> <http://www.lornet.org/acm-ccs/proton#Alias_877dc2e2-c7cf-4188-a523-6ee9b7cbdd24>
>   a   protons:Alias ;
>   <http://www.w3.org/2000/01/rdf-schema#label>
>   "Optimization@en" .
>  
> <http://www.lornet.org/acm-ccs/proton#Alias_62107227-8c21-4ed8-99e0-bb2e4e1cb810>
>   a   protons:Alias ;
>   <http://www.w3.org/2000/01/rdf-schema#label>
>   "Assistive Technologies For Persons With Disabilities@en" .
>  
> <http://www.lornet.org/acm-ccs/proton#G.3.12>
>   a   protont:Topic ;
>   protons:generatedBy <http://www.lornet.org/acm-ccs/proton#TrustedSrc> ;
>   protons:hasMainAlias
>   
> <http://www.lornet.org/acm-ccs/proton#Alias_ff0e0512-f2e8-4b14-9a37-71a675dcd2eb>
>  .
>  
>  
> and many others... Could you please tell me what to do next?
>  
> Best, 
> Srecko
>  
> On Wed, Jun 22, 2011 at 5:01 PM, Philip Alexiev @ Ontotext 
>  wrote:
> Hello Srecko,
>  
> The steps are described in this guide:
> http://www.ontotext.com/sites/default/files/Customizing%20KIM3.pdf  . 
>  
> Depending on the specifics of your ontology, you could 

Re: [Kim-discussion] Extend proton ontology

2011-06-22 Thread Philip Alexiev @ Ontotext
Hi,

1. First make sure <http://www.lornet.org/acm-ccs/proton#TrustedSrc> is 
declared as trusted. This is a statement from KIM's knowledge base:
<http://www.ontotext.com/kim/2006/05/wkb#Gazetteer> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://proton.semanticweb.org/2006/05/protons#Trusted> .
You can either use some of the existing trusted sources, or declare yours as 
trusted.

2. Then create a jape rule to match the  Lookup annotations with class feature  
 "http://proton.semanticweb.org/2006/05/protont#Topic";  and to create a Topic 
annotation.

3. Ad the  Topic annotation type to the whitelist of annotations of KIM in 
KIM/config/nerc.properties in feature  
com.ontotext.kim.KIMConstants.IE_ANN_TYPES .

4. Delete the cache by removing the KIM/context/default/populated folder and 
start KIM again. You can start it with the Gate interface to check if your 
annotations are created. To do that run  KIM/bin/kim gate .


Hope this helps
Philip

On 22 Jun 2011, at 6:29 PM, srecko joksimovic wrote:

> Hello Philip,
> 
> I think that I could send you a part of this file. This is what I have 
> defined:
> 
> <http://www.lornet.org/acm-ccs/proton#K.7.0>
>   a   protont:Topic ;
>   protons:generatedBy <http://www.lornet.org/acm-ccs/proton#TrustedSrc> ;
>   protons:hasMainAlias
>   
> <http://www.lornet.org/acm-ccs/proton#Alias_756db3b5-b66b-41fe-a82e-1012f18a6672>
>  .
> 
> <http://www.lornet.org/acm-ccs/proton#Alias_877dc2e2-c7cf-4188-a523-6ee9b7cbdd24>
>   a   protons:Alias ;
>   <http://www.w3.org/2000/01/rdf-schema#label>
>   "Optimization@en" .
> 
> <http://www.lornet.org/acm-ccs/proton#Alias_62107227-8c21-4ed8-99e0-bb2e4e1cb810>
>   a   protons:Alias ;
>   <http://www.w3.org/2000/01/rdf-schema#label>
>   "Assistive Technologies For Persons With Disabilities@en" .
> 
> <http://www.lornet.org/acm-ccs/proton#G.3.12>
>   a   protont:Topic ;
>   protons:generatedBy <http://www.lornet.org/acm-ccs/proton#TrustedSrc> ;
>   protons:hasMainAlias
>   
> <http://www.lornet.org/acm-ccs/proton#Alias_ff0e0512-f2e8-4b14-9a37-71a675dcd2eb>
>  .
> 
> 
> and many others... Could you please tell me what to do next?
> 
> Best, 
> Srecko
> 
> On Wed, Jun 22, 2011 at 5:01 PM, Philip Alexiev @ Ontotext 
>  wrote:
> Hello Srecko,
> 
> The steps are described in this guide:
> http://www.ontotext.com/sites/default/files/Customizing%20KIM3.pdf  . 
> 
> Depending on the specifics of your ontology, you could map it to proton or 
> not. If you can create a complete mapping to the proton classes, then 
> recognizing the new concepts in the texts will be a little easier. You just 
> need to create a statement for each new concept that it is generated by a 
> trusted source and also point its labels.
> 
> If you decide not to map to proton, then some additional steps are required . 
> Create a new gazetteer with the query to get your concepts and their type and 
> label. Then use a Jape Transducer and create your custom jape rules to 
> convert the resulting Lookup annotations to some of the annotation types in 
> KIM's whitelist. This is also described in the guide.
> 
> Hope this helps.
> philip
> 
> 
> On 22 Jun 2011, at 5:39 PM, Srecko Joksimovic wrote:
> 
>> Hello everyone!
>>  
>> I have extended Proton ontology, and created acm_proton.ttl file. This file 
>> contains my Concepts. When I annotate document, I want to see only these 
>> Concepts.
>> I saw tutorial, I have read few posts, but I could not find the solution. I 
>> have edited kim/config/owlim.ttl file, and added new line in import section. 
>> And  I also added new namespace. But when I run annotator, I do not see my 
>> Concepts.
>>  
>> Please, I need quick help on this one.
>>  
>> Best,
>> -  Lucky
>>  
>> ___
>> Kim-discussion mailing list
>> Kim-discussion@ontotext.com
>> http://ontotext.com/mailman/listinfo/kim-discussion
> 
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] KIM server version 3.0-RC4 missing components?

2011-06-21 Thread Philip Alexiev @ Ontotext
Hi Ha

On the bottom of the official page of the KIM project, you will find links to 
the documentation. Most particularly, you can download and examine the system 
documentation, where you will find descriptions and examples how to interact 
with the server using RMI and web services.

hth
philip

On 20 Jun 2011, at 6:55 PM, Ha Pham wrote:

> Thanks for your email. Can you give me some pointer on how to 
> programmatically interact with KIM, e.g. submit search queries from external 
> system.
> 

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] KIM server version 3.0-RC4 missing components?

2011-06-20 Thread Philip Alexiev @ Ontotext
Hi Pham,

You can find the latest KIM documentation on the official web site:  
http://www.ontotext.com/kim . It includes the quick start guide and the system 
documentation. There you can get familiar with the interfaces to communicate 
with the KIM server and you can look at some examples how to do that. 

Please have in mind that the purpose of the Latest News  server is  not to 
provide a service, but rather to be used for demo purposes. The server is 
updated periodically and some of the examples may not be functional.  You can 
use your own scenarios, according to the documents currently available there.

Hope this helps.
Philip Alexiev
Software Engineer, KIM team

On 20 Jun 2011, at 10:26 AM, Ha Pham wrote:

> 
> Hi All,
> 
> I'm evaluating KIM and the only version I can download is 3.0RC4. Once 
> downloaded and extracted, I run into a couple of issues:
> 
> - the doc/html-documentation is completely empty. Is this the intended 
> behaviour? 
> - The doc/quick-start-guide/KIM_Getting_Started_Guide.pdf mention that there 
> exist a folder call /sesame, which I also don't find in the unzipped package
> - When i follow the LatestNews_UI_Guide.pdf, there are a couple of places 
> where results won't get shown, and there are error related to sesame.
> - There seems to be no documentation about how to programmatically interact 
> with KIM.
> 
> So I'd like to ask for some clarification on the above, and if possible, let 
> me know where to download the package that's more complete or in sync with 
> the documentation.
> 
> Thanks & best regards,
> Ha.
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] [KIM3] Extending Proton with dbpedia: problem loading IE.gapp in GATE

2011-05-19 Thread Philip Alexiev @ Ontotext
Hi Jeremy

The DBpedia extract consist of the file  dbpedia_3.5.1.owl , containing the 
dbpedia taxonomy (which is the hierarchy of classes and properties) and the 
file  dbpedia_instances.nt  containing the instances. Instances are the actual 
objects that you want to recognize in the textual materials. For example - 
specific persons, organizations, locations etc.  Classes are how these objects 
are grouped and queried.

The statement that Aristotle is a Philosopher should be present in the  
dbpedia_instances.nt  file.  Please make sure that you include it in your setup 
and it is loaded in OWLIM.

All the best.
Philip

On 18 May 2011, at 10:44 PM, Jeremy Raes wrote:

> Dear Philip,
> 
> This helps a lot, thanks!
> 
> Additionally I also forgot to declare the rdf:type for Aristotles instance. 
> Adding 
> 
> <http://dbpedia.org/resource/Aristotle> 
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
> <http://dbpedia.org/ontology/Philosopher> .
> 
> to dbpedia_kim.nt solved the empty result set problem.
> 
> Kind regards, 
> Jeremy
> 
> On 16 May 2011 10:27, Philip Alexiev @ Ontotext  
> wrote:
> Hello Jeremy,
> 
> You can think of KIM as a big extension to GATE.  The real picture is much 
> more complicated, but this will help you visualize the process. 
> What happens when you start KIM, is that  a semantic repository is activated 
> (owlim),  then a GATE process is initialized inside KIM, which loads the  
> IE.gapp  pipeline.  KIM also loads in memory some custom GATE resources, some 
> of which communicate with the semantic repository, others work with the 
> semantic information in the annotations etc.  Almost each of these resources  
> use RMI to talk to the running KIM server through the public KIM  API .
> That is why you can not load directly a KIM pipeline  with a standalone  GATE 
> application.
> So in order to be able to  load the IE.gapp   pipeline in  GATE,  you should 
> have  a  KIM server running  and all the KIM specific resources loaded in  
> GATE.   This is what KIM does, when it is started with the following command 
> line:
> 
> $ bash  KIM/bin/kim  gate
> 
> It starts the semantic repository and the KIM server, then it starts an 
> internal instance of GATE Developer with the specified pipeline. Use this 
> interface to  make changes to the pipeline, and make sure to save the 
> application after this to the same IE.gapp  file.
> 
> Hope this helps
> 
> Philip Alexiev
> Software Engineer,
> KIM team
> 
> 
> On 15 May 2011, at 5:58 PM, Jeremy Raes wrote:
> 
>> Hello,
>> 
>> I want to use KIM for my master thesis and am currently trying to work my 
>> way trough the same tutorial as Stephanie (link).
>> 
>> So far I was able to extend the default KIM ontology with the 
>> dbpedia-extract, as instructed in the manual. Now I am trying to set up the 
>> gazetteer (step 3.6), but can't seem to get it working.
>> I tried running the query that is mentioned in the manual, but all it does 
>> is return an empty result set.
>> Nonetheless, I copied-pasted the before mentioned query into a text file 
>> (query.txt) and linked this file with my LKB-gazetteer via the FeedSetupPath.
>> I loaded /context/default/resources/IE.gapp in the GATE editor. This loads a 
>> series of processing resources, but when I check the application 
>> ("Conditional Corpus Pipeline_00018") there are no selected processing 
>> resources. Loading IE.gapp also generates a series of errors:
>> GATE 5.2-snapshot build 3553 started at Sat May 14 17:28:07 CEST 2011
>> and using Java 1.6.0_24 Apple Inc. on Mac OS X x86_64 10.6.7.
>> CREOLE plugin loaded: 
>> file:/usr/local/tomcat/webapps/kim-platform-3.0-RC4/plugins/LingPipe/
>> gate.creole.ResourceInstantiationException: 
>> com.ontotext.kim.client.KIMException: java.lang.NullPointerException
>>  at 
>> com.ontotext.kim.gate.KIMInstanceGeneratorWrapper.init(KIMInstanceGeneratorWrapper.java:28)
>>  at gate.Factory.createResource(Factory.java:384)
>>  at 
>> gate.util.persistence.ResourcePersistence.createObject(ResourcePersistence.java:83)
>>  at 
>> gate.util.persistence.PRPersistence.createObject(PRPersistence.java:76)
>>  at 
>> gate.util.persistence.LanguageAnalyserPersistence.createObject(LanguageAnalyserPersistence.java:51)
>>  at 
>> gate.util.persistence.PersistenceManager.getTransientRepresentation(PersistenceManager.java:347)
>>  at 
>> gate.util.persistence.CollectionPersistence.createObject(CollectionPersistence.java:74)
>>  at 
>> gate.util.persistence.PersistenceManager.getTransientRepresentation(PersistenceM

Re: [Kim-discussion] [KIM3] Extending Proton with dbpedia: problem loading IE.gapp in GATE

2011-05-16 Thread Philip Alexiev @ Ontotext
Hello Jeremy,

You can think of KIM as a big extension to GATE.  The real picture is much more 
complicated, but this will help you visualize the process. 
What happens when you start KIM, is that  a semantic repository is activated 
(owlim),  then a GATE process is initialized inside KIM, which loads the  
IE.gapp  pipeline.  KIM also loads in memory some custom GATE resources, some 
of which communicate with the semantic repository, others work with the 
semantic information in the annotations etc.  Almost each of these resources  
use RMI to talk to the running KIM server through the public KIM  API .
That is why you can not load directly a KIM pipeline  with a standalone  GATE 
application.
So in order to be able to  load the IE.gapp   pipeline in  GATE,  you should 
have  a  KIM server running  and all the KIM specific resources loaded in  
GATE.   This is what KIM does, when it is started with the following command 
line:

$ bash  KIM/bin/kim  gate

It starts the semantic repository and the KIM server, then it starts an 
internal instance of GATE Developer with the specified pipeline. Use this 
interface to  make changes to the pipeline, and make sure to save the 
application after this to the same IE.gapp  file.

Hope this helps

Philip Alexiev
Software Engineer,
KIM team


On 15 May 2011, at 5:58 PM, Jeremy Raes wrote:

> Hello,
> 
> I want to use KIM for my master thesis and am currently trying to work my way 
> trough the same tutorial as Stephanie (link).
> 
> So far I was able to extend the default KIM ontology with the 
> dbpedia-extract, as instructed in the manual. Now I am trying to set up the 
> gazetteer (step 3.6), but can't seem to get it working.
> I tried running the query that is mentioned in the manual, but all it does is 
> return an empty result set.
> Nonetheless, I copied-pasted the before mentioned query into a text file 
> (query.txt) and linked this file with my LKB-gazetteer via the FeedSetupPath.
> I loaded /context/default/resources/IE.gapp in the GATE editor. This loads a 
> series of processing resources, but when I check the application 
> ("Conditional Corpus Pipeline_00018") there are no selected processing 
> resources. Loading IE.gapp also generates a series of errors:
> GATE 5.2-snapshot build 3553 started at Sat May 14 17:28:07 CEST 2011
> and using Java 1.6.0_24 Apple Inc. on Mac OS X x86_64 10.6.7.
> CREOLE plugin loaded: 
> file:/usr/local/tomcat/webapps/kim-platform-3.0-RC4/plugins/LingPipe/
> gate.creole.ResourceInstantiationException: 
> com.ontotext.kim.client.KIMException: java.lang.NullPointerException
>   at 
> com.ontotext.kim.gate.KIMInstanceGeneratorWrapper.init(KIMInstanceGeneratorWrapper.java:28)
>   at gate.Factory.createResource(Factory.java:384)
>   at 
> gate.util.persistence.ResourcePersistence.createObject(ResourcePersistence.java:83)
>   at 
> gate.util.persistence.PRPersistence.createObject(PRPersistence.java:76)
>   at 
> gate.util.persistence.LanguageAnalyserPersistence.createObject(LanguageAnalyserPersistence.java:51)
>   at 
> gate.util.persistence.PersistenceManager.getTransientRepresentation(PersistenceManager.java:347)
>   at 
> gate.util.persistence.CollectionPersistence.createObject(CollectionPersistence.java:74)
>   at 
> gate.util.persistence.PersistenceManager.getTransientRepresentation(PersistenceManager.java:347)
>   at 
> gate.util.persistence.ControllerPersistence.createObject(ControllerPersistence.java:58)
>   at 
> gate.util.persistence.ConditionalControllerPersistence.createObject(ConditionalControllerPersistence.java:44)
>   at 
> gate.util.persistence.ConditionalSerialAnalyserControllerPersistence.createObject(ConditionalSerialAnalyserControllerPersistence.java:53)
>   at 
> gate.util.persistence.PersistenceManager.getTransientRepresentation(PersistenceManager.java:347)
>   at 
> gate.util.persistence.PersistenceManager.loadObjectFromUrl(PersistenceManager.java:744)
>   at 
> gate.util.persistence.PersistenceManager.loadObjectFromFile(PersistenceManager.java:664)
>   at 
> gate.gui.MainFrame$LoadResourceFromFileAction$1.run(MainFrame.java:3451)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: com.ontotext.kim.client.KIMException: 
> java.lang.NullPointerException
>   at 
> com.ontotext.kim.ig.InstanceGenerator.(InstanceGenerator.java:51)
>   at 
> com.ontotext.kim.gate.KIMInstanceGeneratorWrapper.init(KIMInstanceGeneratorWrapper.java:24)
>   ... 15 more
> Caused by: java.lang.NullPointerException
>   at 
> com.ontotext.kim.ig.InstanceGenerator.(InstanceGenerator.java:44)
>   ... 16 more
> [ERROR] Some resources cannot be restored:
> com.ontotext.kim.client.KIMException: java.lang.NullPointerException
>

Re: [Kim-discussion] Download not working

2011-05-02 Thread Philip Alexiev @ Ontotext
Hi Dragan,

The form has been fixed. 

Thank you for your patience and we are sorry for the inconvenience.

All the best,
Philip Alexiev
Software Engineer, KIM Platform


On 2 May 2011, at 2:57 PM, Dragan Djuric wrote:

> Hi,
> 
> It seems that the KIM download page is broken. After filling in an
> submitting the form, I get "Page not found error.
> 
> Regards,
> Dragan
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Knowledgebase update Web UI

2011-03-29 Thread Philip Alexiev @ Ontotext
Hello Sergey,

   Unfortunately, a tool that makes ontology editing an easier process still 
does not exist.  The utilities we use internally are mainly  Protege 
(protege.stanford.edu) , SWOOP (code.google.com/p/swoop/) and  mostly text 
editors. 

   Usually we have data in the form of lists of items, that we want to 
transform to RDF and attach to the ontology.  This can be done in many ways, 
like using script languages (even bash shell alone),  any programming language 
or even Excel.  The idea is to go through the list, set the items as labels, 
and construct the instance URI from the text (having removed or replaced all 
URI special characters). 

Hope this helps
If not - please be a little more specific and we will try to help

Philip Alexiev
Software Engineer, KIM Platform


On 29 Mar 2011, at 1:31 PM, Сергей Васяйчев wrote:

> Hello!
> 
> Does KIM (or Ontotext) platform provide any application(preferable Web) to 
> edit(add/update/delete) existing knowledgebase(Sesame) triples? 
> If not may be 3rd party tools exist for this?
> 
> Thank you.
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] KIM knowledge base

2011-03-25 Thread Philip Alexiev @ Ontotext
Hello Monika

KIM's  knowledge base resides in  wkb.nt and a small extension - wkbx.nt.

Actually we have the concept of Trusted and Recognized entities. 

Trusted we call the ones that we have knowledge of and are in the knowledge 
base. They are well known named entities with a low level of ambiguity usually 
(although it depends on the domain).  The trusted entities are recognized in 
texts by the gazetteer.

Recognized entities on the other hand, are recognized by some additional 
resources and rules based on the different specifics of the words and context 
around them.  For example such a rule might recognize  Arthur Conan Doyle as a 
person in the sentence:
"Sir Arthur Conan Doyle was a scottish physician and writer."
The rule makes the assumption that if it meets "Sir" and  several words 
beginning with a capital letter, then those words are actually the name of a 
person.  Recognition of entities is usually not a very precise process (80% 
correct is considered good), so trusted entities have greater value.

The entities you see in  the  web interface, after having annotated some 
documents,  are a union of trusted and recognized entities in the texts. That 
is why it is richer than just the  knowledge base alone.

Hope this helps,
Philip Alexiev
Software Engineer, KIM team


On 25 Mar 2011, at 3:00 PM, Jingyu Wu wrote:

> Hello,
> 
> i want to use the KIM knowledge base to build an ontology database, when i 
> downloaded KIM platform, i've only found a "wkb.nt" file, but the contained 
> dataset is smaller than the knowledge base in the KIM web user interface 
> http://ln.ontotext.com/KIM/screen/KWUIMain.jsp?m=0. Because im trying to find 
> a person entity in KIM kb, i cant find any information about this in "wkb.nt" 
> file while this person is in KIM UI. Where can i download a complete copy of 
> KIM knowledge base ? Is the "wkb.nt" file a part of KIM KB ? 
> 
> Regards,
> Jingyu Wu
> 
> -- 
> GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit 
> gratis Handy-Flat! http://portal.gmx.net/de/go/dsl
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] extending dbpedia (Kim 3)

2011-03-21 Thread Philip Alexiev @ Ontotext
Hello Stephany,

This may be a problem with the query. Can you send the what you have in:
E:\kim_semantic\kim\context\default\resources\gazetteer\person\query.txt
This is the the path you set in FeedSetupPath. The gazetteer is looking for  
query.txt there and using the content to query the semantic repository and fill 
its dictionaries.

About the file  KIM/config/query.txt :  If you open the setup of the gazetteer 
already existing in the default KIM installation, you will see that its 
FeedSetupPath is set to   KIM/config.  So the default gazetteer is looking for  
query.txt there.  If you change the query in that file, you will change the way 
the default gazetteer fills its dictionaries. 
This file is usually missing, and KIM uses a preset query.

FYI:  we are currently working on an improved version of this guide, which will 
be available soon on the site.

Hope this helps,
Philip Alexiev
Software Engineer, KIM Platform

On 21 Mar 2011, at 4:29 AM, Stefi Ghiran wrote:

> Hi,
> I'm new to KIM.
> I tried the steps from 
> http://ontotext.com/kim/doc/KimDocs-3.0-EN/CaseStudy-IntegrationDbPedia.html.
> At step "3.6 Setting up the gazetteers" I have some problems, and I've tried 
> to solve them but it seems I'm missing something.
> The problem is I don't get any Lookup-up annotations in KIM GATE-UI with LKB 
> Gazetteer on some text, though I do get them using
> KIM Web UI (with default IE.gapp).
> I think it's a problem regarding the params of the PR, but still I don't get 
> it. I'm in a bottleneck right now.
> 
> Also, I found a thread in which a dev. said that for filling the dictionary 
> one should put the "query (sparql or serql) in KIM/config/query.txt ."
> Wasn't that supposed to be put in (any) folder and just pass the folder path 
> in FeedSetupPath?
> 
> Atached:  
> lkb.jpg   (screen shot of the params.)
> log.txt(output of Gate)
> dbpedia_instances.nt (the only .nt file, where I merged the dbpedia_*.nt 
> files from the case study)
> 
> I would be grateful if you could help me. Maybe a tip or a ref. to a thread, 
> I'm sure there something small missing, just don't know what..
> 
> Thanks, Stephany.
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] starting KIM server

2011-02-27 Thread Philip Alexiev @ Ontotext
Hi Leen,

It seems that KIM has been stopped abnormally and the Owlim index is corrupt.  
You can remove the "populated" content in KIM and make it regenerate its 
indexes. Do that by deleting  "KIM/context/default/populated" . Please note, 
that this will also remove everything you have added to the semantic repository 
and all the documents you have populated. When you start KIM again, it will 
generate the indexes based on the rdf files described in the imports section in 
 KIM/config/owlim.ttl  .

Hth
Philip Alexiev
Software Engineer, KIM platform

On 28 Feb 2011, at 3:46 AM, leen hashim wrote:

> here are the log
> 
> c:\KIM\bin>kim
> "KIM_MAX_JAVA_HEAP = 1g"
> "JAVA_HOME = C:\Program Files\Java\jdk1.6.0_24"
> "KIM_HOME = c:\KIM\bin\.."
> [INFO] : : : : : : : : KIM SERVER START : : : : : : : :
> [INFO] KIMService registered on port 1099
> [INFO] OwlimSchemaRepository: 3.3
> [INFO] Build date:  06-22-2010 11:57
> [INFO] Configured parameter 'imports' to 'kb/owl/owl.rdfs;
>  kb/owl/protons.owl;
>  kb/owl/protont.owl;
>  kb/owl/protonu.owl;
>  kb/owl/kimso.owl;
>  kb/owl/kimlo.owl;
>  kb/skos-owl1-dl.rdf;
>  kb/wkb.nt;
>  kb/wkbx.nt;'
> [INFO] Configured parameter 'defaultNS' to 'http://www.w3.org/2002/07/owl#;
> 
> http://proton.semanticweb.org/2006/05/protons#;
> 
> http://proton.semanticweb.org/2006/05/protont#;
> 
> http://proton.semanticweb.org/2006/05/protonu#;
> http://www.ontotext.com/kim/2006/05/wkb#;
> http://www.ontotext.com/kim/2006/05/wkb#;
> http://www.ontotext.com/kim/2006/05/wkb#;
> http://www.ontotext.com/kim/2006/05/wkb#;
> 
> http://www.ontotext.com/kim/2006/05/wkb#;'
> [INFO] Configured parameter 'base-URL' to 
> 'http://www.ontotext.com/kim/2006/05/
> kb#'
> [INFO] Configured parameter 'ruleset' to 'kb/KIMRules.pie'
> [INFO] Configured parameter 'ftsLiteralsOnly' to 'true'
> [INFO] Configured parameter 'console-thread' to 'false'
> [INFO] Configured parameter 'useShutdownHooks' to 'false'
> [INFO] Configured parameter 'entity-index-size' to '40'
> [INFO] Configured parameter 'ftsIndexPolicy' to 'onStartup'
> [INFO] Tokenization regular expression: [\p{L}\d_]+
> [INFO] Repository fragments: 1
> [INFO] Inferencer threads: 1
> [INFO] ftsPolicy = on-startup
> [INFO] fts: indexing literals only
> [INFO] Configured parameter 'tuple-index-memory' to '100M'
> [INFO] Configured parameter 'fts-memory' to '80M'
> [INFO] Cache pages for tuples: 5241
> [INFO] Cache pages for predicates: 0
> [INFO] Cache pages for FTS: 4193
> [INFO] Configured parameter 'storage-folder' to 'populated'
> [INFO] Configured parameter 'repository-type' to 'file-repository'
> [ERROR] Server Starter thread failed!
> java.rmi.RemoteException: Error creating repository; nested exception is:
> java.lang.RuntimeException: Empty header page detected! The 
> c:\KIM\cont
> xt\default\.\populated/psois corrupted!
> at 
> com.ontotext.kim.semanticrepository.UnmanagedRepositoryFactory.recon
> ect(UnmanagedRepositoryFactory.java:44)
> at 
> com.ontotext.kim.KIMServiceImpl.getSemanticRepositoryAPI(KIMServiceI
> pl.java:228)
> at 
> com.ontotext.kim.KIMServiceImpl.getSemanticRepositoryAPI(KIMServiceI
> pl.java:198)
> at 
> com.ontotext.kim.KIMServiceImpl$ServerComponentStarter.run(KIMServic
> Impl.java:82)
> Caused by: java.lang.RuntimeException: Empty header page detected! The 
> c:\KIM\c
> ntext\default\.\populated/psois corrupted!
> at com.ontotext.trree.big.PageCache.(PageCache.java:47)
> at 
> com.ontotext.trree.big.SortedCollection.(SortedCollection.java
> 30)
> at com.ontotext.trree.big.AVLRepository.(AVLRepository.java:92)
> at 
> com.ontotext.trree.OwlimSchemaRepository.newRepository(OwlimSchemaRe
> ository.java:647)
> at 
> com.ontotext.trree.OwlimSchemaRepository.initialize(OwlimSchemaRepos
> tory.java:468)
> at 
> org.openrdf.repository.sail.SailRepository.initial

Re: [Kim-discussion] Maximum content size for a document

2011-02-26 Thread Philip Alexiev @ Ontotext
Hi Naaman,

Basically, KIM is strong in analyzing news and small documents. That is because 
some analyzing resources can't handle big amounts of data. For example the 
patterns in the Jape rules, may perform greedy matches, which are very heavy 
over large content.

The maximum size is not an exact measure. The bigger the document the slower 
the extraction. Generally a document of several pages is a standard. 

We advise if possible to split the document into several smaller parts and 
analyze them independently. This will not have a big impact over the quality of 
the information extraction.  And another benefit of this is that you can 
process the different parts in parallel.

If the documents are news, in the general case the most important information 
is contained in the beginning of the article. So trimming it is also a 
solution.  You can play with this to see how it fits your needs.

Hth
Philip


On 25 Feb 2011, at 7:11 PM, Naaman Musawwir wrote:

> Hello,
>  
> We have data in the form of documents. We extract text from these and add 
> into KIM repository for semantic analysis. Sometime when we try to add a 
> document it takes forever. Also, if the text size is more than 20 KB it also 
> kind of hangs the KIM server and it stops responding for any further requests.
>  
> Does the size of document affects and if so, what is the maximum content size 
> that KIM can process easily?
>  
> Regards,
> Naaman Musawwir.
>  
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] starting KIM server

2011-02-25 Thread Philip Alexiev @ Ontotext
Hi Leen

Could you provide the whole log. There is probably a problem somewhere 
preventing KIM from starting. The log will point us to this problem.

Thank you,
Philip

On 25 Feb 2011, at 4:14 AM, leen hashim wrote:

> hi 
>   ..i've a problem in starting KIM server
>yesterday it works really fine ..right after installation
>   but today it prompt following message
> 
> [INFO] It is now safe to close the HIM Server.
> [INFo] KIMService unregistered fron port 1099
> [INFO] It is now safe to close the KIM Server.
> 
>   even though i try to stop using kim stop command
>   it cannot close the kim server
> 
> how do i fix this ..and start successfully the KIM server
> ..p.s i'm not good in Java language
> pls explain step by step action to be done ..
> 
> tq in advance
> leen
> 
> 
> 
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] UMBEL Help

2011-02-21 Thread Philip Alexiev @ Ontotext
Hi Naaman,

You will find the answers inline.

All the best,
Philip

On 18 Feb 2011, at 3:43 PM, Naaman Musawwir wrote:

> Hello there,
>  
> I came to know about the first production grade release 1.0 of UMBEL. Please 
> guide how I can use it with KIM 3.0 and what will the effect of using it.
>  
> There were some other questions:
>  
> 1.   What is the ETA on Kim 3.5?

Our estimates are that this will happen not sooner than the end of april.

> 2.   What is the default Ontology used by KIM 3.0? I went through 
> http://www.ontotext.com/kim/ontologies.html. Does it use 200K entity version 
> by default or 40K?

Currently we are using the 200K version. KIM comes with it by default.

> 3.   If we want to extend default ontology what is the require procedure?

The public documentation is a good place to look for information. You can find 
it here:
http://ontotext.com/kim/doc/KimDocs-3.0-EN/HomePage.html
Specifically about the ontology extension and integration , you can look at the 
"Extending the KIM Information Extraction capabilities" in the  
"Administrator's Guide" .

At the moment we are actively working on improving the "Case Study: Integrating 
DBPedia" . In a few days a new version will be available, which we believe is 
more descriptive and easy to follow.  I will update you when it is ready.

>  
> Regards,
> Naaman Musawwir.
>  
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] OutOfMemory and connection Errors

2011-02-18 Thread Philip Alexiev @ Ontotext
Hi Naaman,

This is an init time parameter, which means it is set when the resource is 
created.  The easiest way to change this value, is to open the pipeline with a 
text editor and change the value there.  The pipeline description is in 
KIM/context/default/resources/IE.gapp .  You will see a section there:

  
annotationLimit
-1
  

-1  means there is no limit currently.  You can experiment and see what value 
suits you best.

Hth
Philip

On 18 Feb 2011, at 3:00 AM, Naaman Musawwir wrote:

> Hello, can you let me know how to pass this annotationLimit setting during 
> startup?
>  
> Regards,
> Naaman Musawwir.
> From: Philip Alexiev @ Ontotext [mailto:philip.alex...@ontotext.com] 
> Sent: Monday, January 10, 2011 8:30 PM
> To: Naaman Musawwir
> Cc: kim-discussion@ontotext.com
> Subject: Re: [Kim-discussion] OutOfMemory and connection Errors
>  
> Hi,
>  
> You can actually set annotation limit to the LKB gazetteer only. It is an 
> init time parameter of the gazetteer called annotationLimit . If set , the 
> gaz will stop creating Lookups after the number specified. Try it with 
> different limits and see if this will solve the problem. 
>  
> The other solutions I told you are still valid. Splitting or trimming the 
> document. They are not a part of KIM so you will have to implement them as a 
> preprocessing step. Or, you can create your own Gate processing resource 
> which will trim the document if it is larger than a given size of characters.
>  
> Hope this helps,
> Philip
>  
> On Jan 10, 2011, at 3:38 PM, Naaman Musawwir wrote:
> 
> 
> Hello Philip,
>  
> Thank you for guiding in this regard.
>  
> Yes, you are right. One document is causing problem. KIM extracts information 
> from a given URL and builds a document. How can we split such a document into 
> pieces or instruct KIM to do that?
>  
> And second problem is related to the first. It happens when OOM error occurs.
>  
> Regards,
> Naaman Musawwir.
> From: Philip Alexiev @ Ontotext [mailto:philip.alex...@ontotext.com] 
> Sent: Monday, January 10, 2011 1:46 PM
> To: Naaman Musawwir
> Cc: kim-discussion@ontotext.com
> Subject: Re: [Kim-discussion] OutOfMemory and connection Errors
>  
> Hi Naaman,
>  
> Setting the memory to 2g should be enough to process thousands of documents 
> without problem. But it depends on the size of the document.  Large documents 
> contain a huge amount of tokens. This considerably slows down processing 
> resources, which perform calculations over the whole content of the document 
> like the nominal and pronominal co-referencers for example.  
>  
> The best way to handle this is to split the document into parts, and process 
> each part as a separate document. 
>  
> Another popular approach, which works best with news, is to trim them to a 
> particular size. With news this works, because the important information is 
> mainly in the first part of the article and in the title. 
>  
> If you analyze the populater logs, you can see which document is causing the 
> problem, and examine it more carefully.
>  
>  
> I suppose the second problem is a direct consequence of the first one.  Do 
> you fail to stop KIM when it throws OOM error only?
>  
> Hope this helps.
> Philip
>  
>  
> On Jan 8, 2011, at 12:55 PM, Naaman Musawwir wrote:
> 
> 
> 
> Hello, I am facing two errors; one is related to memory and other is about 
> connection.
>  
> Following is the memory error:
>  
> [ERROR] Failed to execute KIM NERC.
> gate.creole.ExecutionException: java.lang.OutOfMemoryError: Java heap space
> at 
> com.ontotext.kim.semanticannotation.GateAnnotator.annotate(GateAnnotator.java:121)
> at 
> com.ontotext.kim.semanticannotation.SemanticAnnotationAPIImpl.nercExecute(SemanticAnnotationAPIImpl.java:58)
> at 
> com.ontotext.kim.semanticannotation.SemanticAnnotationAPIImpl.execute(SemanticAnnotationAPIImpl.java:39)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke(ChannelIfaceImpl.java:513)
> at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.

Re: [Kim-discussion] Helping me on starting using KIM

2011-02-14 Thread Philip Alexiev @ Ontotext
Hi Helen,

We are glad to hear about your interest in KIM.

As you have discovered, those are the two options to communicate with the KIM 
server. Generally the RMI is the preferred one. It allows more sophisticated 
approaches. It can be used with Java or any compatible language.
The web services , on the other hand can be used with any language that 
supports them. The WS API  is a fully functional subset of the RMI API.

The documentation describes how to setup you client, in the terms of 
configuration and classpath - 
http://ontotext.com/kim/doc/KimDocs-3.0-EN/KIMJavaRMIAPI.html . Then the best 
place to continue are the examples - 
http://ontotext.com/kim/doc/KimDocs-3.0-EN/Examples.html . This is the official 
RMI documentation -  http://download.oracle.com/javase/tutorial/rmi/index.html 
at Oracle's site.

This is the code of a very simple client that retrieves the KIMService and then 
the CorporaAPI from there (much more detailed examples can be found in the 
documentation at the link I provided):

import com.ontotext.kim.client.GetService;
import com.ontotext.kim.client.KIMService;
import com.ontotext.kim.client.corpora.CorporaAPI;

public class ConnectToKimService  {
public static void main(String[] args) throws Exception {
KIMService serviceKim = GetService.from();
CorporaAPI apiCorpora = serviceKim.getCorporaAPI();
}
}

About the UI. The functionality available in the UI does not exist as a direct 
API call. It is achieved by series of calls to the server. So designing a new 
interface from scratch would not be effective.  The best way would be to modify 
the existing interface to adopt your look and feel. It is a matter of editing 
JSPs and CSSs.  Then use an iFrame or directly link to the interface itself.

Hope this helps
Philip

On 11 Feb 2011, at 5:39 PM, Helen Wang wrote:

> Hi,
> 
> I am new to KIM therefore asking for your help about how should I use KIM.
> 
> I am working on a research project which has developed a simple domain 
> ontology. In the development of a e-Lib for this project,  I would like to 
> add some semantic search functions.  
> 
> I downloaded the KIM platform and had a look some docs e.g.  the "developer's 
> Guide" in your website  but I do not know how to progress. I have three 
> questions:
> 
> 
> 1. My Java knowledge is quite limited. Can I use C## or PhP?  Is there any 
> working examples?
> 
> 2. I should be able to learn Java quickly. I had a look the "developer's 
> Guide" doc, I saw the KIM web serivce and RMI API. I do not know how to try 
> these examples,  where should I start?
> 
> 3. I have installed Tomcat and run the KIM web interface. In your website 
> showcases, they all use the same interface. Can I just embed an semantic 
> search function in my e-Lib without using your interface?
> 
> 
> Sorry if these questions sound stupid for you.   I do need your help for me 
> to start my work.  
> 
> Any suggestions and tips are more than welcome!
> 
> Many thanks!
> 
> 
> Best regards,
> Helen
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Problem with getEntities method of CoreDbAPI

2011-01-27 Thread Philip Alexiev @ Ontotext
Hi Alistair

Sorry for the delay. We are looking at the issue. In the meantime can you give 
more information which could be useful in understanding the case.  
* Have you done any changes to the CORE index  type feature? What type have you 
set?
* Have you managed to get results with the same method before?

All the best,
Philip Alexiev
Software Engineer, KIM Team

On Jan 26, 2011, at 6:47 PM,   
wrote:

> Hi there,
>  
> I’m getting a strange error message when calling the getEntities method of 
> the CORE Api. I’m calling:
>  
> getEntities(Set kimIds, CoreDbQuery query)
>  
> and get the error:
>  
> Exception in thread "main" java.lang.RuntimeException: not implemented
>   at com.ontotext.kim.coredb.RdfCore.getEntities(RdfCore.java:176)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke(ChannelIfaceImpl.java:514)
>   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
>   at sun.rmi.transport.Transport$1.run(Transport.java:159)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
>   at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
>   at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
>   at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:619)
>  
> Can you let me know what’s throwing the error in RdfCore at line 176? 
> Weirdly, I’m sure this code used to work before I rebuilt the index.
>  
> The other strange thing is that when I run getEntities without specifying the 
> document set (i.e. for all documents) I get no results.
>  
> I’m running KIM version 2.5 – (I do plan to upgrade to 3.0 but need a short 
> term fix of this one really.)
>  
> Thanks,
>  
> Alistair
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Update

2011-01-25 Thread Philip Alexiev @ Ontotext
Hi Juha

As those are interesting questions, I am forwarding the mail to the 
kim-discussion mailing list also.  My answers are inline.


On Jan 24, 2011, at 7:46 PM,  
 wrote:

> A couple of other questions:
>  
> In section 3 of the same document, first example-box - why the label of  the 
> alias "wkb:Robot_R2D2_1" is "R2D2" i.e. exactly the same as the label for 
> main Class "wkb:Robot_R2D2" ? I thought that the whole idea of having aliases 
> is to have also different labels for them. And that would allow the Gazetteer 
> to collect those different labels and annotate accordingly? Maybe label and 
> has MainAlias together is redundant and only hasAlias would be enough? Is 
> this duplication required or just a typo? The text under bullet-point "Source 
> (generatedBy) implies that entities must have at least one alias to be 
> included in the dictionary. Does that mean that just having a label is not 
> enough?
>  

We are touching two separate concepts here. It is generally advisable for all 
classes to have labels. This is a good RDF design practice. In the label is 
stored the human readable form of the class.
As for the gazetteer, two different models are used - labels and aliases.  
 - Labels have the advantage of being much more light and simple. Labels are 
sufficient in most cases to express your knowledge. This is the preferred model.
 - Aliases on the other hand, are much heavier, as each alias is a separate 
instance itself. This model is used if the there is a need to store some 
metadata for the labels. For example - multilingual support.
Labels are used for visualization, so they are recommended. Whether the 
instances will have aliases on top of that, depends on the model of the 
gazetteer that will be used.

> In section 3, under the first example-box, you have the notation 
> "wkb:Robot_R2D2.1" - shouldn't the period be replaced by underscore?

This is a naming convention we use. The local name of the instance's URI is 
formed by the name of its class and the main label of the instance.  The 
instance may have multiple aliases, which URIs are formed by the name of the 
instance  and   .number   appended at the end.   This particular URI means, 
that this is the first alias of the R2D2 instance of class Robot.  Like it is 
said in the documentation page:
"The URIs of the labels, like wkb:Robot_R2D2.1, don't need to be in that exact 
format, ending in .. They only need to be unique."
This is also valid for the URIs of the instances.

>  
> The notation under the next example box, where you are referring to 
> http://www.ontotext. com/kim/2006/05/wkb#Robot_T.1 is confusing to me. Is 
> this just a way of stating that "Robot" is a trusted entity? If that is the 
> case, where should this statement appear?
>  
> The box after this URI seems to be somewhat in contradiction with the Case 
> Study (DBpedia in KIM). Should it have another statement declaring rdf:type  
> //proton#Trusted or something?
>  
> This small part just before section 4 is overall a bit unclear to me and I am 
> wondering if it is missing something... I mean, "generatedBy" seems to serve 
> a different purpose than the property "Trusted". It might be helpful to spell 
> these out and maybe explain how exactly each one is being used by the 
> Gazetteer.

Maybe the documentation is a little bit confusing. The only requirement for an 
entity to be marked as trusted, is that this entity is generated by a source, 
which is of type protons:Trusted. In the example :

wkb:Robot_R2D2 protons:generatedBy wkb:Gazetteer .

If you see the rdf for wkb:Gazetteer you will notice:

<http://www.ontotext.com/kim/2006/05/wkb#Gazetteer> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://proton.semanticweb.org/2006/05/protons#Trusted> .

There are some other trusted sources, defined at the top of  wkb.nt . You can 
also define your own trusted sources.

>  
> Now I have to go home to mind my son but I will be back tomorrow with some 
> more questions. (the case-study is not entirely clear to me)
>  
> Cheers,
>  
> Juha
>  

Hope this helps
Your feedback is valuable to us and helps us improve the documentation
all the best
Philip Alexiev
Software Engineer,  KIM team


> From: Philip Alexiev @ Ontotext [mailto:philip.alex...@ontotext.com] 
> Sent: Monday, January 24, 2011 4:34 PM
> To: JUNTTILA Juha (SANCO)
> Cc: borislav.po...@ontotext.com; georgi.georg...@sirma.bg
> Subject: Re: Update
> 
> Hi Juha,
> 
> It seems to be a mistake in the documentation.  Thank you for pointing it and 
> excuse us if it caused any difficulties.  The Robot class, as is the 
> description in the RDF, will be a subclass of Object.  I will correct this 
> now.
> 
> Greetings,
>

Re: [Kim-discussion] OutOfMemory and connection Errors

2011-01-10 Thread Philip Alexiev @ Ontotext
Hi,

You can actually set annotation limit to the LKB gazetteer only. It is an init 
time parameter of the gazetteer called annotationLimit . If set , the gaz will 
stop creating Lookups after the number specified. Try it with different limits 
and see if this will solve the problem. 

The other solutions I told you are still valid. Splitting or trimming the 
document. They are not a part of KIM so you will have to implement them as a 
preprocessing step. Or, you can create your own Gate processing resource which 
will trim the document if it is larger than a given size of characters.

Hope this helps,
Philip

On Jan 10, 2011, at 3:38 PM, Naaman Musawwir wrote:

> Hello Philip,
>  
> Thank you for guiding in this regard.
>  
> Yes, you are right. One document is causing problem. KIM extracts information 
> from a given URL and builds a document. How can we split such a document into 
> pieces or instruct KIM to do that?
>  
> And second problem is related to the first. It happens when OOM error occurs.
>  
> Regards,
> Naaman Musawwir.
> From: Philip Alexiev @ Ontotext [mailto:philip.alex...@ontotext.com] 
> Sent: Monday, January 10, 2011 1:46 PM
> To: Naaman Musawwir
> Cc: kim-discussion@ontotext.com
> Subject: Re: [Kim-discussion] OutOfMemory and connection Errors
>  
> Hi Naaman,
>  
> Setting the memory to 2g should be enough to process thousands of documents 
> without problem. But it depends on the size of the document.  Large documents 
> contain a huge amount of tokens. This considerably slows down processing 
> resources, which perform calculations over the whole content of the document 
> like the nominal and pronominal co-referencers for example.  
>  
> The best way to handle this is to split the document into parts, and process 
> each part as a separate document. 
>  
> Another popular approach, which works best with news, is to trim them to a 
> particular size. With news this works, because the important information is 
> mainly in the first part of the article and in the title. 
>  
> If you analyze the populater logs, you can see which document is causing the 
> problem, and examine it more carefully.
>  
>  
> I suppose the second problem is a direct consequence of the first one.  Do 
> you fail to stop KIM when it throws OOM error only?
>  
> Hope this helps.
> Philip
>  
>  
> On Jan 8, 2011, at 12:55 PM, Naaman Musawwir wrote:
> 
> 
> Hello, I am facing two errors; one is related to memory and other is about 
> connection.
>  
> Following is the memory error:
>  
> [ERROR] Failed to execute KIM NERC.
> gate.creole.ExecutionException: java.lang.OutOfMemoryError: Java heap space
> at 
> com.ontotext.kim.semanticannotation.GateAnnotator.annotate(GateAnnotator.java:121)
> at 
> com.ontotext.kim.semanticannotation.SemanticAnnotationAPIImpl.nercExecute(SemanticAnnotationAPIImpl.java:58)
> at 
> com.ontotext.kim.semanticannotation.SemanticAnnotationAPIImpl.execute(SemanticAnnotationAPIImpl.java:39)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke(ChannelIfaceImpl.java:513)
> at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.OutOfMemoryError: Java heap space
> at java.util.HashMap.addEntry(HashMap.java:753)
> at java.util.HashMap.put(HashMap.java:385)
> at java.util.HashMap.putAll(HashMap.java:524)
>

Re: [Kim-discussion] OutOfMemory and connection Errors

2011-01-10 Thread Philip Alexiev @ Ontotext
Hi Naaman,

Setting the memory to 2g should be enough to process thousands of documents 
without problem. But it depends on the size of the document.  Large documents 
contain a huge amount of tokens. This considerably slows down processing 
resources, which perform calculations over the whole content of the document 
like the nominal and pronominal co-referencers for example.  

The best way to handle this is to split the document into parts, and process 
each part as a separate document. 

Another popular approach, which works best with news, is to trim them to a 
particular size. With news this works, because the important information is 
mainly in the first part of the article and in the title. 

If you analyze the populater logs, you can see which document is causing the 
problem, and examine it more carefully.


I suppose the second problem is a direct consequence of the first one.  Do you 
fail to stop KIM when it throws OOM error only?

Hope this helps.
Philip


On Jan 8, 2011, at 12:55 PM, Naaman Musawwir wrote:

> Hello, I am facing two errors; one is related to memory and other is about 
> connection.
>  
> Following is the memory error:
>  
> [ERROR] Failed to execute KIM NERC.
> gate.creole.ExecutionException: java.lang.OutOfMemoryError: Java heap space
> at 
> com.ontotext.kim.semanticannotation.GateAnnotator.annotate(GateAnnotator.java:121)
> at 
> com.ontotext.kim.semanticannotation.SemanticAnnotationAPIImpl.nercExecute(SemanticAnnotationAPIImpl.java:58)
> at 
> com.ontotext.kim.semanticannotation.SemanticAnnotationAPIImpl.execute(SemanticAnnotationAPIImpl.java:39)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke(ChannelIfaceImpl.java:513)
> at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.OutOfMemoryError: Java heap space
> at java.util.HashMap.addEntry(HashMap.java:753)
> at java.util.HashMap.put(HashMap.java:385)
> at java.util.HashMap.putAll(HashMap.java:524)
> at 
> gate.annotation.AnnotationSetImpl.(AnnotationSetImpl.java:132)
> at 
> gate.jape.SinglePhaseTransducer.attemptAdvance(SinglePhaseTransducer.java:648)
> at 
> gate.jape.SinglePhaseTransducer.transduce(SinglePhaseTransducer.java:407)
> at 
> gate.jape.MultiPhaseTransducer.transduce(MultiPhaseTransducer.java:180)
> at gate.jape.Batch.transduce(Batch.java:356)
> at gate.creole.Transducer.execute(Transducer.java:132)
> at gate.util.Benchmark.executeWithBenchmarking(Benchmark.java:299)
> at 
> gate.creole.ConditionalSerialController.runComponent(ConditionalSerialController.java:153)
> at gate.creole.SerialController.executeImpl(SerialController.java:152)
> at 
> gate.creole.ConditionalSerialAnalyserController.executeImpl(ConditionalSerialAnalyserController.java:119)
> at gate.creole.AbstractController.execute(AbstractController.java:62)
> at 
> com.ontotext.kim.semanticannotation.GateAnnotator$1.run(GateAnnotator.java:39)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> ... 3 more
>  
> There are around 1500 documents in the repository and I have set 
> KIM_MAX_JAVA_HEAP to 2g. Is it normal case and we just need to increase 
> memory or is there some other possible solution?
>  
>  
> And sometimes when I try to close the server it doesn’t. Please see following 
> error:
>  
> ./kim stop
> KIM_HOME=/usr/local/kim-platform-3.0-RC4
> KIM_CON

Re: [Kim-discussion] KIM Startup problem

2011-01-04 Thread Philip Alexiev @ Ontotext
We had a discussion with the OWLIM team. The functionality exists, but is not 
completed as a separate tool. Rather a java class DatabaseRestorer is used. You 
can invoke it like this:

Let's say you are in KIM's main directory. You can write in the command line:

java -cp lib/*:$JAVA_HOME/lib/tools.jar com.ontotext.trree.DatabaseRestorer 

which will give you a short help on how to use the tool. Here is a complete , 
functional command to restore an index:

java -cp lib/*:$JAVA_HOME/lib/tools.jar com.ontotext.trree.DatabaseRestorer 
context/default/populated/ 40 
/put-the-full-path-here/kim-platform-3.0-RC4/context/default/kb/KIMRules.pie

In most of the cases the index can be repaired and used. But there are of 
course some extreme situations, where nothing can be done. The restorer assumes 
that at least some of the index files are there and are complete and valid.

Hope this helps.
Philip

On Jan 3, 2011, at 5:37 PM, Naaman Musawwir wrote:

> Yes, that tool will be a lot helpful and is in fact, needed. Will wait for 
> the information on that.
>  
> Regards,
> Naaman Musawwir.
> From: Philip Alexiev @ Ontotext [mailto:philip.alex...@ontotext.com] 
> Sent: Monday, January 03, 2011 8:37 PM
> To: Naaman Musawwir
> Cc: kim-discussion@ontotext.com
> Subject: Re: [Kim-discussion] KIM Startup problem
>  
> Hi Naaman,
>  
> We had such issues with abnormal termination of KIM before. This is in cases 
> when the machine stops or the process is killed with a  KILL signal. Just for 
> information, by default, killall will send a TERM signal , which will cause 
> KIM to stop normally.
>  
> There is a tool to recover a broken OWLIM repository. I am currently 
> communicating with the team to get the actual status.
>  
> I will update you when I have more information.
>  
> Thank you.
> Philip
>  
> On Dec 27, 2010, at 5:45 PM, Naaman Musawwir wrote:
> 
> 
> Hello,
>  
> I have noticed that if KIM service is stopped using killall java or the host 
> machine stops abnormally then KIM won’t start. It gives the following message:
>  
> r...@isbsvr2:/usr/local/kim-platform-3.0-RC4/bin# ./kim start &
> [1] 7521
> r...@isbsvr2:/usr/local/kim-platform-3.0-RC4/bin# 
> KIM_HOME=/usr/local/kim-platform-3.0-RC4
> KIM_CONTEXT=/usr/local/kim-platform-3.0-RC4/context/default
> KIM_MAX_JAVA_HEAP=1g
> KIM_LOG_FOLDER=/usr/local/kim-platform-3.0-RC4/log
> [INFO] The KIM Server will be remotely available for RMI connections at 
> 192.168.11.13
> [INFO] : : : : : : : : KIM SERVER START : : : : : : : :
> [INFO] OwlimSchemaRepository: 3.3
> [INFO] Build date:  06-22-2010 11:57
> [INFO] Configured parameter 'imports' to 'kb/owl/owl.rdfs;
>  kb/owl/protons.owl;
>  kb/owl/protont.owl;
>  kb/owl/protonu.owl;
>  kb/owl/kimso.owl;
>  kb/owl/kimlo.owl;
>  kb/skos-owl1-dl.rdf;
>  kb/wkb.nt;
>  kb/wkbx.nt;'
> [INFO] Configured parameter 'defaultNS' to 'http://www.w3.org/2002/07/owl#;
> 
> http://proton.semanticweb.org/2006/05/protons#;
> 
> http://proton.semanticweb.org/2006/05/protont#;
> 
> http://proton.semanticweb.org/2006/05/protonu#;
> http://www.ontotext.com/kim/2006/05/wkb#;
> http://www.ontotext.com/kim/2006/05/wkb#;
> http://www.ontotext.com/kim/2006/05/wkb#;
> http://www.ontotext.com/kim/2006/05/wkb#;
> http://www.ontotext.com/kim/2006/05/wkb#;'
> [INFO] Configured parameter 'base-URL' to 
> 'http://www.ontotext.com/kim/2006/05/wkb#'
> [INFO] Configured parameter 'ruleset' to 'kb/KIMRules.pie'
> [INFO] Configured parameter 'ftsLiteralsOnly' to 'true'
> [INFO] Configured parameter 'console-thread' to 'false'
> [INFO] Configured parameter 'useShutdownHooks' to 'false'
> [INFO] Configured parameter 'entity-index-size' to '40'
> [INFO] Configured parameter 'ftsIndexPolicy' to 'onStartup'
> [INFO] Tokenization regular expression: [\p{L}\d_]+
> [INFO] Repository fragments: 1
> [INFO] Inferencer threads: 1
> [INFO] ftsPolicy = on-startup
> [INFO] fts: indexing literals only
> [INFO] Configured parameter 'tuple-index-memory' to '100M'
> [INFO] Con

Re: [Kim-discussion] KIM Startup problem

2011-01-03 Thread Philip Alexiev @ Ontotext
Hi Naaman,

We had such issues with abnormal termination of KIM before. This is in cases 
when the machine stops or the process is killed with a  KILL signal. Just for 
information, by default, killall will send a TERM signal , which will cause KIM 
to stop normally.

There is a tool to recover a broken OWLIM repository. I am currently 
communicating with the team to get the actual status.

I will update you when I have more information.

Thank you.
Philip

On Dec 27, 2010, at 5:45 PM, Naaman Musawwir wrote:

> Hello,
>  
> I have noticed that if KIM service is stopped using killall java or the host 
> machine stops abnormally then KIM won’t start. It gives the following message:
>  
> r...@isbsvr2:/usr/local/kim-platform-3.0-RC4/bin# ./kim start &
> [1] 7521
> r...@isbsvr2:/usr/local/kim-platform-3.0-RC4/bin# 
> KIM_HOME=/usr/local/kim-platform-3.0-RC4
> KIM_CONTEXT=/usr/local/kim-platform-3.0-RC4/context/default
> KIM_MAX_JAVA_HEAP=1g
> KIM_LOG_FOLDER=/usr/local/kim-platform-3.0-RC4/log
> [INFO] The KIM Server will be remotely available for RMI connections at 
> 192.168.11.13
> [INFO] : : : : : : : : KIM SERVER START : : : : : : : :
> [INFO] OwlimSchemaRepository: 3.3
> [INFO] Build date:  06-22-2010 11:57
> [INFO] Configured parameter 'imports' to 'kb/owl/owl.rdfs;
>  kb/owl/protons.owl;
>  kb/owl/protont.owl;
>  kb/owl/protonu.owl;
>  kb/owl/kimso.owl;
>  kb/owl/kimlo.owl;
>  kb/skos-owl1-dl.rdf;
>  kb/wkb.nt;
>  kb/wkbx.nt;'
> [INFO] Configured parameter 'defaultNS' to 'http://www.w3.org/2002/07/owl#;
> 
> http://proton.semanticweb.org/2006/05/protons#;
> 
> http://proton.semanticweb.org/2006/05/protont#;
> 
> http://proton.semanticweb.org/2006/05/protonu#;
> http://www.ontotext.com/kim/2006/05/wkb#;
> http://www.ontotext.com/kim/2006/05/wkb#;
> http://www.ontotext.com/kim/2006/05/wkb#;
> http://www.ontotext.com/kim/2006/05/wkb#;
> http://www.ontotext.com/kim/2006/05/wkb#;'
> [INFO] Configured parameter 'base-URL' to 
> 'http://www.ontotext.com/kim/2006/05/wkb#'
> [INFO] Configured parameter 'ruleset' to 'kb/KIMRules.pie'
> [INFO] Configured parameter 'ftsLiteralsOnly' to 'true'
> [INFO] Configured parameter 'console-thread' to 'false'
> [INFO] Configured parameter 'useShutdownHooks' to 'false'
> [INFO] Configured parameter 'entity-index-size' to '40'
> [INFO] Configured parameter 'ftsIndexPolicy' to 'onStartup'
> [INFO] Tokenization regular expression: [\p{L}\d_]+
> [INFO] Repository fragments: 1
> [INFO] Inferencer threads: 1
> [INFO] ftsPolicy = on-startup
> [INFO] fts: indexing literals only
> [INFO] Configured parameter 'tuple-index-memory' to '100M'
> [INFO] Configured parameter 'fts-memory' to '80M'
> [INFO] Cache pages for tuples: 5241
> [INFO] Cache pages for predicates: 0
> [INFO] Cache pages for FTS: 4193
> [INFO] Configured parameter 'storage-folder' to 'populated'
> [INFO] KIMService registered on port 1099
> [ERROR] Server Starter thread failed!
> java.rmi.RemoteException: Error creating repository; nested exception is:
> java.lang.IllegalArgumentException: Not a valid (absolute) URI:
> at 
> com.ontotext.kim.semanticrepository.UnmanagedRepositoryFactory.reconnect(UnmanagedRepositoryFactory.java:44)
> at 
> com.ontotext.kim.KIMServiceImpl.getSemanticRepositoryAPI(KIMServiceImpl.java:228)
> at 
> com.ontotext.kim.KIMServiceImpl.getSemanticRepositoryAPI(KIMServiceImpl.java:198)
> at 
> com.ontotext.kim.KIMServiceImpl$ServerComponentStarter.run(KIMServiceImpl.java:82)
> Caused by: java.lang.IllegalArgumentException: Not a valid (absolute) URI:
> at org.openrdf.model.impl.URIImpl.setURIString(URIImpl.java:68)
> at org.openrdf.model.impl.URIImpl.(URIImpl.java:57)
> at com.ontotext.trree.HashEntityPool.toObject(HashEntityPool.java:819)
> at 
> com.ontotext.trree.HashEntityPool.buildFullTextSearchIndex(HashEntityPool.java:1154)
> at 
> com.ontotext.trree.HashEntityPool.buildFullTextIndex(HashEntityPool.java:154)
> at com.ontotext.trree.HashEntityPool.(HashEntityPool.java:143)
> at 
> com.ontotext.trree.OwlimSchemaRepository.initialize(OwlimSchemaRepository.java:443)
> at 
> org.openrdf.repository.sail.SailRepository.initialize(SailRepository.java:84)
> at 
> com.ontotext.kim.semanticrepository.UnmanagedRepositoryFactory.reconnect(UnmanagedRepositoryFactory.java:40)
> ... 3 more
> [INFO] It is now safe to close the KIM Server.
> [INFO] KIMService unregistered from port 10

Re: [Kim-discussion] Accessing to KIM via Internet

2011-01-03 Thread Philip Alexiev @ Ontotext
Hi Sonja, 

Could you provide some output or screenshots? This will be very useful.

Greetings,
Philip

On Jan 2, 2011, at 4:01 AM, Sonja D. Radenkovic wrote:

> Hello,
> 
> 
> 
> I’ve installed KIM on server and I need to access to KIM via Internet by 
> using the Java RMI API. I’ve followed the instructions on the link 
> http://www.ontotext.com/kim/doc/sys-doc/ConfigRMI.html , but I’ve still had 
> the problems.
> 
> Do you have some other procedure for accessing to KIM via Internet, or I’m 
> doing something wrong?
> 
>  
> Best,
> 
> Sonja d. Radenkovic
> 
> 
> 
> -- 
> Mr Sonja D. Radenkovic, Lecturer
> High Economic Professional School Pec-Leposavic
> 38218 Leposavic
> Serbia
> Email: sonja...@gmail.com
> URL:   http://www.goodoldai.org/sonja_radenkovic
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] Help - Search against custom feature

2010-12-20 Thread Philip Alexiev @ Ontotext
Hi Naaman,

This question is related to how Lucene creates its indexes. It uses different 
formatters for different types of fields. So non-string values require special 
treatment. 

Lucene is intended to serve as FTS index and query engine, and not a DB, so our 
wrapper on top of it supports only String literals.

You should consider using strings instead of numeric values. 

Hth
Philip Alexiev
Software Engineer, KIM team

On Dec 13, 2010, at 7:36 PM, Naaman Musawwir wrote:

> Hello, thank you. Here is part of the code that I am using for searching. 
> Please notice the lines where I set query restrictions. If I use title:women 
> it returns relevant documents but if I use BB_ID:2 it never returns anything 
> even if many documents exist. The feature BB_ID is there in documents as 
> verified by displaying that in the code.
>  
> static void searchDocuments(DocumentRepositoryAPI apiDR) {
> // load documents from persistence
> // 
> --
> System.out.println("\n\nLoading documents from persistance ...");
> DocumentQueryResult listDocIDs = null;
>  
> // example
> DocumentQuery query = new DocumentQuery();
> try {
> //query = query.setKeywordRestriction("bb_id:153");
> //query.setKeywordRestriction("lens_id:2");
> //query = query.setKeywordRestriction("Using food to AND 
> title:bank");
> //query = query.setKeywordRestriction("Using food to");
> //query.setMaxResultLength(1);
>  
> listDocIDs = apiDR.getDocumentIds(query);
> System.out.println("Documents Found: " + listDocIDs.size());
> } catch (Exception ex1) {
> ex1.printStackTrace();
> return;
> }
>  
> int numDocIDs = listDocIDs.size();
> int numReadDocs = 0;
> for (int i = 0; i < numDocIDs; i++) {
> long docID = listDocIDs.get(i).getDocumentId();
> try {
> KIMDocument kdoc = apiDR.loadDocument(docID);
> if (kdoc != null) {
> KIMFeatureMap features = kdoc.getFeatures();
> Set featureNames = features.keySet();
> for (Iterator it = featureNames.iterator(); 
> it.hasNext();) {
> String feature = (String) it.next();
> if (feature.equals("TIMESTAMP")) {
> System.out.print(" - " + feature);
> long timeStamp = (Long) features.get(feature);
> System.out.println(" : " + new Date(timeStamp));
> } else if (feature.equals("TITLE")) {
> System.out.print(" - " + feature);
> System.out.println(" : " + features.get(feature));
> } else if (feature.equals("CHALLENGE_ID")) {
> System.out.print(" - " + feature);
> System.out.println(" : " + 
> Long.parseLong((String)features.get(feature)));
> } else if (feature.equals("LENS_ID")) {
> System.out.print(" - " + feature);
> System.out.println(" : " + 
> Long.parseLong((String)features.get(feature)));
> } else if (feature.equals("BB_ID")) {
> System.out.print(" - " + feature);
> System.out.println(" : " + 
> Long.parseLong((String)features.get(feature)));
> }
> }
> //System.out.println("Document: " + kdoc.getContent());
> numReadDocs += 1;
> }
>  
> System.out.println();
> } catch (Exception ex) {
> ex.printStackTrace();
> //System.out.println(" - " + "Can NOT load a doc with docId=" 
> + docID + "!!!");
> continue;
> }
> }
> System.out.println("Documents Successfully Read: " + numReadDocs);
> }
>  
>  
> Regards,
> Naaman Musawwir.
> From: Boyan Kukushev [mailto:boyan.kukus...@ontotext.com] 
> Sent: Monday, December 13, 2010 10:29 PM
> To: kim-discussion@ontotext.com
> Cc: Naaman Musawwir
> Subject: Re: [Kim-discussion] Help - Search against custom feature
>  
>

Re: [Kim-discussion] Help - Search against custom feature

2010-12-20 Thread Philip Alexiev @ Ontotext
Hi Naaman,

I will have a look at it now. 

Greetings,
Philip Alexiev
Software Engineer, KIM team

On Dec 20, 2010, at 8:00 AM, Naaman Musawwir wrote:

> Hello there, did you get chance to have a look at this issue yet?
>  
> Regards,
> Naaman Musawwir.
> From: Boyan Kukushev [mailto:boyan.kukus...@ontotext.com] 
> Sent: Monday, December 13, 2010 10:29 PM
> To: kim-discussion@ontotext.com
> Cc: Naaman Musawwir
> Subject: Re: [Kim-discussion] Help - Search against custom feature
>  
> Hello, Naaman,
> 
> We are currently investigating your issue. Please, send sample code to 
> clarify the problem. Thanks!
> 
> Regards,
> Boyan
> 
> On Mon December 13 2010 06:54:13 Naaman Musawwir wrote:
> > Hello,
> >
> > 
> >
> > I followed the example
> > http://ontotext.com/kim/doc/KimDocs-3.0-EN/SearchForDocumentsExamples.html
> > to search documents against some custom document features. It works fine for
> > default set of features but not for custom features. Do we need to do some
> > extra work in order to get search work for custom features?
> >
> > 
> >
> > Regards,
> >
> > Naaman Musawwir.
> >
> > 
> >
> >
> 
> 
> 
> --
> Boyan Kukushev
> Senior Software Engineer / Java Developer
> Ontotext AD @ Sirma Group Corp.
> 
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1170 / Virus Database: 426/3312 - Release Date: 12/12/10
> 
> ___
> Kim-discussion mailing list
> Kim-discussion@ontotext.com
> http://ontotext.com/mailman/listinfo/kim-discussion

___
Kim-discussion mailing list
Kim-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/kim-discussion


Re: [Kim-discussion] regarding adding own ontology in kim.

2010-12-09 Thread Philip Alexiev @ Ontotext
Hi Chris

kim-gate-ui  starts Gate with all the libraries from KIM loaded. This means you 
can load and use all the extra plugins and resources developed by Ontotext. 
They will be visible on the UI.

You will have to load the lkb-gazetteer plugin in gate. Then you will be able 
to create an lkb-gazetteer PR. When you create a new one, the dialog asks you 
to setup a number of parameters.  FeedSetupPath is set in the dictFeederParams. 
Set it to something like  FeedSetupPath=$relpath$../../../config  .

Hth
Philip Alexiev
Software Engineer, KIM Team

On Dec 9, 2010, at 4:41 PM, Chris Shaw wrote:

> I've been working on the same problem.  I'm on the step now where I should 
> build a new gazetteer, but I can't find the screen where you have the option 
> to set the FeedSetupPath.  I'm following this tutorial: 
> http://ontotext.com/kim/doc/KimDocs-3.0-EN/CaseStudy-IntegrationDbPedia.html 
> and I'm on step 3.6.  I'm using the kim-gate-gui, is this correct?  When I 
> add a new ANNIE gazetteer I only get 4 fields.  The example screen shows 10 
> fields.
> 
> Thanks,
> Chris
> 
> On Sat, Dec 4, 2010 at 8:14 PM, Philip Alexiev @ Ontotext 
>  wrote:
> Hi Shelly,
> 
> One important step, which usually exists in the IE process, is using a list 
> of predefined terms, which are meaningful for the specific domain. Such lists 
> are used by a Gazetteer. KIM uses a more complex Gazetteer, which extracts 
> its lists from the semantic repository underneath. Its default behavior is to 
> get all labels of all entities which meet three requirements:
> * have at least one alias (label)
> KIM uses two models to associate entities with their labels - aliases and 
> labels. Aliases are separate objects and allow having some metadata 
> associated with a concrete label. Labels are just datatype properties - more 
> simple and compact. The model is set via the 
> com.ontotext.kim.KIMConstants.ENTITY_DESCR property in 
> KIM/config/install.properties .
> * are of type that is subclass of protons:Entity
> example:
> wkb:Person_Aristotle a protont:Person .
> protont:Person rdfs:subClassOf protons:Entity .
> * are marked as Trusted
> To mark an entity as trusted, there should exist in the semantic repository 
> statements, that this entity is generated by a trusted source. An example:
> wkb:Gazetteer a protons:Trusted .
> wkb:Person_Aristotle protons:generatedBy wkb:Gazetteer .
> 
> There are some trusted sources defined in the default KB, but new ones can be 
> defined also.
> 
> 
> Many approaches exist for adding new entities to KIM's IE. Some of the most 
> common are:
> 
> * use the existing PROTON classes
> The quickest and easiest way. KIM already knows about most of PROTON's 
> classes and has grammars to create meaningful annotations over them.  So for 
> example if we want to recognize "Aristotle" as a person in the analyzed 
> documents, a new person instance has to be defined like this:
> customkb:Person_Aristotle a protont:Person ;
>   protons:hasMainAlias 
> customkb:Person_Aristotle.1 .
> customkb:Person_Aristotle.1 a protons:Alias;
>   rdfs:label "Aristotle" .
> 
> Note: The format of the URI is not strict. The only requirement is that it is 
> unique.
> Note: An entity can have multiple aliases and one main alias ( labels 
> respectively) .
> 
> The gazetteer will create Lookup annotations, which serve as input for other 
> resources and rules. That is why we want to transform these into meaningful 
> ones. In KIM there are rules that will match a Lookup annotation with class 
> feature class=http://proton.semanticweb.org/2006/05/protont#Person and create 
> a Person annotation over it. That is why over Aristotle, a new Person 
> annotation will be created.
> 
> The drawback of this approach is that the new ontology is very closely tied 
> to PROTON. It is good for extending the instance base of already existing 
> classes, but not that good for extending KIM with a completely new ontology.
> 
> * default gazetteer with custom Jape rules
> If we have a complete ontology we want to adapt, and for example:
> customkb:Person_Aristotle a customkb:Person .
> Then we can complete the mentioned requirements for an entity to be included 
> in the gazetteer lists:
>   * define aliases
> customkb:Person_Aristotle a protont:Person ;
>   protons:hasMainAlias 
> customkb:Person_Aristotle.1 .
> customkb:Person_Aristotle.1 a protons:Alias;
>   rdfs:label "Aristotle" .
> 
>   * make the class subclass directly or indirectly protons

  1   2   >