Hello Nader,

a). First you can use the standard GATE or the GATE Developer that comes with KIM. When you process a document the result you get is an annotation set. You either save that annotation set as XML(after you run the pipeline) or use a datastore. When using a datastore the result is automatically saved back in the datastore. Also with a datastores you can process a higher volume of documents as documents are loaded in memory one by one and this result in less memory management.

b). When we process a document we do information extraction, but besides that we add the document to a full text search (FTS) index. In KIM you can use different FTS indexers and the default one is Lucene.

Depending on the "running strategy" parameter we have different behavior in KIM. With default running strategy we can proceed this way:

1. You call the SemanticAnnotationAPI.execute method to add semantic annotations to your gate document (let's call it kdoc).
kdoc= SemanticAnnotationAPI.execute(kdoc);

Semantic annotations are these that have a URI in the ontology you are using. To do that you need a processing resource that is capable of doing that. In KIM pipeline it is called "Instance Generator".

2. Next step - you call the DocumentRepository.addDocumenet method.
By default this method will create FTS index. But besides that it will generate RDF from using the semantic annotations from step 1. If you do not have the semantic annotations it will only create a FTS index and store the document (storage type is also configurable). The generated RDF is stored and merged in OWLIM with data already available in OWLIM.

You can use my answer here to achieve your specific goals.

Hope this helps and that I was able to explain it properly.

--
Anton Andreev
email: [email protected]
Account Manager at Ontotext
www.ontotext.com





On 23.3.2010 г. 13:06 ч., Nader Zaki wrote:
Dear Anton,

I want to know the importance of using GATE and Lucene in the KIM platform in detail ? How can I use each of them separtely to extarct the semantic information from a HTML page or file ?
Also, what are the inputs and the outputs of each of them ?
Thanks for your time.

Regards,,


Nader Nassef Zaki


------------------------------------------------------------------------
Date: Mon, 22 Mar 2010 13:10:06 +0200
From: [email protected]
To: [email protected]
CC: [email protected]
Subject: Re: [Kim-discussion] Urgent Request

Hello Nader,

1. You need to supply a file that contains a SeRQL query. SeRQL is a language similar to SPARQL and it is used for semantic queries. I have attached such a file with a sample SeRQL query that extracts all the companies that are loaded in OWLIM/KIM by default. Your query must use the "construct" clause: http://www.openrdf.org/doc/sesame/users/ch06.html#d0e1371

Sample command line:
kim\bin\tools>toolRdfExport.cmd query.txt result.rdf RDF/XML

2. I have attached a "2-Getting started.pdf" which is part of a KIM-Guide which still has not been released. It should be considered as a almost ready draft. You will find it useful in order to comprehend what you can do with KIM in general. Check point 5. By setting up the the Sesame UI you will be able to make queries to the built-in OWLIM in KIM. Also the Sesame UI might provide the functionality you need.

Hope this helps,
--
Anton Andreev
email:[email protected]  <mailto:[email protected]>
Account Manager at Ontotext
www.ontotext.com  <http://www.ontotext.com/>


On 20.3.2010 г. 01:00 ч., Nader Zaki wrote:

    Dear sir,

    First of all, thanks alot for your efforts.Second,
    I have some questions:

    1-I tried to use the RDF export tool but there was something
    missing, I couldn't get the SeRQL file as I didn't know where to
    find it and what's its extension, so can you tell me ??
    2-I tried to use the OWLIM but I couldn't even operate it so  can
    I have more guidance to use it ??

    My overall goal is as follows:

    Taking any http page as an input and converting it from HTML to
    RDF or OWL format so that I have the important information in the
    HTML page but in the rdf format file.Then I build a
    semantic application that uses this rdf files in a specific
    domain: Mechanical for example .So what I need for now is a
    program that converts from HTML to RDF .Also I need to know how is
    this done if it's possible to be known.

    Thanks alot for your time.Waiting for your reply as soon as possible.

    Regards,,


    Nader Nassef Zaki
    ------------------------------------------------------------------------
    Date: Mon, 15 Mar 2010 13:02:19 +0200
    From: [email protected] <mailto:[email protected]>
    To: [email protected] <mailto:[email protected]>
    CC: [email protected] <mailto:[email protected]>
    Subject: Re: [Kim-discussion] Urgent Request

    Hello Nader,

    You can process documents and htmls with KIM and the resulting RDF
    is stored in our built-in OWLIM database in KIM. You may also try
    this tool: http://ontotext.com/kim/doc/sys-doc/RDFExport.html.
    This tool will export the RDF from OWLIM.


    Cheers,

    Anton Andreev
    email:[email protected]
    Account Manager at Ontotext
    www.ontotext.com



    On 15.3.2010 г. 11:56 ч., Philip Alexiev wrote:

        Hello Nader,

        1 and 3: KIM does not provide functionality to get the output
        in rdf/xml format.  I don't recall older versions being able
        to do this either. Maybe it is achievable through the API. We
        haven't developed it in this direction.

        2. There are a number of efforts to make public data available
        in RDF format. There is also a big projects which aims to
        connect the different disjoint datasets in one large Knoledge
        Base. The project is called : Linked Open Data
        
(http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData)
        . In the data sets it uses you may find useful references for
        your task
(http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets). For example: Geonames - http://www.geonames.org/ontology/ .

If you have your own data which is in a different format, you may write your own custom tool to create RDF from it. You will have to tie it to the ontology KIM uses by default - PROTON (http://proton.semanticweb.org/). There is a section in the
        documentation of KIM which will be helpful : Creating
        Knowledge Bases and Ontologies .

        Some tools for designing and viewing ontologies are : Protege
        , Swoop, Top Braid Composer.

        Hope this helps
        Philip

        On 03/14/2010 10:42 PM, Nader Zaki wrote:

            Dear Philip,

            I want to ask few questions about the Kim:

            1-How can I get the  output of the annotation as OWL or
            RDF/XML files ?

            2-You told me before to go to *kim/config/sesame.conf* and
            edit the file by adding the namespaces of any new
            knowledge base but how can I build this knowledge base ,
            can you explain in more detail ??

            3-How can I get the older versions of the Kim platform ?
            As I need precisely an API converting from HTML to OWL or
            RDF based files and as I know the older versions of the
            Kim platform did that.

            Thanks alot for your time.Waiting for your reply.

            Regards,,

            Nader Nassef Zaki

            > Date: Tue, 9 Mar 2010 15:50:25 +0200
            > From: [email protected]
            > To: [email protected]
            > CC: [email protected]
            > Subject: Re: Urgent Request
            >
            > Hello Nader,
            >
            > If you are using kim prior to 3.0, the knowledge base
            used is
            > described in kim/config/sesame.conf - there is an
            imports section.
            > You can add your custom files containing RDF data there.
            Have in mind
            > that you will have to provide a corresponding default
            namespace below
            > as well.
            >
            > The files can be in ntriples format or in rdf/xml . You
            can use any
            > ontology editor to achieve this. Protege is a good
            choise. Just take a
            > look at the resulting rdf/xml to make sure it is OK.
            >
            > Hope this helps
            > Philip
            >
            >

            The New Busy is not the old busy. Search, chat and e-mail
            from your inbox. Get started.



-- Philip Alexiev<[email protected]>
        Software Engineer
        Ontotext AD


        _______________________________________________
        Kim-discussion mailing list
        [email protected]
        http://ontotext.com/mailman/listinfo/kim-discussion



    ------------------------------------------------------------------------
    Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
    Sign up now. <http://clk.atdmt.com/GBL/go/210850552/direct/01/>




------------------------------------------------------------------------
Hotmail: Trusted email with Microsoft’s powerful SPAM protection. Sign up now. <http://clk.atdmt.com/GBL/go/210850552/direct/01/>


_______________________________________________
Kim-discussion mailing list
[email protected]
http://ontotext.com/mailman/listinfo/kim-discussion

Reply via email to