Hi all forgot to include the stanbol-dev list in my replay ...
best Rupert ---------- Forwarded message ---------- From: Rupert Westenthaler <[email protected]> Date: Wed, Jul 25, 2012 at 9:43 AM Subject: Re: NIF, NLP2RDF and Stanbol To: Sebastian Hellmann <[email protected]> Hi Sebastian, I finally found the time to read through most of the resources referenced in your original mail and I think I have now a much better understanding about nlp2rdf. ## Points of Interest (for Stanbol Developers/Users) 1. NIF URI scheme: Basically encoding information of fise:TextAnnotation within the URI (see [1] for details). Two variants * "{content-item-uri}#offset_{startindex}_{endindex}_{max-20chars-of-selected-text}" * "{content-item-uri}#hash_{context-length}_{length-selected}_{context-md5}_{max-20chars-of-selected-text}" This could be relatively easy implemented by adding support for those URIs to the EnhancementEngineHelper#createTextAnnotation(..) methods. Note that this would require an API change because EnhancementEngines would need to parse selected-text, context and offset values to correctly calculate the NIF compliant URI 2. Ontologies of Linguistic Annotations (OLiA) provides URIs for things like "olia:Noun", "olia:Verb". It might also be useful to express Lemmas and other features (e.g. as provided by the CELI engines). However this ontology is also quite big and therefore hard to grasp. If this could be used to determine if a POS (Part of Speech) tag provided by some NLP tool for some language corresponds to a "Noun", "Verb" ... it could be really useful. Currently the KeywordLinkingEngine manages those information as part of its configuration. But as I mentioned above - I do not unstained OLiA good enough the be sure if such a thing is possible/feasible. 3. NIF and Open Annotation Core Data Model (OA) [2] (related to STANBOL-351): There was recently the suggestion to adopt OA for Apache Stanbol and NIF seams to have quite an overlap with OA. Some more information about that would clearly help. Finally I would really like if some one could actually translate the FISE annotations depicted in [3] to NIF. I think this would make it much easier for members of the Stanbol Community to grasp NIF. Including information like * POS tags for "Bob" and "Marley" * Chunk "Bob Marley"? Can Chunks be connected to Words? * Are EntityAnnotations within the scope of NIF? If yes, how would they encoded by using NIF * How to express metadata (e.g. dc:creator, fise:extracted-from, fise:confidence) in NIF best Rupert [1] http://svn.aksw.org/papers/2012/WWW_NIF/public/string_ontology.pdf [2] http://www.openannotation.org/spec/core/ [3] http://incubator.apache.org/stanbol/docs/trunk/components/enhancer/enhancementstructure.html#overview-on-the-stanbol-enhancement-structure On Wed, Jul 4, 2012 at 9:06 AM, Sebastian Hellmann <[email protected]> wrote: > Hi Rupert, > I found the dead links here: > http://incubator.apache.org/stanbol/docs/trunk/enhancementusage.html > Basically every link with "fise:" at the beginning, see the highlighed part > here: > http://pcai042.informatik.uni-leipzig.de/~swp12-9/vorprojekt/index.php?annotation_request=http%3A//incubator.apache.org/stanbol/docs/trunk/enhancementusage.html%23frag_f0935e4cd5920aa6c7c996a5ee53a70f > > I will have a look at the ontology soon. > > All the best, > Sebastian > > Am 04.07.2012 08:22, schrieb Rupert Westenthaler: > > Hi Sebastian, all > > Thanks for this mail/proposal. It is definitely well received by > myself and I think also the Stanbol Community as a whole. > > This is only a quick replay to the question about the wrong URL for > the Enhancement Structure Documentation. For a detailed replay I will > definitely need more time. > > On Wed, Jul 4, 2012 at 1:33 AM, <[email protected]> wrote: > > I got a 404 on > http://incubator.apache.org/enhancer/enhancementstructure.html > I read "fise" somewhere. What is it? How does it compare to NIF? What URIs > do you use? How many triples do you have per annotation? > > This looks like a link to that page started unintentional with a '/'. > Can you remember the occurrence of this link? > > The correct URL is > > > http://incubator.apache.org/stanbol/docs/trunk/enhancer/enhancementstructure.html > > Regarding typical use cases you should also have a look as this usage > scenario > > http://incubator.apache.org/stanbol/docs/trunk/enhancementusage.html > > The Ontologies can be found on the SVN (we will make them > de-referenceable as soon as we are a full Apache Project and do own > the URLs) > > > http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/generic/servicesapi/src/main/resources/ > > best > Rupert > > > > > -- > Dipl. Inf. Sebastian Hellmann > Department of Computer Science, University of Leipzig > Events: http://wole2012.eurecom.fr (*Deadline: July 31st 2012*) > Projects: http://nlp2rdf.org , http://dbpedia.org > Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann > Research Group: http://aksw.org -- | Rupert Westenthaler [email protected] | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen -- | Rupert Westenthaler [email protected] | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen
