NIF, NLP2RDF and Stanbol

hellmann Tue, 03 Jul 2012 16:33:53 -0700

Dear list,

this is an explorative email to find possible intersections and pointsfor collaboration. "explorative" because I would claim to have a roughidea about what Stanbol is, but I don't know enough yet to wrap myhead around it completely and pinpoint the overlaps.This is why I will try to give a brief outline of what we have beenworking on at the LOD2 EU project, which direction it is going andwhat my initial ideas are. Feedback is very welcome. I am not being aStanbol expert yet, so I might be off the path ;)


Last year, we have been working on the NLP Interchange Format (NIF).

NIF is an RDF/OWL-based format that aims to achieve interoperabilitybetween Natural Language Processing (NLP) tools, language resourcesand annotations.


What NIF currently is:

1. In Sept. 2011, we published the specification 1.0:http://nlp2rdf.org/nif-1-0 . There are about 8-12 implementations (seedemo at 5.) out there, we know of.2. One of the latest draft papers about it can be found here:http://svn.aksw.org/papers/2012/WWW_NIF/public/string_ontology.pdf

3. Basic idea is to use # fragments to give URIs to Strings, e.g.:

http://www.w3.org/DesignIssues/LinkedData.html#offset_717_729represents the first occurence of "Semantic Web" inhttp://www.w3.org/DesignIssues/LinkedData.htmlOf course, you can then use this URI as subject and add any annotationyou want.

e.g.:
:offset_717_729 its:mentions dbpedia:Semantic_Web .
4. There is a Web annotator making use of the Hash URI scheme or NIF:
http://pcai042.informatik.uni-leipzig.de/~swp12-9/vorprojekt/index.php?annotation_request=http%3A//www.w3.org/DesignIssues/LinkedData.html%23frag_65b9eea6e1cc6bb9f0cd2a47751a186f

5. There is a demonstrator (will be much nicer in a couple of days):http://nlp2rdf.lod2.eu/demo.php

with eye candy, but minor bug: http://nlp2rdf.lod2.eu/demo_new.php

6. Apart from that NIF also tries to find best practices forannotation. E.g. OLiA idenitifers for Part of Speech tagshttp://www.sfb632.uni-potsdam.de/~chiarcos/ontologies.xml or NERD orthe lemon model.


What is planned for NIF:

a) A new spec NIF 2.0 within this year. Discussion will be on thismailing list:http://lists.informatik.uni-leipzig.de/mailman/listinfo/nlp2rdfNIF will be simplified (simpler URI Schemes and annotations),consolidated (Better implementations) and extended (ability to expressconfidence value and string sets, etc. )b) We plan to have implementations for NERD http://nerd.eurecom.fr ,DBpedia Spotlight, Zemanta.com and DKProhttp://www.ukp.tu-darmstadt.de/research/current-projects/dkpro/c) Inclusion of XPointer as NIF URI Scheme and creation of a mappingto "string uris". This should somehow be compatible with theInternationalisation Tag Set (ITS) 2.0 http://www.w3.org/TR/its20/ ,but we are still working together on a bidirectional bridge. Therehave been a plethora of discussion partly at this thread:http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jun/0101.htmld) NIF should be compatible with PROV-AQ: Provenance Access and Queryhttp://www.w3.org/TR/2012/WD-prov-aq-20120619/


What I am hoping for or my ideas about how Stanbol and NIF overlap:

I) Reading your docu, you guys seem to be able to provide very gooduse cases and feedback for NIF 2.0 . We would really like to includethat and also tailor NIF 2.0 to your needs. We are currently settingup a Wiki - still ugly sorry: http://wiki.nlp2rdf.org/ Please mail mefor accounts.II) I would assume, that you need some OWL model for all the enhanceroutput. NIF standardizes NLP tool output and it tries to be blank-nodefree and lightweight, but still as expressive as possible. So for youthis would mean that you could really save time, as ontology modellingis really tedious. By reusing NIF you would get a free data model andspec and you could focus on the implementation of the Stanbol engine.I got a 404 onhttp://incubator.apache.org/enhancer/enhancementstructure.htmlI read "fise" somewhere. What is it? How does it compare to NIF? WhatURIs do you use? How many triples do you have per annotation?III) With NIF we focused on the RDF output for tools, not on theworkflow. Stanbol seems to focus on the workflow as well, right? Itmight be easy to implement a NIF engine with Stanbol. This could be agood showcase for NIF and Stanbol. With a Debian package, we couldinclude Stanbol into the LOD2 Stack http://stack.lod2.eu/


Sorry for the long email, please give some feedback about your ideas.
I am also willing to answer questions and provide examples.
All the best,
Sebastian

--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Events: http://wole2012.eurecom.fr (*Deadline: July 31st 2012*)
Projects: http://nlp2rdf.org , http://dbpedia.org
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

NIF, NLP2RDF and Stanbol

Reply via email to