Stanbol Early Adopter proposals

Adrian Gschwend Thu, 21 Jun 2012 04:05:10 -0700

Hi group,

First thanks for those who did some demos/presentations last week in
Salzburg, I was really impressed by the outcome of this project. I can't
remember another single FP7 project which provides such an interesting
framework as a result.


John asked me if I would like to do an early adopter project this
summer, after studying the modules I can think of the following two use
cases:

Rules/Reasoning/Ontology Manager modules
----------------------------------------
netlabs.org is working on a framework which facilitates interfacing
RDF based data. It goes into the direction of VIE but pushes the
concept quite a bit further, into an even stronger abstraction. I
showed a demo to Massimo Romanelli and I had the impression he was
pretty impressed by the concept so I'm confident that it will be useful
to the SemWeb community. VIE is doing things a bit different but I think
it would even fit as a UI library on top of our framework (will have to
investigate there).

We heavily rely on ontologies in our framework, which means we cache
them and do quite some reasoning on top of them to figure out how data
can be shown in the optimal way. This includes figuring out
relationship between ontology classes and attributes. Right now this
reasoning is pretty dumb and mainly done in code, which means we
analyze triples our self. After reading about rules, reasoning and
ontologies in Stanbol I'm pretty sure that we could do all this and
much more in Stanbol.

So the proposal would be to use the mentioned modules of Stanbol to:

- cache ontologies. Right now we fetch them from the various
official sites which are often very slow and unreliable
- reason/infere (and cache the reasoned stuff) relationships in the
ontologies. This includes figuring out which attribute belongs to
which class, which class to which superclass etc. We could do much
smarter matching with Stanbol than we do now, which would improve our
interfaces
- implement several strategies to figure out matches between data,
which again improves our user interface. As an example, the interface
could figure out that foaf:based_near is a spacial thing and thus can be
shown by an interface class which can show a map.

Expected results for users:
- Much smarter user interfaces that can adopt to the selected (RDF
based) data and choose the best representation for that data on a
particular device (smartphone might look different than the desktop web
browser)
- Faster experience because we cache the ontologies and additional
inferred knowledge in Stanbol

Expected results from a data perspective:
- Create "class trees and property trees" for ontologies. The goal is to
find out how classes and properties are related to each other. In our
framework widgets match to RDF properties. But we cannot and do not want
to implement a class for every property so the relationship between
properties can help the system to figure out which widget might be the
best choice to interface a certain information, even if the widget
designer did not necessarily think of that upfront.

I would use Stanbol via the REST API, as our reference implementation
is mainly written in Node.js. Like this I could easily replace our
REST interfaces which do the work right now.

I think there is a huge potential in the reasoning part and I would
love to spend time on that and see where we can go with it.

As a last remark, our framework will be released under APL as well and I
would surely make all stanbol related stuff available as well.

Document Management Demo with Contenthub
----------------------------------------
I would use the ContentHub module to store and index all kind of
documents using the Apache Solr backend integrated into Stanbol.

After analyzing the exposed RDF data provided by ContentHub we would
implement a demo of how Stanbol can be used to find documents in a
semantically enriched way via smart queries.

I assume we could use the rules/reasoning/ontology modules again to
match to other resources and ontologies and make the search more valuable.

In this demo we would focus more on the RDF data exposed by
ContentHub. Reasoning needs to be used too here but heavily depends on
the quality of the ContentHub output so it's hard to judge how much we
could do right now.

Expected results for users:
- Demo about how enrichment via RDF data can support document management
in terms of categorization, grouping, archiving etc.
- Demo about how ontologies can help figuring out relationships between
data which cannot be done with classical tagging (broader/narrower
search results)
-> prove of concept to show if and how Stanbol could be used for
document management systems like Nuxeo, Mircosoft Sharepoint etc.

Expected results from a data perspective:
- Insight in how powerful entity extraction works from an RDF data
perspective for "classical" documents used and found in every company today
- See how well it fits in company-internal vocabularies and thesaurus
concepts
- Some ideas how difficult it is to gain additional knowledge using
reasoning/infercence etc on it


These are the things I can think of right now. Note that  I will use
Stanbol anyway, there seems to be a lot of useful stuff in there!

As a last remark: We are not Java coders so we would use the REST APIs
as is mainly and talk to the developers on the list if we run into
issues. We are pretty good in reporting stuff, DBpedia fixed a whole lot
of things after we started using it and reported issues ;)

Let me know if you find any of those interesting. I am not sure how
much details you need in the end if we could participate in the
program. We do have the resources this summer so we could spend time on
it pretty soon.

The amount of time we invest would be discussed with you. I would report
to the list on a regular base if you want.

cu

Adrian

- -- 
Adrian Gschwend
@ netlabs.org

ktk [a t] netlabs.org
- -------
Open Source Project
http://www.netlabs.org

Stanbol Early Adopter proposals

Reply via email to