Hi Carsten,
On 2/22/2012 12:02 PM, Carsten Keßler wrote:
Dear LODers,
we are currently working on a project for the United Nations Office
for the Coordination of Humanitarian Affairs (OCHA) in Geneva to
develop a Humanitarian Exchange Language (HXL). Some information about
the project is available at https://sites.google.com/site/hxlproject/.
One of the core components of HXL will be an RDF vocabulary to
annotate the data that are exchanged between humanitarian
organizations. The current draft is available at
http://hxl.humanitarianresponse.info. It is far from complete, but I
think it already shows where we want to go with this. Any feedback on
the vocabulary draft is very welcome, of course.
At a first glance, your ontology looks very interesting and well designed.
The aspect we are currently working on is a metadata section that will
include classes and properties to state who has reported a certain
piece of information, when it was reported, whether it was approved
(and at which level), and so forth. The current idea is to create
named graphs that can be described by these metadata elements. I'd
like to hear your comments on this approach, since this will lead to a
situation where we can have the same triple in several named graphs
For example, graph A with all data reported on Januar 20, 2012 by an
OCHA information officer in Suda, graph B with all data approved by
the OCHA regional office on January 21, and graph C with all data
approved by OCHA in Geneva on January 22. The rationale is to be able
to query based on these metadata elements via SPARQL, e.g., "give me
all figure about refugess in Sudan from January 2012 approved by OCHA
Geneva". Note that the regional office may only approve some of the
triples originally reported, and OCHA Geneva may only approve a subset
of those approved by the regional office. So basically we need to be
able to attach those metadata elements to every single triple.
We will probably run into a situation where we can have the same
triple in 10–20 graphs at the same time. Likewise, we will have a
pretty large number of named graphs in our store, and I'd like to know
whether you think this approach is problematic (e.g. in terms of query
performance), and whether you see an alternative approach?
I investigated some thoughts on this topic as well in the past. This is
also a topic of the current RDF WG Graphs TF (See [1]).
I think, you exactly pointed out the problems with duplicated triples
and single triple named graphs. So there might be the (rather old) need
for statement identifiers, i.e., a URI (or maybe also a bnode) for
identifying a single triple and to be able to describe external context
information. You can find my proposal at the RDF WG comments mailing
list, see [2].
Cheers,
Bo
[1] http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs
[2]
http://lists.w3.org/Archives/Public/public-rdf-comments/2011Jan/0001.html