Re: Is this a good way to get started?

Rob Walpole Fri, 12 Dec 2014 02:17:48 -0800

Hi again Nate,

When you talk about your T-box data, I think this would contain class
> hierarchies, information about which ones are disjoint, etc. Is that right?
>

Exactly.. and it is a tiny dataset compared the instance data on the A-box
side.

>
> `Is there ever a risk that a change to the ontology component (T-box) can
> invalidate the data component (A-box)? If so, how do you manage that?
>

Definitely.. but in that case we create a patch using SPARQL Update and
apply the patch to the data. We keep the patch in our version control
system so that we have a record of the changes we have made and can
re-apply them if necessary.

>
> When you load straight to the triple store, is there a single RDF? if not,
> do you use an assembler to gather to multiple files?
>

This depends. We happen to be using named graphs - I don't know whether
this is appropriate for you or not. We also happen to be using Jena TDB so
if the data is held in N-Quad format then then we load single file which
contains separate graphs. Jena allows you to do this using the tdbloader
command line tool. We could just as easily load separate RDF files that
were in a triple format such as Turtle and specify the graph name during
loading. The result is the same. I wouldn't get too hung up on named graphs
unless you think they are really appropriate for you though as they do add
some complexity to updating the data which it may be better to avoid at
first. The reason we chose to do this is that our ontology is still
developing and we wanted to be able to delete terms that we had decided to
dump without leaving cruft in the triplestore. Dropping the graph seemed to
be the best way to achieve this.

>
> Does separating the T-box and A-box data have any down sides?  Is it
> invisible to reasoners , for example?
>

Yes, as I say, using named graphs adds complexity to updates. We are using
Fuseki and we specifiy "<#dataset> tdb:unionDefaultGraph true" in the
Fuseki config file and this means that the when we query the data we can
forget about the named graphs as it is transparent to the query. When we do
updates though strange things can happen if we don't specify the graph name
in the right part of the query. I can't say how it impacts on reasoning as
we don't use this at present.

>
> Finally, I'm obviously a complete neophyte.  Am I in the wrong group?  I
> don't want to put noise in the channel
>

Being a neophyte is cool - welcome! Whether this is the is the right group
depends whether your questions relate to Jena specifics or not.. it seems
to me they do, at least in part..

Rob

>
> Thanks again!
>
> On Thu, Dec 11, 2014 at 12:20 PM, Rob Walpole <[email protected]>
> wrote:
>
> > Hi Nate,
> >
> > I'm not sure what you mean by an "ontology management workflow" exactly
> and
> > I can't comment on whether your approach is a good one or not... but what
> > we have done is to create our own ontology which as far as possible
> reuses
> > or extends other pre-existing ontologies (e.g. central-goverment, dublin
> > core etc.). This ontology consists of a load of classes, object
> properties
> > and data properties which are used inside our actual data. The ontology
> (or
> > TBox - http://en.wikipedia.org/wiki/Tbox) and data (or ABox -
> > http://en.wikipedia.org/wiki/Abox) components exist as separate datasets
> > and we have found it convenient to store them as separate named graphs
> > within our triplestore - mainly so that the ontology component can be
> > updated easily by dropping and reloading the graph.
> >
> > We manage the ontology using Protege and I have to say I find modelling
> > things in Protege saves me from wasting huge amounts of time as it forces
> > me to model things up front before I start fiddling about with the data.
> I
> > find the OntoGraf plugin particularly helpful when I need to visualise
> > relationships and when discussing requirements with users. Protege also
> > allows you to save the ontology as an RDF file which you can load
> straight
> > into your triplestore (Jena TDB in our case).
> >
> > We also keep a number of named individuals in the ontology itself. These
> > are for things that are entities but what I think of (coming from a Java
> > background) as statics. They are the entities which are very unlikely to
> > change and if they do then I am happy to edit them within the ontology.
> >
> > Hope that helps in some way.
> >
> > Rob
> >
> > Rob Walpole
> > Email [email protected]
> > Tel. +44 (0)7969 869881
> > Skype: RobertWalpolehttp://www.linkedin.com/in/robwalpole
> >
> >
> > On Thu, Dec 11, 2014 at 12:30 PM, Nate Marks <[email protected]> wrote:
> >
> > > I'm trying to get my arms around an ontology management workflow.  I've
> > > been reading the docs on the Apache Jena site  and a couple of books.
>  I
> > > was hoping to test my understanding of the technology by sharing my
> > current
> > > plan and gathering some feedback.
> > >
> > > Thanks in advance if you have the time to comment!
> > >
> > >
> > > I intend to tightly manage a pretty broad ontology.  Let's say it
> > includes
> > > assets, locations, people and workflows.
> > >
> > > I think I want to have a single "schema" file that describes the asset
> > > class hierarchy  and the rules for validating assets based on
> properties,
> > > disjointness etc.
> > >
> > > Then I might have a bunch of other "data" files that enumerate all the
> > > assets using that first "schema"  file.
> > >
> > > I'd repeat this structure using a schema file each for locations,
> people,
> > > workflows.
> > >
> > > Having created these files, I think I can  use an assembler file to
> pull
> > > them into a single model.
> > >
> > > Ultimately, I expect to query the data using Fuseki and this is where I
> > get
> > > a little hazy.  I think the assembler can pull the files into a single
> > > memory model, then I can write it to a tdb.
> > >
> > > Is that necessary, though?  it's a simple bit of java, but I have the
> > > nagging feeling that there's a shorter path to automatically
> > load/validate
> > > those files for  Fuseki
> > >
> > >
> > > Is this approach to organizing the files sound?
> > >
> >
>

Re: Is this a good way to get started?

Reply via email to