Rob, Thank you again for following up. This is great information and opens the doors to some additional research I need to do as well.
I'm really excited about Jena. The only thing better than finding great technology is finding a vibrant and helpful user group. I'll be back soon and thanks again! On Fri, Dec 12, 2014 at 5:16 AM, Rob Walpole <[email protected]> wrote: > Hi again Nate, > > When you talk about your T-box data, I think this would contain class > > hierarchies, information about which ones are disjoint, etc. Is that > right? > > > > Exactly.. and it is a tiny dataset compared the instance data on the A-box > side. > > > > > `Is there ever a risk that a change to the ontology component (T-box) can > > invalidate the data component (A-box)? If so, how do you manage that? > > > > Definitely.. but in that case we create a patch using SPARQL Update and > apply the patch to the data. We keep the patch in our version control > system so that we have a record of the changes we have made and can > re-apply them if necessary. > > > > > When you load straight to the triple store, is there a single RDF? if > not, > > do you use an assembler to gather to multiple files? > > > > This depends. We happen to be using named graphs - I don't know whether > this is appropriate for you or not. We also happen to be using Jena TDB so > if the data is held in N-Quad format then then we load single file which > contains separate graphs. Jena allows you to do this using the tdbloader > command line tool. We could just as easily load separate RDF files that > were in a triple format such as Turtle and specify the graph name during > loading. The result is the same. I wouldn't get too hung up on named graphs > unless you think they are really appropriate for you though as they do add > some complexity to updating the data which it may be better to avoid at > first. The reason we chose to do this is that our ontology is still > developing and we wanted to be able to delete terms that we had decided to > dump without leaving cruft in the triplestore. Dropping the graph seemed to > be the best way to achieve this. > > > > > Does separating the T-box and A-box data have any down sides? Is it > > invisible to reasoners , for example? > > > > Yes, as I say, using named graphs adds complexity to updates. We are using > Fuseki and we specifiy "<#dataset> tdb:unionDefaultGraph true" in the > Fuseki config file and this means that the when we query the data we can > forget about the named graphs as it is transparent to the query. When we do > updates though strange things can happen if we don't specify the graph name > in the right part of the query. I can't say how it impacts on reasoning as > we don't use this at present. > > > > > Finally, I'm obviously a complete neophyte. Am I in the wrong group? I > > don't want to put noise in the channel > > > > Being a neophyte is cool - welcome! Whether this is the is the right group > depends whether your questions relate to Jena specifics or not.. it seems > to me they do, at least in part.. > > Rob > > > > > > Thanks again! > > > > On Thu, Dec 11, 2014 at 12:20 PM, Rob Walpole <[email protected]> > > wrote: > > > > > Hi Nate, > > > > > > I'm not sure what you mean by an "ontology management workflow" exactly > > and > > > I can't comment on whether your approach is a good one or not... but > what > > > we have done is to create our own ontology which as far as possible > > reuses > > > or extends other pre-existing ontologies (e.g. central-goverment, > dublin > > > core etc.). This ontology consists of a load of classes, object > > properties > > > and data properties which are used inside our actual data. The ontology > > (or > > > TBox - http://en.wikipedia.org/wiki/Tbox) and data (or ABox - > > > http://en.wikipedia.org/wiki/Abox) components exist as separate > datasets > > > and we have found it convenient to store them as separate named graphs > > > within our triplestore - mainly so that the ontology component can be > > > updated easily by dropping and reloading the graph. > > > > > > We manage the ontology using Protege and I have to say I find modelling > > > things in Protege saves me from wasting huge amounts of time as it > forces > > > me to model things up front before I start fiddling about with the > data. > > I > > > find the OntoGraf plugin particularly helpful when I need to visualise > > > relationships and when discussing requirements with users. Protege also > > > allows you to save the ontology as an RDF file which you can load > > straight > > > into your triplestore (Jena TDB in our case). > > > > > > We also keep a number of named individuals in the ontology itself. > These > > > are for things that are entities but what I think of (coming from a > Java > > > background) as statics. They are the entities which are very unlikely > to > > > change and if they do then I am happy to edit them within the ontology. > > > > > > Hope that helps in some way. > > > > > > Rob > > > > > > Rob Walpole > > > Email [email protected] > > > Tel. +44 (0)7969 869881 > > > Skype: RobertWalpolehttp://www.linkedin.com/in/robwalpole > > > > > > > > > On Thu, Dec 11, 2014 at 12:30 PM, Nate Marks <[email protected]> > wrote: > > > > > > > I'm trying to get my arms around an ontology management workflow. > I've > > > > been reading the docs on the Apache Jena site and a couple of books. > > I > > > > was hoping to test my understanding of the technology by sharing my > > > current > > > > plan and gathering some feedback. > > > > > > > > Thanks in advance if you have the time to comment! > > > > > > > > > > > > I intend to tightly manage a pretty broad ontology. Let's say it > > > includes > > > > assets, locations, people and workflows. > > > > > > > > I think I want to have a single "schema" file that describes the > asset > > > > class hierarchy and the rules for validating assets based on > > properties, > > > > disjointness etc. > > > > > > > > Then I might have a bunch of other "data" files that enumerate all > the > > > > assets using that first "schema" file. > > > > > > > > I'd repeat this structure using a schema file each for locations, > > people, > > > > workflows. > > > > > > > > Having created these files, I think I can use an assembler file to > > pull > > > > them into a single model. > > > > > > > > Ultimately, I expect to query the data using Fuseki and this is > where I > > > get > > > > a little hazy. I think the assembler can pull the files into a > single > > > > memory model, then I can write it to a tdb. > > > > > > > > Is that necessary, though? it's a simple bit of java, but I have the > > > > nagging feeling that there's a shorter path to automatically > > > load/validate > > > > those files for Fuseki > > > > > > > > > > > > Is this approach to organizing the files sound? > > > > > > > > > >
