Dear Florent, Am 26.07.2011 um 16:46 schrieb florent andré: > >> >> The dependency to Hibernate is mostly for the triple store, not for CMS >> capabilities. And this is something I don't see how to avoid in the near >> future because we need to store additional information about triples for >> reasoning and versioning. >> >> Versioning is also of triples, not of content. As such it is probably also >> interesting to the Stanbol community. > > I'm interesting in a little explanation of the way you store version / > history of triples.
We use a purely relational approach actually: - a table "KIWINODE" stores RDF nodes (unified table for literals, blank nodes and resources) - a table "TRIPLES" stores triples with id, subject, predicate, object, context, marker for deleted, marker for inferred, timestamp, creator (subject, predicate, object, context, creator are references to KIWINODE) - a table "VERSION" stores version ID, timestamp, creator - join tables "VERSION_ADDEDNODES", "VERSION_REMOVEDNODES", "VERSION_ADDEDTRIPLES", "VERSION_REMOVEDTRIPLES" store references to added and removed nodes and to added and removed triples; for deleted triples and nodes, the boolean marker will be set to true, for added nodes it will be false Versioning is thus a simple database operation. "Active" (undeleted) triples can be easily filtered using the boolean marker. Undoing simply means reversing the operations (add and remove) on triples and nodes. > > I begin to think about that (but just think for now :) ), and the possible > help of big tables (e.g. hbase) for this... > > Hbase is a (kind of) 3 dimensional database : > - 1 is column > - 1 is row > - 1 is timestamp I really don't see the point. A relational database is already n-dimensional ;-) > > So, for my 100 feet idea : > - each triple is a row > - ?s, ?p, ?o each a column (or a column family) > > And so, history of each triple is store on the 3rd dimension : timestamps. > > This can bring to a really clean and easy design... if not strong > technical/integration restrictions comes... I am not really convinced, but maybe you can offer some more details and convince me.;-) I am not familiar with these kinds of databases. My thought is that relational databases are really well suited for the task because this is what they have been designed for (triples are really purely relational data), with one (minor) exception: expensive join operations happen frequently when querying RDF, and there is almost no chance to materialize them in advance. This can be compensated a bit by proper indexing and configuration of the database, however. Greetings, Sebastian -- | Dr. Sebastian Schaffert [email protected] | Salzburg Research Forschungsgesellschaft http://www.salzburgresearch.at | Head of Knowledge and Media Technologies Group +43 662 2288 423 | Jakob-Haringer Strasse 5/II | A-5020 Salzburg
