Sorry, this was meant for Rupert so I wrote in German ... but nothing secret ;-)
It says that I would also like to achieve this and that I would like to ger rid of Hibernate if there is a clean way ... Btw, I am on vacation starting this evening... in case of questions my colleaue Thomas can surely answer competently in my place. I will check mail from time to time though ;-) Greetings, Sebastian Am 26.07.2011 um 21:18 schrieb Sebastian Schaffert: > Danke für die Unterstützung, da würd ich auch gern hin. ;-) > > Aber viele Vorschläge sind schon sehr gut, ich würd wirklich gerne weg von > Hibernate wenn es eine saubere Möglichkeit gibt ... > > lg > Sebastian > > Am 26.07.2011 um 18:59 schrieb Rupert Westenthaler: > >> Hi >> >> I think we should investigate if it would make sense to implement the >> Clerezza APIs on top of the "Kiwi" Triple store. This would allow any >> Clerezza based Application - including stanbol - to use this Triple >> store implementation. >> >> WDYT >> Rupert >> >> On Tue, Jul 26, 2011 at 5:54 PM, Sebastian Schaffert >> <[email protected]> wrote: >>> Dear Florent, >>> >>> Am 26.07.2011 um 16:46 schrieb florent andré: >>>> >>>>> >>>>> The dependency to Hibernate is mostly for the triple store, not for CMS >>>>> capabilities. And this is something I don't see how to avoid in the near >>>>> future because we need to store additional information about triples for >>>>> reasoning and versioning. >>>>> >>>>> Versioning is also of triples, not of content. As such it is probably >>>>> also interesting to the Stanbol community. >>>> >>>> I'm interesting in a little explanation of the way you store version / >>>> history of triples. >>> >>> We use a purely relational approach actually: >>> - a table "KIWINODE" stores RDF nodes (unified table for literals, blank >>> nodes and resources) >>> - a table "TRIPLES" stores triples with id, subject, predicate, object, >>> context, marker for deleted, marker for inferred, timestamp, creator >>> (subject, predicate, object, context, creator are references to KIWINODE) >>> - a table "VERSION" stores version ID, timestamp, creator >>> - join tables "VERSION_ADDEDNODES", "VERSION_REMOVEDNODES", >>> "VERSION_ADDEDTRIPLES", "VERSION_REMOVEDTRIPLES" store references to added >>> and removed nodes and to added and removed triples; for deleted triples and >>> nodes, the boolean marker will be set to true, for added nodes it will be >>> false >>> >>> Versioning is thus a simple database operation. "Active" (undeleted) >>> triples can be easily filtered using the boolean marker. Undoing simply >>> means reversing the operations (add and remove) on triples and nodes. >>> >>> >>>> >>>> I begin to think about that (but just think for now :) ), and the possible >>>> help of big tables (e.g. hbase) for this... >>>> >>>> Hbase is a (kind of) 3 dimensional database : >>>> - 1 is column >>>> - 1 is row >>>> - 1 is timestamp >>> >> I think there is currently a lot of work on how to handle Graph >> Structures in this kind of data stores. I am definitely interested in >> this topic but currently I do not have the time to investigate it in >> more detail. >> >>> I really don't see the point. A relational database is already >>> n-dimensional ;-) >>> >> >> As long as you can handle the amount of triples on a single machine it >> is fore sure more efficient and easier to implement to handle it with >> a relational database. >> I think there is also a new TripleStore implementation around that >> uses Solr/Lucene to store Triples. Someone has mentioned it in Paris, >> but I have forgot the name of the project. >> >>> >>>> >>>> So, for my 100 feet idea : >>>> - each triple is a row >>>> - ?s, ?p, ?o each a column (or a column family) >>>> >>>> And so, history of each triple is store on the 3rd dimension : timestamps. >>>> >>>> This can bring to a really clean and easy design... if not strong >>>> technical/integration restrictions comes... >>> >>> I am not really convinced, but maybe you can offer some more details and >>> convince me.;-) I am not familiar with these kinds of databases. >>> >>> My thought is that relational databases are really well suited for the task >>> because this is what they have been designed for (triples are really purely >>> relational data), with one (minor) exception: expensive join operations >>> happen frequently when querying RDF, and there is almost no chance to >>> materialize them in advance. This can be compensated a bit by proper >>> indexing and configuration of the database, however. >>> >> >> Yago2 uses a special n-triple model that includes subject, predicate, >> object, temporal, spatial and full text. For spatial and full text >> they use the according extensions of the relational databases. By that >> they can creatly reduce the amount of joins for requests for event >> like data. >> >> Again this discussion is very related to the work of Fabian on the Factstore! >> >> best >> Rupert >> >> -- >> | Rupert Westenthaler [email protected] >> | Bodenlehenstraße 11 ++43-699-11108907 >> | A-5500 Bischofshofen > > Sebastian > -- > | Dr. Sebastian Schaffert [email protected] > | Salzburg Research Forschungsgesellschaft http://www.salzburgresearch.at > | Head of Knowledge and Media Technologies Group +43 662 2288 423 > | Jakob-Haringer Strasse 5/II > | A-5020 Salzburg > Sebastian -- | Dr. Sebastian Schaffert [email protected] | Salzburg Research Forschungsgesellschaft http://www.salzburgresearch.at | Head of Knowledge and Media Technologies Group +43 662 2288 423 | Jakob-Haringer Strasse 5/II | A-5020 Salzburg
