I'd love to see RDF* and SPARQL* support in Jena but that might be too much to ask.
On Sat, Jan 2, 2016 at 3:09 PM, Andy Seaborne <[email protected]> wrote: > On 02/01/16 19:36, Paul Houle wrote: > >> :s [] [] is a lot like a relational entity, but I think the really >> interesting thing about the RDF model is the ability to create >> "post-relational" structures, even if it does involve blank nodes. The >> future is more like JSON-LD or the nested columnar model. >> >> In that context an entity can be a little bit more than just :s [] [] but >> could involve a hierarchical structure or ordered lists. In the case of >> Freebase, for instance, you have the "mediator" or "CVT" nodes which >> form >> a bipartite graph with respect to entity nodes so it is a straightforward >> operation to cut out an entity and the CVTs around it. >> >> Lately I've been working on a framework which is a bit like the "boxes and >> line" products like Alteryx, KNIME, Actian -- those products are a dime a >> dozen but they are all based on a tabular data model and this one is >> passing small RDF graphs around, so it supports the nested columnar >> model, >> logic, etc. Pipelines like that rapidly become unintuitive and >> structurally unstable when joins get involved, particular when they >> involve "parts" of something that is a clear conceptual entity. >> >> Obviously this thing is configured by an RDF graph, because the point is >> not that you draw a data processing pipeline but that one of these data >> processing pipelines consumes schema information and a theory library to >> build a graph that describes what will be done to the instances. >> >> So there is a MetaFactory that picks apart the graph into subgraphs, >> feeds >> the subgraphs into the processing modules and then hooks them up in a >> communications fabric. >> >> I don't yet have a single strategy for doing the "document extraction" but >> I have two or three methods that between them seem to cover the cases that >> actually come up. >> >> Following this line, it would be nice to be able to lock a whole >> structure >> that looks like >> >> [ >> a :Paper ; >> :authors ("Alpher","Bethe","Gamow") ; >> :publication [ :journal :PhysicalReview ; :year 1948 . ] >> ] >> >> I don't know how implementable such a thing is, but the problem of >> drawing >> a line around a complex entity would be part of it. >> > > I have always thought that we need a type of property that expresses > "contains", or is part of an entity description, as well as datatype > properties for relationships between top-level entities. They are a sort > of generalization of object properties. > > Or maybe a richer set of literals to include maps and proper lists. c.f. > Property graphs. > > Andy > > > >> On Sat, Jan 2, 2016 at 1:08 PM, Andy Seaborne <[email protected]> wrote: >> >> An SQL database row is a entity in the application data model. If you >>> model a person, you have one row, but in RDF you have several triples. >>> Triple level locking is analogous to cell level locking in SQL databases. >>> >>> Andy >>> >>> >>> On 02/01/16 17:01, Paul Houle wrote: >>> >>> I think it is a worthwhile idea. Given that you are still having to get >>>> a >>>> global lock to get a triple lock, isn't there still a scaling limit on >>>> the >>>> global lock? >>>> >>>> I think a lot about the things that made the relational database >>>> approach >>>> so successful and certainly one thing is that row-level locking >>>> corresponds >>>> well to real-life access patterns. >>>> >>>> On Sat, Jan 2, 2016 at 9:18 AM, Claude Warren <[email protected]> wrote: >>>> >>>> Currently most Jena implementations use a multiple read one write >>>> >>>>> solution. However, I think that it is possible (with minimal work) do >>>>> provide a solution that would allow for multiple writers by using lower >>>>> level locks. >>>>> >>>>> I take inspiration from the Privileges code. That code allows >>>>> privileges >>>>> to be determined down to the triple level. Basically it does the >>>>> following >>>>> {noformat} >>>>> start >>>>> | >>>>> v >>>>> may user perform operation on graph? → (no) (restrict) >>>>> | >>>>> v >>>>> (yes) >>>>> may user perform operation on any triple in graph → (yes) (allow) >>>>> | >>>>> v >>>>> (no) >>>>> may user perform operation on the specific triple in graph → (yes) >>>>> (allow) >>>>> | >>>>> v >>>>> (no) (restrict) >>>>> {noformat} >>>>> >>>>> My thought is that the locking may work much the same way. Once one >>>>> thread >>>>> has the objects locked the any other thread may not lock the object. >>>>> The >>>>> process would be something like: >>>>> >>>>> Graph locking would require exclusive lock or non-exclusive lock. If >>>>> the >>>>> entire graph were to be locked for writing (as in the current system) >>>>> then >>>>> the request would be for an exclusive write-lock on the graph. Once an >>>>> exclusive write lock has been established no other write lock may be >>>>> applied to the graph or any of its triples by any other thread. >>>>> >>>>> If a thread only wanted to lock part of the graph, for example all >>>>> triples >>>>> matching <u:foo ANY ANY>, the thread would first acquire a >>>>> non-exclusive >>>>> write lock on the graph. It would then acquire an exclusive write lock >>>>> on >>>>> all triples matching <u:foo ANY ANY>. Once that triple match lock was >>>>> acquired no other thread would be able to lock any triple who's subject >>>>> was >>>>> u:foo. >>>>> >>>>> The lock request would need to contain the graph name and (in the case >>>>> of a >>>>> partial graph lock) a set of triple patterns to lock. The flow for the >>>>> lock would be something like: >>>>> >>>>> {noformat} >>>>> start >>>>> | >>>>> v >>>>> does the thread hold an exclusive graph lock → (yes) (success) >>>>> | >>>>> v >>>>> (no) >>>>> does the thread want an exclusive graph lock → (yes) (go to ex graph >>>>> lock) >>>>> | >>>>> v >>>>> (no) >>>>> does the thread hold a non-exclusive graph lock → (no) (go to nonex >>>>> graph >>>>> lock) >>>>> | >>>>> v >>>>> (yes) (lbl:lock acquired) >>>>> can the thread acquire all the triple locks → (yes) (success) >>>>> | >>>>> v >>>>> (no) (failure) >>>>> >>>>> >>>>> (lbl: nonex graph lock) >>>>> does any thread hold an exclusive graph lock → (yes) (failure) >>>>> | >>>>> v >>>>> (no) >>>>> acquire non-exclusive graph lock >>>>> (goto lock acquired) >>>>> >>>>> >>>>> (lbl: ex graph lock) >>>>> does any thread hold an exclusive graph lock → (yes) (failure) >>>>> | >>>>> v >>>>> (no) >>>>> does any thread hold a non-exclusive graph lock → (yes) (failure) >>>>> | >>>>> v >>>>> (no) >>>>> acquire exclusive graph lock >>>>> (success) >>>>> >>>>> {noformat} >>>>> >>>>> The permissions system uses an abstract engine to determine if the user >>>>> has >>>>> access to the triples. For the locking mechanism the system needs to >>>>> track >>>>> graph locks and triple patterns locked. If a new request for a triple >>>>> pattern matches any existing (already locked) pattern the lock request >>>>> fails. >>>>> >>>>> The simple releaseLock() will release all locks the thread holds. >>>>> >>>>> Note that the locking system does not check the graph being locked to >>>>> see >>>>> if the items exist in the graph it is simply tracking patterns of locks >>>>> and >>>>> determining if there are any conflicts between the patterns. >>>>> >>>>> Because this process can duplicate the current locking strategy it can >>>>> be >>>>> used as a drop in replacement in the current code. So current code >>>>> would >>>>> continue to operate as it does currently but future development could >>>>> be >>>>> more sensitive to locking named graphs, and partial updates to provide >>>>> multi-thread updates. >>>>> >>>>> Thoughts? >>>>> Claude >>>>> >>>>> -- >>>>> I like: Like Like - The likeliest place on the web >>>>> <http://like-like.xenei.com> >>>>> LinkedIn: http://www.linkedin.com/in/claudewarren >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >> >> > -- Paul Houle *Applying Schemas for Natural Language Processing, Distributed Systems, Classification and Text Mining and Data Lakes* (607) 539 6254 paul.houle on Skype [email protected] :BaseKB -- Query Freebase Data With SPARQL http://basekb.com/gold/ Legal Entity Identifier Lookup https://legalentityidentifier.info/lei/lookup/ <http://legalentityidentifier.info/lei/lookup/> Join our Data Lakes group on LinkedIn https://www.linkedin.com/grp/home?gid=8267275
