Re: Possible Jena 3 Module Structure

Claude Warren Sun, 20 Apr 2014 11:05:23 -0700

The security system does use the current implementations of Triple and Node
but it does have to convert them to SecTriple and SecNode.


I have several other projects in which I have to convert to/from base Node
and Triple implementations to something else.   Mostly it is to do
serialization or compression.  The RMI implementation and compressed graph
implementations are two that come to mind.

Doing RMI required converting from Node to a serialized Node and then back
again.  I suppose changing Node from a class to an interface would mean
that the node factory implementation would have to do the conversion from
interface to concrete class where required.  But I wonder, how much of the
current code would have to change to work with interfaces.  The various
places that create nodes could continue to do so as they do now since the
nodes and triples they create would be implementations of the Node/Triple
interface.

Perhaps I misunderstand what is meant my "interface only" module.  What I
would like to see is the addition of  Triple and  Node interfaces.

Am I missing something?

Claude




On Sun, Apr 20, 2014 at 2:55 PM, Andy Seaborne <[email protected]> wrote:

> Having tried to think through how an interface only module would work, I'm
> not sure it's a good idea to depend on it for Jena3.  It's a major change
> and other things would need to wait on it.  We can proceed in steps to
> avoid a dependency here.
>
> Firstly - the "interfaces only" part is presumably Graph and DatasetGraph,
> not Triple, Quad, Node.
>
> The security model still uses Triple and Node as normal doesn't it Claude?
>  In Jena, Triple, Quad, Node objects are can be created at will - the
> inference engine, the parsers and the query subsystem all create Nodes when
> needed.  To parametrize that, we'd need factories passed about, and in ARQ,
> a Node isn't "in" a graph, it can be in many or none and there are mixed
> datasets of different graph implementations.  That begins to feel like a
> lot of work so there had better be a lot of benefit.
>
> The other version of interface only is real Triple, Quad, Node (java
> interfaces with usual implementations in same module), and storage
> interfaces-only, Graph and DatasetGraph.  The size of the memory graph and
> DatasetGraph implementations isn't that large.
>
> There is quite a large hierarchy of DatasetGraphs to support different
> implementation styles - not sure if they'd be in the storage-interface only
> module or not.
>
> Maybe what we need to do is work to a "jena3-core" [*] and then as an
> experiment see how much work it is, and articulate the pros and cons of the
> approach in more detail.
>
> This takes it off the critical path for other jena3 items for now.
>
> [*] I'll try to write "jena3-" to distinguish the modules from same name,
> jena2 modules.  No presumption that "jena3-*" is the final module name.
>
> More inline:
>
>
> On 10/04/14 23:48, Rob Vesse wrote:
>
>> Ok having seen your thoughts I am now leaning towards +1 on having an
>> interface only module
>>
>> I had thought that it might be necessary to split up the security
>> framework into multiple modules so thanks for confirming my suspicions.
>>
>> I’m assuming all the query engine stuff goes in the SPARQL module (Andy?),
>>
>
> Yes - that's my assumption for the moment.  Maybe an engine/API split.
>
>
>  I suppose in principal you could have separate interface/abstract class
>> and implementation modules for SPARQL but it becomes a question of quite
>> how many modules do you want to have.
>>
>
> Yes - you can have too many modules!
>
> We can do a coarse grained split and see how it goes.  further splitting
> as experience and time suggest.
>
>
>  Though I suppose since we are likely to keep the apache-jena-libs modules
>> and just change which modules that pulls in having a proliferation of
>> small modules is not necessarily too taxing on users.
>>
>
> The apache-jena-libs POM module should be the normal way to get the
> libraries.
>
> More ...
>
>
>> Rob
>>
>> On 10/04/2014 06:48, "Claude Warren" <[email protected]> wrote:
>>
>>  Comments Inline
>>>
>>>
>>> On Wed, Apr 9, 2014 at 10:49 PM, Rob Vesse <[email protected]> wrote:
>>>
>>>  Comments inline:
>>>>
>>>> On 08/04/2014 08:10, "Andy Seaborne" <[email protected]> wrote:
>>>>
>>>>  On 08/04/14 14:25, Rob Vesse wrote:
>>>>>
>>>>>>
>>>>>> In terms of specific collaboration opportunities I¹ve heard a few
>>>>>> different
>>>>>> ideas.  I spent a bunch of time talking with Lewis McGibbney who¹s
>>>>>> involved
>>>>>> in Any23 about how Jena might make it easier for projects to shares
>>>>>>
>>>>>> common
>>>>>> functionality like RDF parsers.  The current module structures are
>>>>>> something
>>>>>> of a barrier in this regard since we have multiple versions of some
>>>>>> readers
>>>>>> and they are quite closely coupled into some aspects of our APIs.
>>>>>> Improving
>>>>>> modularisation in the future (as I think we hope to do in Jena 3x
>>>>>> eventually) would make things like this easier for people.
>>>>>>
>>>>>
>>>>> Agreed : we need something like
>>>>>
>>>>> IRI
>>>>> Non-RDF related common library (Atlas in ARQ currently)
>>>>> new core (graph API, datasetgraph)
>>>>> RIOT
>>>>> API
>>>>> SPARQL
>>>>> TDB (split into base, file, b+tree, main)
>>>>>
>>>>
>>>> This looks like the most sensible and concrete modularisation we’ve yet
>>>> come up with.  To clarify are you thinking that the new core would be
>>>> the
>>>> Node, Triple, Quad, Graph, Dataset APIs etc and then API would be the
>>>> Model and Ontology APIs.
>>>>
>>>> In which case +1000, that makes much more sense.
>>>>
>>>> Particularly then putting RIOT between the new core and the API so that
>>>> we
>>>> don’t have two sets of readers and writers!
>>>>
>>>> I’d be interested to hear from Claude as to how the jena-security module
>>>> would fit in this, I guess it may need to be split into multiple modules
>>>> that build on each other and the other modules as appropriate.
>>>>
>>>> <claude>
>>>>
>>> The security module - (first should probably be renamed permissions but
>>> in
>>> any case) just needs interfaces to work most efficiently.  Currently it
>>> wraps Graph, and Model and provides a custom query engine that places
>>> access checks in the middle of the SPARQL calls.
>>>
>>> Given that.  Under the new structure - as I understand it -- security
>>> would
>>> probably have to be broken into 3 sub modules: security-core (graph
>>> stuff),
>>> security-api (model stuff) and security-sparql (query stuff).  Or perhaps
>>> I
>>> don't have the model/sparql separation correct in my head.  Where does
>>> the
>>> query engine that is associated with a model go in the new structure?
>>> </claude>
>>>
>>
> I don't think we should be too prescriptive as to structure.  Firstly,
> because given work needed, there maybe better/more important things to do
> and also because I hope we get to a release sooner rather than later.
>
> That said, I have done an experimental split of TDB into
>
> -- core system and interfaces.
> jena-tdb-base
>
> -- The 2 abstractsions of files as array of blocks
> -- and log of variable length byte blobs
> jena-tdb-file
>
> -- different implementations of index structures.
> jena-tdb-btree, jena-tdb-exthash
>
> -- TDB query engine and client API.
> jena-tdb-tdb
>
>
>
>
>>>  ...
>>>>>
>>>>> A "maybe" is a module that is just the interfaces for graph, dataset
>>>>>
>>>> etc
>>>>
>>>>> etc. and have modules build from that but it looks to me like the
>>>>> difference between new-core and interface+core+mem is quite small.
>>>>> Having the in-memory implmentations around is necessary for internal
>>>>> working.
>>>>>
>>>>
>>>> I’m -0 on this
>>>>
>>>> <claude>Having interfaces in one place makes security easier.
>>>>
>>>
>>> It is also easier to use the contract testing framework to verify
>>> implementations of the interface without cluttering the  test package
>>> with
>>> concrete test implementations.  This is a make it easier for the
>>> integrator/developer issue.
>>>
>>> +1 on this.
>>>
>>> </claude>
>>>
>>>  While it’s easy to do and relatively cheap in Java (as compared to .Net
>>>> where it is a PITA) and certainly the Sesame folks already take this
>>>> approach but I don’t see that there is much value provided.  How many
>>>> people actually run completely custom Node/Triple/Graph/etc stacks?
>>>>
>>>>
>>>>> And? Java8 so we can sort out iterators.
>>>>>
>>>>> Let's more actively discuss this.
>>>>>
>>>>
>>>> Sure though I think Java8 is maybe a whole other discussion.
>>>>
>>>> Another related discussion is moving to Git before we get started down
>>>> this route because doing this scale of refactoring would be horrible
>>>> within SVN.
>>>>
>>>
> A separate thread to plan the git repo layout.
>
>
>
>>>> Rob
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>> Claude
>>> --
>>> I like: Like Like - The likeliest place on the
>>> web<http://like-like.xenei.com>
>>> LinkedIn: http://www.linkedin.com/in/claudewarren
>>>
>>
>>
>>
>>
>>
>


-- 
I like: Like Like - The likeliest place on the web<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: Possible Jena 3 Module Structure

Reply via email to