Re: Using Graph from another JVM -- A solution.

Claude Warren Wed, 01 Jan 2014 09:33:55 -0800

A quick review of netty + thrift would lead me to believe that converting
the RMI to netty+thrift would not be too difficult.



On Wed, Jan 1, 2014 at 12:22 PM, Andy Seaborne <[email protected]> wrote:

> On 01/01/14 07:43, Claude Warren wrote:
>
>> I thought some more about my requirements and I think that making Node
>> (and
>> Triple) Serializable would solve my problem.  I'll take a look at what it
>> would take to implement that.
>>
>
> If it's an interface (Jena3), then your alternative implementation
> approach *should* work; would even be a good test of the architecture.
> Won't it take a minimum of serialization methods in the top of the
> implementing class hierarchy?
>
> I don't know if Java8 default methods work with those serialization
> methods - I'd guessing "no" as serialization is treated specially, the
> methods must be an exact signature and include "private" (IIRC+checking the
> javadoc).  RMI is one of those early Java technologies that hasn't changed
> much in ages.
>
> (If there is only implementation, the JIT will treat all methods as final
> which helps optimization.)
>
> The RMI graph is a special case of a more general design where the graph
> (dataset) can be distributed across more then one machine.  I considered
> RMI for Lizard but rejected it because it's RPC, not streams.  RPC+small
> objects do not make for efficiency; RPC has latency issues for tight
> coupling systems and any call is one copy in, one copy out at an absolute
> minimum. [All a blast from the past for me!]  I may use it for a control
> control but if the system is already using netty+thrift (current plan),
> adding RMI is duplication.
>
> New Year's Resolution - start Jena3!
>
>
>         Andy
>
>  On Mon, Dec 30, 2013 at 8:21 PM, Andy Seaborne <[email protected]> wrote:
>>
>>  On 30/12/13 19:39, Claude Warren wrote:
>>>
>>>  With a Node interface I can implement a serializable node that handles
>>>> all
>>>> the core node types.  It means I only have to convert the node once.
>>>>    Without an interface I have to convert to a serializable format and
>>>> then
>>>> convert back to "native" form.
>>>>
>>>>
>>> So it is a single class? Lucky it's only the concrete types! - there are
>>> subclasses of variable in at least two places.  Quite a bit of instanceof
>>> Node_RuleVariable in the reasoner code.
>>>
>>> I took a look at the code bases looking for other instanceof tests. I
>>> found OWLDLProfile, OWLProfile and OWLLiteProfile that do instanceof test
>>> when they should be doing .isXXX tests.  Changed.
>>>
>>> Other than that, there does not seem to be much code that makes use of
>>> the
>>> class hierarchy although my looking was not systematic.  Of course, other
>>> extension code may be doing so.
>>>
>>>
>>>          Andy
>>>
>>>   On Mon, Dec 30, 2013 at 7:24 PM, Andy Seaborne <[email protected]>
>>> wrote:
>>>
>>>>
>>>>   On 30/12/13 18:58, Claude Warren wrote:
>>>>
>>>>>
>>>>>   I did a quick Node (Interface) and NodeImpl implementation while
>>>>> working
>>>>>
>>>>>> on
>>>>>> the RMI code.  (It made some things easier) there was not much change
>>>>>> to
>>>>>> the code to put in an interface that has the current methods of Node.
>>>>>>  I
>>>>>> would like to move this into the current code base, but if we decide
>>>>>> not
>>>>>> to
>>>>>> do that I can work around it.
>>>>>>
>>>>>>
>>>>>>  This would be better done on a branch for discussion.  I'm -1 to just
>>>>> putting it into trunk.
>>>>>
>>>>> "not much change" needs a migration strategy because this is going to
>>>>> affect all modules, and it's not just the project's code either.
>>>>>
>>>>> What does it make easier?
>>>>>
>>>>>           Andy
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>  On Mon, Dec 30, 2013 at 6:23 PM, Andy Seaborne <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>    PS
>>>>>>
>>>>>>  http://mail-archives.apache.org/mod_mbox/jena-dev/201207.
>>>>>>> mbox/%[email protected]%3E
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 30/12/13 18:21, Andy Seaborne wrote:
>>>>>>>
>>>>>>>    On 30/12/13 16:28, Claude Warren wrote:
>>>>>>>
>>>>>>>
>>>>>>>>    For RMI I am only implementing a Graph.
>>>>>>>>
>>>>>>>>
>>>>>>>>> It may make sense to wrap model and dataset in order to achieve
>>>>>>>>> better
>>>>>>>>> performance (e.g. wrapping a TDB model/dataset may provide better
>>>>>>>>> performance than creating a model against multiple graphs on the
>>>>>>>>> client
>>>>>>>>> side), but for now it is just a Graph.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   Will be be more performant in any measurable way?
>>>>>>>>>
>>>>>>>>
>>>>>>>>     I did have to create a model wrapper for the Security code, but
>>>>>>>> that
>>>>>>>> is
>>>>>>>>
>>>>>>>>   another kettle of fish.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> My plan is to complete the RMI implementation - 90% or so complete
>>>>>>>>> now, and
>>>>>>>>> add security to it (so you can restrict RMI access to specific
>>>>>>>>> graphs
>>>>>>>>> etc).
>>>>>>>>>
>>>>>>>>> Is there any issue with turning on the UUID inside the
>>>>>>>>> NodeFactory? I
>>>>>>>>> see
>>>>>>>>> that there is code for this.
>>>>>>>>>
>>>>>>>>> I would also like to see Node changed to an interface -- but that
>>>>>>>>> is
>>>>>>>>> another discussion -- I think it will keep the core cleaner as
>>>>>>>>> things
>>>>>>>>> like
>>>>>>>>> Node_Null won't pollute it.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  I agree about interfaces that's why NodeFactory has gone in) but
>>>>>>>> the
>>>>>>>> detail of the exact contract needs to be clear.
>>>>>>>>
>>>>>>>> One argument for them is holding per-storage info in a Node impl but
>>>>>>>> that is limited in the system like Jena where Node.equals is global
>>>>>>>> and
>>>>>>>> determined by RDF semantics.
>>>>>>>>
>>>>>>>> I'm looking to simplify Graph/Triple/Node, so get rid of AnonIds (a
>>>>>>>> nuisense - they show up in the RDF API). And TripleMatch.  Some
>>>>>>>> renaming
>>>>>>>> to sane length method names.  Extension for graphs as nodes(nested
>>>>>>>> graphs) and module-specific Nodestio reuse the storage  (they never
>>>>>>>> leave a model - they help reuse things like "Triple" and "Graph" - I
>>>>>>>> found them useful in ARQ/TDB etc for example "this pattern slot is
>>>>>>>> defined").
>>>>>>>>
>>>>>>>> There is lots of potential flexibility that is not used and I think
>>>>>>>> we
>>>>>>>> know now that some of that is not of any use and it just confuses.
>>>>>>>>
>>>>>>>> By the way, abstract interface classes (i.e. all methods abstract)
>>>>>>>> are
>>>>>>>> reported as a bit faster than interfaces.
>>>>>>>>
>>>>>>>> The most important factor to me is that we do realistic steps so we
>>>>>>>> do
>>>>>>>> not get caught with an unresourceable transition from Jena2 to
>>>>>>>> Jena3.
>>>>>>>>   I
>>>>>>>> think we should only consider things that people will resource.
>>>>>>>>
>>>>>>>> Node_NULL is not used anywhere - @deprecate and delete!
>>>>>>>>
>>>>>>>> (Looks like it is left over from RDB days.)
>>>>>>>>
>>>>>>>>         Andy
>>>>>>>>
>>>>>>>> JENA-189
>>>>>>>>
>>>>>>>>
>>>>>>>>    Claude
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Dec 30, 2013 at 3:27 PM, Andy Seaborne <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>     On 29/12/13 20:40, Claude Warren wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>      The RMI simply exposes an existing graph implementation on a
>>>>>>>>>> remote
>>>>>>>>>>
>>>>>>>>>>   system.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The normal disclaimers apply but given the standard Jena
>>>>>>>>>>> configuration:
>>>>>>>>>>>
>>>>>>>>>>> NodeFactory.createAnon() uses UID to create an id that would be
>>>>>>>>>>> passed to
>>>>>>>>>>> the graph on the remote server where the anon would be recreated.
>>>>>>>>>>>
>>>>>>>>>>> The result is that both the client and the server have the same
>>>>>>>>>>> anon
>>>>>>>>>>> id
>>>>>>>>>>> for
>>>>>>>>>>> the blank node.
>>>>>>>>>>>
>>>>>>>>>>> Am I missing something?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    Only that UID are, strictly, only unique for the machine they
>>>>>>>>>>> are
>>>>>>>>>>>
>>>>>>>>>>>  allocated on.  RMI etc can pass them around but they only safely
>>>>>>>>>> identify
>>>>>>>>>> things on the same machine as their origin (they aren't long
>>>>>>>>>> enough
>>>>>>>>>> for
>>>>>>>>>> wider uniqueness).  Its the UID user's responsibility nor to
>>>>>>>>>> present
>>>>>>>>>> them
>>>>>>>>>> on on a non-origin machine.
>>>>>>>>>>
>>>>>>>>>> Ideally, Jena3, I'd like to use UUIDs, and then store only two
>>>>>>>>>> longs,
>>>>>>>>>> for
>>>>>>>>>> blank nodes.  They they are globally safe as well as being
>>>>>>>>>> smaller.
>>>>>>>>>>
>>>>>>>>>> Out of curiosity - why do you need to extend to Model?  Is there a
>>>>>>>>>> client-side implementation of graph and then it's just a case of
>>>>>>>>>> wrapping a
>>>>>>>>>> Graph just like another other graph?  Or am I missing something?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Another issue in parsing is keeping label->bnode mapping.  Labels
>>>>>>>>>> must be
>>>>>>>>>> matched to any previous use in the parser run.
>>>>>>>>>>
>>>>>>>>>> The RIOT parsers do not use jena-core UID generation for bnode
>>>>>>>>>> ids.
>>>>>>>>>> If
>>>>>>>>>> it's a map of label to node allocated, there is a growing data
>>>>>>>>>> structure.
>>>>>>>>>>      Something that we occasionally get reports of being a
>>>>>>>>>> problem as
>>>>>>>>>> the map
>>>>>>>>>> grows for very large parser runs.
>>>>>>>>>>
>>>>>>>>>> Instead, RIOT allocates a large number (122 bits of random) and
>>>>>>>>>> xors
>>>>>>>>>> it
>>>>>>>>>> with the label.  So the internal id is calculated from the label
>>>>>>>>>> and
>>>>>>>>>> is
>>>>>>>>>> unique yet there is no growing data structure.
>>>>>>>>>>
>>>>>>>>>>             Andy
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     Claude
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Dec 29, 2013 at 7:43 PM, Andy Seaborne <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>      On 29/12/13 16:58, Claude Warren wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>       Greetings,
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>   I have an initial implementation of an RMI based Graph that
>>>>>>>>>>>>
>>>>>>>>>>>>> allows
>>>>>>>>>>>>> one
>>>>>>>>>>>>> JVM
>>>>>>>>>>>>> to access a graph in a different JVM.  I hope to extend this to
>>>>>>>>>>>>> the
>>>>>>>>>>>>> Model
>>>>>>>>>>>>> level in the near future.   I just wanted to know if anyone was
>>>>>>>>>>>>> interested
>>>>>>>>>>>>> in this project.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Claude
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>      The perennial question ...
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>   How do you treat blank nodes?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>              Andy
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>


-- 
I like: Like Like - The likeliest place on the web<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: Using Graph from another JVM -- A solution.

Reply via email to