[jira] Commented: (CLEREZZA-395) bnodes mapping in JenaGraphAdaptor should not keep growing with every parsing of rdf files

Rupert Westenthaler (JIRA) Mon, 17 Jan 2011 06:38:13 -0800

    [ 
https://issues.apache.org/jira/browse/CLEREZZA-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982653#action_12982653
 ]


Rupert Westenthaler commented on CLEREZZA-395:
----------------------------------------------

Hi Reto, all

What do you mean by "store" and "transfer"?
 (1) persistent storage (e.g. Jena TDB) and export (e.g. RDF/XML serilization), 
or also
 (2) storage and CRUD operations while working with several MGraphs on API 
Level.
I completely agree with (1) but I am unsure about (2) because I understand the 
potential danger but also used things like that for a lot of stuff in the past 
years (e.g. by using by using 
http://www.openrdf.org/doc/sesame2/api/index.html?org/openrdf/repository/util/RDFInserter.html
 that preserves BNode IDs).

Let me point out, that operations described in (2) are possible with the 
current implementation.
Here a small Example of what I refer to (written here in the TextEditor - so no 
guarantee that is would compile)

MGraph graph1 = new SimpleMGraph();
MGraph graph2 = new SimpleMGraph();
//I think it should even work with an Jena Graph because of the Bidi Map 
providing mappings for BNodes

//By being able to create a BNode without a Graph there is not something like a 
Context of an BNode
BNode rupertInfo = new BNode();
BNode retoInfo = new BNode();
UriRef name = new UriRef(FOAF+"name");
UriRef knows = new UriRef(FOAF+"knows");

//add operations do not create new instances of BNode ... so there is still no 
context
graph1.add(new TripleImpl(rupertInfo;name,new PlainLiteral("Rupert 
Westenthaler"));
graph1.add(new TripleImpl(rupertInfo;knows, retoInfo));

graph2.add(new TripleImpl(reto;name,new PlainLiteral("Reto Bachmann-Gmur"));
//"rupertInfo" is now in two graphs (2 contexts?)
graph2.add(new TripleImpl(reto;knows;rupertInfo);

//So now lets have some fun with the BNodes
//search for all knows in graph1 -> OK (because within the same context)
Iterator<Triple> rupertsFriends = graph1.filter(rupertInfo,knows,null);
//query for all information of the results (BNodes) in graph2 -> NOT OK?!
while(rupertsFriends.hasNext()){
  Resource friendBNode = rupertsFriends.getObject();
  Iterator<Triple> friendInfos = graph2.filter(friendBNode,null,null)
  //add them to the BNode in graph1 -> NOT OK?!
  while(friendInfos.hasNext()){
    graph1.add(friendInfos.next); //OK this would not work because it changes 
graph1 within the Iteration over rupertsFriends, but it shows the principle
  }
}

This works because
 - BNode does not override equals and the equals implementation of 
java.lang.Objects checks for reference
 - one instance of an BNode is shared between the two Graphs
 - the performAdd Method (at least from SimpleTripleCollection) does not create 
new instances for added BNodes
So if it is the goal to completely avoid sharing of BNodes between Graph 
instances one would need to change the current implementation.

In conclusion I would like to point out that adding an ID to BNode - as 
suggested in my first comment - would not change anything out of a technical 
perspective. However I clearly understand that adding a Constructor like 
BNode(String bNodeID) to the public API would encourage wrong usage of BNodes 
by users which might cause a lot of troubles if they are not aware of the 
consequences.

best
Rupert Westenthaler 

> bnodes mapping in JenaGraphAdaptor should not keep growing with every parsing 
> of rdf files
> ------------------------------------------------------------------------------------------
>
>                 Key: CLEREZZA-395
>                 URL: https://issues.apache.org/jira/browse/CLEREZZA-395
>             Project: Clerezza
>          Issue Type: Improvement
>            Reporter: Hasan
>            Assignee: Hasan
>
> With every parsing of rdf files free memory is getting less.
> The problem seems to lie in the JenaGraphAdaptor class
> It has a member:
> final BidiMap<BNode, Node> tria2JenaBNodes = new BidiMapImpl<BNode, Node>();
> which grows each time a serialized graph get parsed.
> My experiments with my test data show
> At the end of the 1st parsing: Size of tria2JenaBNodes = 87200
> At the end of the 2nd parsing: Size of tria2JenaBNodes = 130800
> At the end of the 3rd parsing: Size of tria2JenaBNodes = 174400

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CLEREZZA-395) bnodes mapping in JenaGraphAdaptor should not keep growing with every parsing of rdf files

Reply via email to