Hello,
My name is Bruno Paiva Lima da Silva, and I am a PhD student at LIRMM in
Montpellier, France investigating graph-based conjunctive query
answering. A very quick description of my thesis can be found at: [
http://bplsilva.com/en/ ] , or, for more details, please check my
presentations [ http://bplsilva.com/en/research/talks/ ].
The reason why I am writing to you is that I require further information
regarding your storage system. In my work I aim comparing several
different storage systems for conjective querying. To this end I have
implemented a common and abstract interface in Java that uses in the
"logical" representation (as defined in First-Order Logic) of a factual
piece of knowledge. The formula is then sent to different storage
systems (e.g. DEX, HyperGraph, MySQL, Neo4J, OrientDB, Sqlite, and
others), one of them being your system, and the storage and querying
time is then logged for further analysis and comparisons.
To this end, I am testing this architecture by the means of an
incrementally large RDF file, that can go from 10k triples up to 5M
triples (and more).
The main reason of this e-mail, beside annoucing you that I am using
your system in a research scope, is to understand whether I am using it
correctly for our purpose, ensuring the validity of the results I obtain
when I run my tests.
I am particularly interested in knowing whether I am using the best
solution with Neo4J within 6 of our functions. In red you will find the
way we perform it as of today.
- (1) Creating a new graph
public Neo4jGraph(String s) throws Exception {
super(s);
directory = "alaska-data/neo4j/" + s;
graph = new EmbeddedGraphDatabase(directory);
}
- (2) Adding a new node to the graph
public long addTerm(Object label) {
Map<String,Object> properties = new
HashMap<String,Object>();
properties.put("label",label.toString());
long newNode = inserter.createNode(properties);
batchIndex.add(newNode,properties);
batchIndex.flush();
return newNode;
}
- (3) Adding a new edge to the graph
public void addAtom(Object predicateLabel, ArrayList<Object>
termObjects) throws Exception {
Long n1 = getNodeByLabel(termObjects.get(0));
Long n2 = getNodeByLabel(termObjects.get(1));
if (n1 == null) { n1 = addTerm(termObjects.get(0)); }
if (n2 == null) { n2 = addTerm(termObjects.get(1)); }
if (n1 != n2) {
inserter.createRelationship(n1,n2,DynamicRelationshipType.withName(predicateLabel.toString()),null);
}
}
- (4) Retrieving all the nodes of the graph
public ArrayList<ITerm> getTerms() throws Exception {
ArrayList<ITerm> terms = new ArrayList<ITerm>();
for (Node n : graph.getAllNodes()) {
if (n.getId() != 0) {
Term newTerm = new
Term(n.getProperty("label"));
terms.add(newTerm);
}
}
return terms;
}
- (5) Retrieving all the edges of the graph
public ArrayList<IAtom> getAtoms() throws Exception {
ArrayList<IAtom> atomsToReturn = new ArrayList<IAtom>();
for (Node n : graph.getAllNodes()) {
Iterable<Relationship> rel =
n.getRelationships(Direction.OUTGOING);
for (Relationship r : rel) {
String predFullName =
r.getType().toString();
int predStart =
predFullName.indexOf("[") + 1;
String predName =
predFullName.substring(predStart,predFullName.length()-1);
Predicate pred = new
Predicate(predName,2);
ArrayList<ITerm> atomTerms = new
ArrayList<ITerm>();
ITerm nt1 = new
Term(r.getStartNode().getProperty("label"));
ITerm nt2 = new
Term(r.getEndNode().getProperty("label"));
atomTerms.add(nt1);
atomTerms.add(nt2);
IAtom newAtom = new Atom(pred,atomTerms);
atomsToReturn.add(newAtom);
}
}
return atomsToReturn;
}
- (6) Retrieving a node by its label
public Long getNodeByLabel(Object label) {
IndexHits<Long> hits =
batchIndex.get("label",label.toString());
if (hits.size() == 0) { return null; }
else { return hits.getSingle(); }
}
- (7) Identifying whether there is an edge between two nodes or not
public boolean areConnected(ITerm t1,ITerm t2,Predicate p,int pos)
throws Exception {
Direction dir;
if (pos == 0) { dir = Direction.OUTGOING; }
else { dir = Direction.INCOMING; }
long l = getNodeByLabel(t1.getLabel().toString());
Node n = graph.getNodeById(l);
Iterable<Relationship> rel = n.getRelationships(dir);
for (Relationship r : rel) {
String predFullName = r.getType().toString();
int predStart = predFullName.indexOf("[") + 1;
String predName =
predFullName.substring(predStart,predFullName.length()-1);
if (predName.equals(p.getLabelToString())) {
if (dir == Direction.OUTGOING) {
String otherTerm =
r.getEndNode().getProperty("label").toString();
if
(otherTerm.equals(t2.getLabel().toString())) { return true; }
}
else {
String otherTerm =
r.getStartNode().getProperty("label").toString();
if
(otherTerm.equals(t2.getLabel().toString())) { return true; }
}
}
}
return false;
}
Being aware of that, I ask you to read carefully the small pieces of
code attached to this e-mail, answering whether there is a manner to
improve them or not, principally when speaking of reduction of the
number of operations, execution time and memory usage.
By the way, do not hesitate to contact me if you are further interested
in the results obtained.
Thank you,
Bruno Paiva Lima da Silva
PhD Student
GraphIK Research Team
LIRMM - Montpellier, France
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user