Re: [Neo4j] Function to check whether two nodes are connected?
Easy: just one. For now, I've written this, but I'm still not sure it is the simplest way to write it public boolean areConnected(Node n1,Node n2,Relationship rel,Direction dir) throws Exception { Iterable relationships = n1.getRelationships(dir); for (Relationship r : relationships) { //I am only working with Dynamic Relationships if (r.getType().equals(rel.getType())) { if (dir == Direction.OUTGOING) { if (r.getEndNode().equals(n2)) { return true; } } else { if (r.getStartNode().equals(n2)) { return true; } } } } return false; } Bruno Le 27/10/2011 18:31, Peter Neubauer a écrit : > Bruno, > There is no such function low level, but toy can use a Shortest path algo to > check this. What is the maximum length for a path between the nodes? > On Oct 27, 2011 6:14 PM, "Bruno Paiva Lima da Silva" > wrote: > >> Hello there! >> First of all, thanks for the help in all my previous questions, all the >> answers have been helping me to use Neo4j with success. >> >> I have a very simple question, but I haven't found the answer yet... >> >> I'd like to have a function, which signature would be more or less like >> this: >> >> public areTheyConnected(Node *n1*,Node *n2*,Relationship *rel*,Direction >> *dir*) >> >> which returns true iff there is an edge of type *rel*, between *n1* and >> *n2*, in the *dir* direction (the direction has n1 as reference). >> >> Example: >> >> In my graph, I have: "Bob knows Tom, Tom knows Peter, Jack knows Tom" >> >> areTheyConnected(nodeBob,nodeTom,relKnows,Direction.OUTGOING) returns >> true; (Bob knows Tom) >> areTheyConnected(nodeTom,nodeJack,relKnows,Direction.INCOMING) also >> returns true; (Jack knows Tom) >> >> areTheyConnected(nodeBob,nodeTom,relKnows,Direction.INCOMING) returns >> false; (Tom doesn't know Bob) >> >> Is there an easy method (constant time, or close) for that? >> >> Thank you very much, >> Bruno >> ___ >> Neo4j mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user >> > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Function to check whether two nodes are connected?
Hello there! First of all, thanks for the help in all my previous questions, all the answers have been helping me to use Neo4j with success. I have a very simple question, but I haven't found the answer yet... I'd like to have a function, which signature would be more or less like this: public areTheyConnected(Node *n1*,Node *n2*,Relationship *rel*,Direction *dir*) which returns true iff there is an edge of type *rel*, between *n1* and *n2*, in the *dir* direction (the direction has n1 as reference). Example: In my graph, I have: "Bob knows Tom, Tom knows Peter, Jack knows Tom" areTheyConnected(nodeBob,nodeTom,relKnows,Direction.OUTGOING) returns true; (Bob knows Tom) areTheyConnected(nodeTom,nodeJack,relKnows,Direction.INCOMING) also returns true; (Jack knows Tom) areTheyConnected(nodeBob,nodeTom,relKnows,Direction.INCOMING) returns false; (Tom doesn't know Bob) Is there an easy method (constant time, or close) for that? Thank you very much, Bruno ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Loading RDF data questions
Thanks for the quick answer Peter. I don't know if you remember my talk @ Hannover, but for my PhD thesis project, my research team & I we translate all the RDF data we have as input, and we transform it into First-Order Logics (that's basically to maintain semantic equivalences with Datalog and Conceptual Graphs families). That said, we don't try inserting an RDF file directly into Neo4J, but JAVA Objects representing the RDF files. (Btw, we also use Sail in order to compare the efficacity and effectiveness of GDB's against RDBs and TSs for our problem). But, these objects aren't very complicated. For now, we just encapsulate Strings containing subject, predicate and object names. That's why I asked the question this morning: After parsing the RDF with Jena, I obtain a big list of atoms (in FOL, an atom represents an edge in a graph) which I try to store, using the method I have written before. I see people in the mailing list working with very big datasets, and I ask myself what is going wrong for now since we haven't got further than 200k triples (which is not big at all) using our methods. Bruno Le 06/10/2011 12:17, Peter Neubauer a écrit : > Bruno, > > RDF support is provided via Josh Shinavier's SAIL implementation on > top of Neo4j already. > > Look at the SPARQL-plugin-in-the-making, > https://github.com/peterneubauer/sparql-plugin/blob/master/src/test/java/org/neo4j/server/plugin/sparql/BerlinDatasetTest.java > for how to load a fiel into Neo4j as an RDF store, and how to query > it. This is using a subset of the Berlin RDF dataset and queries, > http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/ExploreUseCase/index.html, > for instance. > > Does that help? I hope to get this into shape very soon, so you can > use the Neo4j Server with the SPARQL plugin in order to load and query > RDF and essentially turn the Neo4j Server into a Triple Store. > > Cheers, > > /peter neubauer > > GTalk: neubauer.peter > Skype peter.neubauer > Phone +46 704 106975 > LinkedIn http://www.linkedin.com/in/neubauer > Twitter http://twitter.com/peterneubauer > > http://www.neo4j.org - Your high performance graph database. > http://startupbootcamp.org/- Öresund - Innovation happens HERE. > http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. > > > > On Thu, Oct 6, 2011 at 7:50 AM, Bruno Paiva Lima da Silva > wrote: >> Hello, >> >> I'm writing to ask whether I am using correctly Neo4J for loading and >> storing RDF datasets. >> For now my performances results have been quite bad. However, it seems >> to me that I haven't understood well how to use the BatchInserter for >> what I want to. >> >> So, I have RDF datasets that can go from 1K to 20M triples, and I want >> to store them into an empty Neo4J graph. >> >> The method I use for the insertion is the following: >> >> - For each triple of my RDF data: >> -- Check if there is a subject node in the graph. If yes, find it, if >> not, create it. >> -- Check if there is a object node in the graph. If yes, find it, if >> not, create it. >> -- Create an edge with a label "predicate" between subject and object. >> >> This method is quite simple and generic, but has also carries a quite >> big problem: >> It spends more time reading and searching than inserting. >> >> Having profiled its execution, it spends almost 90% of the time >> searching if a given node exists. >> >> For now, I have tried to use Neo4J with simple transactions, then I have >> switched to BatchInserter + LuceneIndex, but I still think there is >> space to improve my program. >> >> That said, my questions are: >> - Can anyone tell me, knowing how Neo4J works, how to improve my >> insertion process or tell me if there is a better solution? >> - If there are any big errors in my code. It's not yet very well >> documented, but it is available here: >> https://bitbucket.org/bplsilva/alaska-project/src/e7fdf2e9341b/src/fr/lirmm/graphik/alaska/impl/graph/neo4j/Neo4jFact.java >> >> Thank you very much, >> >> -- >> *PAIVA LIMA DA SILVA Bruno* >> PhD Student in Informatics @ Univ. Montpellier 2 >> [ GraphIK Research Team: LIRMM, Montpellier (France) ] >> Website: http://bplsilva.com >> ___ >> Neo4j mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user >> > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Loading RDF data questions
Hello, I'm writing to ask whether I am using correctly Neo4J for loading and storing RDF datasets. For now my performances results have been quite bad. However, it seems to me that I haven't understood well how to use the BatchInserter for what I want to. So, I have RDF datasets that can go from 1K to 20M triples, and I want to store them into an empty Neo4J graph. The method I use for the insertion is the following: - For each triple of my RDF data: -- Check if there is a subject node in the graph. If yes, find it, if not, create it. -- Check if there is a object node in the graph. If yes, find it, if not, create it. -- Create an edge with a label "predicate" between subject and object. This method is quite simple and generic, but has also carries a quite big problem: It spends more time reading and searching than inserting. Having profiled its execution, it spends almost 90% of the time searching if a given node exists. For now, I have tried to use Neo4J with simple transactions, then I have switched to BatchInserter + LuceneIndex, but I still think there is space to improve my program. That said, my questions are: - Can anyone tell me, knowing how Neo4J works, how to improve my insertion process or tell me if there is a better solution? - If there are any big errors in my code. It's not yet very well documented, but it is available here: https://bitbucket.org/bplsilva/alaska-project/src/e7fdf2e9341b/src/fr/lirmm/graphik/alaska/impl/graph/neo4j/Neo4jFact.java Thank you very much, -- *PAIVA LIMA DA SILVA Bruno* PhD Student in Informatics @ Univ. Montpellier 2 [ GraphIK Research Team: LIRMM, Montpellier (France) ] Website: http://bplsilva.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Graph homomorphism vs. Graph traversal?
Hello all, I was thinking about the differences between Graph homomorphism and Graph traversal operations. Can anyone help me on that? I know that Graph homomorphism takes a data graph and a request graph as input. It is also easily possible to transform a graph traversal query into a graph then compute homomorphisms. On the other hand, the requested graph in the homomorphism operation may not be a connected graph. Can we say that the result of a homomorphism having as input a not-connected request graph, is the same that the union of the results when applying a graph traversal query with each of the connected components of the request? Thanks in advance, PAIVA LIMA DA SILVA Bruno ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Further information about Neo4J
Hello, My name is Bruno Paiva Lima da Silva, and I am a PhD student at LIRMM in Montpellier, France investigating graph-based conjunctive query answering. A very quick description of my thesis can be found at: [ http://bplsilva.com/en/ ] , or, for more details, please check my presentations [ http://bplsilva.com/en/research/talks/ ]. The reason why I am writing to you is that I require further information regarding your storage system. In my work I aim comparing several different storage systems for conjective querying. To this end I have implemented a common and abstract interface in Java that uses in the "logical" representation (as defined in First-Order Logic) of a factual piece of knowledge. The formula is then sent to different storage systems (e.g. DEX, HyperGraph, MySQL, Neo4J, OrientDB, Sqlite, and others), one of them being your system, and the storage and querying time is then logged for further analysis and comparisons. To this end, I am testing this architecture by the means of an incrementally large RDF file, that can go from 10k triples up to 5M triples (and more). The main reason of this e-mail, beside annoucing you that I am using your system in a research scope, is to understand whether I am using it correctly for our purpose, ensuring the validity of the results I obtain when I run my tests. I am particularly interested in knowing whether I am using the best solution with Neo4J within 6 of our functions. In red you will find the way we perform it as of today. - (1) Creating a new graph public Neo4jGraph(String s) throws Exception { super(s); directory = "alaska-data/neo4j/" + s; graph = new EmbeddedGraphDatabase(directory); } - (2) Adding a new node to the graph public long addTerm(Object label) { Map properties = new HashMap(); properties.put("label",label.toString()); long newNode = inserter.createNode(properties); batchIndex.add(newNode,properties); batchIndex.flush(); return newNode; } - (3) Adding a new edge to the graph public void addAtom(Object predicateLabel, ArrayList termObjects) throws Exception { Long n1 = getNodeByLabel(termObjects.get(0)); Long n2 = getNodeByLabel(termObjects.get(1)); if (n1 == null) { n1 = addTerm(termObjects.get(0)); } if (n2 == null) { n2 = addTerm(termObjects.get(1)); } if (n1 != n2) { inserter.createRelationship(n1,n2,DynamicRelationshipType.withName(predicateLabel.toString()),null); } } - (4) Retrieving all the nodes of the graph public ArrayList getTerms() throws Exception { ArrayList terms = new ArrayList(); for (Node n : graph.getAllNodes()) { if (n.getId() != 0) { Term newTerm = new Term(n.getProperty("label")); terms.add(newTerm); } } return terms; } - (5) Retrieving all the edges of the graph public ArrayList getAtoms() throws Exception { ArrayList atomsToReturn = new ArrayList(); for (Node n : graph.getAllNodes()) { Iterable rel = n.getRelationships(Direction.OUTGOING); for (Relationship r : rel) { String predFullName = r.getType().toString(); int predStart = predFullName.indexOf("[") + 1; String predName = predFullName.substring(predStart,predFullName.length()-1); Predicate pred = new Predicate(predName,2); ArrayList atomTerms = new ArrayList(); ITerm nt1 = new Term(r.getStartNode().getProperty("label")); ITerm nt2 = new Term(r.getEndNode().getProperty("label")); atomTerms.add(nt1); atomTerms.add(nt2); IAtom newAtom = new Atom(pred,atomTerms); atomsToReturn.add(newAtom); } } return atomsToReturn; } - (6) Retrieving a node by its label public Long getNodeByLabel(Object label) { IndexHits hits = batchIndex.get("label",label.toString()); if (hits.size() == 0) { return null; } else { return hits.getSingle(); } } - (7) Identifying whether there is an edge between two nodes or not public boolean areConnected(ITerm t1,ITerm t2,Predicate p,int pos)