Re: [Neo4j] Function to check whether two nodes are connected?

2011-10-27 Thread Bruno Paiva Lima da Silva
Easy: just one.

For now, I've written this, but I'm still not sure it is the simplest 
way to write it

 public boolean areConnected(Node n1,Node n2,Relationship 
rel,Direction dir) throws Exception {
 Iterable relationships = n1.getRelationships(dir);

 for (Relationship r : relationships) {
 //I am only working with Dynamic Relationships
 if (r.getType().equals(rel.getType())) {
 if (dir == Direction.OUTGOING) { if 
(r.getEndNode().equals(n2)) { return true; } }
 else { if (r.getStartNode().equals(n2)) { return true; } }
 }
 }
 return false;
 }

Bruno

Le 27/10/2011 18:31, Peter Neubauer a écrit :
> Bruno,
> There is no such function low level, but toy can use a Shortest path algo to
> check this. What is the maximum length for a path between the nodes?
> On Oct 27, 2011 6:14 PM, "Bruno Paiva Lima da Silva"
> wrote:
>
>> Hello there!
>> First of all, thanks for the help in all my previous questions, all the
>> answers have been helping me to use Neo4j with success.
>>
>> I have a very simple question, but I haven't found the answer yet...
>>
>> I'd like to have a function, which signature would be more or less like
>> this:
>>
>> public areTheyConnected(Node *n1*,Node *n2*,Relationship *rel*,Direction
>> *dir*)
>>
>> which returns true iff there is an edge of type *rel*, between *n1* and
>> *n2*, in the *dir* direction (the direction has n1 as reference).
>>
>> Example:
>>
>> In my graph, I have: "Bob knows Tom, Tom knows Peter, Jack knows Tom"
>>
>> areTheyConnected(nodeBob,nodeTom,relKnows,Direction.OUTGOING) returns
>> true; (Bob knows Tom)
>> areTheyConnected(nodeTom,nodeJack,relKnows,Direction.INCOMING) also
>> returns true; (Jack knows Tom)
>>
>> areTheyConnected(nodeBob,nodeTom,relKnows,Direction.INCOMING) returns
>> false; (Tom doesn't know Bob)
>>
>> Is there an easy method (constant time, or close) for that?
>>
>> Thank you very much,
>> Bruno
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>>
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Function to check whether two nodes are connected?

2011-10-27 Thread Bruno Paiva Lima da Silva
Hello there!
First of all, thanks for the help in all my previous questions, all the 
answers have been helping me to use Neo4j with success.

I have a very simple question, but I haven't found the answer yet...

I'd like to have a function, which signature would be more or less like 
this:

public areTheyConnected(Node *n1*,Node *n2*,Relationship *rel*,Direction 
*dir*)

which returns true iff there is an edge of type *rel*, between *n1* and 
*n2*, in the *dir* direction (the direction has n1 as reference).

Example:

In my graph, I have: "Bob knows Tom, Tom knows Peter, Jack knows Tom"

areTheyConnected(nodeBob,nodeTom,relKnows,Direction.OUTGOING) returns 
true; (Bob knows Tom)
areTheyConnected(nodeTom,nodeJack,relKnows,Direction.INCOMING) also 
returns true; (Jack knows Tom)

areTheyConnected(nodeBob,nodeTom,relKnows,Direction.INCOMING) returns 
false; (Tom doesn't know Bob)

Is there an easy method (constant time, or close) for that?

Thank you very much,
Bruno
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Loading RDF data questions

2011-10-06 Thread Bruno Paiva Lima da Silva
Thanks for the quick answer Peter.

I don't know if you remember my talk @ Hannover, but for my PhD thesis 
project, my research team & I we translate all the RDF data we have as 
input, and we transform it into First-Order Logics (that's basically to 
maintain semantic equivalences with Datalog and Conceptual Graphs families).

That said, we don't try inserting an RDF file directly into Neo4J, but 
JAVA Objects representing the RDF files. (Btw, we also use Sail in order 
to compare the efficacity and effectiveness of GDB's against RDBs and 
TSs for our problem).

But, these objects aren't very complicated. For now, we just encapsulate 
Strings containing subject, predicate and object names.

That's why I asked the question this morning:

After parsing the RDF with Jena, I obtain a big list of atoms (in FOL, 
an atom represents an edge in a graph) which I try to store, using the 
method I have written before.

I see people in the mailing list working with very big datasets, and I 
ask myself what is going wrong for now since we haven't got further than 
200k triples (which is not big at all) using our methods.

Bruno

Le 06/10/2011 12:17, Peter Neubauer a écrit :
> Bruno,
>
> RDF support is provided via Josh Shinavier's SAIL implementation on
> top of Neo4j already.
>
> Look at the SPARQL-plugin-in-the-making,
> https://github.com/peterneubauer/sparql-plugin/blob/master/src/test/java/org/neo4j/server/plugin/sparql/BerlinDatasetTest.java
> for how to load a fiel into Neo4j as an RDF store, and how to query
> it. This is using a subset of the Berlin RDF dataset and queries,
> http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/ExploreUseCase/index.html,
> for instance.
>
> Does that help? I hope to get this into shape very soon, so you can
> use the Neo4j Server with the SPARQL plugin in order to load and query
> RDF and essentially turn the Neo4j Server into a Triple Store.
>
> Cheers,
>
> /peter neubauer
>
> GTalk:  neubauer.peter
> Skype   peter.neubauer
> Phone   +46 704 106975
> LinkedIn   http://www.linkedin.com/in/neubauer
> Twitter  http://twitter.com/peterneubauer
>
> http://www.neo4j.org   - Your high performance graph database.
> http://startupbootcamp.org/- Öresund - Innovation happens HERE.
> http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
>
>
>
> On Thu, Oct 6, 2011 at 7:50 AM, Bruno Paiva Lima da Silva
>   wrote:
>> Hello,
>>
>> I'm writing to ask whether I am using correctly Neo4J for loading and
>> storing RDF datasets.
>> For now my performances results have been quite bad. However, it seems
>> to me that I haven't understood well how to use the BatchInserter for
>> what I want to.
>>
>> So, I have RDF datasets that can go from 1K to 20M triples, and I want
>> to store them into an empty Neo4J graph.
>>
>> The method I use for the insertion is the following:
>>
>> - For each triple of my RDF data:
>> -- Check if there is a subject node in the graph. If yes, find it, if
>> not, create it.
>> -- Check if there is a object node in the graph. If yes, find it, if
>> not, create it.
>> -- Create an edge with a label "predicate" between subject and object.
>>
>> This method is quite simple and generic, but has also carries a quite
>> big problem:
>> It spends more time reading and searching than inserting.
>>
>> Having profiled its execution, it spends almost 90% of the time
>> searching if a given node exists.
>>
>> For now, I have tried to use Neo4J with simple transactions, then I have
>> switched to BatchInserter + LuceneIndex, but I still think there is
>> space to improve my program.
>>
>> That said, my questions are:
>> - Can anyone tell me, knowing how Neo4J works, how to improve my
>> insertion process or tell me if there is a better solution?
>> - If there are any big errors in my code. It's not yet very well
>> documented, but it is available here:
>> https://bitbucket.org/bplsilva/alaska-project/src/e7fdf2e9341b/src/fr/lirmm/graphik/alaska/impl/graph/neo4j/Neo4jFact.java
>>
>> Thank you very much,
>>
>> --
>> *PAIVA LIMA DA SILVA Bruno*
>> PhD Student in Informatics @ Univ. Montpellier 2
>> [ GraphIK Research Team: LIRMM, Montpellier (France) ]
>> Website: http://bplsilva.com
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>>
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Loading RDF data questions

2011-10-05 Thread Bruno Paiva Lima da Silva
Hello,

I'm writing to ask whether I am using correctly Neo4J for loading and 
storing RDF datasets.
For now my performances results have been quite bad. However, it seems 
to me that I haven't understood well how to use the BatchInserter for 
what I want to.

So, I have RDF datasets that can go from 1K to 20M triples, and I want 
to store them into an empty Neo4J graph.

The method I use for the insertion is the following:

- For each triple of my RDF data:
-- Check if there is a subject node in the graph. If yes, find it, if 
not, create it.
-- Check if there is a object node in the graph. If yes, find it, if 
not, create it.
-- Create an edge with a label "predicate" between subject and object.

This method is quite simple and generic, but has also carries a quite 
big problem:
It spends more time reading and searching than inserting.

Having profiled its execution, it spends almost 90% of the time 
searching if a given node exists.

For now, I have tried to use Neo4J with simple transactions, then I have 
switched to BatchInserter + LuceneIndex, but I still think there is 
space to improve my program.

That said, my questions are:
- Can anyone tell me, knowing how Neo4J works, how to improve my 
insertion process or tell me if there is a better solution?
- If there are any big errors in my code. It's not yet very well 
documented, but it is available here: 
https://bitbucket.org/bplsilva/alaska-project/src/e7fdf2e9341b/src/fr/lirmm/graphik/alaska/impl/graph/neo4j/Neo4jFact.java

Thank you very much,

-- 
*PAIVA LIMA DA SILVA Bruno*
PhD Student in Informatics @ Univ. Montpellier 2
[ GraphIK Research Team: LIRMM, Montpellier (France) ]
Website: http://bplsilva.com 
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Graph homomorphism vs. Graph traversal?

2011-10-03 Thread Bruno Paiva Lima da Silva
Hello all,

I was thinking about the differences between Graph homomorphism and 
Graph traversal operations.
Can anyone help me on that?

I know that Graph homomorphism takes a data graph and a request graph as 
input.
It is also easily possible to transform a graph traversal query into a 
graph then compute homomorphisms.

On the other hand, the requested graph in the homomorphism operation may 
not be a connected graph.

Can we say that the result of a homomorphism having as input a 
not-connected request graph, is the same that the union of the results 
when applying a graph traversal query with each of the connected 
components of the request?

Thanks in advance,
PAIVA LIMA DA SILVA Bruno
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Further information about Neo4J

2011-08-11 Thread Bruno Paiva Lima da Silva
Hello,

My name is Bruno Paiva Lima da Silva, and I am a PhD student at LIRMM in 
Montpellier, France investigating graph-based conjunctive query 
answering. A very quick description of my thesis can be found at: [ 
http://bplsilva.com/en/ ] , or, for more details, please check my 
presentations [ http://bplsilva.com/en/research/talks/ ].

The reason why I am writing to you is that I require further information 
regarding your storage system. In my work I aim comparing several 
different storage systems for conjective querying. To this end I have 
implemented a common and abstract interface in Java that uses in the 
"logical" representation (as defined in First-Order Logic) of a factual 
piece of knowledge. The formula is then sent to different storage 
systems (e.g. DEX, HyperGraph, MySQL, Neo4J, OrientDB, Sqlite, and 
others), one of them being your system, and the storage and querying 
time is then logged for further analysis and comparisons.

To this end, I am testing this architecture by the means of an 
incrementally large RDF file, that can go from 10k triples up to 5M 
triples (and more).

The main reason of this e-mail, beside annoucing you that I am using 
your system in a research scope, is to understand whether I am using it 
correctly for our purpose, ensuring the validity of the results I obtain 
when I run my tests.

I am particularly interested in knowing whether I am using the best 
solution with Neo4J within 6 of our functions. In red you will find the 
way we perform it as of today.

- (1) Creating a new graph

public Neo4jGraph(String s) throws Exception {
 super(s);
 directory = "alaska-data/neo4j/" + s;
 graph = new EmbeddedGraphDatabase(directory);
 }


- (2) Adding a new node to the graph

public long addTerm(Object label) {
 Map properties = new 
HashMap();
 properties.put("label",label.toString());
 long newNode = inserter.createNode(properties);
 batchIndex.add(newNode,properties);
 batchIndex.flush();
 return newNode;
 }

- (3) Adding a new edge to the graph

public void addAtom(Object predicateLabel, ArrayList 
termObjects) throws Exception {
 Long n1 = getNodeByLabel(termObjects.get(0));
 Long n2 = getNodeByLabel(termObjects.get(1));

 if (n1 == null) { n1 = addTerm(termObjects.get(0)); }
 if (n2 == null) { n2 = addTerm(termObjects.get(1)); }

 if (n1 != n2) {
 
inserter.createRelationship(n1,n2,DynamicRelationshipType.withName(predicateLabel.toString()),null);
 

 }
 }



- (4) Retrieving all the nodes of the graph

public ArrayList getTerms() throws Exception {
 ArrayList terms = new ArrayList();
 for (Node n : graph.getAllNodes()) {
 if (n.getId() != 0) {
 Term newTerm = new 
Term(n.getProperty("label"));
 terms.add(newTerm);
 }
 }
 return terms;
 }

- (5) Retrieving all the edges of the graph

public ArrayList getAtoms() throws Exception {
 ArrayList atomsToReturn = new ArrayList();

 for (Node n : graph.getAllNodes()) {
 Iterable rel = 
n.getRelationships(Direction.OUTGOING);

 for (Relationship r : rel) {
 String predFullName = 
r.getType().toString();
 int predStart = 
predFullName.indexOf("[") + 1;
 String predName = 
predFullName.substring(predStart,predFullName.length()-1);

 Predicate pred = new 
Predicate(predName,2);

 ArrayList atomTerms = new 
ArrayList();

 ITerm nt1 = new 
Term(r.getStartNode().getProperty("label"));
 ITerm nt2 = new 
Term(r.getEndNode().getProperty("label"));
 atomTerms.add(nt1);
 atomTerms.add(nt2);

 IAtom newAtom = new Atom(pred,atomTerms);
 atomsToReturn.add(newAtom);
 }
 }
 return atomsToReturn;
 }

- (6) Retrieving a node by its label

public Long getNodeByLabel(Object label) {
 IndexHits hits = 
batchIndex.get("label",label.toString());
 if (hits.size() == 0) { return null; }
 else { return hits.getSingle(); }
 }

- (7) Identifying whether there is an edge between two nodes or not

public boolean areConnected(ITerm t1,ITerm t2,Predicate p,int pos)