[Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node

2011-09-20 Thread st3ven
Hello neo4j-comunity,



I am creating a graph database for a social network.

To create the graph database I am using the Batch Inserter.

The Batch Inserter inserts data from 2 files into the graph database.



Files:

1. the first file contains the Nodes I want to create (about 3.5M Nodes)

The file looks like this:
Author 1
Author 2
Author 2 ...

2. the second file contains every Relationship between the Nodes (about 2.5
billion Relationships)


This file looks like this:
Author1; Author2; timestamp
Author2; Author3; timestamp
Author1; Author3; timestamp...

The specifications of my Computer look like this:



Intel Core i7 3,4Ghz

16GB Ram

Geforce GT 420 1GB

2TB harddrive



My Code to create the graph database looks like this:



package wikiOSN;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.Map;

import org.neo4j.graphdb.DynamicRelationshipType;
import org.neo4j.graphdb.index.BatchInserterIndex;
import org.neo4j.graphdb.index.BatchInserterIndexProvider;
import org.neo4j.helpers.collection.MapUtil;
import org.neo4j.index.impl.lucene.LuceneBatchInserterIndexProvider;
import org.neo4j.kernel.impl.batchinsert.BatchInserter;
import org.neo4j.kernel.impl.batchinsert.BatchInserterImpl;

public class CreateAndConnectNodes {

public static void main(String[] args) throws IOException {
BufferedReader bf = new BufferedReader(new FileReader(
/media/sdg1/Wikipedia/Reduced 
Files/autoren-der-wikiartikel));
BufferedReader bf2 = new BufferedReader(new FileReader(
/media/sdg1/Wikipedia/Reduced 
Files/wikipedia-output));
CreateAndConnectNodes cacn = new CreateAndConnectNodes();
cacn.createGraphDatabase(bf, bf2);

}

private long relationCounter = 0;

private void createGraphDatabase(BufferedReader bf, BufferedReader bf2)
throws IOException {
BatchInserter inserter = new BatchInserterImpl(
target/socialNetwork-batchinsert);
BatchInserterIndexProvider indexProvider = new
LuceneBatchInserterIndexProvider(
inserter);
BatchInserterIndex authors = indexProvider.nodeIndex(author,
MapUtil.stringMap(type, exact));
authors.setCacheCapacity(name, 10);

String zeile;
String zeile2;

while ((zeile = bf.readLine()) != null) {
Maplt;String, Objectgt; properties = 
MapUtil.map(name, zeile);
long node = inserter.createNode(properties);
authors.add(node, properties);
}
bf.close();
System.out.println(Nodes created!);
authors.flush();
String node = ;
long node1 = 0;
long node2 = 0;
while ((zeile2 = bf2.readLine()) != null) {
if (relationCounter++ % 1 == 0) {

System.out
.println(Edges already 
created:  + relationCounter);

}
String[] relation = zeile2.split(%;% );
if (node == ) {
node = relation[0];
if (authors.get(name, 
relation[0]).getSingle() != null) {
node1 = authors.get(name, 
relation[0]).getSingle();
} else {
System.out.println(Autor 1:  + 
relation[0]);
break;
}

}
if (!node.equals(relation[0])) {
node = relation[0];
if (authors.get(name, 
relation[0]).getSingle() != null) {
node1 = authors.get(name, 
relation[0]).getSingle();
} else {
System.out.println(Autor 1:  + 
relation[0]);
break;
}

}
if (authors.get(name, relation[1]).getSingle() != 
null) {
node2 = authors.get(name, 
relation[1]).getSingle();
} else {
System.out.println(Autor 2:  + relation[1]);
break;
}

Maplt;String, Objectgt; properties = 
MapUtil.map(timestamp,
Long.parseLong(relation[2].trim()));

Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node

2011-09-20 Thread Peter Neubauer
Steven,
the most performant way to insert data with the BatchInserter is to
first insert the nodes only form your node file (that should be fast).
After that (or at the same time), find a way to generate the
relationship file with Neo4j IDs rather than being forced to look the
nodes up in indexes during relationship insertion. This is taking the
bulk of time, so if you could write back to a file your node IDs, then
massage the relationship text file to include node FROM and TO IDs
(e.g. using Perl or Bash or Ruby) and import that one refering to
these directly, that should be much faster.

HTH

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Tue, Sep 20, 2011 at 12:23 PM, st3ven st3...@web.de wrote:
 Hello neo4j-comunity,



 I am creating a graph database for a social network.

 To create the graph database I am using the Batch Inserter.

 The Batch Inserter inserts data from 2 files into the graph database.



 Files:

 1. the first file contains the Nodes I want to create (about 3.5M Nodes)

 The file looks like this:
 Author 1
 Author 2
 Author 2 ...

 2. the second file contains every Relationship between the Nodes (about 2.5
 billion Relationships)


 This file looks like this:
 Author1; Author2; timestamp
 Author2; Author3; timestamp
 Author1; Author3; timestamp...

 The specifications of my Computer look like this:



 Intel Core i7 3,4Ghz

 16GB Ram

 Geforce GT 420 1GB

 2TB harddrive



 My Code to create the graph database looks like this:



 package wikiOSN;

 import java.io.BufferedReader;
 import java.io.FileReader;
 import java.io.IOException;
 import java.util.Map;

 import org.neo4j.graphdb.DynamicRelationshipType;
 import org.neo4j.graphdb.index.BatchInserterIndex;
 import org.neo4j.graphdb.index.BatchInserterIndexProvider;
 import org.neo4j.helpers.collection.MapUtil;
 import org.neo4j.index.impl.lucene.LuceneBatchInserterIndexProvider;
 import org.neo4j.kernel.impl.batchinsert.BatchInserter;
 import org.neo4j.kernel.impl.batchinsert.BatchInserterImpl;

 public class CreateAndConnectNodes {

        public static void main(String[] args) throws IOException {
                BufferedReader bf = new BufferedReader(new FileReader(
                                /media/sdg1/Wikipedia/Reduced 
 Files/autoren-der-wikiartikel));
                BufferedReader bf2 = new BufferedReader(new FileReader(
                                /media/sdg1/Wikipedia/Reduced 
 Files/wikipedia-output));
                CreateAndConnectNodes cacn = new CreateAndConnectNodes();
                cacn.createGraphDatabase(bf, bf2);

        }

        private long relationCounter = 0;

        private void createGraphDatabase(BufferedReader bf, BufferedReader bf2)
                        throws IOException {
                BatchInserter inserter = new BatchInserterImpl(
                                target/socialNetwork-batchinsert);
                BatchInserterIndexProvider indexProvider = new
 LuceneBatchInserterIndexProvider(
                                inserter);
                BatchInserterIndex authors = indexProvider.nodeIndex(author,
                                MapUtil.stringMap(type, exact));
                authors.setCacheCapacity(name, 10);

                String zeile;
                String zeile2;

                while ((zeile = bf.readLine()) != null) {
                        Maplt;String, Objectgt; properties = 
 MapUtil.map(name, zeile);
                        long node = inserter.createNode(properties);
                        authors.add(node, properties);
                }
                bf.close();
                System.out.println(Nodes created!);
                authors.flush();
                String node = ;
                long node1 = 0;
                long node2 = 0;
                while ((zeile2 = bf2.readLine()) != null) {
                        if (relationCounter++ % 1 == 0) {

                                System.out
                                                .println(Edges already 
 created:  + relationCounter);

                        }
                        String[] relation = zeile2.split(%;% );
                        if (node == ) {
                                node = relation[0];
                                if (authors.get(name, 
 relation[0]).getSingle() != null) {
                                        node1 = authors.get(name, 
 relation[0]).getSingle();
                                } else {
                                        System.out.println(Autor 1:  + 
 relation[0]);
                                        break;
                          

Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node

2011-09-20 Thread Linan Wang
hi Stephan,
have you set the -Xms, -XX:+UseNUMA, and -XX:+UseConcMarkSweepGC? they
could speedup the process significantly.
also, if you like, the jrockit is fast and free now. give it a try.
btw, which file system you are using? have you turned off atime?

On Tue, Sep 20, 2011 at 12:00 PM, st3ven st3...@web.de wrote:
 Peter,

 the import of the data into the graph database is not the main problem for
 me.
 The lookup of nodes from the index is fast enough for me.
 To create the database it took me nearly half a day.

 My main problem here is getting the node degree of every node.
 As I already said I am using this code to get the node degree of every node:

 for (Node node : db.getAllNodes()) {
                        counter = 0;

                        if (node.getId()  0) {
                                for (Relationship rel :
 node.getRelationships()) {
                                        counter++;
                                }

 System.out.println(node.getProperty(name).toString() + : 
                                                + counter);
                        }

                }

 After 3 days I only got the node degree of 8 nodes and I want to
 optimize my traversal here, cause this is very slow.
 What can I do to make this faster or do I have to change my code for getting
 the node degree?
 I only posted my import code because I thought I could maybe optimize there
 something for this traversal.

 Thank you very much for your help!

 Cheers,
 Stephan


 --
 View this message in context: 
 http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3351664.html
 Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Best regards

Linan Wang
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node

2011-09-20 Thread st3ven
Hi,

I already tried these java parameters, but that didn't really speedup the
process and i already turned atime off.
As Java parameters I am using right now -d64 -server -Xms7G -Xmx14G
-XX:+UseParallelGC -XX:+UseNUMA
What I've also noticed is, that reading from the database is really slow on
my hard disk.
It just reads 1mb/s and sometimes 8mb/s, but that is really slow. My hard
disk can normally read and copy files much faster.
Also very strange is, that the workload of the hard disk is around 99% with
reading 1mb/s.

My OS is Ubuntu Linux x64 and my file system is ext4.

On the neo4j Wiki I found some performance guides, but these didn't really
help.
Do you know what I can do else?


Perfomance Guides:
http://wiki.neo4j.org/content/Linux_Performance_Guide
http://wiki.neo4j.org/content/Linux_Performance_Guide 
http://wiki.neo4j.org/content/Configuration_Settings
http://wiki.neo4j.org/content/Configuration_Settings 

I also added a configurtion file, but it seems that my Java program doesn't
use all of the Ram.

Thanks for your help!

Cheers,
Stephan



--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3351881.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j graph collections introduction of NodeCollection interface

2011-09-20 Thread Niels Hoogeveen

Hi Bryce,
Sorry for the late response.
I understand it's difficult to come up with a really good use-case for making 
NodeCollection more general in the context of IndexedRelationships, but I like 
to think of that interface as something we can eventually use for all sorts of 
collections, not just the ones derived from SortedTree. 
There is of course the issue that relationships can not attach to 
relationships, so collections of relationships will need to be addressed by Id. 
This is not necessarily a bad thing, because it decouples the container and the 
elements. In other words the container knows what elements it contains, but the 
elements don't know in what containers they are placed. 
Another option would be to create shadow nodes for contained relationships. 
Instead of adding a relationships to the collection, its shadow node is added 
and both the shadow node and the relationship contain pointers (properties with 
Id values) towards each other.
I think it would be best if we do indeed create a GraphCollection interface 
parameterized by T extends PropertyContainer  even if that type parameter for 
now is always a Node. It doesn't add much complexity now to do it, and later on 
we may regret it and by then it becomes harder to do because there is an 
installed base.
Niels

 Date: Sat, 17 Sep 2011 14:19:04 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Neo4j graph collections introduction of NodeCollection   
 interface
 
 Hi Niels,
 
 I had wondered about having a collection interface that covered both nodes
 and relationships.  There were a couple of reasons I didn't go with that
 right now, though well worthwhile discussing it and going with a
 GraphCollection super interface if it fits properly.
 
 Firstly I wanted to get something out there so people could have a look,
 and having something that matched what IndexedRelationship currently
 required was easiest first step.  Biggest thing specific in there to that
 functionality is the addNode method returning a relationship.
 
 The other issue was more wondering how a relationship collection would work.
  Say I have a relationship collection, and I have a relationship R1 between
 node A and B, how am I going to represent that relationship withing some
 graph based data structure that makes sense.  There could be a node X that
 is part of the relationship collection data structure (e.g. tree) and that
 node could have an attribute that has the relationship id on it, but that
 doesn't seem like it would be very performant.  There could be a
 relationship between X and A that also gave the relationship type of R1, so
 you could find the relationship based on that, but there isn't
 any guarantee of the relationship type being unique.  What it would need to
 properly model it is the ability to have a relationship between X and R1,
 i.e. a relationship from a node to a relationship.
 
 If instead of being able to add any given relationship to the relationship
 collection you instead restrict it to being relationships matching a certain
 criteria from a given node then it is practically the same thing as a
 relationship expander.
 
 Or if you instead have a way through the relationship collection to create
 relationships from a given node to a set of other arbitrary nodes, with the
 relationship collection having a fixed relationship type and direction, then
 that is practically the current IndexedRelationship.
 
 I guess a way it could work is similar to IndexedRelationship, basically
 more general case of that class, where you have a method on the relationship
 collection createRelationship(startNode, endNode, relationshipType,
 direction) that was then stored in an internal data structure to create a
 pseudo relationship between the start and end, and then being able to
 iterate over this set of relationships.  Not sure exactly what the use case
 of that would be.  Maybe of more interest could be the same situation where
 the relationship type and direction are fixed, then you may have a friend
 of set of relationships that you create between arbitrary nodes and then
 iterate over all of those.
 
 I can't personally think of a good way of adding a set of arbitrary
 relationships into a collection stored in a graph data structure.
 
 Thoughts?
 
 Cheers
 Bryce
 
 P.S. Peter, I had thought to remove the passing in of the graph database and
 instead just getting it from the node, or only passing in the graph database
 and creating the node internally.
 
 On Sat, Sep 17, 2011 at 2:19 AM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:
 
 
  Hi Bryce,
  I really like what you are trying to achieve here.
  One question:
  Instead of having NodeCollection, why not have GraphCollectionT extends
  PropertyContainer. That way we can have collections of both Relationships
  and Nodes.
  Niels
 
   Date: Fri, 16 Sep 2011 17:37:29 +1200
   From: bryc...@gmail.com
   To: user@lists.neo4j.org
   Subject: [Neo4j] Neo4j graph 

Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node

2011-09-20 Thread Peter Neubauer
Steven,
in this scenario, you are reading up the entire db, and basically have
it cold. Neo4j is not optimized in itself to do full graph-scans. I
see a few solutions for you:

- store the number of relationships as a property on nodes and read
only that. this works if the updates to your graph are not too
frequent.

- Store the relationships as a property in an Index (e.g. Lucene) and
as the index for all entries. Thus, you are using an index for what it
is good at - global operations over all documents.

- use HA or just file copy to replicate the graph on several
instances, and send a sharded query to all of them (e.g. count 100K
node degrees on all of the instances and return the result). This
query is very easy to do in a map/reduce fashion.

Is that feasible?

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Tue, Sep 20, 2011 at 1:00 PM, st3ven st3...@web.de wrote:
 Peter,

 the import of the data into the graph database is not the main problem for
 me.
 The lookup of nodes from the index is fast enough for me.
 To create the database it took me nearly half a day.

 My main problem here is getting the node degree of every node.
 As I already said I am using this code to get the node degree of every node:

 for (Node node : db.getAllNodes()) {
                        counter = 0;

                        if (node.getId()  0) {
                                for (Relationship rel :
 node.getRelationships()) {
                                        counter++;
                                }

 System.out.println(node.getProperty(name).toString() + : 
                                                + counter);
                        }

                }

 After 3 days I only got the node degree of 8 nodes and I want to
 optimize my traversal here, cause this is very slow.
 What can I do to make this faster or do I have to change my code for getting
 the node degree?
 I only posted my import code because I thought I could maybe optimize there
 something for this traversal.

 Thank you very much for your help!

 Cheers,
 Stephan


 --
 View this message in context: 
 http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3351664.html
 Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node

2011-09-20 Thread Linan Wang
hi stephan
i'm wondering if any difference if you could specify the relationship
when counting degrees:
RelationshipType knows = DynamicRelationshipType.withName(KNOWS);

Iterable rels = node.getRelationship(knows);
count = com.google.common.collect.Iterables.size(rels);

besides, do you know where is the bottle neck is, the node iteration
or relationship retrieval?

On Tue, Sep 20, 2011 at 1:38 PM, st3ven st3...@web.de wrote:
 Hi,

 I already tried these java parameters, but that didn't really speedup the
 process and i already turned atime off.
 As Java parameters I am using right now -d64 -server -Xms7G -Xmx14G
 -XX:+UseParallelGC -XX:+UseNUMA
 What I've also noticed is, that reading from the database is really slow on
 my hard disk.
 It just reads 1mb/s and sometimes 8mb/s, but that is really slow. My hard
 disk can normally read and copy files much faster.
 Also very strange is, that the workload of the hard disk is around 99% with
 reading 1mb/s.

 My OS is Ubuntu Linux x64 and my file system is ext4.

 On the neo4j Wiki I found some performance guides, but these didn't really
 help.
 Do you know what I can do else?


 Perfomance Guides:
 http://wiki.neo4j.org/content/Linux_Performance_Guide
 http://wiki.neo4j.org/content/Linux_Performance_Guide
 http://wiki.neo4j.org/content/Configuration_Settings
 http://wiki.neo4j.org/content/Configuration_Settings

 I also added a configurtion file, but it seems that my Java program doesn't
 use all of the Ram.

 Thanks for your help!

 Cheers,
 Stephan



 --
 View this message in context: 
 http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3351881.html
 Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Best regards

Linan Wang
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Representing relationship strength

2011-09-20 Thread editor
I'm looking into a persistant representation of a naive Bayesian classifier
using a graph database. I have three basic object types: users, words and
and topics. The relationships between these nodes would represent the
strength of their connection -- a probability between zero and one.

To query the graph I would traverse relationships from user to topic, using
the strength of connections to represent connectedness. Querying could
potentially take a more neural net-like form.

I'm still quite naive myself when it comes to graph databases, but a
Bayesian classifier seems to be a good fit for a graph model like Neo4j.
That said, in my background research I haven't seen a way to represent the
strength of connections, just the binary relationship of whether two objects
are connected or not. 

Can anyone comment on the feasibility of a Neo4j implementation of a
Bayesian classifier? Are there ways I might be able to represent
relationship strength using Neo4j primitives?

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Representing-relationship-strength-tp3352296p3352296.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] how to get the User who has been B Followed who has Followed Back.

2011-09-20 Thread iamyuanlong
hi all,
 I have some relation like this:
http://neo4j-community-discussions.438527.n3.nabble.com/file/n3352328/follow.jpg
 

what should I do to get the users who has been B Followed and has Followed
back to B.
In the image the result should be (A).

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-to-get-the-User-who-has-been-B-Followed-who-has-Followed-Back-tp3352328p3352328.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node

2011-09-20 Thread st3ven
Hello again,

the bottle neck is at the iteration.
I did some tests with it to check whether the iteration or relationship
retrievel is to slow.

My test results look like this:

Retrieval:1ms; Counting:158ms; number of edges:116407
Retrieval:0ms; Counting:2ms; number of edges:1804
Retrieval:0ms; Counting:0ms; number of edges:22
Retrieval:0ms; Counting:0ms; number of edges:31
Retrieval:0ms; Counting:0ms; number of edges:39
Retrieval:0ms; Counting:2ms; number of edges:1213
Retrieval:0ms; Counting:0ms; number of edges:57
Retrieval:0ms; Counting:36ms; number of edges:59420
Retrieval:0ms; Counting:335ms; number of edges:175156
Retrieval:1ms; Counting:168ms; number of edges:146697
Retrieval:0ms; Counting:354ms; number of edges:285051
Retrieval:0ms; Counting:0ms; number of edges:50
Retrieval:0ms; Counting:11ms; number of edges:20960
Retrieval:0ms; Counting:0ms; number of edges:43
Retrieval:0ms; Counting:0ms; number of edges:51
Retrieval:0ms; Counting:1ms; number of edges:647
Retrieval:0ms; Counting:5ms; number of edges:10216
Retrieval:0ms; Counting:2ms; number of edges:3444
Retrieval:0ms; Counting:0ms; number of edges:1128
Retrieval:1ms; Counting:312ms; number of edges:319127
Retrieval:1ms; Counting:0ms; number of edges:5
Retrieval:0ms; Counting:760ms; number of edges:104741
Retrieval:0ms; Counting:11ms; number of edges:9210
Retrieval:0ms; Counting:0ms; number of edges:31
Retrieval:1ms; Counting:3ms; number of edges:3116
Retrieval:0ms; Counting:37ms; number of edges:70835
Retrieval:0ms; Counting:383ms; number of edges:296445
Retrieval:1ms; Counting:0ms; number of edges:120
Retrieval:0ms; Counting:2ms; number of edges:1526
Retrieval:0ms; Counting:0ms; number of edges:71
Retrieval:0ms; Counting:42ms; number of edges:35960
Retrieval:0ms; Counting:90ms; number of edges:9644
Retrieval:0ms; Counting:186ms; number of edges:129981
Retrieval:0ms; Counting:1ms; number of edges:1213
Retrieval:1ms; Counting:143ms; number of edges:124495
Retrieval:0ms; Counting:0ms; number of edges:58
Retrieval:0ms; Counting:75ms; number of edges:56195
Retrieval:0ms; Counting:99ms; number of edges:92574
Retrieval:0ms; Counting:0ms; number of edges:13
Retrieval:0ms; Counting:50ms; number of edges:26350
Retrieval:0ms; Counting:2ms; number of edges:1856
Retrieval:1ms; Counting:376ms; number of edges:114166
Retrieval:0ms; Counting:9528ms; number of edges:11956
Retrieval:0ms; Counting:50047ms; number of edges:12645
Retrieval:1ms; Counting:43687ms; number of edges:15025

The first results came up very fast, because they seem to have been cached
cause I did that quite often.
As you can see the last 4 results weren't cached and it took a huge amount
of time to do the iteration over the relationships.

I checked that with the following code:

for (Node node : db.getAllNodes()) {
if (node.getId()  0) {
long test = System.currentTimeMillis();
IterableRelationship rels = 
node.getRelationships(knows);
System.out.print(Retrieval:
+ (System.currentTimeMillis() - 
test));

test = System.currentTimeMillis();
int count = 
com.google.common.collect.Iterables.size(rels);
System.out.print(ms; Counting:
+ (System.currentTimeMillis() - 
test));
System.out.println(ms; number of edges: + 
count); 
}
}
Is there maybe a possibilty to cache more relationships or do you have any
idea hot to speedup the iteration progress.

Thanks for your help again!

Cheers,
Stephan

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3352415.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] how to get the User who has been B Followed who has Followed Back.

2011-09-20 Thread Peter Neubauer
In Cypher (http://docs.neo4j.org/chunked/snapshot/cypher-query-lang.html)

START b=(node_auto_index,'name:B') MATCH a-[:FOLLOW]-b,
b-[:FOLLOW]-a RETURN a

see 
https://github.com/neo4j/community/blob/d413404a88db989fd289581ecee6e68faec00ace/embedded-examples/src/test/java/org/neo4j/examples/ShortDocumentationExamplesTest.java#L250

In Gremlin (http://docs.neo4j.org/chunked/snapshot/gremlin-plugin.html)

Marko will provide, have no time to test it to be exact :)

HTH

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Tue, Sep 20, 2011 at 5:05 PM, iamyuanlong yuanlong1...@gmail.com wrote:
 hi all,
  I have some relation like this:
 http://neo4j-community-discussions.438527.n3.nabble.com/file/n3352328/follow.jpg

 what should I do to get the users who has been B Followed and has Followed
 back to B.
 In the image the result should be (A).

 --
 View this message in context: 
 http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-to-get-the-User-who-has-been-B-Followed-who-has-Followed-Back-tp3352328p3352328.html
 Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] how to get the User who has been B Followed who has Followed Back.

2011-09-20 Thread Adriano Henrique de Almeida
With cypher you can do:

start a=(10) match a-[:FOLLOW]-b-[:FOLLOW]-a return a

where 10 can be your node Id or you can either use an index.

Cheers.

2011/9/20 iamyuanlong yuanlong1...@gmail.com

 hi all,
  I have some relation like this:

 http://neo4j-community-discussions.438527.n3.nabble.com/file/n3352328/follow.jpg

 what should I do to get the users who has been B Followed and has Followed
 back to B.
 In the image the result should be (A).

 --
 View this message in context:
 http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-to-get-the-User-who-has-been-B-Followed-who-has-Followed-Back-tp3352328p3352328.html
 Sent from the Neo4j Community Discussions mailing list archive at
 Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Adriano Almeida
Caelum | Ensino e Inovação
www.caelum.com.br
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node

2011-09-20 Thread Michael Hunger
The retrieval is only virtual, as it is lazy.

When I get back to my machine on Thursday, I gonna run your tests and get back 
to you. I have made some modifications on the relationship loading and want to 
see how that affects this.

There are issues loading lots of relationships with cold caches in a one-by-one 
usecase. As the larger segment caching only kicks in if there are a certain 
number of misses of the memory mapped file loading.

Using an SSD would also speed up your use-case.

Configuring Neo4j to use more memory for memory mapping would also help.

Cheers

Michael

Am 20.09.2011 um 17:37 schrieb st3ven:

 Hello again,
 
 the bottle neck is at the iteration.
 I did some tests with it to check whether the iteration or relationship
 retrievel is to slow.
 
 My test results look like this:
 
 Retrieval:1ms; Counting:158ms; number of edges:116407
 Retrieval:0ms; Counting:2ms; number of edges:1804
 Retrieval:0ms; Counting:0ms; number of edges:22
 Retrieval:0ms; Counting:0ms; number of edges:31
 Retrieval:0ms; Counting:0ms; number of edges:39
 Retrieval:0ms; Counting:2ms; number of edges:1213
 Retrieval:0ms; Counting:0ms; number of edges:57
 Retrieval:0ms; Counting:36ms; number of edges:59420
 Retrieval:0ms; Counting:335ms; number of edges:175156
 Retrieval:1ms; Counting:168ms; number of edges:146697
 Retrieval:0ms; Counting:354ms; number of edges:285051
 Retrieval:0ms; Counting:0ms; number of edges:50
 Retrieval:0ms; Counting:11ms; number of edges:20960
 Retrieval:0ms; Counting:0ms; number of edges:43
 Retrieval:0ms; Counting:0ms; number of edges:51
 Retrieval:0ms; Counting:1ms; number of edges:647
 Retrieval:0ms; Counting:5ms; number of edges:10216
 Retrieval:0ms; Counting:2ms; number of edges:3444
 Retrieval:0ms; Counting:0ms; number of edges:1128
 Retrieval:1ms; Counting:312ms; number of edges:319127
 Retrieval:1ms; Counting:0ms; number of edges:5
 Retrieval:0ms; Counting:760ms; number of edges:104741
 Retrieval:0ms; Counting:11ms; number of edges:9210
 Retrieval:0ms; Counting:0ms; number of edges:31
 Retrieval:1ms; Counting:3ms; number of edges:3116
 Retrieval:0ms; Counting:37ms; number of edges:70835
 Retrieval:0ms; Counting:383ms; number of edges:296445
 Retrieval:1ms; Counting:0ms; number of edges:120
 Retrieval:0ms; Counting:2ms; number of edges:1526
 Retrieval:0ms; Counting:0ms; number of edges:71
 Retrieval:0ms; Counting:42ms; number of edges:35960
 Retrieval:0ms; Counting:90ms; number of edges:9644
 Retrieval:0ms; Counting:186ms; number of edges:129981
 Retrieval:0ms; Counting:1ms; number of edges:1213
 Retrieval:1ms; Counting:143ms; number of edges:124495
 Retrieval:0ms; Counting:0ms; number of edges:58
 Retrieval:0ms; Counting:75ms; number of edges:56195
 Retrieval:0ms; Counting:99ms; number of edges:92574
 Retrieval:0ms; Counting:0ms; number of edges:13
 Retrieval:0ms; Counting:50ms; number of edges:26350
 Retrieval:0ms; Counting:2ms; number of edges:1856
 Retrieval:1ms; Counting:376ms; number of edges:114166
 Retrieval:0ms; Counting:9528ms; number of edges:11956
 Retrieval:0ms; Counting:50047ms; number of edges:12645
 Retrieval:1ms; Counting:43687ms; number of edges:15025
 
 The first results came up very fast, because they seem to have been cached
 cause I did that quite often.
 As you can see the last 4 results weren't cached and it took a huge amount
 of time to do the iteration over the relationships.
 
 I checked that with the following code:
 
 for (Node node : db.getAllNodes()) {
   if (node.getId()  0) {
   long test = System.currentTimeMillis();
   IterableRelationship rels = 
 node.getRelationships(knows);
   System.out.print(Retrieval:
   + (System.currentTimeMillis() - 
 test));
 
   test = System.currentTimeMillis();
   int count = 
 com.google.common.collect.Iterables.size(rels);
   System.out.print(ms; Counting:
   + (System.currentTimeMillis() - 
 test));
   System.out.println(ms; number of edges: + 
 count); 
   }
   }
 Is there maybe a possibilty to cache more relationships or do you have any
 idea hot to speedup the iteration progress.
 
 Thanks for your help again!
 
 Cheers,
 Stephan
 
 --
 View this message in context: 
 http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3352415.html
 Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org

Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node

2011-09-20 Thread Linan Wang
hi stephan,
my theory is that most of the time would spent on retrieving imcoming
relationships. could you try again but this time only retrieve
outgoing relationship?

for (Node node : db.getAllNodes()) {
   if (node.getId()  0) {
   long test = System.currentTimeMillis();
   IterableRelationship rels =
node.getRelationships(knows, Direction.OUTGOING);
   System.out.print(Retrieval:
   +
(System.currentTimeMillis() - test));

   test = System.currentTimeMillis();
   int count =
com.google.common.collect.Iterables.size(rels);
   System.out.print(ms; Counting:
   +
(System.currentTimeMillis() - test));
   System.out.println(ms; number of
edges: + count);
   }
   }

On Tue, Sep 20, 2011 at 4:37 PM, st3ven st3...@web.de wrote:
 Hello again,

 the bottle neck is at the iteration.
 I did some tests with it to check whether the iteration or relationship
 retrievel is to slow.

 My test results look like this:

 Retrieval:1ms; Counting:158ms; number of edges:116407
 Retrieval:0ms; Counting:2ms; number of edges:1804
 Retrieval:0ms; Counting:0ms; number of edges:22
 Retrieval:0ms; Counting:0ms; number of edges:31
 Retrieval:0ms; Counting:0ms; number of edges:39
 Retrieval:0ms; Counting:2ms; number of edges:1213
 Retrieval:0ms; Counting:0ms; number of edges:57
 Retrieval:0ms; Counting:36ms; number of edges:59420
 Retrieval:0ms; Counting:335ms; number of edges:175156
 Retrieval:1ms; Counting:168ms; number of edges:146697
 Retrieval:0ms; Counting:354ms; number of edges:285051
 Retrieval:0ms; Counting:0ms; number of edges:50
 Retrieval:0ms; Counting:11ms; number of edges:20960
 Retrieval:0ms; Counting:0ms; number of edges:43
 Retrieval:0ms; Counting:0ms; number of edges:51
 Retrieval:0ms; Counting:1ms; number of edges:647
 Retrieval:0ms; Counting:5ms; number of edges:10216
 Retrieval:0ms; Counting:2ms; number of edges:3444
 Retrieval:0ms; Counting:0ms; number of edges:1128
 Retrieval:1ms; Counting:312ms; number of edges:319127
 Retrieval:1ms; Counting:0ms; number of edges:5
 Retrieval:0ms; Counting:760ms; number of edges:104741
 Retrieval:0ms; Counting:11ms; number of edges:9210
 Retrieval:0ms; Counting:0ms; number of edges:31
 Retrieval:1ms; Counting:3ms; number of edges:3116
 Retrieval:0ms; Counting:37ms; number of edges:70835
 Retrieval:0ms; Counting:383ms; number of edges:296445
 Retrieval:1ms; Counting:0ms; number of edges:120
 Retrieval:0ms; Counting:2ms; number of edges:1526
 Retrieval:0ms; Counting:0ms; number of edges:71
 Retrieval:0ms; Counting:42ms; number of edges:35960
 Retrieval:0ms; Counting:90ms; number of edges:9644
 Retrieval:0ms; Counting:186ms; number of edges:129981
 Retrieval:0ms; Counting:1ms; number of edges:1213
 Retrieval:1ms; Counting:143ms; number of edges:124495
 Retrieval:0ms; Counting:0ms; number of edges:58
 Retrieval:0ms; Counting:75ms; number of edges:56195
 Retrieval:0ms; Counting:99ms; number of edges:92574
 Retrieval:0ms; Counting:0ms; number of edges:13
 Retrieval:0ms; Counting:50ms; number of edges:26350
 Retrieval:0ms; Counting:2ms; number of edges:1856
 Retrieval:1ms; Counting:376ms; number of edges:114166
 Retrieval:0ms; Counting:9528ms; number of edges:11956
 Retrieval:0ms; Counting:50047ms; number of edges:12645
 Retrieval:1ms; Counting:43687ms; number of edges:15025

 The first results came up very fast, because they seem to have been cached
 cause I did that quite often.
 As you can see the last 4 results weren't cached and it took a huge amount
 of time to do the iteration over the relationships.

 I checked that with the following code:

 for (Node node : db.getAllNodes()) {
                        if (node.getId()  0) {
                                long test = System.currentTimeMillis();
                                IterableRelationship rels = 
 node.getRelationships(knows);
                                System.out.print(Retrieval:
                                                + (System.currentTimeMillis() 
 - test));

                                test = System.currentTimeMillis();
                                int count = 
 com.google.common.collect.Iterables.size(rels);
                                System.out.print(ms; Counting:
                                                + (System.currentTimeMillis() 
 - test));
                                System.out.println(ms; number of edges: + 
 count);
                        }
                }
 Is there maybe a possibilty to cache more relationships or do you have any
 idea hot to speedup the iteration progress.

 Thanks for your help again!

 Cheers,
 Stephan

 --
 View this message in context: 
 

Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node

2011-09-20 Thread st3ven
Hello Peter,

it's a pity that neo4j doesn't support full graph-scans.

Is there maybe a possibility to cache more relationships to speed things up
a little bit.
I recognized that only the iteration over the relationships is taking hours.
The time to get all relationships of one node is quite fast.

I think I could try your second solution:
- Store the relationships as a property in an Index (e.g. Lucene) and
as the index for all entries. Thus, you are using an index for what it
is good at - global operations over all documents. 

But I didn't understood it correctly. Do you mean an Index which stores the
ID of a relationship and creating such an Index for every node?
Could you maybe give me a code example for that?
That would be very kind of you.

The first solution is not really realizable, because I don't know the number
of relationships of every node.
I would have to count the relationships before the insertion and that would
make my database useless for the node degree query.

Thank you very much for your help!

Cheers,
Stephan

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3352509.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node

2011-09-20 Thread Peter Neubauer
Steven,
the index is built into the DB, so you can use something like
http://docs.neo4j.org/chunked/snapshot/tutorials-java-embedded-index.html
to index all your nodes into Lucene (in one index, the node as key,
the number of relationships as numeric value when creating them). When
reading, you would simply request all keys from the index and iterate
over them. I am not terribly sure how much fast it is, but given that
you are just loading up documents, Lucene should be reasonably fast.

Let us know if that works out!

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Tue, Sep 20, 2011 at 6:01 PM, st3ven st3...@web.de wrote:
 Hello Peter,

 it's a pity that neo4j doesn't support full graph-scans.

 Is there maybe a possibility to cache more relationships to speed things up
 a little bit.
 I recognized that only the iteration over the relationships is taking hours.
 The time to get all relationships of one node is quite fast.

 I think I could try your second solution:
 - Store the relationships as a property in an Index (e.g. Lucene) and
 as the index for all entries. Thus, you are using an index for what it
 is good at - global operations over all documents.

 But I didn't understood it correctly. Do you mean an Index which stores the
 ID of a relationship and creating such an Index for every node?
 Could you maybe give me a code example for that?
 That would be very kind of you.

 The first solution is not really realizable, because I don't know the number
 of relationships of every node.
 I would have to count the relationships before the insertion and that would
 make my database useless for the node degree query.

 Thank you very much for your help!

 Cheers,
 Stephan

 --
 View this message in context: 
 http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3352509.html
 Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] REST API Base URI

2011-09-20 Thread Nuo Yan
I access my neo4j server through the REST API. For security purpose, I put
the neo4j server behind a nginx lb. I'm wondering if there is config entry
somewhere that I can set the neo4j server to return a customized base uri
that I can set to my LB's uri.

For example, currently creating a node by POSTing to the lb (say
https://10.0.0.1/db/data) returns

{
  outgoing_relationships : 
http://neo4j/db/data/node/160/relationships/out;,
  data : {
  },
  traverse : http://neo4j/db/data/node/160/traverse/{returnType};,
  all_typed_relationships : 
http://neo4j/db/data/node/160/relationships/all/{-list||types},
  property : http://neo4j/db/data/node/160/properties/{key};,
  self : http://neo4j/db/data/node/160;,
  properties : http://neo4j/db/data/node/160/properties;,
  outgoing_typed_relationships : 
http://neo4j/db/data/node/160/relationships/out/{-list||types},
  incoming_relationships : http://neo4j/db/data/node/160/relationships/in
,
  extensions : {
  },
  create_relationship : http://neo4j/db/data/node/160/relationships;,
  paged_traverse : 
http://neo4j/db/data/node/160/paged/traverse/{returnType}{?pageSize,leaseTime}
,
  all_relationships : http://neo4j/db/data/node/160/relationships/all;,
  incoming_typed_relationships : 
http://neo4j/db/data/node/160/relationships/in/{-list||types}


Is there a config on the neo4j server that I can set to make it either
return the lb URI https://10.0.0.1; as the base uri or return relative path
in the result?
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Design help for G+ like app

2011-09-20 Thread ant-1
Hi,

I'm our company software architect, and I'm new to GraphDBs. But as we're
building a Google+-like, we realized the need for something like Neo4j. And
as this community seems the best, we settle for you guys :)

Anyway. Onto the design. Call us fools, but we're trying to redo Google+
(except for kids). I need help with the design, for starters.

Here's the Domain:
- Users
- Users have friends
- Users can place friends in one or more group (circle for G+), groups being
only visible to the user creating them.
- Users can create posts, which are visible either by all his friends or
only one or more groups.

I realize the hardest part is to retrieve feeds. For example, I want the
posts feed for user X for his group G. 

Here's what I envision:
- User are nodes
- Users have FRIEND_WITH relationships (direction being the initial
requester to the other)
- Groups are nodes.
- Group has a CREATED_BY relationship to user
- Group has BELONGS_TO relationships to multiple users
- Post are nodes
- Post has CREATED_BY relationship to the user
- Post has VISIBLE_TO relationship to one or more groups
- PostingEvent is a node with a timestamp property
- PostingEvent has a RELATED_TO relationship to the user and the post

And we would have a timeline index (Lucene or B-tree, I have no idea) for
feeds retrieval.

1. Do you see issues with my design?
2. What to do with postings to All my friends, do I create a All friends
group? In that case do I still need the user-to-user relationships?
3. I never worked with timeline indexes and such, so I could use some
readings on the subject, even theorical ones, even dead-tree books. Please
don't hesitate to make recommendations.

Thanks !

Antoine

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Design-help-for-G-like-app-tp3353185p3353185.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] how to get the User who has been B Followed who has Followed Back.

2011-09-20 Thread Marko Rodriguez
Hi,
  I have some relation like this:
 http://neo4j-community-discussions.438527.n3.nabble.com/file/n3352328/follow.jpg
 
 what should I do to get the users who has been B Followed and has Followed
 back to B.
 In the image the result should be (A).


 
 In Gremlin (http://docs.neo4j.org/chunked/snapshot/gremlin-plugin.html)
 Marko will provide, have no time to test it to be exact :)

If I understand your query correctly, then its:

g.v(1).out.filter{it.out[[id:1]].hasNext()}

Start from vertex 1(B), go to its outgoing adjacent neighbors. For each of 
those neighbors, make sure there is at least one link back to vertex 1.

HTH,
Marko.

http://markorodriguez.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j graph collections introduction of NodeCollection interface

2011-09-20 Thread Bryce
Hi Niels,

Probably is a good idea.  I will try to get something done around that soon,
flat out with work issues/features at present (including a nice
concurrency bug, argh).

Cheers
Bryce

On Wed, Sep 21, 2011 at 2:01 AM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:


 Hi Bryce,
 Sorry for the late response.
 I understand it's difficult to come up with a really good use-case for
 making NodeCollection more general in the context of IndexedRelationships,
 but I like to think of that interface as something we can eventually use for
 all sorts of collections, not just the ones derived from SortedTree.
 There is of course the issue that relationships can not attach to
 relationships, so collections of relationships will need to be addressed by
 Id. This is not necessarily a bad thing, because it decouples the container
 and the elements. In other words the container knows what elements it
 contains, but the elements don't know in what containers they are placed.
 Another option would be to create shadow nodes for contained relationships.
 Instead of adding a relationships to the collection, its shadow node is
 added and both the shadow node and the relationship contain pointers
 (properties with Id values) towards each other.
 I think it would be best if we do indeed create a GraphCollection interface
 parameterized by T extends PropertyContainer  even if that type parameter
 for now is always a Node. It doesn't add much complexity now to do it, and
 later on we may regret it and by then it becomes harder to do because there
 is an installed base.
 Niels

  Date: Sat, 17 Sep 2011 14:19:04 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Neo4j graph collections introduction of
 NodeCollection   interface
 
  Hi Niels,
 
  I had wondered about having a collection interface that covered both
 nodes
  and relationships.  There were a couple of reasons I didn't go with that
  right now, though well worthwhile discussing it and going with a
  GraphCollection super interface if it fits properly.
 
  Firstly I wanted to get something out there so people could have a
 look,
  and having something that matched what IndexedRelationship currently
  required was easiest first step.  Biggest thing specific in there to that
  functionality is the addNode method returning a relationship.
 
  The other issue was more wondering how a relationship collection would
 work.
   Say I have a relationship collection, and I have a relationship R1
 between
  node A and B, how am I going to represent that relationship withing some
  graph based data structure that makes sense.  There could be a node X
 that
  is part of the relationship collection data structure (e.g. tree) and
 that
  node could have an attribute that has the relationship id on it, but that
  doesn't seem like it would be very performant.  There could be a
  relationship between X and A that also gave the relationship type of R1,
 so
  you could find the relationship based on that, but there isn't
  any guarantee of the relationship type being unique.  What it would need
 to
  properly model it is the ability to have a relationship between X and R1,
  i.e. a relationship from a node to a relationship.
 
  If instead of being able to add any given relationship to the
 relationship
  collection you instead restrict it to being relationships matching a
 certain
  criteria from a given node then it is practically the same thing as a
  relationship expander.
 
  Or if you instead have a way through the relationship collection to
 create
  relationships from a given node to a set of other arbitrary nodes, with
 the
  relationship collection having a fixed relationship type and direction,
 then
  that is practically the current IndexedRelationship.
 
  I guess a way it could work is similar to IndexedRelationship, basically
  more general case of that class, where you have a method on the
 relationship
  collection createRelationship(startNode, endNode, relationshipType,
  direction) that was then stored in an internal data structure to create a
  pseudo relationship between the start and end, and then being able to
  iterate over this set of relationships.  Not sure exactly what the use
 case
  of that would be.  Maybe of more interest could be the same situation
 where
  the relationship type and direction are fixed, then you may have a
 friend
  of set of relationships that you create between arbitrary nodes and then
  iterate over all of those.
 
  I can't personally think of a good way of adding a set of arbitrary
  relationships into a collection stored in a graph data structure.
 
  Thoughts?
 
  Cheers
  Bryce
 
  P.S. Peter, I had thought to remove the passing in of the graph database
 and
  instead just getting it from the node, or only passing in the graph
 database
  and creating the node internally.
 
  On Sat, Sep 17, 2011 at 2:19 AM, Niels Hoogeveen
  pd_aficion...@hotmail.comwrote:
 
  
   Hi Bryce,
   I really like what 

Re: [Neo4j] Representing relationship strength

2011-09-20 Thread Tatham Oddie
Relationships can carry a data payload. You could introduce a weight property 
there.

-- Tatham


-Original Message-
From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On 
Behalf Of editor
Sent: Wednesday, 21 September 2011 12:52 AM
To: user@lists.neo4j.org
Subject: [Neo4j] Representing relationship strength

I'm looking into a persistant representation of a naive Bayesian classifier 
using a graph database. I have three basic object types: users, words and and 
topics. The relationships between these nodes would represent the strength of 
their connection -- a probability between zero and one.

To query the graph I would traverse relationships from user to topic, using the 
strength of connections to represent connectedness. Querying could potentially 
take a more neural net-like form.

I'm still quite naive myself when it comes to graph databases, but a Bayesian 
classifier seems to be a good fit for a graph model like Neo4j.
That said, in my background research I haven't seen a way to represent the 
strength of connections, just the binary relationship of whether two objects 
are connected or not. 

Can anyone comment on the feasibility of a Neo4j implementation of a Bayesian 
classifier? Are there ways I might be able to represent relationship strength 
using Neo4j primitives?

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Representing-relationship-strength-tp3352296p3352296.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node

2011-09-20 Thread Linan Wang
Stephan,
what's the size of your db? if it's under 10G, how about just dump the
full directory into to a ramfs. leave 1G to jvm and it'll do heavy io
on the ramfs. i think it's a simple solution and could yield
interesting result. please let me know the result if you tried. thanks

On Tue, Sep 20, 2011 at 5:41 PM, Peter Neubauer
peter.neuba...@neotechnology.com wrote:
 Steven,
 the index is built into the DB, so you can use something like
 http://docs.neo4j.org/chunked/snapshot/tutorials-java-embedded-index.html
 to index all your nodes into Lucene (in one index, the node as key,
 the number of relationships as numeric value when creating them). When
 reading, you would simply request all keys from the index and iterate
 over them. I am not terribly sure how much fast it is, but given that
 you are just loading up documents, Lucene should be reasonably fast.

 Let us know if that works out!

 Cheers,

 /peter neubauer

 GTalk:      neubauer.peter
 Skype       peter.neubauer
 Phone       +46 704 106975
 LinkedIn   http://www.linkedin.com/in/neubauer
 Twitter      http://twitter.com/peterneubauer

 http://www.neo4j.org               - Your high performance graph database.
 http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
 http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



 On Tue, Sep 20, 2011 at 6:01 PM, st3ven st3...@web.de wrote:
 Hello Peter,

 it's a pity that neo4j doesn't support full graph-scans.

 Is there maybe a possibility to cache more relationships to speed things up
 a little bit.
 I recognized that only the iteration over the relationships is taking hours.
 The time to get all relationships of one node is quite fast.

 I think I could try your second solution:
 - Store the relationships as a property in an Index (e.g. Lucene) and
 as the index for all entries. Thus, you are using an index for what it
 is good at - global operations over all documents.

 But I didn't understood it correctly. Do you mean an Index which stores the
 ID of a relationship and creating such an Index for every node?
 Could you maybe give me a code example for that?
 That would be very kind of you.

 The first solution is not really realizable, because I don't know the number
 of relationships of every node.
 I would have to count the relationships before the insertion and that would
 make my database useless for the node degree query.

 Thank you very much for your help!

 Cheers,
 Stephan

 --
 View this message in context: 
 http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3352509.html
 Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Best regards

Linan Wang
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] how to get the User who has been B Followed who has Followed Back.

2011-09-20 Thread iamyuanlong
hi Peter,
   This can get the result.But if I want to contain B's Friends too.Should I
use this?
http://neo4j-community-discussions.438527.n3.nabble.com/file/n3354221/follow%26friend.jpg
 
Use:
START b=(node_auto_index,'name:B') MATCH a-[:FOLLOW]-b-[:FOLLOW]-a or
a-[:FRIEND]-b-[:FRIEND]-a  RETURN a 

OR:
START b=(node_auto_index,'name:B') MATCH a-[r]-b-[r]-a  RETURN  where
r.TYPE=FOLLOW or r.TYPE=FRIEND a 

Both of them can get the result or not? 

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-to-get-the-User-who-has-been-B-Followed-who-has-Followed-Back-tp3352328p3354221.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Querying multivalued properties

2011-09-20 Thread Alexandre de Assis Bento Lima
Hi,

I need multivalued properties in my application. However, I don't know how to 
make queries 
based on them using indexes. I need to search nodes that have a certain value 
inside their 
multivalued properties (arrays). Does anybody know how can I do that? I 
couldn't find 
anything in the documentation.

Thanks in advance!

Alexandre.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user