[Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
Hello neo4j-comunity, I am creating a graph database for a social network. To create the graph database I am using the Batch Inserter. The Batch Inserter inserts data from 2 files into the graph database. Files: 1. the first file contains the Nodes I want to create (about 3.5M Nodes) The file looks like this: Author 1 Author 2 Author 2 ... 2. the second file contains every Relationship between the Nodes (about 2.5 billion Relationships) This file looks like this: Author1; Author2; timestamp Author2; Author3; timestamp Author1; Author3; timestamp... The specifications of my Computer look like this: Intel Core i7 3,4Ghz 16GB Ram Geforce GT 420 1GB 2TB harddrive My Code to create the graph database looks like this: package wikiOSN; import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException; import java.util.Map; import org.neo4j.graphdb.DynamicRelationshipType; import org.neo4j.graphdb.index.BatchInserterIndex; import org.neo4j.graphdb.index.BatchInserterIndexProvider; import org.neo4j.helpers.collection.MapUtil; import org.neo4j.index.impl.lucene.LuceneBatchInserterIndexProvider; import org.neo4j.kernel.impl.batchinsert.BatchInserter; import org.neo4j.kernel.impl.batchinsert.BatchInserterImpl; public class CreateAndConnectNodes { public static void main(String[] args) throws IOException { BufferedReader bf = new BufferedReader(new FileReader( /media/sdg1/Wikipedia/Reduced Files/autoren-der-wikiartikel)); BufferedReader bf2 = new BufferedReader(new FileReader( /media/sdg1/Wikipedia/Reduced Files/wikipedia-output)); CreateAndConnectNodes cacn = new CreateAndConnectNodes(); cacn.createGraphDatabase(bf, bf2); } private long relationCounter = 0; private void createGraphDatabase(BufferedReader bf, BufferedReader bf2) throws IOException { BatchInserter inserter = new BatchInserterImpl( target/socialNetwork-batchinsert); BatchInserterIndexProvider indexProvider = new LuceneBatchInserterIndexProvider( inserter); BatchInserterIndex authors = indexProvider.nodeIndex(author, MapUtil.stringMap(type, exact)); authors.setCacheCapacity(name, 10); String zeile; String zeile2; while ((zeile = bf.readLine()) != null) { Maplt;String, Objectgt; properties = MapUtil.map(name, zeile); long node = inserter.createNode(properties); authors.add(node, properties); } bf.close(); System.out.println(Nodes created!); authors.flush(); String node = ; long node1 = 0; long node2 = 0; while ((zeile2 = bf2.readLine()) != null) { if (relationCounter++ % 1 == 0) { System.out .println(Edges already created: + relationCounter); } String[] relation = zeile2.split(%;% ); if (node == ) { node = relation[0]; if (authors.get(name, relation[0]).getSingle() != null) { node1 = authors.get(name, relation[0]).getSingle(); } else { System.out.println(Autor 1: + relation[0]); break; } } if (!node.equals(relation[0])) { node = relation[0]; if (authors.get(name, relation[0]).getSingle() != null) { node1 = authors.get(name, relation[0]).getSingle(); } else { System.out.println(Autor 1: + relation[0]); break; } } if (authors.get(name, relation[1]).getSingle() != null) { node2 = authors.get(name, relation[1]).getSingle(); } else { System.out.println(Autor 2: + relation[1]); break; } Maplt;String, Objectgt; properties = MapUtil.map(timestamp, Long.parseLong(relation[2].trim()));
Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
Steven, the most performant way to insert data with the BatchInserter is to first insert the nodes only form your node file (that should be fast). After that (or at the same time), find a way to generate the relationship file with Neo4j IDs rather than being forced to look the nodes up in indexes during relationship insertion. This is taking the bulk of time, so if you could write back to a file your node IDs, then massage the relationship text file to include node FROM and TO IDs (e.g. using Perl or Bash or Ruby) and import that one refering to these directly, that should be much faster. HTH Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Tue, Sep 20, 2011 at 12:23 PM, st3ven st3...@web.de wrote: Hello neo4j-comunity, I am creating a graph database for a social network. To create the graph database I am using the Batch Inserter. The Batch Inserter inserts data from 2 files into the graph database. Files: 1. the first file contains the Nodes I want to create (about 3.5M Nodes) The file looks like this: Author 1 Author 2 Author 2 ... 2. the second file contains every Relationship between the Nodes (about 2.5 billion Relationships) This file looks like this: Author1; Author2; timestamp Author2; Author3; timestamp Author1; Author3; timestamp... The specifications of my Computer look like this: Intel Core i7 3,4Ghz 16GB Ram Geforce GT 420 1GB 2TB harddrive My Code to create the graph database looks like this: package wikiOSN; import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException; import java.util.Map; import org.neo4j.graphdb.DynamicRelationshipType; import org.neo4j.graphdb.index.BatchInserterIndex; import org.neo4j.graphdb.index.BatchInserterIndexProvider; import org.neo4j.helpers.collection.MapUtil; import org.neo4j.index.impl.lucene.LuceneBatchInserterIndexProvider; import org.neo4j.kernel.impl.batchinsert.BatchInserter; import org.neo4j.kernel.impl.batchinsert.BatchInserterImpl; public class CreateAndConnectNodes { public static void main(String[] args) throws IOException { BufferedReader bf = new BufferedReader(new FileReader( /media/sdg1/Wikipedia/Reduced Files/autoren-der-wikiartikel)); BufferedReader bf2 = new BufferedReader(new FileReader( /media/sdg1/Wikipedia/Reduced Files/wikipedia-output)); CreateAndConnectNodes cacn = new CreateAndConnectNodes(); cacn.createGraphDatabase(bf, bf2); } private long relationCounter = 0; private void createGraphDatabase(BufferedReader bf, BufferedReader bf2) throws IOException { BatchInserter inserter = new BatchInserterImpl( target/socialNetwork-batchinsert); BatchInserterIndexProvider indexProvider = new LuceneBatchInserterIndexProvider( inserter); BatchInserterIndex authors = indexProvider.nodeIndex(author, MapUtil.stringMap(type, exact)); authors.setCacheCapacity(name, 10); String zeile; String zeile2; while ((zeile = bf.readLine()) != null) { Maplt;String, Objectgt; properties = MapUtil.map(name, zeile); long node = inserter.createNode(properties); authors.add(node, properties); } bf.close(); System.out.println(Nodes created!); authors.flush(); String node = ; long node1 = 0; long node2 = 0; while ((zeile2 = bf2.readLine()) != null) { if (relationCounter++ % 1 == 0) { System.out .println(Edges already created: + relationCounter); } String[] relation = zeile2.split(%;% ); if (node == ) { node = relation[0]; if (authors.get(name, relation[0]).getSingle() != null) { node1 = authors.get(name, relation[0]).getSingle(); } else { System.out.println(Autor 1: + relation[0]); break;
Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
hi Stephan, have you set the -Xms, -XX:+UseNUMA, and -XX:+UseConcMarkSweepGC? they could speedup the process significantly. also, if you like, the jrockit is fast and free now. give it a try. btw, which file system you are using? have you turned off atime? On Tue, Sep 20, 2011 at 12:00 PM, st3ven st3...@web.de wrote: Peter, the import of the data into the graph database is not the main problem for me. The lookup of nodes from the index is fast enough for me. To create the database it took me nearly half a day. My main problem here is getting the node degree of every node. As I already said I am using this code to get the node degree of every node: for (Node node : db.getAllNodes()) { counter = 0; if (node.getId() 0) { for (Relationship rel : node.getRelationships()) { counter++; } System.out.println(node.getProperty(name).toString() + : + counter); } } After 3 days I only got the node degree of 8 nodes and I want to optimize my traversal here, cause this is very slow. What can I do to make this faster or do I have to change my code for getting the node degree? I only posted my import code because I thought I could maybe optimize there something for this traversal. Thank you very much for your help! Cheers, Stephan -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3351664.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Best regards Linan Wang ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
Hi, I already tried these java parameters, but that didn't really speedup the process and i already turned atime off. As Java parameters I am using right now -d64 -server -Xms7G -Xmx14G -XX:+UseParallelGC -XX:+UseNUMA What I've also noticed is, that reading from the database is really slow on my hard disk. It just reads 1mb/s and sometimes 8mb/s, but that is really slow. My hard disk can normally read and copy files much faster. Also very strange is, that the workload of the hard disk is around 99% with reading 1mb/s. My OS is Ubuntu Linux x64 and my file system is ext4. On the neo4j Wiki I found some performance guides, but these didn't really help. Do you know what I can do else? Perfomance Guides: http://wiki.neo4j.org/content/Linux_Performance_Guide http://wiki.neo4j.org/content/Linux_Performance_Guide http://wiki.neo4j.org/content/Configuration_Settings http://wiki.neo4j.org/content/Configuration_Settings I also added a configurtion file, but it seems that my Java program doesn't use all of the Ram. Thanks for your help! Cheers, Stephan -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3351881.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4j graph collections introduction of NodeCollection interface
Hi Bryce, Sorry for the late response. I understand it's difficult to come up with a really good use-case for making NodeCollection more general in the context of IndexedRelationships, but I like to think of that interface as something we can eventually use for all sorts of collections, not just the ones derived from SortedTree. There is of course the issue that relationships can not attach to relationships, so collections of relationships will need to be addressed by Id. This is not necessarily a bad thing, because it decouples the container and the elements. In other words the container knows what elements it contains, but the elements don't know in what containers they are placed. Another option would be to create shadow nodes for contained relationships. Instead of adding a relationships to the collection, its shadow node is added and both the shadow node and the relationship contain pointers (properties with Id values) towards each other. I think it would be best if we do indeed create a GraphCollection interface parameterized by T extends PropertyContainer even if that type parameter for now is always a Node. It doesn't add much complexity now to do it, and later on we may regret it and by then it becomes harder to do because there is an installed base. Niels Date: Sat, 17 Sep 2011 14:19:04 +1200 From: bryc...@gmail.com To: user@lists.neo4j.org Subject: Re: [Neo4j] Neo4j graph collections introduction of NodeCollection interface Hi Niels, I had wondered about having a collection interface that covered both nodes and relationships. There were a couple of reasons I didn't go with that right now, though well worthwhile discussing it and going with a GraphCollection super interface if it fits properly. Firstly I wanted to get something out there so people could have a look, and having something that matched what IndexedRelationship currently required was easiest first step. Biggest thing specific in there to that functionality is the addNode method returning a relationship. The other issue was more wondering how a relationship collection would work. Say I have a relationship collection, and I have a relationship R1 between node A and B, how am I going to represent that relationship withing some graph based data structure that makes sense. There could be a node X that is part of the relationship collection data structure (e.g. tree) and that node could have an attribute that has the relationship id on it, but that doesn't seem like it would be very performant. There could be a relationship between X and A that also gave the relationship type of R1, so you could find the relationship based on that, but there isn't any guarantee of the relationship type being unique. What it would need to properly model it is the ability to have a relationship between X and R1, i.e. a relationship from a node to a relationship. If instead of being able to add any given relationship to the relationship collection you instead restrict it to being relationships matching a certain criteria from a given node then it is practically the same thing as a relationship expander. Or if you instead have a way through the relationship collection to create relationships from a given node to a set of other arbitrary nodes, with the relationship collection having a fixed relationship type and direction, then that is practically the current IndexedRelationship. I guess a way it could work is similar to IndexedRelationship, basically more general case of that class, where you have a method on the relationship collection createRelationship(startNode, endNode, relationshipType, direction) that was then stored in an internal data structure to create a pseudo relationship between the start and end, and then being able to iterate over this set of relationships. Not sure exactly what the use case of that would be. Maybe of more interest could be the same situation where the relationship type and direction are fixed, then you may have a friend of set of relationships that you create between arbitrary nodes and then iterate over all of those. I can't personally think of a good way of adding a set of arbitrary relationships into a collection stored in a graph data structure. Thoughts? Cheers Bryce P.S. Peter, I had thought to remove the passing in of the graph database and instead just getting it from the node, or only passing in the graph database and creating the node internally. On Sat, Sep 17, 2011 at 2:19 AM, Niels Hoogeveen pd_aficion...@hotmail.comwrote: Hi Bryce, I really like what you are trying to achieve here. One question: Instead of having NodeCollection, why not have GraphCollectionT extends PropertyContainer. That way we can have collections of both Relationships and Nodes. Niels Date: Fri, 16 Sep 2011 17:37:29 +1200 From: bryc...@gmail.com To: user@lists.neo4j.org Subject: [Neo4j] Neo4j graph
Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
Steven, in this scenario, you are reading up the entire db, and basically have it cold. Neo4j is not optimized in itself to do full graph-scans. I see a few solutions for you: - store the number of relationships as a property on nodes and read only that. this works if the updates to your graph are not too frequent. - Store the relationships as a property in an Index (e.g. Lucene) and as the index for all entries. Thus, you are using an index for what it is good at - global operations over all documents. - use HA or just file copy to replicate the graph on several instances, and send a sharded query to all of them (e.g. count 100K node degrees on all of the instances and return the result). This query is very easy to do in a map/reduce fashion. Is that feasible? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Tue, Sep 20, 2011 at 1:00 PM, st3ven st3...@web.de wrote: Peter, the import of the data into the graph database is not the main problem for me. The lookup of nodes from the index is fast enough for me. To create the database it took me nearly half a day. My main problem here is getting the node degree of every node. As I already said I am using this code to get the node degree of every node: for (Node node : db.getAllNodes()) { counter = 0; if (node.getId() 0) { for (Relationship rel : node.getRelationships()) { counter++; } System.out.println(node.getProperty(name).toString() + : + counter); } } After 3 days I only got the node degree of 8 nodes and I want to optimize my traversal here, cause this is very slow. What can I do to make this faster or do I have to change my code for getting the node degree? I only posted my import code because I thought I could maybe optimize there something for this traversal. Thank you very much for your help! Cheers, Stephan -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3351664.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
hi stephan i'm wondering if any difference if you could specify the relationship when counting degrees: RelationshipType knows = DynamicRelationshipType.withName(KNOWS); Iterable rels = node.getRelationship(knows); count = com.google.common.collect.Iterables.size(rels); besides, do you know where is the bottle neck is, the node iteration or relationship retrieval? On Tue, Sep 20, 2011 at 1:38 PM, st3ven st3...@web.de wrote: Hi, I already tried these java parameters, but that didn't really speedup the process and i already turned atime off. As Java parameters I am using right now -d64 -server -Xms7G -Xmx14G -XX:+UseParallelGC -XX:+UseNUMA What I've also noticed is, that reading from the database is really slow on my hard disk. It just reads 1mb/s and sometimes 8mb/s, but that is really slow. My hard disk can normally read and copy files much faster. Also very strange is, that the workload of the hard disk is around 99% with reading 1mb/s. My OS is Ubuntu Linux x64 and my file system is ext4. On the neo4j Wiki I found some performance guides, but these didn't really help. Do you know what I can do else? Perfomance Guides: http://wiki.neo4j.org/content/Linux_Performance_Guide http://wiki.neo4j.org/content/Linux_Performance_Guide http://wiki.neo4j.org/content/Configuration_Settings http://wiki.neo4j.org/content/Configuration_Settings I also added a configurtion file, but it seems that my Java program doesn't use all of the Ram. Thanks for your help! Cheers, Stephan -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3351881.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Best regards Linan Wang ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Representing relationship strength
I'm looking into a persistant representation of a naive Bayesian classifier using a graph database. I have three basic object types: users, words and and topics. The relationships between these nodes would represent the strength of their connection -- a probability between zero and one. To query the graph I would traverse relationships from user to topic, using the strength of connections to represent connectedness. Querying could potentially take a more neural net-like form. I'm still quite naive myself when it comes to graph databases, but a Bayesian classifier seems to be a good fit for a graph model like Neo4j. That said, in my background research I haven't seen a way to represent the strength of connections, just the binary relationship of whether two objects are connected or not. Can anyone comment on the feasibility of a Neo4j implementation of a Bayesian classifier? Are there ways I might be able to represent relationship strength using Neo4j primitives? -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Representing-relationship-strength-tp3352296p3352296.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] how to get the User who has been B Followed who has Followed Back.
hi all, I have some relation like this: http://neo4j-community-discussions.438527.n3.nabble.com/file/n3352328/follow.jpg what should I do to get the users who has been B Followed and has Followed back to B. In the image the result should be (A). -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-to-get-the-User-who-has-been-B-Followed-who-has-Followed-Back-tp3352328p3352328.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
Hello again, the bottle neck is at the iteration. I did some tests with it to check whether the iteration or relationship retrievel is to slow. My test results look like this: Retrieval:1ms; Counting:158ms; number of edges:116407 Retrieval:0ms; Counting:2ms; number of edges:1804 Retrieval:0ms; Counting:0ms; number of edges:22 Retrieval:0ms; Counting:0ms; number of edges:31 Retrieval:0ms; Counting:0ms; number of edges:39 Retrieval:0ms; Counting:2ms; number of edges:1213 Retrieval:0ms; Counting:0ms; number of edges:57 Retrieval:0ms; Counting:36ms; number of edges:59420 Retrieval:0ms; Counting:335ms; number of edges:175156 Retrieval:1ms; Counting:168ms; number of edges:146697 Retrieval:0ms; Counting:354ms; number of edges:285051 Retrieval:0ms; Counting:0ms; number of edges:50 Retrieval:0ms; Counting:11ms; number of edges:20960 Retrieval:0ms; Counting:0ms; number of edges:43 Retrieval:0ms; Counting:0ms; number of edges:51 Retrieval:0ms; Counting:1ms; number of edges:647 Retrieval:0ms; Counting:5ms; number of edges:10216 Retrieval:0ms; Counting:2ms; number of edges:3444 Retrieval:0ms; Counting:0ms; number of edges:1128 Retrieval:1ms; Counting:312ms; number of edges:319127 Retrieval:1ms; Counting:0ms; number of edges:5 Retrieval:0ms; Counting:760ms; number of edges:104741 Retrieval:0ms; Counting:11ms; number of edges:9210 Retrieval:0ms; Counting:0ms; number of edges:31 Retrieval:1ms; Counting:3ms; number of edges:3116 Retrieval:0ms; Counting:37ms; number of edges:70835 Retrieval:0ms; Counting:383ms; number of edges:296445 Retrieval:1ms; Counting:0ms; number of edges:120 Retrieval:0ms; Counting:2ms; number of edges:1526 Retrieval:0ms; Counting:0ms; number of edges:71 Retrieval:0ms; Counting:42ms; number of edges:35960 Retrieval:0ms; Counting:90ms; number of edges:9644 Retrieval:0ms; Counting:186ms; number of edges:129981 Retrieval:0ms; Counting:1ms; number of edges:1213 Retrieval:1ms; Counting:143ms; number of edges:124495 Retrieval:0ms; Counting:0ms; number of edges:58 Retrieval:0ms; Counting:75ms; number of edges:56195 Retrieval:0ms; Counting:99ms; number of edges:92574 Retrieval:0ms; Counting:0ms; number of edges:13 Retrieval:0ms; Counting:50ms; number of edges:26350 Retrieval:0ms; Counting:2ms; number of edges:1856 Retrieval:1ms; Counting:376ms; number of edges:114166 Retrieval:0ms; Counting:9528ms; number of edges:11956 Retrieval:0ms; Counting:50047ms; number of edges:12645 Retrieval:1ms; Counting:43687ms; number of edges:15025 The first results came up very fast, because they seem to have been cached cause I did that quite often. As you can see the last 4 results weren't cached and it took a huge amount of time to do the iteration over the relationships. I checked that with the following code: for (Node node : db.getAllNodes()) { if (node.getId() 0) { long test = System.currentTimeMillis(); IterableRelationship rels = node.getRelationships(knows); System.out.print(Retrieval: + (System.currentTimeMillis() - test)); test = System.currentTimeMillis(); int count = com.google.common.collect.Iterables.size(rels); System.out.print(ms; Counting: + (System.currentTimeMillis() - test)); System.out.println(ms; number of edges: + count); } } Is there maybe a possibilty to cache more relationships or do you have any idea hot to speedup the iteration progress. Thanks for your help again! Cheers, Stephan -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3352415.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] how to get the User who has been B Followed who has Followed Back.
In Cypher (http://docs.neo4j.org/chunked/snapshot/cypher-query-lang.html) START b=(node_auto_index,'name:B') MATCH a-[:FOLLOW]-b, b-[:FOLLOW]-a RETURN a see https://github.com/neo4j/community/blob/d413404a88db989fd289581ecee6e68faec00ace/embedded-examples/src/test/java/org/neo4j/examples/ShortDocumentationExamplesTest.java#L250 In Gremlin (http://docs.neo4j.org/chunked/snapshot/gremlin-plugin.html) Marko will provide, have no time to test it to be exact :) HTH Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Tue, Sep 20, 2011 at 5:05 PM, iamyuanlong yuanlong1...@gmail.com wrote: hi all, I have some relation like this: http://neo4j-community-discussions.438527.n3.nabble.com/file/n3352328/follow.jpg what should I do to get the users who has been B Followed and has Followed back to B. In the image the result should be (A). -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-to-get-the-User-who-has-been-B-Followed-who-has-Followed-Back-tp3352328p3352328.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] how to get the User who has been B Followed who has Followed Back.
With cypher you can do: start a=(10) match a-[:FOLLOW]-b-[:FOLLOW]-a return a where 10 can be your node Id or you can either use an index. Cheers. 2011/9/20 iamyuanlong yuanlong1...@gmail.com hi all, I have some relation like this: http://neo4j-community-discussions.438527.n3.nabble.com/file/n3352328/follow.jpg what should I do to get the users who has been B Followed and has Followed back to B. In the image the result should be (A). -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-to-get-the-User-who-has-been-B-Followed-who-has-Followed-Back-tp3352328p3352328.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Adriano Almeida Caelum | Ensino e Inovação www.caelum.com.br ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
The retrieval is only virtual, as it is lazy. When I get back to my machine on Thursday, I gonna run your tests and get back to you. I have made some modifications on the relationship loading and want to see how that affects this. There are issues loading lots of relationships with cold caches in a one-by-one usecase. As the larger segment caching only kicks in if there are a certain number of misses of the memory mapped file loading. Using an SSD would also speed up your use-case. Configuring Neo4j to use more memory for memory mapping would also help. Cheers Michael Am 20.09.2011 um 17:37 schrieb st3ven: Hello again, the bottle neck is at the iteration. I did some tests with it to check whether the iteration or relationship retrievel is to slow. My test results look like this: Retrieval:1ms; Counting:158ms; number of edges:116407 Retrieval:0ms; Counting:2ms; number of edges:1804 Retrieval:0ms; Counting:0ms; number of edges:22 Retrieval:0ms; Counting:0ms; number of edges:31 Retrieval:0ms; Counting:0ms; number of edges:39 Retrieval:0ms; Counting:2ms; number of edges:1213 Retrieval:0ms; Counting:0ms; number of edges:57 Retrieval:0ms; Counting:36ms; number of edges:59420 Retrieval:0ms; Counting:335ms; number of edges:175156 Retrieval:1ms; Counting:168ms; number of edges:146697 Retrieval:0ms; Counting:354ms; number of edges:285051 Retrieval:0ms; Counting:0ms; number of edges:50 Retrieval:0ms; Counting:11ms; number of edges:20960 Retrieval:0ms; Counting:0ms; number of edges:43 Retrieval:0ms; Counting:0ms; number of edges:51 Retrieval:0ms; Counting:1ms; number of edges:647 Retrieval:0ms; Counting:5ms; number of edges:10216 Retrieval:0ms; Counting:2ms; number of edges:3444 Retrieval:0ms; Counting:0ms; number of edges:1128 Retrieval:1ms; Counting:312ms; number of edges:319127 Retrieval:1ms; Counting:0ms; number of edges:5 Retrieval:0ms; Counting:760ms; number of edges:104741 Retrieval:0ms; Counting:11ms; number of edges:9210 Retrieval:0ms; Counting:0ms; number of edges:31 Retrieval:1ms; Counting:3ms; number of edges:3116 Retrieval:0ms; Counting:37ms; number of edges:70835 Retrieval:0ms; Counting:383ms; number of edges:296445 Retrieval:1ms; Counting:0ms; number of edges:120 Retrieval:0ms; Counting:2ms; number of edges:1526 Retrieval:0ms; Counting:0ms; number of edges:71 Retrieval:0ms; Counting:42ms; number of edges:35960 Retrieval:0ms; Counting:90ms; number of edges:9644 Retrieval:0ms; Counting:186ms; number of edges:129981 Retrieval:0ms; Counting:1ms; number of edges:1213 Retrieval:1ms; Counting:143ms; number of edges:124495 Retrieval:0ms; Counting:0ms; number of edges:58 Retrieval:0ms; Counting:75ms; number of edges:56195 Retrieval:0ms; Counting:99ms; number of edges:92574 Retrieval:0ms; Counting:0ms; number of edges:13 Retrieval:0ms; Counting:50ms; number of edges:26350 Retrieval:0ms; Counting:2ms; number of edges:1856 Retrieval:1ms; Counting:376ms; number of edges:114166 Retrieval:0ms; Counting:9528ms; number of edges:11956 Retrieval:0ms; Counting:50047ms; number of edges:12645 Retrieval:1ms; Counting:43687ms; number of edges:15025 The first results came up very fast, because they seem to have been cached cause I did that quite often. As you can see the last 4 results weren't cached and it took a huge amount of time to do the iteration over the relationships. I checked that with the following code: for (Node node : db.getAllNodes()) { if (node.getId() 0) { long test = System.currentTimeMillis(); IterableRelationship rels = node.getRelationships(knows); System.out.print(Retrieval: + (System.currentTimeMillis() - test)); test = System.currentTimeMillis(); int count = com.google.common.collect.Iterables.size(rels); System.out.print(ms; Counting: + (System.currentTimeMillis() - test)); System.out.println(ms; number of edges: + count); } } Is there maybe a possibilty to cache more relationships or do you have any idea hot to speedup the iteration progress. Thanks for your help again! Cheers, Stephan -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3352415.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org
Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
hi stephan, my theory is that most of the time would spent on retrieving imcoming relationships. could you try again but this time only retrieve outgoing relationship? for (Node node : db.getAllNodes()) { if (node.getId() 0) { long test = System.currentTimeMillis(); IterableRelationship rels = node.getRelationships(knows, Direction.OUTGOING); System.out.print(Retrieval: + (System.currentTimeMillis() - test)); test = System.currentTimeMillis(); int count = com.google.common.collect.Iterables.size(rels); System.out.print(ms; Counting: + (System.currentTimeMillis() - test)); System.out.println(ms; number of edges: + count); } } On Tue, Sep 20, 2011 at 4:37 PM, st3ven st3...@web.de wrote: Hello again, the bottle neck is at the iteration. I did some tests with it to check whether the iteration or relationship retrievel is to slow. My test results look like this: Retrieval:1ms; Counting:158ms; number of edges:116407 Retrieval:0ms; Counting:2ms; number of edges:1804 Retrieval:0ms; Counting:0ms; number of edges:22 Retrieval:0ms; Counting:0ms; number of edges:31 Retrieval:0ms; Counting:0ms; number of edges:39 Retrieval:0ms; Counting:2ms; number of edges:1213 Retrieval:0ms; Counting:0ms; number of edges:57 Retrieval:0ms; Counting:36ms; number of edges:59420 Retrieval:0ms; Counting:335ms; number of edges:175156 Retrieval:1ms; Counting:168ms; number of edges:146697 Retrieval:0ms; Counting:354ms; number of edges:285051 Retrieval:0ms; Counting:0ms; number of edges:50 Retrieval:0ms; Counting:11ms; number of edges:20960 Retrieval:0ms; Counting:0ms; number of edges:43 Retrieval:0ms; Counting:0ms; number of edges:51 Retrieval:0ms; Counting:1ms; number of edges:647 Retrieval:0ms; Counting:5ms; number of edges:10216 Retrieval:0ms; Counting:2ms; number of edges:3444 Retrieval:0ms; Counting:0ms; number of edges:1128 Retrieval:1ms; Counting:312ms; number of edges:319127 Retrieval:1ms; Counting:0ms; number of edges:5 Retrieval:0ms; Counting:760ms; number of edges:104741 Retrieval:0ms; Counting:11ms; number of edges:9210 Retrieval:0ms; Counting:0ms; number of edges:31 Retrieval:1ms; Counting:3ms; number of edges:3116 Retrieval:0ms; Counting:37ms; number of edges:70835 Retrieval:0ms; Counting:383ms; number of edges:296445 Retrieval:1ms; Counting:0ms; number of edges:120 Retrieval:0ms; Counting:2ms; number of edges:1526 Retrieval:0ms; Counting:0ms; number of edges:71 Retrieval:0ms; Counting:42ms; number of edges:35960 Retrieval:0ms; Counting:90ms; number of edges:9644 Retrieval:0ms; Counting:186ms; number of edges:129981 Retrieval:0ms; Counting:1ms; number of edges:1213 Retrieval:1ms; Counting:143ms; number of edges:124495 Retrieval:0ms; Counting:0ms; number of edges:58 Retrieval:0ms; Counting:75ms; number of edges:56195 Retrieval:0ms; Counting:99ms; number of edges:92574 Retrieval:0ms; Counting:0ms; number of edges:13 Retrieval:0ms; Counting:50ms; number of edges:26350 Retrieval:0ms; Counting:2ms; number of edges:1856 Retrieval:1ms; Counting:376ms; number of edges:114166 Retrieval:0ms; Counting:9528ms; number of edges:11956 Retrieval:0ms; Counting:50047ms; number of edges:12645 Retrieval:1ms; Counting:43687ms; number of edges:15025 The first results came up very fast, because they seem to have been cached cause I did that quite often. As you can see the last 4 results weren't cached and it took a huge amount of time to do the iteration over the relationships. I checked that with the following code: for (Node node : db.getAllNodes()) { if (node.getId() 0) { long test = System.currentTimeMillis(); IterableRelationship rels = node.getRelationships(knows); System.out.print(Retrieval: + (System.currentTimeMillis() - test)); test = System.currentTimeMillis(); int count = com.google.common.collect.Iterables.size(rels); System.out.print(ms; Counting: + (System.currentTimeMillis() - test)); System.out.println(ms; number of edges: + count); } } Is there maybe a possibilty to cache more relationships or do you have any idea hot to speedup the iteration progress. Thanks for your help again! Cheers, Stephan -- View this message in context:
Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
Hello Peter, it's a pity that neo4j doesn't support full graph-scans. Is there maybe a possibility to cache more relationships to speed things up a little bit. I recognized that only the iteration over the relationships is taking hours. The time to get all relationships of one node is quite fast. I think I could try your second solution: - Store the relationships as a property in an Index (e.g. Lucene) and as the index for all entries. Thus, you are using an index for what it is good at - global operations over all documents. But I didn't understood it correctly. Do you mean an Index which stores the ID of a relationship and creating such an Index for every node? Could you maybe give me a code example for that? That would be very kind of you. The first solution is not really realizable, because I don't know the number of relationships of every node. I would have to count the relationships before the insertion and that would make my database useless for the node degree query. Thank you very much for your help! Cheers, Stephan -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3352509.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
Steven, the index is built into the DB, so you can use something like http://docs.neo4j.org/chunked/snapshot/tutorials-java-embedded-index.html to index all your nodes into Lucene (in one index, the node as key, the number of relationships as numeric value when creating them). When reading, you would simply request all keys from the index and iterate over them. I am not terribly sure how much fast it is, but given that you are just loading up documents, Lucene should be reasonably fast. Let us know if that works out! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Tue, Sep 20, 2011 at 6:01 PM, st3ven st3...@web.de wrote: Hello Peter, it's a pity that neo4j doesn't support full graph-scans. Is there maybe a possibility to cache more relationships to speed things up a little bit. I recognized that only the iteration over the relationships is taking hours. The time to get all relationships of one node is quite fast. I think I could try your second solution: - Store the relationships as a property in an Index (e.g. Lucene) and as the index for all entries. Thus, you are using an index for what it is good at - global operations over all documents. But I didn't understood it correctly. Do you mean an Index which stores the ID of a relationship and creating such an Index for every node? Could you maybe give me a code example for that? That would be very kind of you. The first solution is not really realizable, because I don't know the number of relationships of every node. I would have to count the relationships before the insertion and that would make my database useless for the node degree query. Thank you very much for your help! Cheers, Stephan -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3352509.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] REST API Base URI
I access my neo4j server through the REST API. For security purpose, I put the neo4j server behind a nginx lb. I'm wondering if there is config entry somewhere that I can set the neo4j server to return a customized base uri that I can set to my LB's uri. For example, currently creating a node by POSTing to the lb (say https://10.0.0.1/db/data) returns { outgoing_relationships : http://neo4j/db/data/node/160/relationships/out;, data : { }, traverse : http://neo4j/db/data/node/160/traverse/{returnType};, all_typed_relationships : http://neo4j/db/data/node/160/relationships/all/{-list||types}, property : http://neo4j/db/data/node/160/properties/{key};, self : http://neo4j/db/data/node/160;, properties : http://neo4j/db/data/node/160/properties;, outgoing_typed_relationships : http://neo4j/db/data/node/160/relationships/out/{-list||types}, incoming_relationships : http://neo4j/db/data/node/160/relationships/in , extensions : { }, create_relationship : http://neo4j/db/data/node/160/relationships;, paged_traverse : http://neo4j/db/data/node/160/paged/traverse/{returnType}{?pageSize,leaseTime} , all_relationships : http://neo4j/db/data/node/160/relationships/all;, incoming_typed_relationships : http://neo4j/db/data/node/160/relationships/in/{-list||types} Is there a config on the neo4j server that I can set to make it either return the lb URI https://10.0.0.1; as the base uri or return relative path in the result? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Design help for G+ like app
Hi, I'm our company software architect, and I'm new to GraphDBs. But as we're building a Google+-like, we realized the need for something like Neo4j. And as this community seems the best, we settle for you guys :) Anyway. Onto the design. Call us fools, but we're trying to redo Google+ (except for kids). I need help with the design, for starters. Here's the Domain: - Users - Users have friends - Users can place friends in one or more group (circle for G+), groups being only visible to the user creating them. - Users can create posts, which are visible either by all his friends or only one or more groups. I realize the hardest part is to retrieve feeds. For example, I want the posts feed for user X for his group G. Here's what I envision: - User are nodes - Users have FRIEND_WITH relationships (direction being the initial requester to the other) - Groups are nodes. - Group has a CREATED_BY relationship to user - Group has BELONGS_TO relationships to multiple users - Post are nodes - Post has CREATED_BY relationship to the user - Post has VISIBLE_TO relationship to one or more groups - PostingEvent is a node with a timestamp property - PostingEvent has a RELATED_TO relationship to the user and the post And we would have a timeline index (Lucene or B-tree, I have no idea) for feeds retrieval. 1. Do you see issues with my design? 2. What to do with postings to All my friends, do I create a All friends group? In that case do I still need the user-to-user relationships? 3. I never worked with timeline indexes and such, so I could use some readings on the subject, even theorical ones, even dead-tree books. Please don't hesitate to make recommendations. Thanks ! Antoine -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Design-help-for-G-like-app-tp3353185p3353185.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] how to get the User who has been B Followed who has Followed Back.
Hi, I have some relation like this: http://neo4j-community-discussions.438527.n3.nabble.com/file/n3352328/follow.jpg what should I do to get the users who has been B Followed and has Followed back to B. In the image the result should be (A). In Gremlin (http://docs.neo4j.org/chunked/snapshot/gremlin-plugin.html) Marko will provide, have no time to test it to be exact :) If I understand your query correctly, then its: g.v(1).out.filter{it.out[[id:1]].hasNext()} Start from vertex 1(B), go to its outgoing adjacent neighbors. For each of those neighbors, make sure there is at least one link back to vertex 1. HTH, Marko. http://markorodriguez.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4j graph collections introduction of NodeCollection interface
Hi Niels, Probably is a good idea. I will try to get something done around that soon, flat out with work issues/features at present (including a nice concurrency bug, argh). Cheers Bryce On Wed, Sep 21, 2011 at 2:01 AM, Niels Hoogeveen pd_aficion...@hotmail.comwrote: Hi Bryce, Sorry for the late response. I understand it's difficult to come up with a really good use-case for making NodeCollection more general in the context of IndexedRelationships, but I like to think of that interface as something we can eventually use for all sorts of collections, not just the ones derived from SortedTree. There is of course the issue that relationships can not attach to relationships, so collections of relationships will need to be addressed by Id. This is not necessarily a bad thing, because it decouples the container and the elements. In other words the container knows what elements it contains, but the elements don't know in what containers they are placed. Another option would be to create shadow nodes for contained relationships. Instead of adding a relationships to the collection, its shadow node is added and both the shadow node and the relationship contain pointers (properties with Id values) towards each other. I think it would be best if we do indeed create a GraphCollection interface parameterized by T extends PropertyContainer even if that type parameter for now is always a Node. It doesn't add much complexity now to do it, and later on we may regret it and by then it becomes harder to do because there is an installed base. Niels Date: Sat, 17 Sep 2011 14:19:04 +1200 From: bryc...@gmail.com To: user@lists.neo4j.org Subject: Re: [Neo4j] Neo4j graph collections introduction of NodeCollection interface Hi Niels, I had wondered about having a collection interface that covered both nodes and relationships. There were a couple of reasons I didn't go with that right now, though well worthwhile discussing it and going with a GraphCollection super interface if it fits properly. Firstly I wanted to get something out there so people could have a look, and having something that matched what IndexedRelationship currently required was easiest first step. Biggest thing specific in there to that functionality is the addNode method returning a relationship. The other issue was more wondering how a relationship collection would work. Say I have a relationship collection, and I have a relationship R1 between node A and B, how am I going to represent that relationship withing some graph based data structure that makes sense. There could be a node X that is part of the relationship collection data structure (e.g. tree) and that node could have an attribute that has the relationship id on it, but that doesn't seem like it would be very performant. There could be a relationship between X and A that also gave the relationship type of R1, so you could find the relationship based on that, but there isn't any guarantee of the relationship type being unique. What it would need to properly model it is the ability to have a relationship between X and R1, i.e. a relationship from a node to a relationship. If instead of being able to add any given relationship to the relationship collection you instead restrict it to being relationships matching a certain criteria from a given node then it is practically the same thing as a relationship expander. Or if you instead have a way through the relationship collection to create relationships from a given node to a set of other arbitrary nodes, with the relationship collection having a fixed relationship type and direction, then that is practically the current IndexedRelationship. I guess a way it could work is similar to IndexedRelationship, basically more general case of that class, where you have a method on the relationship collection createRelationship(startNode, endNode, relationshipType, direction) that was then stored in an internal data structure to create a pseudo relationship between the start and end, and then being able to iterate over this set of relationships. Not sure exactly what the use case of that would be. Maybe of more interest could be the same situation where the relationship type and direction are fixed, then you may have a friend of set of relationships that you create between arbitrary nodes and then iterate over all of those. I can't personally think of a good way of adding a set of arbitrary relationships into a collection stored in a graph data structure. Thoughts? Cheers Bryce P.S. Peter, I had thought to remove the passing in of the graph database and instead just getting it from the node, or only passing in the graph database and creating the node internally. On Sat, Sep 17, 2011 at 2:19 AM, Niels Hoogeveen pd_aficion...@hotmail.comwrote: Hi Bryce, I really like what
Re: [Neo4j] Representing relationship strength
Relationships can carry a data payload. You could introduce a weight property there. -- Tatham -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of editor Sent: Wednesday, 21 September 2011 12:52 AM To: user@lists.neo4j.org Subject: [Neo4j] Representing relationship strength I'm looking into a persistant representation of a naive Bayesian classifier using a graph database. I have three basic object types: users, words and and topics. The relationships between these nodes would represent the strength of their connection -- a probability between zero and one. To query the graph I would traverse relationships from user to topic, using the strength of connections to represent connectedness. Querying could potentially take a more neural net-like form. I'm still quite naive myself when it comes to graph databases, but a Bayesian classifier seems to be a good fit for a graph model like Neo4j. That said, in my background research I haven't seen a way to represent the strength of connections, just the binary relationship of whether two objects are connected or not. Can anyone comment on the feasibility of a Neo4j implementation of a Bayesian classifier? Are there ways I might be able to represent relationship strength using Neo4j primitives? -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Representing-relationship-strength-tp3352296p3352296.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Creating a graph database with BatchInserter and getting the node degree of every node
Stephan, what's the size of your db? if it's under 10G, how about just dump the full directory into to a ramfs. leave 1G to jvm and it'll do heavy io on the ramfs. i think it's a simple solution and could yield interesting result. please let me know the result if you tried. thanks On Tue, Sep 20, 2011 at 5:41 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Steven, the index is built into the DB, so you can use something like http://docs.neo4j.org/chunked/snapshot/tutorials-java-embedded-index.html to index all your nodes into Lucene (in one index, the node as key, the number of relationships as numeric value when creating them). When reading, you would simply request all keys from the index and iterate over them. I am not terribly sure how much fast it is, but given that you are just loading up documents, Lucene should be reasonably fast. Let us know if that works out! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Tue, Sep 20, 2011 at 6:01 PM, st3ven st3...@web.de wrote: Hello Peter, it's a pity that neo4j doesn't support full graph-scans. Is there maybe a possibility to cache more relationships to speed things up a little bit. I recognized that only the iteration over the relationships is taking hours. The time to get all relationships of one node is quite fast. I think I could try your second solution: - Store the relationships as a property in an Index (e.g. Lucene) and as the index for all entries. Thus, you are using an index for what it is good at - global operations over all documents. But I didn't understood it correctly. Do you mean an Index which stores the ID of a relationship and creating such an Index for every node? Could you maybe give me a code example for that? That would be very kind of you. The first solution is not really realizable, because I don't know the number of relationships of every node. I would have to count the relationships before the insertion and that would make my database useless for the node degree query. Thank you very much for your help! Cheers, Stephan -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Creating-a-graph-database-with-BatchInserter-and-getting-the-node-degree-of-every-node-tp3351599p3352509.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Best regards Linan Wang ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] how to get the User who has been B Followed who has Followed Back.
hi Peter, This can get the result.But if I want to contain B's Friends too.Should I use this? http://neo4j-community-discussions.438527.n3.nabble.com/file/n3354221/follow%26friend.jpg Use: START b=(node_auto_index,'name:B') MATCH a-[:FOLLOW]-b-[:FOLLOW]-a or a-[:FRIEND]-b-[:FRIEND]-a RETURN a OR: START b=(node_auto_index,'name:B') MATCH a-[r]-b-[r]-a RETURN where r.TYPE=FOLLOW or r.TYPE=FRIEND a Both of them can get the result or not? -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-to-get-the-User-who-has-been-B-Followed-who-has-Followed-Back-tp3352328p3354221.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Querying multivalued properties
Hi, I need multivalued properties in my application. However, I don't know how to make queries based on them using indexes. I need to search nodes that have a certain value inside their multivalued properties (arrays). Does anybody know how can I do that? I couldn't find anything in the documentation. Thanks in advance! Alexandre. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user