2010/7/23 Arijit Mukherjee <ariji...@gmail.com> > Thanx to both of you. Yes, I can just check whether the label exists > on the node or not. In my case checking for Integer.MIN_VALUE which is > what is assigned when the subscriber node is created. > To assign a temporary value (or a value representing the state "not assigned") seems unecessary. A better way would be to not set that property on creating a node and then use:
node.getProperty( "whatever key", Integer.MIN_VALUE ); when getting that property. > BTW - is it ever possible to label the components while creating the > graph? I can't think of any way of doing this - but I might be missing > something... > > Regards > Arijit > > On 22 July 2010 20:54, Vitor De Mario <vitordema...@gmail.com> wrote: > > As far as the algorithm goes, I see nothing wrong. Connected components > is a > > well known problem in graph theory, and you're doing just fine. > > > > I second the recommendations of Tobias, specially the second one, as you > > would get rid of the labelled collection completely, and that improves > you > > both in time and memory. > > > > []'s > > Vitor > > > > > > On Thu, Jul 22, 2010 at 11:35 AM, Tobias Ivarsson < > > tobias.ivars...@neotechnology.com> wrote: > > > >> The first obvious thing is that labelled.contains(currentNode.getId()) > is > >> going to take more time as your dataset grows, since it's a linear > search > >> for the element in an ArrayList. A HashSet would be a much more > appropriate > >> data structure for your application. > >> > >> The other thing that comes to mind is the memory overhead of the > >> labelled-collection. Eventually it is going to contain every node in the > >> graph, and be very large. This "steals" some of the memory that could > have > >> been used for caching the graph, forcing Neo4j to do more I/O than it > would > >> have if it could have used that memory for cache. > >> > >> Would it be possible for you to replace > >> the !labelled.contains(currentNode.getId())-check with > >> currentNode.getProperty("componentID",null) == null? Or are there > >> situations > >> where the node could have that property and not be considered labeled? > >> > >> Cheers, > >> Tobias > >> > >> On Thu, Jul 22, 2010 at 3:35 PM, Arijit Mukherjee <ariji...@gmail.com > >> >wrote: > >> > >> > Hi All > >> > > >> > I'm trying to label all connected components in a graph - i.e. all > >> > nodes that are connected will have a common "componentID" property > >> > set. I'm using the Traverser to do this. For each node in the graph > >> > (unless it is already labelled, which I track by inserting the node ID > >> > in a list), the traverser finds out all the neighbours using BFS, and > >> > then the node and all the neighbours are labelled with a certain > >> > value. The code is something like this - > >> > > >> > Iterable<Node> allNodes = graphDbService.getAllNodes(); > >> > ArrayList labelled = new ArrayList(); > >> > for (Node currentNode : allNodes) { > >> > if (currentNode.hasProperty("number") && > >> > !labelled.contains(currentNode.getId())) { > >> > Traverser traverser = > >> > currentNode.traverse(Order.BREADTH_FIRST, > >> > StopEvaluator.END_OF_GRAPH, > >> > ReturnableEvaluator.ALL_BUT_START_NODE, > >> > RelTypes.CALLS, Direction.BOTH); > >> > int currentID = initialID; > >> > initialID++; > >> > currentNode.setProperty("componentID", currentID); > >> > labelled.add(currentNode.getId()); > >> > for (Node friend : traverser) { > >> > friend.setProperty("componentID", currentID); > >> > // mark each node as labelled > >> > labelled.add(friend.getId()); > >> > } > >> > } > >> > } > >> > > >> > This works well for a small graph (2000 nodes). But for a graph of > >> > about 1 million nodes, this is taking about 45 minutes on a 64-bit > >> > Intel 2.3GHz CPU, 4GB RAM (Java 1.6 update 21 and Neo4J 1.0). > >> > > >> > Is this normal? Or is the code I'm using faulty? Is there any other > >> > way to label the connected components? > >> > > >> > Regards > >> > Arijit > >> > > >> > -- > >> > "And when the night is cloudy, > >> > There is still a light that shines on me, > >> > Shine on until tomorrow, let it be." > >> > _______________________________________________ > >> > Neo4j mailing list > >> > User@lists.neo4j.org > >> > https://lists.neo4j.org/mailman/listinfo/user > >> > > >> > >> > >> > >> -- > >> Tobias Ivarsson <tobias.ivars...@neotechnology.com> > >> Hacker, Neo Technology > >> www.neotechnology.com > >> Cellphone: +46 706 534857 > >> _______________________________________________ > >> Neo4j mailing list > >> User@lists.neo4j.org > >> https://lists.neo4j.org/mailman/listinfo/user > >> > > _______________________________________________ > > Neo4j mailing list > > User@lists.neo4j.org > > https://lists.neo4j.org/mailman/listinfo/user > > > > > > -- > "And when the night is cloudy, > There is still a light that shines on me, > Shine on until tomorrow, let it be." > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user