2010/7/23 Arijit Mukherjee <ariji...@gmail.com>

> Thanx to both of you. Yes, I can just check whether the label exists
> on the node or not. In my case checking for Integer.MIN_VALUE which is
> what is assigned when the subscriber node is created.
>
To assign a temporary value (or a value representing the state "not
assigned") seems unecessary. A better way would be to not set that property
on creating a node and then use:

    node.getProperty( "whatever key", Integer.MIN_VALUE );

when getting that property.


> BTW - is it ever possible to label the components while creating the
> graph? I can't think of any way of doing this - but I might be missing
> something...
>
> Regards
> Arijit
>
> On 22 July 2010 20:54, Vitor De Mario <vitordema...@gmail.com> wrote:
> > As far as the algorithm goes, I see nothing wrong. Connected components
> is a
> > well known problem in graph theory, and you're doing just fine.
> >
> > I second the recommendations of Tobias, specially the second one, as you
> > would get rid of the labelled collection completely, and that improves
> you
> > both in time and memory.
> >
> > []'s
> > Vitor
> >
> >
> > On Thu, Jul 22, 2010 at 11:35 AM, Tobias Ivarsson <
> > tobias.ivars...@neotechnology.com> wrote:
> >
> >> The first obvious thing is that labelled.contains(currentNode.getId())
> is
> >> going to take more time as your dataset grows, since it's a linear
> search
> >> for the element in an ArrayList. A HashSet would be a much more
> appropriate
> >> data structure for your application.
> >>
> >> The other thing that comes to mind is the memory overhead of the
> >> labelled-collection. Eventually it is going to contain every node in the
> >> graph, and be very large. This "steals" some of the memory that could
> have
> >> been used for caching the graph, forcing Neo4j to do more I/O than it
> would
> >> have if it could have used that memory for cache.
> >>
> >> Would it be possible for you to replace
> >> the !labelled.contains(currentNode.getId())-check with
> >> currentNode.getProperty("componentID",null) == null? Or are there
> >> situations
> >> where the node could have that property and not be considered labeled?
> >>
> >> Cheers,
> >> Tobias
> >>
> >> On Thu, Jul 22, 2010 at 3:35 PM, Arijit Mukherjee <ariji...@gmail.com
> >> >wrote:
> >>
> >> > Hi All
> >> >
> >> > I'm trying to label all connected components in a graph - i.e. all
> >> > nodes that are connected will have a common "componentID" property
> >> > set. I'm using the Traverser to do this. For each node in the graph
> >> > (unless it is already labelled, which I track by inserting the node ID
> >> > in a list), the traverser finds out all the neighbours using BFS, and
> >> > then the node and all the neighbours are labelled with a certain
> >> > value. The code is something like this -
> >> >
> >> >        Iterable<Node> allNodes = graphDbService.getAllNodes();
> >> >        ArrayList labelled = new ArrayList();
> >> >        for (Node currentNode : allNodes) {
> >> >            if (currentNode.hasProperty("number") &&
> >> > !labelled.contains(currentNode.getId())) {
> >> >                Traverser traverser =
> >> > currentNode.traverse(Order.BREADTH_FIRST,
> >> >                        StopEvaluator.END_OF_GRAPH,
> >> >                        ReturnableEvaluator.ALL_BUT_START_NODE,
> >> >                        RelTypes.CALLS, Direction.BOTH);
> >> >                int currentID = initialID;
> >> >                initialID++;
> >> >                currentNode.setProperty("componentID", currentID);
> >> >                labelled.add(currentNode.getId());
> >> >                for (Node friend : traverser) {
> >> >                    friend.setProperty("componentID", currentID);
> >> >                    // mark each node as labelled
> >> >                    labelled.add(friend.getId());
> >> >                }
> >> >           }
> >> >     }
> >> >
> >> > This works well for a small graph (2000 nodes). But for a graph of
> >> > about 1 million nodes, this is taking about 45 minutes on a 64-bit
> >> > Intel 2.3GHz CPU, 4GB RAM (Java 1.6 update 21 and Neo4J 1.0).
> >> >
> >> > Is this normal? Or is the code I'm using faulty? Is there any other
> >> > way to label the connected components?
> >> >
> >> > Regards
> >> > Arijit
> >> >
> >> > --
> >> > "And when the night is cloudy,
> >> > There is still a light that shines on me,
> >> > Shine on until tomorrow, let it be."
> >> > _______________________________________________
> >> > Neo4j mailing list
> >> > User@lists.neo4j.org
> >> > https://lists.neo4j.org/mailman/listinfo/user
> >> >
> >>
> >>
> >>
> >> --
> >> Tobias Ivarsson <tobias.ivars...@neotechnology.com>
> >> Hacker, Neo Technology
> >> www.neotechnology.com
> >> Cellphone: +46 706 534857
> >> _______________________________________________
> >> Neo4j mailing list
> >> User@lists.neo4j.org
> >> https://lists.neo4j.org/mailman/listinfo/user
> >>
> > _______________________________________________
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
>
>
>
> --
> "And when the night is cloudy,
> There is still a light that shines on me,
> Shine on until tomorrow, let it be."
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to