The union find algorithm would be appropriate for labeling connected
components as you build the graph.

Andrew Baine

On Thursday, July 22, 2010, Arijit Mukherjee <ariji...@gmail.com> wrote:
> Thanx to both of you. Yes, I can just check whether the label exists
> on the node or not. In my case checking for Integer.MIN_VALUE which is
> what is assigned when the subscriber node is created.
>
> BTW - is it ever possible to label the components while creating the
> graph? I can't think of any way of doing this - but I might be missing
> something...
>
> Regards
> Arijit
>
> On 22 July 2010 20:54, Vitor De Mario <vitordema...@gmail.com> wrote:
>> As far as the algorithm goes, I see nothing wrong. Connected components is a
>> well known problem in graph theory, and you're doing just fine.
>>
>> I second the recommendations of Tobias, specially the second one, as you
>> would get rid of the labelled collection completely, and that improves you
>> both in time and memory.
>>
>> []'s
>> Vitor
>>
>>
>> On Thu, Jul 22, 2010 at 11:35 AM, Tobias Ivarsson <
>> tobias.ivars...@neotechnology.com> wrote:
>>
>>> The first obvious thing is that labelled.contains(currentNode.getId()) is
>>> going to take more time as your dataset grows, since it's a linear search
>>> for the element in an ArrayList. A HashSet would be a much more appropriate
>>> data structure for your application.
>>>
>>> The other thing that comes to mind is the memory overhead of the
>>> labelled-collection. Eventually it is going to contain every node in the
>>> graph, and be very large. This "steals" some of the memory that could have
>>> been used for caching the graph, forcing Neo4j to do more I/O than it would
>>> have if it could have used that memory for cache.
>>>
>>> Would it be possible for you to replace
>>> the !labelled.contains(currentNode.getId())-check with
>>> currentNode.getProperty("componentID",null) == null? Or are there
>>> situations
>>> where the node could have that property and not be considered labeled?
>>>
>>> Cheers,
>>> Tobias
>>>
>>> On Thu, Jul 22, 2010 at 3:35 PM, Arijit Mukherjee <ariji...@gmail.com
>>> >wrote:
>>>
>>> > Hi All
>>> >
>>> > I'm trying to label all connected components in a graph - i.e. all
>>> > nodes that are connected will have a common "componentID" property
>>> > set. I'm using the Traverser to do this. For each node in the graph
>>> > (unless it is already labelled, which I track by inserting the node ID
>>> > in a list), the traverser finds out all the neighbours using BFS, and
>>> > then the node and all the neighbours are labelled with a certain
>>> > value. The code is something like this -
>>> >
>>> >        Iterable<Node> allNodes = graphDbService.getAllNodes();
>>> >        ArrayList labelled = new ArrayList();
>>> >        for (Node currentNode : allNodes) {
>>> >            if (currentNode.hasProperty("number") &&
>>> > !labelled.contains(currentNode.getId())) {
>>> >                Traverser traverser =
>>> > currentNode.traverse(Order.BREADTH_FIRST,
>>> >                        StopEvaluator.END_OF_GRAPH,
>>> >                        ReturnableEvaluator.ALL_BUT_START_NODE,
>>> >                        RelTypes.CALLS, Direction.BOTH);
>>> >                int currentID = initialID;
>>> >                initialID++;
>>> >                currentNode.setProperty("componentID", currentID);
>>> >                labelled.add(currentNode.getId());
>>> >                for (Node friend : traverser) {
>>> >                    friend.setProperty("componentID", currentID);
>>> >                    // mark each node as labelled
>>> >                    labelled.add(friend.getId());
>>> >                }
>>> >           }
>>> >     }
>>> >
>>> > --
> "And when the night is cloudy,
> There is still a light that shines on me,
> Shine on until tomorrow, let it be."
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>

-- 
Andrew Baine
cell: 202-494-8716
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to