Re: [sage-trac] #19014: Tarjan Strongly Connected Components Algorithm

sage-trac Fri, 14 Aug 2015 07:25:12 -0700

#19014: Tarjan Strongly Connected Components Algorithm
-------------------------------------+-------------------------------------
       Reporter:  borassi            |        Owner:
           Type:  enhancement        |       Status:  needs_review
       Priority:  major              |    Milestone:  sage-6.9
      Component:  graph theory       |   Resolution:
       Keywords:  Strongly           |    Merged in:
  connected components, Tarjan       |    Reviewers:
        Authors:  Michele Borassi    |  Work issues:
Report Upstream:  N/A                |       Commit:
         Branch:                     |  0f855c6676d6f10a6f68b64b4c2be693c08eba91
  u/borassi/tarjan_strongly_connected_components_algorithm|     Stopgaps:
   Dependencies:                     |
-------------------------------------+-------------------------------------


Comment (by dcoudert):

 > > I have some remarks:
 > > - Do we really need to have both
 `_tarjan_strongly_connected_components` and
 `tarjan_strongly_connected_components` ? Same for other methods. What's
 the motivation behind?
 >
 > Yes, I think I need both methods. The point is that I want to implement
 a fast algorithm for closeness centrality ![1], and the first step is
 computing SCCs. In that implementation, I will first convert the graph to
 a `static_sparse_graph`, and then I will have to run the SCC algorithm on
 this graph, using the C routine `_tarjan_strongly_connected_components`. I
 also need the other routine to access
 `_tarjan_strongly_connected_components` from Python code (for instance, in
 a .py file). Do you have better ideas?

 If you plan to call these methods directly from other cython files, it
 make sense to split the method in two pieces.
 With Nathann we recently started to name such methods
 `compute_cool_stuff_C` rather than `_compute_cool_stuff`. I don't know if
 there is a general rule for that.

 > > - Have you tried using an array of short instead of a bitset for
 `in_scc_stack` ? We don't have memory issue here so we can save some
 operations, plus you already do that for `visited`
 >
 > Hmmm, I should really understand better how bitset work. I used it only
 because I thought it was faster than standard arrays. In any case, I
 removed all bitsets from this code.

 Well, bitsets are really fast, and to be honest we save very little using
 an array instead. The code is shorter. This is already something.

 > > - I don't really see the benefit of using numpy (I don't know it). Is
 it just more convenient or also faster that other methods? You could use
 something like
 > > {{{
 > > d = {i:list() for i in range(nscc)}
 > > for u,i in scc.iteritems():
 > > d[i].append(u)
 > > output = [d[i] for i in range(nscc)]
 > > }}}
 >
 > Well, probably the problem is that I'm a theoretical guy, and if the
 algorithm is linear I want a linear implementation. This is why I used an
 array and not a dictionary (and, as far as I know, numpy is the only way
 to build an array of Python objects). However, if we use a dictionary, the
 running-time is almost the same, even when there are a lot of SCCs. I
 deleted numpy and now I use dictionaries.
 >
 > {{{
 > WITH NUMPY
 > sage: g = DiGraph(10000000)
 > sage: %timeit g.strongly_connected_components()
 > 1 loops, best of 3: 9.44 s per loop
 >
 > WITH DICTIONARIES
 > sage: g = DiGraph(10000000)
 > sage: %timeit g.strongly_connected_components()
 > 1 loops, best of 3: 9.87 s per loop
 > }}}

 another option (don't know if it is faster or not)
 {{{
 cdef int i
 cdef list output = list(list() for i in range(nscc))  # We cannot use []
 here
 for i,v in enumerate(G.vertex_iterator()):
     output[scc[i]].append(v)
 }}}

 Method `strongly_connected_components_digraph`:
 - is not currently called by
 `DiGraph().strongly_connected_components_digraph()` and will certainly
 never be
 - seems more complicated than what it should. Indeed, as soon as you have
 a mapping `vertex->scc` you can simply iterate over the edges `(u,v)` of
 the input digraph and add edges `(scc[u],scc[v])` to the output digraph. I
 don't see the need for the `_strongly_connected_components_digraph` method
 and it is rather complicated.



 General comments:
 - `if visited[u] == 0:` -> `if not visited[u]`
 - `return [output, {v:scc[i] for i,v in enumerate(G.vertices())}]` ->
 `return output, {v:scc[i] for i,v in enumerate(G.vertices())}`  no need
 for explicitely creating a list
 - you have broken doctests caused by this ticket. I have not tracked the
 dependencies.
 {{{
 sage -t --long src/sage/monoids/automatic_semigroup.py
 **********************************************************************
 File "src/sage/monoids/automatic_semigroup.py", line 135, in
 sage.monoids.automatic_semigroup.AutomaticSemigroup
 Failed example:
     map(sorted, M.j_classes())
 Expected:
     [[[], [2]], [[1, 1], [1]]]
 Got:
     [[[1], [1, 1]], [[], [2]]]
 **********************************************************************
 File "src/sage/monoids/automatic_semigroup.py", line 137, in
 sage.monoids.automatic_semigroup.AutomaticSemigroup
 Failed example:
     M.j_classes_of_idempotents()
 Expected:
     [[[]], [[1, 1]]]
 Got:
     [[[1, 1]], [[]]]
 **********************************************************************
 File "src/sage/monoids/automatic_semigroup.py", line 139, in
 sage.monoids.automatic_semigroup.AutomaticSemigroup
 Failed example:
     M.j_transversal_of_idempotents()
 Expected:
     [[], [1, 1]]
 Got:
     [[1, 1], []]
 **********************************************************************
 1 item had failures:
    3 of  80 in sage.monoids.automatic_semigroup.AutomaticSemigroup
     [264 tests, 3 failures, 1.94 s]
 sage -t --long src/sage/categories/finite_semigroups.py
 **********************************************************************
 File "src/sage/categories/finite_semigroups.py", line 119, in
 
sage.categories.finite_semigroups.FiniteSemigroups.ParentMethods.j_transversal_of_idempotents
 Failed example:
     sorted(S.j_transversal_of_idempotents())
 Expected:
     ['a', 'ab', 'ac', 'acb', 'b', 'c', 'cb']
 Got:
     ['a', 'ab', 'abc', 'ac', 'b', 'bc', 'c']
 }}}

--
Ticket URL: <http://trac.sagemath.org/ticket/19014#comment:13>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/d/optout.

Re: [sage-trac] #19014: Tarjan Strongly Connected Components Algorithm

Reply via email to