#8372: split up incidence_matrix() over graph.py and digraph.py
-------------------------------+--------------------------------------------
   Reporter:  mvngu            |       Owner:  rlm       
       Type:  enhancement      |      Status:  needs_work
   Priority:  major            |   Milestone:  sage-4.3.4
  Component:  graph theory     |    Keywords:            
     Author:  Minh Van Nguyen  |    Upstream:  N/A       
   Reviewer:                   |      Merged:            
Work_issues:                   |  
-------------------------------+--------------------------------------------
Changes (by mvngu):

  * status:  needs_info => needs_work


Comment:

 Replying to [comment:2 rbeezer]:
 > Columns of the matrix get sorted (as column vectors) once they are all
 built.  I recognize this is the current behavior, and that this makes a
 "pretty" matrix, highlighting structure.  But now it is much harder to
 connect columns back to the edges they correspond to.

 To me, it doesn't make any difference whether or not we have the
 statements
 {{{
 #!python
 cols.sort()
 return matrix(cols, sparse=sparse).transpose()
 }}}
 to sort the columns before feeding them to the matrix constructor. I don't
 see any benefits either way mainly because to me incidence matrices don't
 care about edge labels. For undirected graphs, the incidence matrix with
 sorted columns
 {{{
 [0 0 1]
 [0 1 0]
 [1 0 0]
 [1 1 1]
 }}}
 and the incidence matrix without sorted columns
 {{{
 [1 0 0]
 [0 1 0]
 [0 0 1]
 [1 1 1]
 }}}
 result in isomorphic graphs. I have commented out the line that sorts the
 columns. If in the future anyone wants to sort the columns as per the
 above two lines, they could add a keyword, say, `sort=[True|False]` which
 defaults to `False`. For the moment, I leave this enhancement out of the
 patch as it is beyond the scope of what this ticket is about.
 [[BR]][[BR]]


 > Second question.  I'm uncertain about putting a "2" into the incidence
 matrix for a loop.  I'd prefer to think of a loop in a digraph (one
 without multiple edges) as a directed edge back to the same vertex, so I'd
 sum +1 and -1 at that vertex (to get zero).

 Here are some motivations for having "2" designating self-loops:

  * For an undirected graph `G` with unoriented incidence matrix `M`, a row
 sum corresponds to the degree of a vertex. This holds even if `G` has
 self-loops or `G` is a multi-undirected graph. Any column sum is equal to
 2 for the following cases: `G` is simple, `G` has self-loops, or `G` is a
 multi-undirected graph. If self-loops are ignored, i.e. assigned the value
 "0" to the relevant entry in a row, I see no easy way to reconstruct the
 original graph from the incidence matrix.
  * Let `G` be an undirected (or multi-undirected) graph with oriented
 incidence matrix `M`, and say we use "2" to designate self-loops. We can
 use the follow method to recover the vertex degree. For each row `r` of
 `M`, let `n` (meaning "negative") be the frequency of "-1" in `r`, let `p`
 (meaning "positive") be the frequency of "1" in `r`, and let `t` (meaning
 "two") be twice the frequency of "2" in `r` because a self-loop
 contributes two to the total degree of a vertex. Then the degree of the
 vertex corresponding to `r` is given by `n + p + t`. Using "0" for self-
 loops, I don't see any easy way to recover the degree of a vertex. Now in
 general, each column sum of `M` is not always zero. But this loss is
 compensated for by the knowledge that: the degree of any vertex is
 recoverable; if the sum of any column `c` is 2, then `c` corresponds to a
 self-loop. Using "0" for self-loops would result in the loss of two vital
 pieces of information: the exact degree of each vertex; and whether or not
 an edge is a self-loop. Another loss is that we cannot easily (accurately)
 recover `G` from `M`.
  * The argument for the case of digraphs (or multidigraphs) is similar to
 that for the oriented incidence matrix of an undirected (or multi-
 undirected) graph. That is, using "0" for self-loops would result in the
 loss of such information as: the exact degree of each vertex; whether a
 column represents a self-loop; and no easy way to accurately recover the
 original (multi)digraph from its incidence matrix.
 [[BR]][[BR]]


 > You can construct a {{{DiGraph}}} from an incidence matrix, and this
 form of construction explicitly checks for a single +1 and a single -1 in
 each column.  So the (hopefully) inverse processes below don't work.  My
 suggestion of a zero breaks also.  Maybe throw an error for a
 {{{DiGraph}}} with loops?  Using a zero preserves the property that the
 sum of the rows of the matrix is the all-zeros vector, which I think is as
 important some of the column-sum properties.

 The more I delve into the graph theory code to enhance the incidence
 matrix implementation, the greater is the urge to first update NetworkX to
 version 1.0.1. With the definition of incidence matrix as documented in
 the patch, one could rely on that definition to modify the graph and
 digraph constructors to take account of the case where "2" represents
 self-loops. I have thought about implementing this. But half way through
 my modification of the relevant constructors, I get the feeling that any
 more patches to the graph theory module of Sage would result in more bit
 rotting of the patches at ticket #7608 for upgrading to NetworkX 1.0.1,
 and more work to get NetworkX upgraded.
 [[BR]][[BR]]


 > Haven't thought about how all this affects getting the Laplacian as the
 product of the incidence matrix with its transpose (which might make a
 nice doctest).

 I believe the modification I proposed won't affect the computation of
 Laplacian matrices from incidence matrices of undirected loopless graphs.
 [[BR]][[BR]]


 {{{
 sage: D=DiGraph({0:[0], 1:[2]})
 sage: D.incidence_matrix()
 [ 0  2]
 [-1  0]
 [ 1  0]
 sage: E=DiGraph(D.incidence_matrix())
 <BOOM>
 ValueError: Non-symmetric or non-square matrix assumed to be an incidence
 matrix: There must be two nonzero entries (-1 & 1) per column.
 }}}

 This can be fixed by modifying the constructors of graph and digraph.
 [[BR]][[BR]]

 As much as I want to get the incidence matrix enhancement patch into Sage
 as soon as possible, I think the upgrading to NetworkX 1.0.1 now takes
 high priority on my todo list. Producing a new patch for the incidence
 matrix enhancement is easy, but the more difficult task is to ensure that
 patches at #7608 don't bit rot. After taking a day off thinking about this
 matter, I'll postpone the current ticket and instead work on #7608. I have
 uploaded a half-finished patch that improves on the previous version. I
 want to put it here so that I could resume work on it later on. Sorry for
 having troubled you to review the code so far.

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/8372#comment:4>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sage-trac?hl=en.

Reply via email to