#8372: split up incidence_matrix() over graph.py and digraph.py
-------------------------------+--------------------------------------------
Reporter: mvngu | Owner: rlm
Type: enhancement | Status: needs_work
Priority: major | Milestone: sage-4.3.4
Component: graph theory | Keywords:
Author: Minh Van Nguyen | Upstream: N/A
Reviewer: | Merged:
Work_issues: |
-------------------------------+--------------------------------------------
Changes (by mvngu):
* status: needs_info => needs_work
Comment:
Replying to [comment:2 rbeezer]:
> Columns of the matrix get sorted (as column vectors) once they are all
built. I recognize this is the current behavior, and that this makes a
"pretty" matrix, highlighting structure. But now it is much harder to
connect columns back to the edges they correspond to.
To me, it doesn't make any difference whether or not we have the
statements
{{{
#!python
cols.sort()
return matrix(cols, sparse=sparse).transpose()
}}}
to sort the columns before feeding them to the matrix constructor. I don't
see any benefits either way mainly because to me incidence matrices don't
care about edge labels. For undirected graphs, the incidence matrix with
sorted columns
{{{
[0 0 1]
[0 1 0]
[1 0 0]
[1 1 1]
}}}
and the incidence matrix without sorted columns
{{{
[1 0 0]
[0 1 0]
[0 0 1]
[1 1 1]
}}}
result in isomorphic graphs. I have commented out the line that sorts the
columns. If in the future anyone wants to sort the columns as per the
above two lines, they could add a keyword, say, `sort=[True|False]` which
defaults to `False`. For the moment, I leave this enhancement out of the
patch as it is beyond the scope of what this ticket is about.
[[BR]][[BR]]
> Second question. I'm uncertain about putting a "2" into the incidence
matrix for a loop. I'd prefer to think of a loop in a digraph (one
without multiple edges) as a directed edge back to the same vertex, so I'd
sum +1 and -1 at that vertex (to get zero).
Here are some motivations for having "2" designating self-loops:
* For an undirected graph `G` with unoriented incidence matrix `M`, a row
sum corresponds to the degree of a vertex. This holds even if `G` has
self-loops or `G` is a multi-undirected graph. Any column sum is equal to
2 for the following cases: `G` is simple, `G` has self-loops, or `G` is a
multi-undirected graph. If self-loops are ignored, i.e. assigned the value
"0" to the relevant entry in a row, I see no easy way to reconstruct the
original graph from the incidence matrix.
* Let `G` be an undirected (or multi-undirected) graph with oriented
incidence matrix `M`, and say we use "2" to designate self-loops. We can
use the follow method to recover the vertex degree. For each row `r` of
`M`, let `n` (meaning "negative") be the frequency of "-1" in `r`, let `p`
(meaning "positive") be the frequency of "1" in `r`, and let `t` (meaning
"two") be twice the frequency of "2" in `r` because a self-loop
contributes two to the total degree of a vertex. Then the degree of the
vertex corresponding to `r` is given by `n + p + t`. Using "0" for self-
loops, I don't see any easy way to recover the degree of a vertex. Now in
general, each column sum of `M` is not always zero. But this loss is
compensated for by the knowledge that: the degree of any vertex is
recoverable; if the sum of any column `c` is 2, then `c` corresponds to a
self-loop. Using "0" for self-loops would result in the loss of two vital
pieces of information: the exact degree of each vertex; and whether or not
an edge is a self-loop. Another loss is that we cannot easily (accurately)
recover `G` from `M`.
* The argument for the case of digraphs (or multidigraphs) is similar to
that for the oriented incidence matrix of an undirected (or multi-
undirected) graph. That is, using "0" for self-loops would result in the
loss of such information as: the exact degree of each vertex; whether a
column represents a self-loop; and no easy way to accurately recover the
original (multi)digraph from its incidence matrix.
[[BR]][[BR]]
> You can construct a {{{DiGraph}}} from an incidence matrix, and this
form of construction explicitly checks for a single +1 and a single -1 in
each column. So the (hopefully) inverse processes below don't work. My
suggestion of a zero breaks also. Maybe throw an error for a
{{{DiGraph}}} with loops? Using a zero preserves the property that the
sum of the rows of the matrix is the all-zeros vector, which I think is as
important some of the column-sum properties.
The more I delve into the graph theory code to enhance the incidence
matrix implementation, the greater is the urge to first update NetworkX to
version 1.0.1. With the definition of incidence matrix as documented in
the patch, one could rely on that definition to modify the graph and
digraph constructors to take account of the case where "2" represents
self-loops. I have thought about implementing this. But half way through
my modification of the relevant constructors, I get the feeling that any
more patches to the graph theory module of Sage would result in more bit
rotting of the patches at ticket #7608 for upgrading to NetworkX 1.0.1,
and more work to get NetworkX upgraded.
[[BR]][[BR]]
> Haven't thought about how all this affects getting the Laplacian as the
product of the incidence matrix with its transpose (which might make a
nice doctest).
I believe the modification I proposed won't affect the computation of
Laplacian matrices from incidence matrices of undirected loopless graphs.
[[BR]][[BR]]
{{{
sage: D=DiGraph({0:[0], 1:[2]})
sage: D.incidence_matrix()
[ 0 2]
[-1 0]
[ 1 0]
sage: E=DiGraph(D.incidence_matrix())
<BOOM>
ValueError: Non-symmetric or non-square matrix assumed to be an incidence
matrix: There must be two nonzero entries (-1 & 1) per column.
}}}
This can be fixed by modifying the constructors of graph and digraph.
[[BR]][[BR]]
As much as I want to get the incidence matrix enhancement patch into Sage
as soon as possible, I think the upgrading to NetworkX 1.0.1 now takes
high priority on my todo list. Producing a new patch for the incidence
matrix enhancement is easy, but the more difficult task is to ensure that
patches at #7608 don't bit rot. After taking a day off thinking about this
matter, I'll postpone the current ticket and instead work on #7608. I have
uploaded a half-finished patch that improves on the previous version. I
want to put it here so that I could resume work on it later on. Sorry for
having troubled you to review the code so far.
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/8372#comment:4>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sage-trac?hl=en.