On Wednesday, 20 June 2012 00:06:11 UTC+3, Nathann Cohen wrote: > > Hellooooooo everybody !! > > As in the best times, I tried 5 times to answer all who answered this > thread, got angry, erased the message, and began again. But Robert Miller > has a nice saying when we get to such situations : "shut the fuck up and > show me the code". So that's what I will try to do. Give examples instead > of wondering aloud whether the world has gone mad. > > Be warned, for it is a long mail (I --love-- long emails). It also gives > some idea of what is happening in Sage's graph library these days, if > anybody (but me) cares :-) > > 1) Some history : the BipartiteGraph class. > ----------------------------------------------------------- > The purpose of this class was to *at long last* have Graph classes that > represent graph properties. There are two possibilities when writing such a > thing : > a) make it inherit all the Graph functions > b) make it a new class > The problem when picking b) is that this class gets totally useless as it > has none of the interesting methods of Graph. So a) was picked, and of > course most if not all methods of BipartiteGraph break when called on a > BipartiteGraph object. > The reason why is easy : some of the most fundamental graph operations are > invalid. Like add_edge. >
IMHO it's a problem of the Graph class itself. There should be no add_edge or remove_edge there at all. All this mutability stuff must be done elsewhere. (In a subclass DurtyMutableGraph, if you wish) As, indeed, these operations will break most of specific graph properties. > I assure you that not being able to add an edge in a graph is *very* > problematic. > The code ? > > Try the following : > sage: BipartiteGraph(graphs.CompleteBipartiteGraph(3,3)).complement() > > Of course, the complement of a bipartite graph needs not be bipartite, > right ? Yeah, but the graph methods do not know that, nor should they mind > most of the time. Try that : > > sage: BipartiteGraph(graphs.CompleteBipartiteGraph(3,3)).independent_set() > > Basically all graphs functions would need to be rewritten to work with > BipartiteGraph. The only sane design choice would be for BipartiteGraph > *not* to inherit Graph methods, but of course it would be pretty empty and > all methods would need to be copied anyway. > > There is another problem with Bipartite Graph. The add_edge function needs > to be wrapped so that no edge is added to the graph that creates an odd > cycle. This is a costly operation. *VERY* costly operation (compared to > add_edge, of course). And so it was not done this way, for efficiency. > Hence instead of checking whether the graph "plus the new edge" stays > bipartite, the vertices are bicolored from the start, and an exception is > raised when add_edge tries to add an edge that does not fit *the former > bicoloring*. > That's mostly incorrect, does not correspond to the user's intent, but it > is....faster >_< > > sage: g = BipartiteGraph(5) > sage: g.add_edge(0,1) > > This could could be fixed by checking that the graph remains bipartite > each time an edge is added, but of course that comes with a cost, and a > cost that it is not healthy to pay when you *just* want to add an edge. The > most natural explanation behind is that you usually know what you are doing > when you add an edge to a graph, and that you should know for yourself > whether you graph remains bipartite or not. If not, please check it with > is_bipartite when you need it, but *do not make everybody's code 10000 > times slower out of sheer laziness*. > > At some point somebody proposed to override the complement() function so > that the complement of a graph stays bipartite : this can be achieved by > switching edges and nonedges *between the two sets* and nowhere else. > Though this operation makes perfect sense in graph theory, it would of > course be very dangereous to replace the .complement() function by doing > that, for all algorithms that take a BipartiteGraph as an instance and > compute graph complements would then return wrong result *and not know it*. > > Sooooooo, these are the problems created by a property we want to keep on > a graph. It is not *yet* about immutable graphs, because NONE of these > methods would be available at all. This would also prevent you from running > any method that think they can use the other ones, even when they do not > actually change the graph itself (like independent_set()). > > I said it several times, but for me this class should be removed. It is > not just bad from the computational point of view, but it also creates > problems that new users have no idea how to cope with, because what the > methods do is not what they have in mind. > > 2) Immutable graphs : storing the graph's properties > ----------------------------------------------------------------------- > > Recently in Sage was merged a patch which is an interface with the graph > class database ISGCI [1,2]. There are many "graph classes" (let us > understand this as 'graph property' for the length of this discussion), and > this database can tell us to some extend which properties are implied by > others. When I first wrote this patch, what I had in mind is precisely what > Dima mentionned : some algorithms work for general graphs and have a bad > running time, some other algorithms have a better running time because they > are only meant to be run on graphs with a specific property [3]. > This is tempting, because the graph theory litterature is full of this > kind of algorithms "which work only if my graph has this and this and > that". Hence the goal was to run on each graph the implementation most > adapted to it. > > ~~~~~~ > By the way Dima, there are "very sane ways" to write > perfect-graph-specific algorithms in Graph.py, hence that can be accessed > for all graphs. What would you think of : > Graph.independent_set(algorithm = "perfect_graph") ? > But it would return rubbish on a non-perfect graph, won't it? This (non-generic) implementation of independent_set() for perfect graphs does not belong to this class. It must be in a subclass implementing perfect graphs, which should overload independent_set() with the more efficient version. It would not be the default algorithm, it would not be run by mistake, and > because perfect independent set are not only interesting for perfect > graphs, this is actually what would make the most sense. No new class, no > overhead over the most important graph functions... > ~~~~~~ > > Of course, we cannot attempt, each time independent_set is run, to check > all graphs properties for which we have a specific algorithm. Unless the > current implementation (very bad, but -correct-) is changed, it is very > costly to check whether a graph is perfect. Much more than to compute an > independent set, actually. > > Hence it would be nice to have a class of immutable graphs, to which could > be associated (cached) some properties that would not have to be computed > over and over afterwards. The most natural question, now : what about the > a) and b) options from the BipartiteGraph class ? Should they inherit all > methods, or none ? > My personal advice is clear --> absolutely none. Unless somebody can think > of a way to avoid all the problems we had with BipartiteGraph. Hence > immutable graphs would only be used for "storage purposes", and the > combinat guys already (repeatedly) asked for such a thing, as they > sometimes need their graphs to be hashable (and to store them in sets, or > as dictionary keys). It also happened to me recently, and I had to store > spar6_string instead. Not so bad, but not a good answer either. > I also wrote some kind of static graph structure myself -- low level ones > -- because you can perform some optimizations when you know from the start > that the graph will not be modified. Having a static graph structure thus > is a way to have a faster data structure, which we use for instance to > compute distances between all pairs of vertices [4]. > In particular, the purpose of these speedups is to save the time it takes > to call has_edge, or to iterate over the neighbors of a graph. > In particular, it is totally unthinkable that graphs you actually want to > compute things on could be immutable. We already lose time compared to > networkX for stupid things like add/remove edges, there is no way on earth > that add_edge would copy the whole thing. > > Dima, I have been working with Thomas on groups recently, with GAP and > Magma, which was pretty new to me. Of course their objects are usually > immutable, of course you create them once and never touch them, but the way > graph theory works is totally different. We spend our lifetimes adding and > removing edges, and vertices, and building a complete graph on n vertices > edge by edge *cannot* take n^4 time, which is what happens if you copy the > graph each time you want to add an edge Try working with GRAPE in GAP. Surely you can add and remove edges there. I am perfectly aware of this. GRAPE is not object-oriented in any way, that's why you can do whatever you want with your graph. Honestly. All algorithms are built like that. Any greedy algorithm works > like that, and graph theory has a lot of them. If you want to compute > matchings, if you want to compute decompositions, god, everything is being > done step by step, corrected 10 times, and you just cannot afford to copy > your whole data each time. > > 3) Splitting the files > --------------------------- > > We have something like "three files" for graph methods (data structures > excluded). generic_graph, graph, and digraph. The graph one is pretty > thick, and should be even thicker considering that some methods stored in > generic_graph should be moved to graph.py (old mistakes, most probably > mine). I do not see among the methods any clear way to split them (I > already had to list all graphs methods thematically twice, each time I > reached the same conclusion -- that it is totally arbitrary and unnatural), > so I will refrain from doing it myself. Of course anybody else can write a > patch doing that, which somebody else will review, Sage is a free software > so my personal advice does not matter much. But I do not see how it could > be done smartly. > Some recent additions to Sage's graph library were done in modules. These > functions are not necessarily made available through functions from the > Graph class, but can be imported. > I think that it is the best way to procede now that the algorithms we add > get more and more specific, somehow "research-level", at least in the view > I have of graphs ("topological things", like connectivity, distances, > sparse graphs, that stuff). > > Having a large amount of graph function available in modules only is also > a way to get the users used to query the documentation when they wonder > what Sage can do, and not only the automatic completion of "g.<tab>". Many > graph methods are not that fundamental anyway, and the same way that it > would not make sense to import them systematically to Graph, there is no > sense in creating a "GraphThatYouWantToColor" class whose methods would be > the coloring-related stuff (currently stored in graph_coloring.py). > > Hence, as this thread originally suggested, I think the best is to work > with modules, and to lessen the load of Graph.py by "importing" the useful > functions there, which just requires one line (thank you Florent for your > trick !!!). > I am sorry to only think of the matrix0, matrix1, matrix2 and matrix3 > files as a joke. The only reason I see to keep them this way is that "we > always kept them this way", and that people are usually scared of backward > compatibility. Of course, I do not know this code at all, so I will not > come and dirty it with my hands. > Jason suggested that it originally came from a technical problem in > Cython, and that makes sense. Honestly this is just a terrible design, and > unless you are forced to do it, you just don't. That's common sense. You do > not use object-oriented programming to create files matrix0...matrix3. > That's confusing, that's... > Come on, that's just a mistake. > It shows. > > I mean.... The thing is that for me importing stuff from module to classes > just solves everything at no cost. Of course I may have missed things, in > which case somebody will probably tell me what eventually, but why look > anywhere else if we have an answer ? Is there a bad side to this way of > doing ? I work with group theory these days, so I am used to missing > obvious things :-) > > ------------- > > Well. By the academic world's standard this mail is long only forgot > unimportant things, so I better resume filling boxes as I will be leaving > Belgium the day after tomorrow. Gosh, my appartment is a mess ^^; > I am writing this using a network connection via a borrowed 3G usb stick, and it only works outdoors here :-) So I better continue later next week. Dima > > Oh, and by the way I expect to be very close to totally unreachable for a > long period, probably two months and a half, from early july to end of > september. If anybody around happens to be in southeast Asia during that > time (Thailand/Vietnam/Cambodge/Laos), I will be backpacking there, let's > meet somewhere !!! > > Haaaaaaaaaaaaaaaaaaaaaaaaaaave > fuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuun ! :-) > > Nathann > > [1] http://www.graphclasses.org/ > [2] http://combinat.sagemath.org/doc/reference/sage/graphs/isgci.html > [3] Of course Dima I only talked about methods *implemented* in Sage > [4] > http://www.sagemath.org/doc/reference/sage/graphs/distances_all_pairs.html > [5] > http://www.sagemath.org/doc/reference/sage/graphs/base/static_sparse_graph.html > -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org