On Aug 4, 1:16 pm, Robert Miller <r...@rlmiller.org> wrote: > So you want Sage Sets to implement "a < b" to mean "a is a subset of > b"? I'll admit that that is reasonable, and it is a fact that it > follows Python convention.
My initial reaction for this is affirmative: If python makes a choice, then sage must follow because it is a python library. There are other areas of mathematics where "<" gets used for proper inclusion. A group theorist is going to be very surprised if for two groups H,G, the expression "H < G" is valid but does not mean "H is a subgroup of G". > But I think that the Python convention is > bizarre, especially given how they implement sorting lists. Indeed, but documented http://docs.python.org/library/stdtypes.html#set-types-set-frozenset : "Since sets only define partial ordering (subset relationships), the output of the list.sort() method is undefined for lists of sets." > I have to say I disagree > with the implicit statement I am reading in your message which is that > "<" can only be "decidedly mathematical" if we define it to mean > subset. In the category of sets (finite sets for that matter), I cannot think of a meaning for "<". If you have a total ordering on the elements then "lex sorted" ordering makes sense of course, but now you are working in the category of "finite subsets of a totally ordered set". > Groebner bases are also decidedly mathematical, and define all > sorts of arbitrary total orderings... Indeed, and computer algebra systems (the good ones anyway) take great care in making the choice of monomial ordering explicit. > For things like the adjacency matrix, having a > consistent if arbitrary total ordering of the vertices is important. But then an adjacency matrix is something that is defined for a graph *PLUS* an ordering of its vertices. A graph only has associated to it an "adjacency class", which is the conjugacy class of one of its adjacency matrices under permutation matrices, i.e.: { P.transpose()* adjacency_matrix * P : P in permutation matrices of the right size } I agree that's not something you want to work with in a computer algebra system, but then perhaps a graph should be given by a *sequence* of vertices rather than a set. Just a "the 3 dimensional vector space over R" also comes with an ordered basis whenever you do computations with it. > On the other hand, if the __lt__ etc. methods implement orderings that > aren't total orderings, sorting gets very messy. In particular, users > will be complaining about all sorts of Sage objects falling into the > same trap where sorted(L) returns a list which isn't consistent. (Or > maybe I'm the only person here who cares...) Python 3 is clear on the matter: If there is not a natural ordering, then __lt__() is going to return a type error. In addition, for sets "<" is commandeered for a partial ordering, so routines that assume that __lt__() is consistent with a total ordering are going to return undefined results, and this is documented. I for one would not expect "sorted" to work on arbitrary iterables, so I'd be happy with an error and disappointed with an undefined result, until I read the relevant documentation. > As a thought exercise, let's persue the partial ordering idea, fixing > Python's frozensets as our vertices. [...] > We want to define a > consistent ordering on the vertices, so we use their hashes. You could use id() instead if you ensure uniqueness of keys... This is guaranteed to be unique and unchanged for the lifetime of the object, so you won't get collisions. In CPython it is implemented as the address where the object is stored, but a garbage collected implementation of Python would probably have to do something else. [OK, so that one doesn't work] An alternative approach would be to realise that vertices need to be stored in a sequence that has quick membership testing, for instance a dictionary {'v1': 0, 'v2':1} etc. Depending on the type of access, perhaps you would also need to keep the thing as a sequence ['v1','v2']. In magma, this data structure is an "indexed set". Including an element adds it at the end, if it doesn't already exist in the indexed set. Probably, deletion should follow list semantics. I don't know how easy it is to make such a data structure efficient and what the memory penalty is for having both hashing and ordering. > On another angle, Python has clearly dealt with this problem > internally, but they don't seem willing to share this with their > users. I haven't checked the implementation, but I suspect that for membership test and lookup, they use hash-tables. > I love this kind of stuff [sorted([frozenset,....]) examples] this is documented to be undefined. > So, given that we want to have graphs whose vertex sets are sets, or > subspaces, what is to be done? Don't rely on sorting, but use other methods for getting consistent orderings and efficient membership tests. > Also, consider the fact that many Sage functions use "return > sorted(output)" to guarantee a consistent ordering of the output. What > you're advocating means that this wouldn't work in many cases... Yes. Personally, I would be in favour of actively randomizing the order in which results are returned whenever the results are not supposed to be returned in any particular order. So many subtle bugs creep into code due to people assuming results are returned in some particular order and then in some future edition of the code, this order is changed and the code unexpectedly breaks. If you need one return value out of many with a particular property, explicitly select on that property. Otherwise, deal with the fact that you are working with *some* return value. I understand that consistent string representation of output is necessary for doctesting, but then you can just do a sort in the doctest, in the spirit of: sage: f = PolynomialRing(ComplexNumbers(100),"x").random_polynomial() sage: L = f.roots() sage: sorted(L, key = lambda v: [v[0].abs(),v[0].arg()]) or test on internal consistency sage: sum([v[1]: v in L]) == f.degree() and all( f(r[0]).close_to_zero() for r in L ) In short: - I don't believe sorting is essential for programming many things efficiently - I don't think routines should be going out of their way to sort their output - The reality is that Python will not support total ordering in many cases anymore. -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org