On Aug 4, 1:16 pm, Robert Miller <r...@rlmiller.org> wrote:
> So you want Sage Sets to implement "a < b" to mean "a is a subset of
> b"? I'll admit that that is reasonable, and it is a fact that it
> follows Python convention.

My initial reaction for this is affirmative: If python makes a choice,
then sage must follow because it is a python library.

There are other areas of mathematics where "<" gets used for proper
inclusion. A group theorist is going to be very surprised if for two
groups H,G, the expression "H < G" is valid but does not mean "H is a
subgroup of G".

> But I think that the Python convention is
> bizarre, especially given how they implement sorting lists.

Indeed, but documented 
http://docs.python.org/library/stdtypes.html#set-types-set-frozenset
: "Since sets only define partial ordering (subset relationships), the
output of the list.sort()  method is undefined for lists of sets."

> I have to say I disagree
> with the implicit statement I am reading in your message which is that
> "<" can only be "decidedly mathematical" if we define it to mean
> subset.
In the category of sets (finite sets for that matter), I cannot think
of a meaning for "<". If you have a total ordering on the elements
then "lex sorted" ordering makes sense of course, but now you are
working in the category of "finite subsets of a totally ordered set".

> Groebner bases are also decidedly mathematical, and define all
> sorts of arbitrary total orderings...

Indeed, and computer algebra systems (the good ones anyway) take great
care in making the choice of monomial ordering explicit.

> For things like the adjacency matrix, having a
> consistent if arbitrary total ordering of the vertices is important.

But then an adjacency matrix is something that is defined for a graph
*PLUS* an ordering of its vertices. A graph only has associated to it
an "adjacency class", which is the conjugacy class of one of its
adjacency matrices under permutation matrices, i.e.:

{ P.transpose()* adjacency_matrix * P : P in permutation matrices of
the right size }

I agree that's not something you want to work with in a computer
algebra system, but then perhaps a graph should be given by a
*sequence* of vertices rather than a set. Just a "the 3 dimensional
vector space over R" also comes with an ordered basis whenever you do
computations with it.

> On the other hand, if the __lt__ etc. methods implement orderings that
> aren't total orderings, sorting gets very messy. In particular, users
> will be complaining about all sorts of Sage objects falling into the
> same trap where sorted(L) returns a list which isn't consistent. (Or
> maybe I'm the only person here who cares...)

Python 3 is clear on the matter: If there is not a natural ordering,
then __lt__() is going to return a type error. In addition, for sets
"<" is commandeered for a partial ordering, so routines that assume
that __lt__() is consistent with a total ordering are going to return
undefined results, and this is documented. I for one would not expect
"sorted" to work on arbitrary iterables, so I'd be happy with an error
and disappointed with an undefined result, until I read the relevant
documentation.

> As a thought exercise, let's persue the partial ordering idea, fixing
> Python's frozensets as our vertices.
[...]
> We want to define a
> consistent ordering on the vertices, so we use their hashes.

You could use id() instead if you ensure uniqueness of keys... This is
guaranteed to be unique and unchanged for the lifetime of the object,
so you won't get collisions. In CPython it is implemented as the
address where the object is stored, but a garbage collected
implementation of Python would probably have to do something else.
[OK, so that one doesn't work]

An alternative approach would be to realise that vertices need to be
stored in a sequence that has quick membership testing, for instance a
dictionary {'v1': 0, 'v2':1} etc. Depending on the type of access,
perhaps you would also need to keep the thing as a sequence
['v1','v2']. In magma, this data structure is an "indexed set".
Including an element adds it at the end, if it doesn't already exist
in the indexed set. Probably, deletion should follow list semantics.

I don't know how easy it is to make such a data structure efficient
and what the memory penalty is for having both hashing and ordering.

> On another angle, Python has clearly dealt with this problem
> internally, but they don't seem willing to share this with their
> users.

I haven't checked the implementation, but I suspect that for
membership test and lookup, they use hash-tables.

> I love this kind of stuff
[sorted([frozenset,....]) examples]
this is documented to be undefined.

> So, given that we want to have graphs whose vertex sets are sets, or
> subspaces, what is to be done?

Don't rely on sorting, but use other methods for getting consistent
orderings and efficient membership tests.

> Also, consider the fact that many Sage functions use "return
> sorted(output)" to guarantee a consistent ordering of the output. What
> you're advocating means that this wouldn't work in many cases...

Yes. Personally, I would be in favour of actively randomizing the
order in which results are returned whenever the results are not
supposed to be returned in any particular order. So many subtle bugs
creep into code due to people assuming results are returned in some
particular order and then in some future edition of the code, this
order is changed and the code unexpectedly breaks.

If you need one return value out of many with a particular property,
explicitly select on that property. Otherwise, deal with the fact that
you are working with *some* return value.

I understand that consistent string representation of output is
necessary for doctesting, but then you can just do a sort in the
doctest, in the spirit of:

sage: f = PolynomialRing(ComplexNumbers(100),"x").random_polynomial()
sage: L = f.roots()
sage: sorted(L, key = lambda v: [v[0].abs(),v[0].arg()])

or test on internal consistency

sage: sum([v[1]: v in L]) == f.degree() and
all( f(r[0]).close_to_zero() for r in L )

In short:
 - I don't believe sorting is essential for programming many things
efficiently
 - I don't think routines should be going out of their way to sort
their output
 - The reality is that Python will not support total ordering in many
cases anymore.

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to