[Metamath] Philosophy and goals for set.mm

Norman Megill Thu, 23 Jan 2020 20:58:25 -0800

Tierry, Alexander, and Benoit have asked for clarification of the goals of 
set.mm.  Here are some of my opinions.  I am moving the discussion in 
https://github.com/metamath/set.mm/issues/1431 to here for wider readership.

*1.* There is no "goal" per se for set.mm. People work on what they are
interested in. Typically work starts off in mathboxes, where people pretty
much have freedom to do whatever they want. Two situations that typically
will result in the work being moved to the main part of set.mm are (1) more
than one person wants to use or develop the work and (2) the work is an
mm100 theorem. There are other factors, though, that I'll discuss. And in
any case it is a judgment call by me and others that may not be perfect.
Just because we choose not to import from a mathbox right away doesn't mean
the value of the work is less, it just means that in our possibly flawed
judgment it either doesn't quite fit in with the set.mm philosophy or it
hasn't found its place there yet.

So what I will primarily discuss is what I see as the "philosophy" that has
been generally used so far in set.mm. Hopefully that can guide the goals
that people set for themselves as they contribute to set.mm.

*2.* A very important thing to me is the philosophy of striving for
simplicity wherever possible. This particularly impacts what definitions
we choose.

*"Rules" for introducing new definitions*

Definitions are at the heart of achieving simplicity. While developing
set.mm, two rules of thumb that evolved in my own work were (1) there
shouldn't be two definitions that mean almost the same thing and (2)
definitions shouldn't be introduced unless there is an actual need in the
work at hand. Maybe I should add (3), don't formally define something that
doesn't help proofs and can be stated simply in English in a theorem's
comment.

Concerning the 1st rule, very early on, when I was starting to work with
ordinals, I noticed that many books would redefine "e." (epsilon) as "<"
and "C_" (subset) as "<_" (less than or equal) since those were the
ordering roles they played in von Neumann ordinal theory. I figured since
books did it, it must be important, so I also did that in the early
set.mm. (I may have even used those symbols because reals were a distant
vision at that point.) Books casually interchanged them in theorems and
proofs, but formalization meant I might have 2 or more versions of many
theorems. It was a constant nuisance to have to convert back and forth
inside of proofs, making proofs longer. I eventually decided to use just
"e." and "C_", and it made formalizations much more pleasant with fewer
theorems and shorter proofs, sometimes providing more general theorems that
could be used outside of ordinal theory. I never missed not having the
ordering symbols to remind me of their roles. I liked the feeling of being
closer to the underlying set theory rather than being an abstraction layer
away from it. If the "less than" role of "e." was important for human
understanding in a particular case, I could simply use the English language
"less than" the comment (3rd rule) while continuing to use "e." in the $p
statement.

As for the 2nd rule, there are often different choices that can be made in
the details of how something is defined. It is sometimes hard or
impossible to anticipate what choice is optimal until we actually start
using the definition for serious work. There is also the question of
whether the definition is even necessary. More than once I've found that a
definition I thought might be needed (for example was used in the
literature proof) actually wasn't. Sometimes, like the "e." vs. "<" case
above, the definition was even a hindrance and made proofs longer. Other
times an intermediate definition of my own invention that isn't in the
literature turned out to be advantageous for shorter proofs. It can be
best to let the work drive the need for the definition and its precise
details.

Another example of a literature definition we purposely don't use and
illustrates the 3rd rule is Takeuti/Zaring's Russell class with symbol
"Ru", which they formally define as "{ x | x e/ x }". The only theorem
that uses it (in set.mm and in T/K) is ~ ru. For the purposes of set.mm,
it is wasteful and pointless since we can just define the Russell class
verbally in the comment of ~ ru.

A problem that has arisen in the past is where a person has added a set of
definitions from the literature, proved some simple property theorems, then
is disappointed that the work isn't imported into the main set.mm.
Sometimes we do and sometimes we don't, but the main issue is whether the
2nd rule above is being followed. Without knowing exactly how the
definition is going to be applied, very often it won't be optimal and will
have to be adjusted, and sometimes it is more efficient just to start from
scratch with a definition in the form that is needed. I prefer to see some
"serious" theorems proved from a definition before importing it.

As a somewhat crude example of why a definition in isolation may not be
optimal, suppose someone defined sine as "$a |- sin A = ..." (not a straw
man; this kind of thing has been attempted in the past) i.e. a 1-place
function-like "sin A" rather than a stand-alone object "sin". While this
can be used to prove a few things, it is very limiting since the symbol
"sin" in isolation has no meaning, and we can't prove e.g. "sin : CC -->
CC". It is an easy mistake to make if you don't understand set theoretical
picture but are blindly copying a textbook definition. Here it is pretty
obvious, but in general such issues can be subtle and may depend on the
application. Also, there are often multiple ways to define something, some
of which are easy to start from and others which would require a lot of
background development. The best choice can't always be anticipated
without knowing what background will be available.

*3.* Let me comment on the recent proposal to import df-mgm, df-asslaw,
df-sgrp. These are my opinions, and correct me if I am wrong.

df-mgm: A magma is nothing but an empty shell with a binary operation.
There are no significant theorems that follow from it. All the properties
of binary operations that we have needed have already been proven. If
additional properties are needed in the future it seems it would be better
to state them directly because they will be more useful generally. Just
because it appears in Bourbaki or other texts is not to me a
justification: it also needs to be useful and be driven by an actual need
for it. There are many definitions in textbooks that have turned out to be
unnecessary.

df-asslaw: All this is doing is encapsulating the class of associative
operations. The only thing that can be derived from it AFAIK is the
associative law. I don't see how it would benefit proofs since additional
work is needed to convert to and from it. I don't see how it helps human
understanding because it is easier for a human to see "show the associative
law" in the description of a theorem than be asked to learn yet another
definition (an example of the 3rd rule).

df-sgrp: This is basically df-asslaw converted to an extensible
structure. So in essence it duplicates df-asslaw (or vice versa depending
on your point of view), violating the first rule (no redundancy). More
importantly, AFAIK there is nothing that can be derived from it except the
associative law, unlike say df-grp where deep and non-obvious theorems are
possible. Yes, it can be used as a "layer" on top of which we can slightly
simplify df-mnd and df-rng, but it is really worth having an entire,
otherwise useless definitional layer just to avoid having to state
"(x+y)+z=x+(y+z)"? The fact that it simplifies two definitions is a slight
positive, but I'm still torn about it. It doesn't quite pay for itself in
terms of reducing set.mm size, and it increases the burden on the reader
who has to learn another definition and drill down another layer to
understand df-mnd and df-rng. Personally I'd prefer to see
"(x+y)+z=x+(y+z)" directly because it is immediately obvious without having
to look at a deeper layer. In the case of the description in df-rng0, it
would be more direct and less obscure to state "associative operation"
instead of "semigroup operation", since most people (at the level of
studying that definition) would know what "associative" means, but fewer
would would know what "semigroup" means.

I can't say that these will never be imported. Possibly something will
drive a need for them, maybe something related to category theory where
even a magma might be a useful object. But I would want to see the
definitions be driven by that need when it arises; we don't even know yet
whether the present extensible structure form can be adapted for that
purpose. It is fine to store them in a mathbox indefinitely, but I'd like
to see a true need driving their use before importing.

*4.* As for adding an exhaustive list of all possible definitions one can
find in Bourbaki or whatever, as someone suggested, I don't think something
like that belongs in set.mm, for all the reasons above. There are other
resources that already list these (Wikipedia, Planetmath, nLab). Their
precise formalization will depend on the context in which they are needed
in set.mm, which is unknown until they are used. In isolation (perhaps
with a couple of property theorems that basically check syntax) there is no
guarantee that they are correct, even in set.mm. To restate, I think the
philosophy should be that definitions should be added by need, not as
busywork. Adding a definition when needed is one of the smallest parts of
building a series of significant proofs. It doesn't have to be done in
advance, and I don't think it is generally productive or useful to do so.

*5.* Finally, let me touch on the issue of > (greater than).

There are many non-symmetrical relations that use reversed symbols to swap
arguments: "B contains A" with reversed epsilon, "B includes A" with
reversed subset symbol, "Q if P" with reversed arrow, etc. I've seen all
of these in the literature. If we really feel the reader will encounter
them and expect set.mm to explain their meaning (which is likely explained
in their textbook anyway), we could mention informally the reversed symbol
usage when we introduce the forward symbol. But we don't add $a's for them
because we will never use them. That is because either we would have to
add a huge number of theorems containing all possible forward and reversed
combinations of the symbols, or we would have to constantly convert between
them inside of proofs. Both of those are contrary to a philosophy of
simplicity.

IMO the same should be done with >, mentioning what it means in the
description for <. Introducing a formal $a statement for > that will never
be used is unnecessary and wasteful of resources. If we want to be
excessively pedantic, we could mention in the description for < that the
the formal definition would be "|- > = `' >" , although that seems less
intuitive than simply saying that "in theorem descriptions, we occasionally
use the terminology 'A is greater than B' to mean 'B is less than A'." A
grade school student can (and does) easily understand that.

Basically, ">" violates all 3 "rules" for new definitions I proposed above.

Norm

--
You received this message because you are subscribed to the Google Groups
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/metamath/6e9144b6-5b8f-488d-bf19-83e5a2a250e0%40googlegroups.com.

[Metamath] Philosophy and goals for set.mm

Reply via email to