[Metamath] Re: Philosophy and goals for set.mm

'Alexander van der Vekens' via Metamath Tue, 28 Jan 2020 10:53:20 -0800

I want to come back to the first post of this topic, and I want to 
concentrate on the "definition" parts (2 and 4):


On Friday, January 24, 2020 at 5:58:14 AM UTC+1, Norman Megill wrote:
>
> ...
> *2.* A very important thing to me is the philosophy of striving for 
> simplicity wherever possible.  This particularly impacts what definitions 
> we choose.
>
> *"Rules" for introducing new definitions*
>
> Definitions are at the heart of achieving simplicity.  While developing 
> set.mm, two rules of thumb that evolved in my own work were (1) there 
> shouldn't be two definitions that mean almost the same thing and (2) 
> definitions shouldn't be introduced unless there is an actual need in the 
> work at hand.  Maybe I should add (3), don't formally define something that 
> doesn't help proofs and can be stated simply in English in a theorem's 
> comment.
>
I fully agree with this in principle, but what does "almost the same", 
"actual need" and "help proofs" exactly mean? There are major discrepancies 
in how to interpret these terms.

>
> Concerning the 1st rule, very early on, when I was starting to work with 
> ordinals, I noticed that many books would redefine "e."  (epsilon) as "<" 
> and "C_" (subset) as "<_" (less than or equal) since those were the 
> ordering roles they played in von Neumann ordinal theory.  I figured since 
> books did it, it must be important, so I also did that in the early set.mm.  
> (I may have even used those symbols because reals were a distant vision at 
> that point.)  Books casually interchanged them in theorems and proofs, but 
> formalization meant I might have 2 or more versions of many theorems.  It 
> was a constant nuisance to have to convert back and forth inside of proofs, 
> making proofs longer.  I eventually decided to use just "e." and "C_", and 
> it made formalizations much more pleasant with fewer theorems and shorter 
> proofs, sometimes providing more general theorems that could be used 
> outside of ordinal theory.  I never missed not having the ordering symbols 
> to remind me of their roles.  I liked the feeling of being closer to the 
> underlying set theory rather than being an abstraction layer away from it.  
> If the "less than" role of "e." was important for human understanding in a 
> particular case, I could simply use the English language "less than" the 
> comment (3rd rule) while continuing to use "e." in the $p statement.
>
Yes, that's absolutely right. These would be exactly identical definitions 
for the same concept, only using different symbols, so here it is clear 
what "almost the same" means (namely "exactly the same"). In this case, the 
symbols "<" and "C_" are used to denote different things anyway. But do you 
think the definition of ">" would be "almost the same" as the available 
definition of "<"? 

>
> As for the 2nd rule, there are often different choices that can be made in 
> the details of how something is defined.  It is sometimes hard or 
> impossible to anticipate what choice is optimal until we actually start 
> using the definition for serious work.  There is also the question of 
> whether the definition is even necessary.  More than once I've found that a 
> definition I thought might be needed (for example was used in the 
> literature proof) actually wasn't.  Sometimes, like the "e." vs.  "<" case 
> above, the definition was even a hindrance and made proofs longer.  Other 
> times an intermediate definition of my own invention that isn't in the 
> literature turned out to be advantageous for shorter proofs.  It can be 
> best to let the work drive the need for the definition and its precise 
> details.
>
> Another example of a literature definition we purposely don't use and 
> illustrates the 3rd rule is Takeuti/Zaring's Russell class with symbol 
> "Ru", which they formally define as "{ x | x e/ x }".  The only theorem 
> that uses it (in set.mm and in T/Z) is ~ ru.  For the purposes of set.mm, 
> it is wasteful and pointless since we can just define the Russell class 
> verbally in the comment of ~ ru.
>
> A problem that has arisen in the past is where a person has added a set of 
> definitions from the literature, proved some simple property theorems, then 
> is disappointed that the work isn't imported into the main set.mm.  
> Sometimes we do and sometimes we don't, but the main issue is whether the 
> 2nd rule above is being followed.  Without knowing exactly how the 
> definition is going to be applied, very often it won't be optimal and will 
> have to be adjusted, and sometimes it is more efficient just to start from 
> scratch with a definition in the form that is needed.  I prefer to see some 
> "serious" theorems proved from a definition before importing it.
>
Here you use again criteria which could be interpreted differently: what is 
"serious"? Is it the number of theorems using (directly or indirectly) a 
definition, or must it be a theorem from the 100 theorem list (or the list 
of theorems in Wikipedia) using the definition? Or is it sufficient that 
the theorem using the definition is mentioned in a book, maybe labelled as 
theorem (or proposition etc.)?

>
> As a somewhat crude example of why a definition in isolation may not be 
> optimal, suppose someone defined sine as "$a |- sin A = ..."  (not a straw 
> man; this kind of thing has been attempted in the past) i.e. a 1-place 
> function-like "sin A" rather than a stand-alone object "sin".  While this 
> can be used to prove a few things, it is very limiting since the symbol 
> "sin" in isolation has no meaning, and we can't prove e.g.  "sin :  CC --> 
> CC".  It is an easy mistake to make if you don't understand set theoretical 
> picture but are blindly copying a textbook definition.  Here it is pretty 
> obvious, but in general such issues can be subtle and may depend on the 
> application.  Also, there are often multiple ways to define something, some 
> of which are easy to start from and others which would require a lot of 
> background development.  The best choice can't always be anticipated 
> without knowing what background will be available.
>
Absolutely correct, but here we have rules how (new) definitions should 
look like: they usually should be binary relations or functions/operations, 
with exceptions only in very specially justified cases - I think these are 
not documented yet (in section "Conventions"), but these conventions for 
definitions came up as I started with Graph Theory, in a discussion (in 
this Google group) mainly with Mario. By such a conventioned, such crude 
definitions as descibed can be avoided.

>
>
> *3.* Let me comment on the recent proposal to import df-mgm, df-asslaw, 
> df-sgrp.  These are my opinions, and correct me if I am wrong.
>
> df-mgm:  A magma is nothing but an empty shell with a binary operation.  
> There are no significant theorems that follow from it.  All the properties 
> of binary operations that we have needed have already been proven.  If 
> additional properties are needed in the future it seems it would be better 
> to state them directly because they will be more useful generally.  Just 
> because it appears in Bourbaki or other texts is not to me a 
> justification:  it also needs to be useful and be driven by an actual need 
> for it.  There are many definitions in textbooks that have turned out to be 
> unnecessary.
>
We have already several theorems valid for magmas, which are proven for 
monoids, or at least in the context of monoids (because we have nothing 
else). Proving them for monoids, however, could confuse a reader, because 
he does not see if an identity is required for the proof or not. 
Additionally, ther term "magma" is used 20 times in main set.mm, so it has 
some significance. Furthermore, the definition of a monoid could be based 
on the definition of a magma (or semigroup, see below) as it is done for 
the definition of groups based on the definition of monoids, e.g.
MndALT1 = { g e. Mgm | [. ( Base ` g ) / b ]. [. ( +g ` g ) / p ]. ( A. x 
e. b A. y e. b A. z e. b ( ( x p y ) p z ) = ( x p ( y p z ) ) /\ E. e e. b 
A. x e. b ( ( e p x ) = x /\ ( x p e ) = x ) ) }
or using the definition of a semigroup
MndALT2 = { g e. SGrp | [. ( Base ` g ) / b ]. [. ( +g ` g ) / p ]. E. e e. 
b A. x e. b ( ( e p x ) = x /\ ( x p e ) = x ) }

For me, these are already serious reasons to have these definitions. To be 
concrete, however, would it be sufficient to prove the statement in ~ 
xrs1mnd that RR*s is neither a monoid nor a semigroup (it is "only" a 
magma) to justify the definition of a magma?


> df-asslaw:  All this is doing is encapsulating the class of associative 
> operations.  The only thing that can be derived from it AFAIK is the 
> associative law.  I don't see how it would benefit proofs since additional 
> work is needed to convert to and from it.  I don't see how it helps human 
> understanding because it is easier for a human to see "show the associative 
> law" in the description of a theorem than be asked to learn yet another 
> definition (an example of the 3rd rule).
>
The main reason why I (still want to) introduce these definitions, is that 
they can be used as building blocks for very different structures. If you 
look at Wikipedia (e.g. the article "Magma"), you can find a table 
classifying all "Group-like structures" - the "law"-theorems correspond to 
the columns of this table. By these, you could characterize a monoid by 
three of these laws. Instead of defining a monoid as it is done today by
df-mnd $a |- Mnd = { g | [. ( Base ` g ) / b ]. [. ( +g ` g ) / p ]. ( A. x 
e. b A. y e. b A. z e. b ( ( x p y ) e. b
            /\ ( ( x p y ) p z ) = ( x p ( y p z ) ) ) /\ E. e e. b A. x e. 
b ( ( e p x ) = x /\ ( x p e ) = x ) ) }
you do not see immediately that it contains closure, associativity and 
having an identity! This would be different by a definition like
df-mndalt $a |- MndALT = { g | [. ( Base ` g ) / b ]. [. ( +g ` g ) / p ]. 
( p clLaw b /\ p assLaw b /\ p idLaw b ) }

Concerning the human understanding, we have (and we keep) corresponding 
theorems, e.g. 
mndass $p |- ( ( G e. Mnd /\ ( X e. B /\ Y e. B /\ Z e. B ) ) -> ( ( X .+ Y 
) .+ Z ) = ( X .+ ( Y .+ Z ) ) ) $=
But in parallel, we could have theorems in the short form, e.g.:
mndasslaw $p |- ( ( G e. MndALT -> .+ assLaw B ) $=


df-sgrp:  This is basically df-asslaw converted to an extensible 
> structure.  So in essence it duplicates df-asslaw (or vice versa depending 
> on your point of view), violating the first rule (no redundancy).  More 
> importantly, AFAIK there is nothing that can be derived from it except the 
> associative law, unlike say df-grp where deep and non-obvious theorems are 
> possible.  Yes, it can be used as a "layer" on top of which we can slightly 
> simplify df-mnd and df-rng, but it is really worth having an entire, 
> otherwise useless definitional layer just to avoid having to state 
> "(x+y)+z=x+(y+z)"?  The fact that it simplifies two definitions is a slight 
> positive, but I'm still torn about it.  It doesn't quite pay for itself in 
> terms of reducing set.mm size, and it increases the burden on the reader 
> who has to learn another definition and drill down another layer to 
> understand df-mnd and df-rng.  Personally I'd prefer to see 
> "(x+y)+z=x+(y+z)" directly because it is immediately obvious without having 
> to look at a deeper layer.  In the case of the description in df-rng0, it 
> would be more direct and less obscure to state "associative operation" 
> instead of "semigroup operation", since most people (at the level of 
> studying that definition) would know what "associative" means, but fewer 
> would would know what "semigroup" means.
>
In my opinion, this is not a duplication: df-asslaw would be part of 
df-sgrp, characterizing the operation of the extensible structure. 
Concerning the reader: yes, for some it is a burden, for others it is a 
pleasure to have a stringent hierarchy of definitions. Finally, I think 
there will be a lot of serious theorems which could be proven (there is a 
complete "Semigroup Theory"). To wait until such theorems are formally 
proven within set.mm is not necessary, in this case, because basic theorems 
can already be provided - and they cannot be wrong, because they are also 
valid for monoids in its currently defined manner.

>
> I can't say that these will never be imported.  Possibly something will 
> drive a need for them, maybe something related to category theory where 
> even a magma might be a useful object.  But I would want to see the 
> definitions be driven by that need when it arises; we don't even know yet 
> whether the present extensible structure form can be adapted for that 
> purpose.  It is fine to store them in a mathbox indefinitely, but I'd like 
> to see a true need driving their use before importing.
>
>
> *4.* As for adding an exhaustive list of all possible definitions one can 
> find in Bourbaki or whatever, as someone suggested, I don't think something 
> like that belongs in set.mm 
> <http://www.google.com/url?q=http%3A%2F%2Fset.mm&sa=D&sntz=1&usg=AFQjCNGyXl3zrxstB4U2tIe0Dll1hr7fxA>,
>  
> for all the reasons above.  There are other resources that already list 
> these (Wikipedia, Planetmath, nLab).  Their precise formalization will 
> depend on the context in which they are needed in set.mm, which is 
> unknown until they are used.  In isolation (perhaps with a couple of 
> property theorems that basically check syntax) there is no guarantee that 
> they are correct, even in set.mm.  To restate, I think the philosophy 
> should be that definitions should be added based on need, not as busywork.  
> Adding a definition when needed is one of the smallest parts of building a 
> series of significant proofs.  It doesn't have to be done in advance, and I 
> don't think it is generally productive or useful to do so.
>
This is also something we can agree on immediately: Defnitions without 
usage (i.e. without theorems or only with theorems for syntax checks) are 
not useful (within set.mm) and should not be contained in the main body of 
set.mm.

In summary, we agree in most cases, and differences are only in the 
interpretation of the rules. Regarding the "law" definitions and 
definitions for magmas, semigroups and (non-unital) rings, I still would 
like to have them in the main body of set.mm, making it
more concise and consistent (especially if you look at the theorems with 
labels containing "rng" which are valid for unital rings only, which might 
be confusing). From my point of view, the benefits for a user are greater 
than the disadvantages.

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/7ef80605-c3d1-49b7-9a75-92822eeffd75%40googlegroups.com.

[Metamath] Re: Philosophy and goals for set.mm

Reply via email to