[Metamath] Re: Philosophy and goals for set.mm

'Alexander van der Vekens' via Metamath Thu, 30 Jan 2020 23:01:18 -0800

For the last week I have been "actively working" on providing 
"applications" of the definitions and related theorems for magmas and 
semigroups (I've spent almost my whole spare time for it). Although the 
results may be trivial, they are "serious", at least in my opinion.


I've provided the results in my latest two PRs (they can be found in my 
mathbox). Here they are (in brief):

   - There is a small magma (with a base set containing two elements) which 
   is not a semigroup: ~mgm2nsgrp
   - There is a small semigroup (with a base set containing two elements) 
   which is not a monoid: ~sgrp2nmnd
   - The previous results prove the proper inclusion chain `Mnd C. SGrp C. 
   Mgm`: ~mndsssgrp and  ~sgrpssmgm
   - The (additive group of the) extended reals is a magma, but not a 
   semigroup. Formalization of the statements in the comments of ~xrsmcmn and 
   ~xrs1mnd: ~xrsmgmdifsgrp
   - A stronger version of ~ gsumpropd if at least one of the involved 
   structures is a magma, see ~gsumpropd2: ~gsummgmpropd 
   
Especially the proof of ~sgrp2nmnd was time-consuming, although not 
difficult/complex: 32 "sums" had to be"calculated" to show the 
associativity (see ~sgrp2nmndlem4).

Now I want to ask if these results are sufficient to move the definitions 
and basic theorems for magmas and semigroups (and the results themselves, 
of course) to the main body of set.mm? From my point of view, they should, 
at least in they plain variants (~df-mgmALT, ~df-sgrp), without using the 
"law" definitions (although I like them very much, there is still no 
"serious" use of them, and they can be introduced easily later when they 
are really required).  And afterwards, many theorems proven for/in the 
context of monoids could be tidied up accordingly.

For rings (non-unital vs. unital) I plan to provide similar results, maybe 
also a result in category theory ("the category of rngs has a zero object 
while the category of rings does not" as proposed by Benoit) - but 
unfortunately, even the category of rings is not available in set.mm yet... 

On Thursday, January 30, 2020 at 5:07:40 AM UTC+1, Norman Megill wrote:
>
> On Tuesday, January 28, 2020 at 1:53:14 PM UTC-5, Alexander van der Vekens 
> wrote:
>>
>> I want to come back to the first post of this topic, and I want to 
>> concentrate on the "definition" parts (2 and 4):
>>
>
> Thanks for your comments!
>  
>
>>
>> On Friday, January 24, 2020 at 5:58:14 AM UTC+1, Norman Megill wrote:
>>>
>>> ...
>>> *2.* A very important thing to me is the philosophy of striving for 
>>> simplicity wherever possible.  This particularly impacts what definitions 
>>> we choose.
>>>
>>> *"Rules" for introducing new definitions*
>>>
>>> Definitions are at the heart of achieving simplicity.  While developing 
>>> set.mm, two rules of thumb that evolved in my own work were (1) there 
>>> shouldn't be two definitions that mean almost the same thing and (2) 
>>> definitions shouldn't be introduced unless there is an actual need in the 
>>> work at hand.  Maybe I should add (3), don't formally define something that 
>>> doesn't help proofs and can be stated simply in English in a theorem's 
>>> comment.
>>>
>> I fully agree with this in principle, but what does "almost the same", 
>> "actual need" and "help proofs" exactly mean? There are major discrepancies 
>> in how to interpret these terms.
>>
>
> I agree they are vague at this point, partly to initiate discussion, and 
> partly because there will always be borderline cases where a subjective 
> decision is required.  Also, perhaps I should have called them "guidelines" 
> rather than "rules" (they certainly aren't official rules at this point).
>
> "Almost the same" is somewhat subjective, but "B > A" the very clearly the 
> same as "A < B" except for argument order.  Similarly, we don't introduce 
> backwards epsilon even though we verbally sometimes say "B contains A" to 
> mean "A e. B".  We don't try to introduce a new notation for every possible 
> phrase that occurs in a comment.  Instead, we explain it in the comment 
> when necessary or when the notation is introduced.  See e.g. the comment of 
> ~wel where we describe 4 different English phrases for "e." but don't 
> introduce 4 notations.
>
> "Actual need" - If we want a strict, well-defined rule, we could require 
> that the definition reduces the file size of set.mm (or will reduce it 
> eventually with its expected usage).  That's probably too strict though.  A 
> better meaning of "actual need" would be a definition whose need is driven 
> by the work in progress in order to simplify that work.  Sometimes it will 
> be a definition that might not even be in the literature, and sometimes 
> literature definitions will not be needed.
>
> "Helps proofs" means the definition should ideally lead to shorter 
> proofs.  For example, defining A C_ B (subset) clearly makes proofs simpler 
> by not having to work with A. x ( x e. A -> x e. B ) all the time.  If we 
> replaced A C_ B by its definiens everywhere, many proofs would be 
> significantly larger.
>
>
>>> Concerning the 1st rule, very early on, when I was starting to work with 
>>> ordinals, I noticed that many books would redefine "e."  (epsilon) as "<" 
>>> and "C_" (subset) as "<_" (less than or equal) since those were the 
>>> ordering roles they played in von Neumann ordinal theory.  I figured since 
>>> books did it, it must be important, so I also did that in the early 
>>> set.mm.  (I may have even used those symbols because reals were a 
>>> distant vision at that point.)  Books casually interchanged them in 
>>> theorems and proofs, but formalization meant I might have 2 or more 
>>> versions of many theorems.  It was a constant nuisance to have to convert 
>>> back and forth inside of proofs, making proofs longer.  I eventually 
>>> decided to use just "e." and "C_", and it made formalizations much more 
>>> pleasant with fewer theorems and shorter proofs, sometimes providing more 
>>> general theorems that could be used outside of ordinal theory.  I never 
>>> missed not having the ordering symbols to remind me of their roles.  I 
>>> liked the feeling of being closer to the underlying set theory rather than 
>>> being an abstraction layer away from it.  If the "less than" role of "e." 
>>> was important for human understanding in a particular case, I could simply 
>>> use the English language "less than" the comment (3rd rule) while 
>>> continuing to use "e." in the $p statement.
>>>
>> Yes, that's absolutely right. These would be exactly identical 
>> definitions for the same concept, only using different symbols, so here it 
>> is clear what "almost the same" means (namely "exactly the same"). In this 
>> case, the symbols "<" and "C_" are used to denote different things anyway. 
>> But do you think the definition of ">" would be "almost the same" as the 
>> available definition of "<"?
>>
>
> Absolutely.  It is exactly the same except for the order of the 
> arguments.  We could, and perhaps should, add to the comment of df-ltxr 
> something like, ""In theorem descriptions, we occasionally use the 
> terminology 'A is greater than B' to mean 'B is less than A'."  We haven't 
> done so because literally no one has expressed confusion being able to 
> figure out what 'A is greater than B' means in the comment when the theorem 
> itself says 'B < A'.  Elementary school children are expected to understand 
> that; their books have word problems that interchange terminology like that 
> all the time.
>  
>
>>
>>
>>> As for the 2nd rule, there are often different choices that can be made 
>>> in the details of how something is defined.  It is sometimes hard or 
>>> impossible to anticipate what choice is optimal until we actually start 
>>> using the definition for serious work.  There is also the question of 
>>> whether the definition is even necessary.  More than once I've found that a 
>>> definition I thought might be needed (for example was used in the 
>>> literature proof) actually wasn't.  Sometimes, like the "e." vs.  "<" case 
>>> above, the definition was even a hindrance and made proofs longer.  Other 
>>> times an intermediate definition of my own invention that isn't in the 
>>> literature turned out to be advantageous for shorter proofs.  It can be 
>>> best to let the work drive the need for the definition and its precise 
>>> details.
>>>
>>> Another example of a literature definition we purposely don't use and 
>>> illustrates the 3rd rule is Takeuti/Zaring's Russell class with symbol 
>>> "Ru", which they formally define as "{ x | x e/ x }".  The only theorem 
>>> that uses it (in set.mm and in T/Z) is ~ ru.  For the purposes of set.mm, 
>>> it is wasteful and pointless since we can just define the Russell class 
>>> verbally in the comment of ~ ru.
>>>
>>> A problem that has arisen in the past is where a person has added a set 
>>> of definitions from the literature, proved some simple property theorems, 
>>> then is disappointed that the work isn't imported into the main set.mm.  
>>> Sometimes we do and sometimes we don't, but the main issue is whether the 
>>> 2nd rule above is being followed.  Without knowing exactly how the 
>>> definition is going to be applied, very often it won't be optimal and will 
>>> have to be adjusted, and sometimes it is more efficient just to start from 
>>> scratch with a definition in the form that is needed.  I prefer to see some 
>>> "serious" theorems proved from a definition before importing it.
>>>
>> Here you use again criteria which could be interpreted differently: what 
>> is "serious"?
>>
>
> This is also subjective, but I would say "serious" means that the 
> definition leads to something other than just restatements of its 
> properties i.e. something other than what I would call "shallow" theorems.  
> For example, assLaw can't prove anything deeper than the associative law 
> AFAIK, which is an immediately obvious consequence.  Something like 
> uniqueness of the identity element in Grp might be an example that is 
> slightly beyond "shallow", since it isn't immediately obvious to everyone.
>  
>
>> Is it the number of theorems using (directly or indirectly) a definition,
>>
>
> Not in itself.  We could add a theorem for ">" reproducing every theorem 
> now using "<", and many more with all possible combinations of the two, but 
> that wouldn't justify introducing a definition for it.
>  
>
>> or must it be a theorem from the 100 theorem list (or the list of 
>> theorems in Wikipedia) using the definition?
>>
>  
> No, it doesn't have to.
>  
>
>> Or is it sufficient that the theorem using the definition is mentioned in 
>> a book, maybe labelled as theorem (or proposition etc.)?
>>
>
> No, that isn't sufficient (or necessary).
>
> I would say the most important thing is that it involves material that is 
> actively being worked on, and its introduction is driven by that work and 
> practically essential for its continued practical development.  Sometimes 
> this may even mean that we introduce definitions that aren't in any books.
>  
>
>>
>>> As a somewhat crude example of why a definition in isolation may not be 
>>> optimal, suppose someone defined sine as "$a |- sin A = ..."  (not a straw 
>>> man; this kind of thing has been attempted in the past) i.e. a 1-place 
>>> function-like "sin A" rather than a stand-alone object "sin".  While this 
>>> can be used to prove a few things, it is very limiting since the symbol 
>>> "sin" in isolation has no meaning, and we can't prove e.g.  "sin :  CC --> 
>>> CC".  It is an easy mistake to make if you don't understand set theoretical 
>>> picture but are blindly copying a textbook definition.  Here it is pretty 
>>> obvious, but in general such issues can be subtle and may depend on the 
>>> application.  Also, there are often multiple ways to define something, some 
>>> of which are easy to start from and others which would require a lot of 
>>> background development.  The best choice can't always be anticipated 
>>> without knowing what background will be available.
>>>
>> Absolutely correct, but here we have rules how (new) definitions should 
>> look like: they usually should be binary relations or functions/operations, 
>> with exceptions only in very specially justified cases - I think these are 
>> not documented yet (in section "Conventions"), but these conventions for 
>> definitions came up as I started with Graph Theory, in a discussion (in 
>> this Google group) mainly with Mario. By such a conventioned, such crude 
>> definitions as descibed can be avoided.
>>
>>>
>>>
>>> *3.* Let me comment on the recent proposal to import df-mgm, df-asslaw, 
>>> df-sgrp.  These are my opinions, and correct me if I am wrong.
>>>
>>> df-mgm:  A magma is nothing but an empty shell with a binary operation.  
>>> There are no significant theorems that follow from it.  All the properties 
>>> of binary operations that we have needed have already been proven.  If 
>>> additional properties are needed in the future it seems it would be better 
>>> to state them directly because they will be more useful generally.  Just 
>>> because it appears in Bourbaki or other texts is not to me a 
>>> justification:  it also needs to be useful and be driven by an actual need 
>>> for it.  There are many definitions in textbooks that have turned out to be 
>>> unnecessary.
>>>
>> We have already several theorems valid for magmas, which are proven for 
>> monoids, or at least in the context of monoids (because we have nothing 
>> else). Proving them for monoids, however, could confuse a reader, because 
>> he does not see if an identity is required for the proof or not. 
>> Additionally, ther term "magma" is used 20 times in main set.mm, so it 
>> has some significance. Furthermore, the definition of a monoid could be 
>> based on the definition of a magma (or semigroup, see below) as it is done 
>> for the definition of groups based on the definition of monoids, e.g.
>> MndALT1 = { g e. Mgm | [. ( Base ` g ) / b ]. [. ( +g ` g ) / p ]. ( A. x 
>> e. b A. y e. b A. z e. b ( ( x p y ) p z ) = ( x p ( y p z ) ) /\ E. e e. b 
>> A. x e. b ( ( e p x ) = x /\ ( x p e ) = x ) ) }
>> or using the definition of a semigroup
>> MndALT2 = { g e. SGrp | [. ( Base ` g ) / b ]. [. ( +g ` g ) / p ]. E. e 
>> e. b A. x e. b ( ( e p x ) = x /\ ( x p e ) = x ) }
>>
>
> Unless they are used for many things other than monoid, I don't see the 
> advantage of either of these over just df-mnd.  Neither magma nor semigroup 
> have any "serious" (non-shallow) consequences AFAIK.  The overhead of 
> introducing them just to define 2 layers under monoid is far more than the 
> overhead of just df-mnd by itself.
>
> Finally, df-mnd is self-contained allowing the reader to grasp it 
> immediately, and the comment in df-mnd explains it very clearly, so I think 
> df-mnd is much less of a burden on the reader, especially one who has never 
> heard of magma and semigroup and has no reason to care about them.
>
>
>> For me, these are already serious reasons to have these definitions. To 
>> be concrete, however, would it be sufficient to prove the statement in ~ 
>> xrs1mnd that RR*s is neither a monoid nor a semigroup (it is "only" a 
>> magma) to justify the definition of a magma?
>>
>>
>>> df-asslaw:  All this is doing is encapsulating the class of associative 
>>> operations.  The only thing that can be derived from it AFAIK is the 
>>> associative law.  I don't see how it would benefit proofs since additional 
>>> work is needed to convert to and from it.  I don't see how it helps human 
>>> understanding because it is easier for a human to see "show the associative 
>>> law" in the description of a theorem than be asked to learn yet another 
>>> definition (an example of the 3rd rule).
>>>
>> The main reason why I (still want to) introduce these definitions, is 
>> that they can be used as building blocks for very different structures.
>>
>
> Then I think we should wait until we have a need for those structures.  We 
> don't right now.  If and when it makes sense to change df-mnd to 
> incorporate them, we can do it then.  This was done with df-grp to 
> incorporate monoid.  It's a relatively trivial retrofit.
>  
>
>> If you look at Wikipedia (e.g. the article "Magma"), you can find a table 
>> classifying all "Group-like structures" - the "law"-theorems correspond to 
>> the columns of this table.
>>
>
> The article also mentions about 25 variations of magmas obtained by adding 
> various properties.  Many of them don't even link to Wikipedia pages, 
> suggesting they are probably not widely used if used at all.  I don't think 
> they should be added to set.mm until a need for them arises, driven by 
> work in progress.
>
>  
>
>> By these, you could characterize a monoid by three of these laws. Instead 
>> of defining a monoid as it is done today by
>> df-mnd $a |- Mnd = { g | [. ( Base ` g ) / b ]. [. ( +g ` g ) / p ]. ( A. 
>> x e. b A. y e. b A. z e. b ( ( x p y ) e. b
>>             /\ ( ( x p y ) p z ) = ( x p ( y p z ) ) ) /\ E. e e. b A. x 
>> e. b ( ( e p x ) = x /\ ( x p e ) = x ) ) }
>> you do not see immediately that it contains closure, associativity and 
>> having an identity! This would be different by a definition like
>>
>
> The user knows from the description and from the property theorems that 
> are referenced.  But even without that, I do see all of these properties 
> pretty immediately - they're right there in the definition: ( x p y ) e. b 
> for closure,  ( ( x p y ) p z ) = ( x p ( y p z ) ) for associativity, and 
> E. e ... ( ( e p x ) = x /\ ( x p e ) = x ) for left and right identity.
>  
>
>> df-mndalt $a |- MndALT = { g | [. ( Base ` g ) / b ]. [. ( +g ` g ) / p 
>> ]. ( p clLaw b /\ p assLaw b /\ p idLaw b ) }
>>
>
> The user would not know what this means without looking up all the 
> auxiliary definitions.  It isn't obvious that idLaw means both left and 
> right identity.  If avoiding the quantifiers that clutter df-mnd is the 
> goal, they are going to encounter those anyway when drilling down to learn 
> the extra definitions.
>
> To me, these kinds of definitions are better explained in the English 
> description.  They are more things to burden the reader with learning.  The 
> description of df-mnd very clearly states these properties.
>  
>
>>
>> Concerning the human understanding, we have (and we keep) corresponding 
>> theorems, e.g. 
>> mndass $p |- ( ( G e. Mnd /\ ( X e. B /\ Y e. B /\ Z e. B ) ) -> ( ( X .+ 
>> Y ) .+ Z ) = ( X .+ ( Y .+ Z ) ) ) $=
>> But in parallel, we could have theorems in the short form, e.g.:
>> mndasslaw $p |- ( ( G e. MndALT -> .+ assLaw B ) $=
>>
>>
>> df-sgrp:  This is basically df-asslaw converted to an extensible 
>>> structure.  So in essence it duplicates df-asslaw (or vice versa depending 
>>> on your point of view), violating the first rule (no redundancy).  More 
>>> importantly, AFAIK there is nothing that can be derived from it except the 
>>> associative law, unlike say df-grp where deep and non-obvious theorems are 
>>> possible.  Yes, it can be used as a "layer" on top of which we can slightly 
>>> simplify df-mnd and df-rng, but it is really worth having an entire, 
>>> otherwise useless definitional layer just to avoid having to state 
>>> "(x+y)+z=x+(y+z)"?  The fact that it simplifies two definitions is a slight 
>>> positive, but I'm still torn about it.  It doesn't quite pay for itself in 
>>> terms of reducing set.mm size, and it increases the burden on the 
>>> reader who has to learn another definition and drill down another layer to 
>>> understand df-mnd and df-rng.  Personally I'd prefer to see 
>>> "(x+y)+z=x+(y+z)" directly because it is immediately obvious without having 
>>> to look at a deeper layer.  In the case of the description in df-rng0, it 
>>> would be more direct and less obscure to state "associative operation" 
>>> instead of "semigroup operation", since most people (at the level of 
>>> studying that definition) would know what "associative" means, but fewer 
>>> would would know what "semigroup" means.
>>>
>> In my opinion, this is not a duplication: df-asslaw would be part of 
>> df-sgrp, characterizing the operation of the extensible structure. 
>> Concerning the reader: yes, for some it is a burden, for others it is a 
>> pleasure to have a stringent hierarchy of definitions.
>>
>
> This is a good point.  How do we choose which reader to satisfy?  
> Mathematicians who already know the stuff probably would find the hierarchy 
> more pleasing, and readers who are trying to learn unfamiliar material 
> would find more definitions to learn a burden.
>
>  
>
>> Finally, I think there will be a lot of serious theorems which could be 
>> proven (there is a complete "Semigroup Theory"). To wait until such 
>> theorems are formally proven within set.mm is not necessary, in this 
>> case, because basic theorems can already be provided - and they cannot be 
>> wrong, because they are also valid for monoids in its currently defined 
>> manner.
>>
>
>>> I can't say that these will never be imported.  Possibly something will 
>>> drive a need for them, maybe something related to category theory where 
>>> even a magma might be a useful object.  But I would want to see the 
>>> definitions be driven by that need when it arises; we don't even know yet 
>>> whether the present extensible structure form can be adapted for that 
>>> purpose.  It is fine to store them in a mathbox indefinitely, but I'd like 
>>> to see a true need driving their use before importing.
>>>
>>>
>>> *4.* As for adding an exhaustive list of all possible definitions one 
>>> can find in Bourbaki or whatever, as someone suggested, I don't think 
>>> something like that belongs in set.mm 
>>> <http://www.google.com/url?q=http%3A%2F%2Fset.mm&sa=D&sntz=1&usg=AFQjCNGyXl3zrxstB4U2tIe0Dll1hr7fxA>,
>>>  
>>> for all the reasons above.  There are other resources that already list 
>>> these (Wikipedia, Planetmath, nLab).  Their precise formalization will 
>>> depend on the context in which they are needed in set.mm, which is 
>>> unknown until they are used.  In isolation (perhaps with a couple of 
>>> property theorems that basically check syntax) there is no guarantee that 
>>> they are correct, even in set.mm.  To restate, I think the philosophy 
>>> should be that definitions should be added based on need, not as busywork.  
>>> Adding a definition when needed is one of the smallest parts of building a 
>>> series of significant proofs.  It doesn't have to be done in advance, and I 
>>> don't think it is generally productive or useful to do so.
>>>
>> This is also something we can agree on immediately: Defnitions without 
>> usage (i.e. without theorems or only with theorems for syntax checks) are 
>> not useful (within set.mm) and should not be contained in the main body 
>> of set.mm.
>>
>> In summary, we agree in most cases, and differences are only in the 
>> interpretation of the rules. Regarding the "law" definitions and 
>> definitions for magmas, semigroups and (non-unital) rings, I still would 
>> like to have them in the main body of set.mm, making it
>> more concise and consistent (especially if you look at the theorems with 
>> labels containing "rng" which are valid for unital rings only, which might 
>> be confusing). From my point of view, the benefits for a user are greater 
>> than the disadvantages.
>>
>
> "rng" was chosen for the label abbreviation because I wasn't aware that 
> non-unital rings had their own symbol.  It is a simple substitution to 
> change the names.  We still have no application driving the introduction of 
> Rng.
>
> Norm 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/e4307560-fb2e-4316-bbdf-b211902c225c%40googlegroups.com.

[Metamath] Re: Philosophy and goals for set.mm

Reply via email to