[isabelle-dev] Summary: WHY have 'a set back?

Alexander Krauss Tue, 30 Aug 2011 13:28:07 -0700

Florian Haftmann wrote:

envisaged working plan
----------------------


a) general
* continue discussion about having set or not, end with a summary (for
the mailing list archive)

Among all the technical details about the HOW, the WHY part of this
discussion seems to have come to a halt. Here is my attempt of a
summary of what we currently know.

1. In favour of an explicit set type:

1.1. Type discipline (for users):

The notational conventions seem to be a matter of taste (some users
like to explicitly distinguish between sets and functions, others
prefer to write just P instead of {x. P x} in a set
context). However, our library has many results about both sets and
predicates, but almost none about arbitrary mixtures of the
two. Thus, mixing them is usually a bad idea. Inspection of theories
in the distribution and AFP has shown that such a mixture usually
requires unfolding mem_def later on to finish the proof.

Example:
http://isabelle.in.tum.de/repos/isabelle/rev/510ac30f44c0
(in Multivariate_Analysis, mem_def and Collect_def disappear from
proofs when mixture is eliminated)

1.2. Type discipline (for tools): Even when all specifications and lemmas
respect the set/pred separation, it is occasionally lost by applying
seemingly harmless lemmas such as subset_antisym ([intro!] by
default), or set_eqI, whose assumptions contain set notation, while
the conclusion is generic. Thus, users will be forced into awkward
situations, and the only way to recover from them is to unfold
mem_def etc. The only way to avoid this is to disable the automatic
use of such lemmas (with dramatic losses) or to introduce the type
discipline.

In other instances, some simplification rules cannot be enabled by
default, since (even though they are perfectly sensible) they would
make the confusion pervasive. The simplifier's eta-expansive
behaviour can make this worse (turning "A" into "%x. A x" where A is
a set), but the problem exists independently from this.

Example:

http://isabelle.in.tum.de/repos/isabelle/file/fd520fa2fb09/src/HOL/Complete_Lattice.thy#l823
(Here, mem_def is required, since the initial simp call does not
preserve the the type discipline)

http://isabelle.in.tum.de/repos/isabelle/file/fd520fa2fb09/src/HOL/Equiv_Relations.thy#l367
(similar situation)

1.3. Higher-Order Unification

Higher-Order Unification sometimes fails on type "'a => bool" where
it works for explicit "'a set", which is first-order. The reasons
for this effect are unclear and were not investigated in 2008, since
HOU and its implementation in Isabelle are rather intricate.

Examples (from the set elimination 2008):
http://isabelle.in.tum.de/repos/isabelle/rev/d334a6d69598
http://isabelle.in.tum.de/repos/isabelle/rev/6a4d5ca6d2e5
http://isabelle.in.tum.de/repos/isabelle/rev/c0fa62fa0e5b

1.4. Coercions

Sets can be declared as covariant for the sake of coercion
inference, which makes more sense for subtyping.
(A similar issue exists for Nitpick's monotonicity inference, but
there it is solved independently already.)

1.5. Class instances (+,*)

Sets can get their own class instances differing from functions. The
only known instance where this is practically useful is in
HOL/Library/Sets_and_Functions.thy, where one can then define A + B
= { x + y | x y. x : A & y : B }. So this is rather minor.

1.6. Code generation

Code generation for sets is simplified by eliminating the need for
an explicit constructor (as in Library/Cset.thy) as an intermediate
type. However, for other types (maps), the data refinement problem
persists.

2. Contra an explicit set type:

2.1. Quite a lot of work for users out there: Cleaning up set/pred
confusion from our own theories and the AFP is already taking
significant time. Some newer AFP entries where confusion occurs also
in definitions and lemmas (in particular DataRefinementIBP and
GraphMarkingIBP) require significant reengineering. (The original
author, Viorel Preoteasa kindly agreed to help us here). But even
for those theories it seems that the changes improve overall
readability and automation.

2.2. "(Fairly) small loss" in sledgehammer's performance, according to
Jasmin's measurements.

(summary ends here, my own opinion follows:)

So, it seems that we can conclude that instead of the nice unified
development that we hoped for in 2008, we got quite a bit of confusion
(points 1.1 and 1.2), in addition to the drawbacks that were already
known (1.3, 1.5-1.6). If we had to choose between the two versions now
(with no status quo to consider), the case would be pretty clear. So
the question is whether our users out there will tolerate that they
have to fix quite a number of theories.

I think I am pro 'a set in the end, assuming we can fix the remaining
technical issues.

Please remind me of any important points I have missed.

Alex

_______________________________________________
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev

[isabelle-dev] Summary: WHY have 'a set back?

Reply via email to