Hello again
Me:
What if I don't trust your kernel?
[..]
What's the downloadable object that can be machine-checked to satisfy my paranoid insurance company?
Oleg:
I hope that you can forgive me if I reply by
quoting the above two paragraphs back to you, with the substitution s/kernel/compiler/.
What if I don't trust your compiler?
I have heard a similar question asked of J. Strother Moore and J. Harrison. J. Strother Moore said that most of ACL2 is built by bootstrapping, from lemmas and strategies that ACL2 itself has proven. However, the core of ACL2 just has to be trusted.
It's not an issue of forgiveness. I actively encourage such questions, of myself and other people. I'd also recommend this paper by Randy Pollack:
`How to believe a machine-checked proof' http://www.dcs.ed.ac.uk/home/rap/export/believing.ps.gz
But the one-line summary is `by rechecking it independently'. Of course, we'll never achieve _certainty_ in any absolute sense, but we can do a lot better than `I can't show you the evidence for security reasons, but there's no doubt that...'. In order to do that, you need more than a machine which says `yes'. You need a language of evidence which is simple enough that people can write their own checker in whatever way they choose. Martin-Loef's type theory provides a source of good candidates for such a language of evidence. Its core proof-checking/type-checking algorithm is very small. This is what comes out the other end of the Epigram elaborator.
[..]
Hongwei Xi's code contains the evidence I'm asking for. The verification conditions are added by hand in the program as annotations, just as yours are annotations outside the program. His are checked by Presburger arithmetic, just as yours could be.
Actually, as far as the PLDI'98 paper is concerned, they specifically say they do not use the full Presburger arithmetics. Rather, they solve the constraints by Fourier variable elimination. Anyway, even if the various elaboration and decision rules are proven to be sound and complete, what is the evidence that their implementation in the extended SML compiler is also sound and complete?
Another very good question. Every time you appeal to a proof-search oracle or to an axiomatization of a library, or anything which hides the evidence, the amount of stuff you `just have to trust' gets bigger, and the harder it is to perform independent rechecking.
But what if some bright spark (volunteers welcome) were to implement Presburger Arithmetic in one of the following ways...? (1) External to the system, but outputting checkable proof terms, instead of just `yes' or `no'. (2) Within the system, with a checkable proof that its yea be yea and its nay be nay. [Traditional implementation and correctness proof.] (3) Within the system, with a type which actually guarantees what the algorithm proves in the first place. [The `two-level' approach.]
The conclusion specifically states that the algorithm is currently incomplete.
Clearly, soundness is more important. But it is nice to have a good idea which problems the machine will get and which it won't.
Me:
...this proposition is false. The bounds function returns bounds which are outside the range of the array when the array is empty. You'll notice that Hongwei Xi's program correctly handles this case.
Don't get me wrong: I think your branded arrays and indices are a very good idea. You could clearly fix this problem by Maybe-ing up bbounds or (better?) by refusing to brand empty arrays in the first place.
Oleg:
I have noticed that the branding trick would work very well with number-parameterized types. The latter provide missing guarantees, for example, statically outlaw empty arrays. Hongwei Xi's code has another neat example: a dot product of two arrays where one array is statically known to be no longer that the other. Number-Parameterized types can statically express that inequality constraint too.
Yes, of course. And any dependently typed solution to these problems would naturally parametrize the data-structures by their size, represented as ordinary numbers. Statically expressing the constraints is easy.
The abstract `brand' is just a type-level proxy for the bounding interval, and the library of operations provides interval-respecting operations on indices. This is a very neat solution in Haskell, but it goes round an extra corner which isn't necessary with dependent types, where you can just talk about the interval directly. The library-writer would develop and verify the same convenient operations for working with intervals and indices; the proofs would be independently recheckable terms in type theory.
The Number-Parameterized types paper considers a more difficult example -- and indeed the typechecker forced me to give it a term that is verifiably a proof of the property (inequality on the sizes) stated in term's inferred type. The typecheker truly demanded a proof; shouting didn't help.
Rightly so. What tools would be useful in such a situation?
Incidentally, the paper is being considered for JFP, I guess. I don't know if the text could be made available. I still can post the link to the code: http://pobox.com/~oleg/ftp/Haskell/number-param-vector-code.tar.gz
I should emphasize that all proper examples use genuine Haskell arrays rather than nested tuples.
[..]
There's no reason why a dependently typed language should not have genuine arrays.
Yet the type of the array includes its size, conveniently expressed in decimal notation. One can specify arithmetic equality and inequality constraints on the sizes of the array, in the type of the corresponding functions. The constraints can be inferred. One example specifically deals with the case when the sizes of those arrays are not known until the run-time -- moreover, change in the course of a computation that involves general recursion. It seems that branding trick nicely complements number-parameterized arrays and makes `dynamic' cases easier to handle.
Again, all of this sounds like a good thing to do, but less like hard work if you have a language designed to support it.
I don't know why you emphasize general recursion. It's no big deal in
programs which we execute only at run-time. Sure, I prefer structural
recursion to general recursion, because I like to have termination for
free, and also to be able to use programs safely in types. But I am not a bottom-pincher: you can have general recursion at run-time and still
decide typechecking. Here's a cheap and nasty trick that does it (and
we're kicking around some less nasty tricks at the moment). Add a
hypothesis
general :: forall p . (p -> p) -> p
This is all you need to write general recursive programs. In Epigram, your program would invoke `general' once, and then make whatever recursive calls you like. None of these programs can loop during typechecking, because `general' has no computational behaviour. We can tell which programs/proofs to trust by which use `general'. Only when we output run-time code do we make
general f = f (general f)
Of course, if you really want to run this program in types---at your own risk---we plan to let you. There are three security levels:
paradise: a complete absence of generals purgatory: there are generals, but their orders are not obeyed pandemonium: what happens when you listen to generals
A compiler-switch would do, but annotations on individual functions would be better.
So please, everybody, no more `What about general recursion? What
about decidable typechecking?'. It's no big deal. Why do some people
presume that Cayenne's particular problems are intrinsic to any dependently typed language?
You'll notice that Hongwei Xi's program correctly handles this case.
But what if I specified the dependent type with a mistake that overlook the empty array case? Would the dependent ML compiler catch me red-handed? In all possible cases? Where is the formal proof of that?
I can't speak for dependent ML, but you wouldn't get away with it in Epigram. Epigram rules out _provably_ impossible cases and provides a language in which to provide this guarantee. In the mundane majority of cases, this proof amounts to `constructors of datatype indices are disjoint', which gets handled automatically.
I have failed to emphasize the parallels between Hongwei Xi's annotations and the corresponding Haskell code.
It wasn't lost on me...
[..]
Yes, there is a trusted kernel involved -- just as there is a Dependent SML system to be implicitly trusted.
I agree: the appeal to the constraint oracle is a weakness. As is the appeal to informal proofs, or axiomatizations of the library.
However, in the given example the trusted kernel is compact Haskell code plus the GHC system. The latter is complex -- but it is being used by thousands of people over extended period of time -- and so has higher confidence than experimental extensions (unless the latter have been formally proven -- I mean the code itself -- by a trusted system such as ACL2 or Coq).
Confidence of what? Remember that Haskell is logically unsound. I can fake a branded index by taking the head of an empty list. Fortunately, this fake will always be discovered at run-time before unsafeAt goes haywire (that ain't true for the empty array bug). However, it somewhat weakens the static guarantee!
I do not wish to sound as being against Dependant Type systems and implementations. I merely wish to point out poor-man approaches.
Don't get me wrong either: I agree with you. I'm in favour of cranking as much static confidence out of Haskell's type system as possible. And some pretty good stuff is possible (modulo bottom). However, it's very easy to pick up such programs, implemented by hook or by crook (I'm a crook) in Haskell and say: `See? Who needs dependent types?'. The truth of the matter is that these programs are rather like the dependently typed programs you might write, except that you need to use type-level proxies for values and type-level Prolog trickery, instead of ordinary values and ordinary programs. The dependently typed versions of these programs are simpler than the Haskell versions. In the short term, Haskell is the here and now, so do what you can; in the longer term, choosing type class unicycling over dependent types is a false economy.
[..]
- Try to make `safety' checks algorithmical: use newtypes to `lift' the safety checks such as range checks and see if they can be merged with the tests the algorithm has to do anyway.
Yes, this is what often happens with dependently typed programs. The key thing to ensure is that the tests have types which show their meaning statically. A Bool is a bit uninformative.
- Use explicitly-quantified type variables to associate `unforgeable' types with values, so that the following property holds: the equality of such types entails the equality of the corresponding values -- even if we don't know what they values are until the run-time.
Exactly: type-level proxies for values. It's the right thing to do if you can't have values in types. I'm in favour of the `datakind' idea: Haskell's type-level proxies for values should look as much like values as possible!
- Convenient (decimal) type arithmetics at compile time -- and even `semi-compile' time.
There's nothing which forces numbers in dependent type systems to be unary. That's just the path of least resistance in prototypes. Also, it reflects the importance of the inductive structure of numbers when they are being used as indices. I wonder how much this really buys you when you're developing indexed operations, as opposed to when you want to type actual values. I guess I'll read the paper.
I'm interested in just how far those principle may take me.
That's a good thing to do. I'm trying to deliver a language which makes exactly that kind of program much easier to write. I hope that sounds like a good thing too.
Thank you very much for your message!
And thank you!
All the best
Conor
_______________________________________________ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell