Crispin Cowan wrote: >> For years, I have been trying to get formal specification advocates to explain the difference between high level programming languages and specification languages. From my point of view, "formal specifications" can be divided into two categories:
* Those that can be mechanically translated into code, otherwise known as "programs" * Those that cannot be mechanically translated, otherwise known as "documentation" :) << Specifications that can readily be mechanically translated to code are often called "executable specifications". Specifications that cannot be readily translated to code fall into two groups: 1. They may either be sufficient to completely describe the required behaviour but expressed in such a way that the machine cannot turn them into a program. For example, given a sequence s we might specify: s'.ranb = s.ranb & s'.isOrdered Which says that, viewed as an unordered collection, the sequence contains the same elements as before, but they are now ordered. A sufficiently capable code generator might realise that a good way to achieve this is to call a method called "sort"; but to a less capable code generator, this would be non-executable. 2. They may describe a part of the expected behaviour rather than the complete behaviour, the intention being that they are expected properties of the complete specification (or program). For example, if the sequence in the above example is known to be non-empty, we could specify that the minimum element in the sequence is the first element (if isOrdered means ascending). These are the sorts of things that you can do to a very limited extent using assertions in C (and now Java). I think it is a mistake to regard them as documentation - they are far more than that. >> ... I agree with the other posters: the above could describe a formal specification, but could also describe a declarative programming language. << Executable specifications can indeed be related to declarative programming. >> However, I think I do see a gap between these extremes. You could have a formal specification that can be mechanically transformed into a *checker* program that verifies that a solution is correct, but cannot actually generate a correct solution. The assert() statements that David Crocker mentioned are an incomplete form of this; incomplete because the do not *completely* verify the program's behavior to be correct (because they are haphazardly placed by hand). So there's another midpoint in the spectrum: a formal spec that can only verify correctness but is complete, effectively is a program for non-deterministic machines (cf: NP completeness theory). A spec that is incomplete (does not specify all outputs) is more of an approximation. << What you are talking about is "annotated development". Languages/products that support this include SPARK and ESC/Java. Unfortunately, if you want to use automated proof, you need to either substantially restrict the language (SPARK) or accept that the "proofs" are not always sound (ESC/Java). BTW I think both of those are excellent products in spite of their limitations. >> All of which begs the question: are these formal specs that are somehow not programs any easier to verify than actual programs? Probably somewhat easier (they are necessarily simpler) but some would argue, not enough simpler to be worth the bother. E.g. suppose 100,000 lines of code reduces to 10,000 lines of formal specification in some logical notation. A hard problem, but solvable, is a mechanical proof that the 10,000 line spec and the 100,000 lines of code actually conform. An unsolved problem is "does the 10,000 line spec mean what the human *thinks* it means?" << If the code is handwritten in a conventional programming language (excluding the SPARK subset of Ada) then it is very difficult to get a high success rate using automated proof. Generating code from specifications (with manual refinement in a carefully-designed notation where necessary) is not only a faster way to produce code, it also makes the proofs easier. As to whether the 10,000 line spec means what the human thinks it means: if the human is an experienced developer, then it is much more likely to mean what he/she thinks it means that the 100,000 line program is. Specifications can be mechanically checked for inconsistency and some forms of incompleteness, reducing the possibility of error. However, whether the specification means what the non-technical end-user wants is another matter entirely. David Crocker, Escher Technologies Ltd. Consultancy, contracting and tools for dependable software development www.eschertech.com