I think I have conceived a design for a programming environment that will appeal strongly even to nonprogrammers and put an unprecedented amount of computational power and flexibility at their disposal, while remaining unusually productive even for expert programmers. (Foolishly, I am willing to claim this despite not having spent much time teaching nonprogrammers to program.)
This design brings together elements of spreadsheets, array-processing languages, pure functional languages, prototype-based object-oriented languages, outliners, rapid-feedback test-first programming, Visual Basic, traditional IDEs, Python, and Simonyi's "Intentional Programming" (which I still frankly think is bullshit, although it's hard to tell now that all the original papers have been pulled off the Web). I don't think it requires solving any unsolved problems to make it work; it's a simple matter of programming. I'll probably have to spend some time hacking on this to see whether the idea is really as good as I think it is. What on earth is going on with spreadsheets? -------------------------------------------- VisiCalc was a programming language for numerical computation, and it saw the most rapid adoption of any programming language the world has ever seen, among people who didn't think of themselves as programmers and still don't, and its progeny is a major market segment to this day. This is an astonishing and little-remarked fact. How did this happen? Well, one important element was that VisiCalc (and spreadsheets in general) gave instant feedback --- as soon as you defined your formula, you saw the result, and you could usually tell if it was wrong as soon as you wrote it. Another related important element was visibility: nothing was hidden, so the user did not have to spend their precious brain cycles trying to visualize what was happening inside the program. A third important element was array processing power, like in APL; this reduced the need for abstraction, which helps a lot in providing visibility. A fourth element was the untyped nature of the language: you didn't need to specify types for cells, you just put the right values there, and you didn't even have to fight with type conflicts --- like Perl or Tcl, objects were automatically coerced to the right type. Activation records and objects ------------------------------ According to Gelernter's _Machine Beauty_, the unique innovation of Simula was to give activation records indefinite extent, turning them into objects, and Scheme was similarly founded on the idea that any lexical scope was potentially an actor, an object. Self turned this on its head: in Self, an activation record is an object, with inheritance and everything. Self is prototype-based: objects have concrete prototypes, and when name lookup fails in an object, it continues in the object's parent objects. The local variables in the activation records are attributes of the activation-record object; the object inherits (prototype-wise) from activation records of lexically enclosing blocks. My programming environment design --------------------------------- An expression calculator for pure functional expressions can re-evaluate each expression every time it's modified, displaying the result next to the expression, as the 'dynamic calculation' hack I recently posted to kragen-hacks illustrates. Suppose we have a pure functional programming language in which all function parameters have default values. A programmer can see values propagate through their function, just like in VisiCalc --- as they define each new variable, they can see its value with the default values. Now activation records of the function are just objects that inherit from the function definition, but have different parameter values. The values defined in the function are attributes of the object. You might write a "function call" providing the value 4 for the parameter x as "f(x=4)"; another way of understanding this expression is that you are instantiating an object that inherits from f, and you are overriding the inherited value of x with 4. You might get the y value from that function as "f(x=4).y". If the user gets instant feedback (a la my recent "dynamic calculation" hack) when they write "f(x=4)" in the form of seeing a version of "f" pop up with the value "4" for "x" and the other resulting values for other attributes, then the user should be able to (a) determine whether that was what they really meant and (b) figure out which attribute of "f" they really want. Furthermore, they should be able to fix bugs in "f" by editing the formulas inside it, right there, and have them reflected everywhere "f" is used. If the description of "f" is displayed and edited in an outliner, then only its inputs and outputs might be displayed by default. Aggregate data can be put in substructures, which can be referred to with dotted.word.notation, as in Java, Python, Pascal, or C. Moving things around in the outliner should result in all references to their names in your program getting changed. Formulas whose computation involves side effects will be displayed with an "evaluate" button (labeled "Do it!") and won't be evaluated unless the button is clicked. When these formulas are hidden, their "Do it!" buttons will migrate upwards to the nearest displayed heading --- so functions whose computation involves, or may involve, side effects will inherit the button-ness of their constituent formulas. Dependency tracking can still work through the side-effecting formulas. (Of course, once there are side effects in the language, you need to be strict.) Dragging and dropping these variables onto user-interface forms and report forms should be a cinch to learn to use. It'd be nice to have a constraints-based layout manager that still had visual editing, of course. For programming in the large, separate compilation, and library writing, some kind of module system would be nice, especially when you want the kind of automatic dynamic renaming described above. If you're working on a library used by people whose computers you don't have access to, you can't change top-level names at will --- your programming environment may rename all *your* references to the top-level name, but it can't rename all the references on the other computers. Sugar and spice --------------- If more traditional function syntax is desired (sin(theta) rather than sin(theta=theta).value) you could provide it by defining sin(theta) = sin_internal(theta=theta).value), just as you might define x = 3. When editing a function, you might want to specify which values could be overridden by a function call, which values must be overridden by a function call, and which values can't be read from outside the function, or you could just use the above method for defining such functions. Speculative: List comprehensions would be a natural way of defining new aggregate operations. Frontier UserTalk is an imperative programming language that uses an outliner to edit it. But its outline represents a hierarchy of control structures, not a hierarchy of data structures or function calls. This might be a decent way of including conditionals into the language: V absval = x = 3 (3) V result = if x < 0 then value = -x (not evaluated) V else value = x (3) assert(result.value >= 0) # the following three lines are tests > absval(x=2) > absval(x=0) > absval(x=-1) abs(x) = absval(x=x).result.value This is one of several cases where you'd want to have nameless values that got computed anyway. Case analyses are one of several places where you'd want to have a blank added for a value automatically --- if a variable is added in one branch of the case, it should be added in the other branch of the case as well. Assertions would be a useful side-effect-ful operation to evaluate without requiring manual activation --- you should always be able to see a little LED indicator indicating whether any assertions in your program were currently failing, clicking on which would tell you what they were. It should be possible to override the side-effect-ful status of a function, in general --- either forcing the system to treat a slow-to-compute side-effect-free function as imperative, or forcing it to automatically perform side effects every time a variable was evaluated. It would be good to be able to click on any variable and immediately be taken to its definition; or, likewise, to ask where a function is instantiated/inherited/called from. It would be nice to be able to declare pseudo-aggregate items in a manner similar to Python's __getattr__, such that the rvalue a.b turns into something like a.__getattr__("b").value, and a(b=3) turns into a.__withattr__("b", 3). Speculative: If this is an array-oriented language, with (like APL, Matlab, Octave, IDL, PDL, Numpy, and J) then numeric operations, at least, ought to transparently work elementwise on arrays; another way of saying this is that array indexing distributes over numeric operations, e.g. (a + b)[0] == a[0] + b[0]. Perhaps also array indexing ought to ought to "distribute over" attribute access (a.foo), such that a.foo[0] is the same as a[0].foo, at least where a.foo is actually an array. (Presumably if a has any attributes that are not arrays, a[0] should raise an exception.) That's one way to handle loops. Problem: For the above-defined abs function to transparently work elementwise on arrays, conditionals must also somehow distribute over arrays. This would be easy to do in a lazy environment, but not in a strict one. Unhandled errors can be handled by propagating them up to the lowest visible header for display, just like "Do it!" buttons. (And, of course, propagating the absence of their return value through formulae that depend on them.) Naturally, there should be a condition-case facility for programmatically handling exceptions as well. This could perhaps be the general case of which assertion failure is a special case. It would be good to be able to click on the unhandled-error message to dig all the way down to where it came from instead of having to do it one step at a time. Generally, as in my "dynamic calculation" hack, it's a good idea for debugging to retain the last valid value produced by an erroneous or side-effectful computation so you can debug things that depend on it. Question: how to handle loops in general? Implicit "map" will probably be sufficient for many applications, but it seems that more advanced loops (e.g. "reduce", "filter") will need to be defined in some other way. Recursion would probably work OK. Also, I've never seen a language that had implicit "map" and also handled lists nested to varying depths. -- <[EMAIL PROTECTED]> Kragen Sitaker <http://www.pobox.com/~kragen/> To forget the evil in oneself is to turn one's own good -- now untethered from modesty and rendered tyrannical -- into a magnified power for evil. -- Steve Talbot, NETFUTURE #129, via SMART Letter