functional programming for amateurs in an outliner

Kragen Sitaker Mon, 27 May 2002 00:21:06 -0700

I think I have conceived a design for a programming environment that
will appeal strongly even to nonprogrammers and put an unprecedented
amount of computational power and flexibility at their disposal, while
remaining unusually productive even for expert programmers.
(Foolishly, I am willing to claim this despite not having spent much
time teaching nonprogrammers to program.)


This design brings together elements of spreadsheets, array-processing
languages, pure functional languages, prototype-based object-oriented
languages, outliners, rapid-feedback test-first programming, Visual
Basic, traditional IDEs, Python, and Simonyi's "Intentional
Programming" (which I still frankly think is bullshit, although it's
hard to tell now that all the original papers have been pulled off the
Web).  I don't think it requires solving any unsolved problems to make
it work; it's a simple matter of programming.

I'll probably have to spend some time hacking on this to see whether
the idea is really as good as I think it is.

What on earth is going on with spreadsheets?
--------------------------------------------

VisiCalc was a programming language for numerical computation, and it
saw the most rapid adoption of any programming language the world has
ever seen, among people who didn't think of themselves as programmers
and still don't, and its progeny is a major market segment to this
day.  This is an astonishing and little-remarked fact.  How did this
happen?

Well, one important element was that VisiCalc (and spreadsheets in
general) gave instant feedback --- as soon as you defined your
formula, you saw the result, and you could usually tell if it was
wrong as soon as you wrote it.

Another related important element was visibility: nothing was hidden,
so the user did not have to spend their precious brain cycles trying
to visualize what was happening inside the program.

A third important element was array processing power, like in APL;
this reduced the need for abstraction, which helps a lot in providing
visibility.

A fourth element was the untyped nature of the language: you didn't
need to specify types for cells, you just put the right values there,
and you didn't even have to fight with type conflicts --- like Perl or
Tcl, objects were automatically coerced to the right type.

Activation records and objects
------------------------------

According to Gelernter's _Machine Beauty_, the unique innovation of
Simula was to give activation records indefinite extent, turning them
into objects, and Scheme was similarly founded on the idea that any
lexical scope was potentially an actor, an object.  Self turned this
on its head: in Self, an activation record is an object, with
inheritance and everything.  Self is prototype-based: objects have
concrete prototypes, and when name lookup fails in an object, it
continues in the object's parent objects.  The local variables in the
activation records are attributes of the activation-record object; the
object inherits (prototype-wise) from activation records of lexically
enclosing blocks.

My programming environment design
---------------------------------

An expression calculator for pure functional expressions can
re-evaluate each expression every time it's modified, displaying the
result next to the expression, as the 'dynamic calculation' hack I
recently posted to kragen-hacks illustrates.

Suppose we have a pure functional programming language in which all
function parameters have default values.  A programmer can see values
propagate through their function, just like in VisiCalc --- as they
define each new variable, they can see its value with the default
values.  Now activation records of the function are just objects that
inherit from the function definition, but have different parameter
values.

The values defined in the function are attributes of the object.  You
might write a "function call" providing the value 4 for the parameter
x as "f(x=4)"; another way of understanding this expression is that
you are instantiating an object that inherits from f, and you are
overriding the inherited value of x with 4. You might get the y value
from that function as "f(x=4).y".

If the user gets instant feedback (a la my recent "dynamic
calculation" hack) when they write "f(x=4)" in the form of seeing a
version of "f" pop up with the value "4" for "x" and the other
resulting values for other attributes, then the user should be able to
(a) determine whether that was what they really meant and (b) figure
out which attribute of "f" they really want.  Furthermore, they should
be able to fix bugs in "f" by editing the formulas inside it, right
there, and have them reflected everywhere "f" is used.

If the description of "f" is displayed and edited in an outliner, then
only its inputs and outputs might be displayed by default.  Aggregate
data can be put in substructures, which can be referred to with
dotted.word.notation, as in Java, Python, Pascal, or C.  Moving things
around in the outliner should result in all references to their names
in your program getting changed.

Formulas whose computation involves side effects will be displayed
with an "evaluate" button (labeled "Do it!") and won't be evaluated
unless the button is clicked.  When these formulas are hidden, their
"Do it!" buttons will migrate upwards to the nearest displayed heading
--- so functions whose computation involves, or may involve, side
effects will inherit the button-ness of their constituent formulas.
Dependency tracking can still work through the side-effecting
formulas.

(Of course, once there are side effects in the language, you need to
be strict.)

Dragging and dropping these variables onto user-interface forms and
report forms should be a cinch to learn to use.  It'd be nice to have
a constraints-based layout manager that still had visual editing, of
course.

For programming in the large, separate compilation, and library
writing, some kind of module system would be nice, especially when you
want the kind of automatic dynamic renaming described above.  If
you're working on a library used by people whose computers you don't
have access to, you can't change top-level names at will --- your
programming environment may rename all *your* references to the
top-level name, but it can't rename all the references on the other
computers.

Sugar and spice
---------------

If more traditional function syntax is desired (sin(theta) rather than
sin(theta=theta).value) you could provide it by defining sin(theta) =
sin_internal(theta=theta).value), just as you might define x = 3.

When editing a function, you might want to specify which values could
be overridden by a function call, which values must be overridden by a
function call, and which values can't be read from outside the
function, or you could just use the above method for defining such
functions.

Speculative: List comprehensions would be a natural way of defining
new aggregate operations.

Frontier UserTalk is an imperative programming language that uses an
outliner to edit it.  But its outline represents a hierarchy of
control structures, not a hierarchy of data structures or function
calls.  This might be a decent way of including conditionals into the
language:

V absval = 
  x = 3   (3)
  V result = if x < 0 then
    value = -x  (not evaluated)
  V else
    value = x   (3)
  assert(result.value >= 0)

# the following three lines are tests
> absval(x=2)
> absval(x=0)
> absval(x=-1)

abs(x) = absval(x=x).result.value

This is one of several cases where you'd want to have nameless values
that got computed anyway.

Case analyses are one of several places where you'd want to have a
blank added for a value automatically --- if a variable is added in
one branch of the case, it should be added in the other branch of the
case as well.

Assertions would be a useful side-effect-ful operation to evaluate
without requiring manual activation --- you should always be able to
see a little LED indicator indicating whether any assertions in your
program were currently failing, clicking on which would tell you what
they were.  It should be possible to override the side-effect-ful
status of a function, in general --- either forcing the system to
treat a slow-to-compute side-effect-free function as imperative, or
forcing it to automatically perform side effects every time a variable
was evaluated.

It would be good to be able to click on any variable and immediately
be taken to its definition; or, likewise, to ask where a function is
instantiated/inherited/called from.

It would be nice to be able to declare pseudo-aggregate items in a
manner similar to Python's __getattr__, such that the rvalue a.b turns
into something like a.__getattr__("b").value, and a(b=3) turns into
a.__withattr__("b", 3).

Speculative: If this is an array-oriented language, with (like APL,
Matlab, Octave, IDL, PDL, Numpy, and J) then numeric operations, at
least, ought to transparently work elementwise on arrays; another way
of saying this is that array indexing distributes over numeric
operations, e.g. (a + b)[0] == a[0] + b[0].  Perhaps also array
indexing ought to ought to "distribute over" attribute access (a.foo),
such that a.foo[0] is the same as a[0].foo, at least where a.foo is
actually an array.  (Presumably if a has any attributes that are not
arrays, a[0] should raise an exception.)  That's one way to handle
loops.

Problem: For the above-defined abs function to transparently work
elementwise on arrays, conditionals must also somehow distribute over
arrays.  This would be easy to do in a lazy environment, but not in a
strict one.

Unhandled errors can be handled by propagating them up to the lowest
visible header for display, just like "Do it!" buttons.  (And, of
course, propagating the absence of their return value through formulae
that depend on them.)  Naturally, there should be a condition-case
facility for programmatically handling exceptions as well.  This could
perhaps be the general case of which assertion failure is a special
case.

It would be good to be able to click on the unhandled-error message to
dig all the way down to where it came from instead of having to do it
one step at a time.

Generally, as in my "dynamic calculation" hack, it's a good idea for
debugging to retain the last valid value produced by an erroneous or
side-effectful computation so you can debug things that depend on it.

Question: how to handle loops in general?  Implicit "map" will
probably be sufficient for many applications, but it seems that more
advanced loops (e.g. "reduce", "filter") will need to be defined in
some other way.  Recursion would probably work OK.  Also, I've never
seen a language that had implicit "map" and also handled lists nested
to varying depths.


-- 
<[EMAIL PROTECTED]>       Kragen Sitaker     <http://www.pobox.com/~kragen/>
To forget the evil in oneself is to turn one's own good -- now
untethered from modesty and rendered tyrannical -- into a magnified
power for evil.  -- Steve Talbot, NETFUTURE #129, via SMART Letter

functional programming for amateurs in an outliner

Reply via email to