> this is possible, but it assumes, essentially, that one doesn't run into
> such a limit.
> if one gets to a point where every "fundamental" concept is only ever
> expressed once, and everything is built from preceding fundamental concepts,
> then this is a limit, short of dropping fundamental concepts.

Yes, but I don't think any theoretical framework can tell us a priori
how close we are to that limit. The fact that we run out of ideas
doesn't mean there are no more new ideas waiting to be discovered.
Maybe if we change our choice of fundamental concepts, we can further
simplify our systems.

For instance, it was assumed that the holy grail of Lisp would be to
get to the essence of lambda calculus, and then John Shutt did away
with lambda as a fundamental concept, he derived it from vau, doing
away with macros and special forms in the process. I don't know
whether Kernel will live up to its promise, but in any case it was an
innovative line of inquiry.

> theoretically, about the only way to really do much better would be using a
> static schema (say, where the sender and receiver have a predefined set of
> message symbols, predefined layout templates, ...). personally though, I
> really don't like these sorts of compressors (they are very brittle,
> inflexible, and prone to version issues).
> this is essentially what "write a tic-tac-toe player in Scheme" implies:
> both the sender and receiver of the message need to have a common notion of
> both "tic-tac-toe player" and "Scheme". otherwise, the message can't be
> decoded.

But nothing prevents you from reaching this common notion via previous
messages. So, I don't see why this protocol would have to be any more
brittle than a more verbous one.

> a more general strategy is basically to build a model "from the ground up",
> where the sender and reciever have only basic knowledge of basic concepts
> (the basic compression format), and most everything else is built on the fly
> based on the data which has been seen thus far (old data is used to build
> new data, ...).

Yes, but, as I said, old that are used to build new data, but there's
no need to repeat old data over and over again. When two people
communicate with each other, they don't introduce themselves and share
their personal details again and again at the beginning of each

> and, of course, such a system would likely be, itself, absurdly complex...

The system wouldn't have to be complex. Instead, it would *represent*
complexity through first-class data structures. The aim would be to
make the implicit complexity explicit, so that this simple system can
reason about it. More concretely, the implicit complexity is the
actual use of competing, redundant standards, and the explicit
complexity is an ontology describing those standards, so that a
reasoner can transform, translate and find duplicities with
dramatically less human attention. Developing such an ontology is by
no means trivial, it's hard work, but in the end I think it would be
very much worth the trouble.

> and this is also partly why making everything smaller (while keeping its
> features intact) would likely end up looking a fair amount like data
> compression (it is compression code and semantic space).

Maybe, but I prefer to think of it in terms of machine translation.
There are many different human languages, some of them more expressive
than others (for instance, with a larger lexicon, or a more
fine-grained tense system). If you want to develop an interlingua for
machine translation, you have to take a superset of all "features" of
the supported languages, and a convenient grammar to encode it (in GF
it would be an "abstract syntax"). Of course, it may be tricky to
support translation from any language to any other, because you may
need neologisms or long clarifications to express some ideas in the
least expressive languages, but let's leave that aside for the moment.
My point is that, once you do that, you can feed a reasoner with
literature in any language, and the reasoner doesn't have to
understand them all; it only has to understand the interlingua, which
may well be easier to parse than any of the target languages. You
didn't eliminate the complexity of human languages, but now it's
tidily packaged in an ontology, where it doesn't get in the reasoner's

> some of this is also what makes my VM sub-project as complex as it is: it
> deals with a variety of problem cases, and each adds a little complexity,
> and all this adds up. likewise, some things, such as interfacing (directly)
> with C code and data, add more complexity than others (simpler and cleaner
> FFI makes the VM itself much more complex).

Maybe that's because you are trying to support everything "by hand",
with all this knowledge and complexity embedded in your code. On the
other hand, it seems that the VPRI team is trying to develop new,
powerful standards with all the combined "features" of the existing
ones while actually supporting a very small subset of those legacy
standards. This naturally leads to a small, powerful and beautiful
system, but then you miss a lot of knowledge which is trapped in
legacy code. What I would like to see is a way to rescue this
knowledge and treat it like data, so that the mechanism to interface
to those other systems doesn't have to be built into yours, and so
that you don't have to interface with them that often to begin with.
fonc mailing list

Reply via email to