Re: Are we getting better at designing programming languages?

JS Sat, 27 Jul 2013 08:11:22 -0700

On Friday, 26 July 2013 at 23:19:45 UTC, H. S. Teoh wrote:

On Fri, Jul 26, 2013 at 03:02:32PM +0200, JS wrote:
I think the next step in languages it the mutli-levelabstraction.
Right now we have the base level core programming and the
preprocessing/template/generic level above that. There is noreasonwhy language can't/shouldn't keep going. The ability tocontrol and
help the compiler do it's job better is the next frontier.
Analogous to how C++ allowed for abstraction of data, templateallow
for abstraction of functionality, we then need to abstract
"templates"(or rather meta programming).
There is much value to be had for working with the minimumpossiblesubset of features that can achieve what you want with aminimum ofhassle. The problem with going too far with abstraction is thatyoustart living in an imaginary idealistic dreamworld that hasnothing todo with how the hardware actually implements the stuff. Youstartwriting some idealistic code and then wonder why it doesn'twork, or why
performance is so poor.  As Knuth once said:

        By understanding a machine-oriented language, the programmer
will tend to use a much more efficient method; it is muchcloser
        to reality. -- D.  Knuth
People who are more than casually interested in computersshouldhave at least some idea of what the underlying hardware islike.
        Otherwise the programs they write will be pretty weird. -- D.
        Knuth
If I ever had the chance to teach programming, my first coursewould beassembly language programming, followed by C, then by otherlanguages,starting with a functional language (Lisp, Haskell, orConcurrentClean), then other imperative languages, like Java. (Thenfinally I'll
teach them D and tell them to forget about the others. :-P)
Will I expect my students to write large applications inassemblylanguage or C? Nope. But will I require them to pass the finalexam inassembly language? YOU BETCHA. I had the benefit of learningassembly
while I was still a teenager, and I can't tell you how much that
experience has shaped my programming skills. Even though Ihaven'twritten a single line of assembly for at least 10 years,understandingwhat the machine ultimately runs gave me deep insights into whycertainthings are done in a certain way, and how to take advantage ofthat. Ithelps you build *useful* abstractions that map well to theunderlyingmachine code, which therefore gives you good performance andoverall
behaviour while providing ease of use.

I used to program in assembly and loved it. The problem was thatone could not write large programs. Not because it is impossiblein assembly but because there was no(or little) ways to abstract.Large programs MUST be able to be broken into manageable pieces.OOP is what allows the large programs of our times... Irrelevantif it was done in assembly or not.

E.g., it is not impossible to think of an assembly likelanguage(low level) that has many high level concepts(classes,templates, etc...) and a compiler that has many safetyfeatures(type checking, code analysis, bounds checking, etc...)...

But what you end up with then is probably something similar to Dwith every function's body written in asm.

By contrast, my encounters with people who grew up with Java orPascalconsistently showed that most of them haven't the slightestclue how themachine even works, and as a result, just like Knuth said, theytend tohave some pretty weird ideas about how to write their programs.Theytend to build messy abstractions or idealistic abstractionsthat don'tmap well to the underlying hardware, and as a result, theirprograms are
often bloated, needlessly complex, and run poorly.


[...]
For example, why are there built in types?
You need to learn assembly language to understand the answer tothat
one. ;-)

I've spend several years in assembly when I was young... But youneed to go a step further. Electronics deals with only 1's and0's... not nibbles, bytes, words, dwords, qwords, etc.... Thesegroups only help people, not computers.

Surely we generally have optimal performance by using types thatare multiples of the bus size, but that is irrelevant. Sure manycpu's have some idea of the basic types but this is only becauseit is in hardware and they can't predict the type you want tocreate. For the most part it is irrelevant because all complextypes are built from fundamental types.

BUT we are talking about compilers and not cpu's. Compilers aresoftware and can be written to "know" future types(bymodifying/informing the compiler).

Everything that can be done in any HLL can be done with 1's and0's in a hex editor... in fact, it must be so or you end up witha program that can't run.

So this alone proves that a HLL is only for abstraction to makelife easier(essentially to mimic human thinking the best it can).

The problem that I've always run across is that all compilers arerather limited in some way that makes life harder, not easier.Sometimes this is bugs, other times it is lack of as simplefeature that would make thinks much easier to deal with.

D goes a long way in the feature set but seems to have a lot morebugs than normal and has a few down sides. D is what got me backinto programming. I went into C# for a while and really like thelanguage(I find it very cohesive and well thought out) butunfortunately do not want to be restricted to .NET(a greatlibrary and well put together too).

There is no inherit reason this is so except this allowscompilers to
achieve certain performance results...
Nah... performance isn't the *only* reason. But like I said,you need tounderstand the foundations (i.e., assembly language) before youcanunderstand why, to use a physical analogy, you can't justfreely move
load-bearing walls around.

Again, at the lowest level cpu's work on bits, nothing else. Evenmost cpu's are abstracted for performance reasons, but it doesn'tchange that fact.

By bits I do not mean a 1-bit computer but simply a computer thatworks on a bit stream with no fixed size "word". Think of aturing machine.

but having a higher level of abstraction of meta programmingshouldallow us to bridge the internals of the compiler moreeffectively.
Andrei mentions several times in TDPL that we programmers don'tlikeartificial distinctions between built-in types and user-definedtypes,and I agree with that sentiment. Fortunately, things like aliasthis andopCast allow us to define user-defined types that, for allpracticalpurposes, behave as though they were built-in types. This is agood
thing, and we should push it to the logical conclusion: to allow
user-defined types to be optimized in analogous ways tobuilt-in types.That is something I've always found lacking in the languages Iknow, andsomething I'd love to explore, but given that we're trying tostabilize
D2 right now, it isn't gonna happen in the near future.

Maybe if we ever get to D3...

I think those are cool features, and it's the kinda stuff thatdraws me to the language. Stuff that makes life easier ratherthan harder. But all these concepts are simply "configuring" thecompiler to do things that traditionally they didn't do.

Alias this is telling the compiler to treat a class as a specifictype or to replace it's usage with a function. Where did thiscome from? C/C++ doesn't have it! Why? Because it either wasn'tthought of or wasn't thought of as useful. Those kinds of thingshold compilers back. If someone will use it then it is useful. Iunderstand that in reality there are limitations but when someonemakes the decision "No one will need this" then every ultimatelysuffers. It's a very egocentric decision that assumes that theperson knows everything. Luckily Walter seems to have had theforesight to avoid making such decisions.

Nevertheless, having said all that, if you truly want to makethemachine dance, you gotta sing to its tune. In the old days, thesayingwas that premature optimization is the root of all evils. Thesedays,I'd like to say, premature *generalization* is the root of allevils.

Sure. Ultimately someone designed the cpu in a certain way andfor you to take advantage of all it's potential you have to workwithin the limitations/constraints they set up(which, sometimes,is not known fully). Also, it is useless, from a practicalmatter, to create a program in a language that can never be ranbut not theoretically useless.

It ultimately depends on the goals... I imagine when someonewants to create the best they can, then sometimes it's very easyto go overboard... sometimes the tools are simply not availableto reach the goal. But such attitudes is what pushes theboundaries and generally pays off in the long run... without themwe would at the most still be using punch cards.

I've seen software that suffered from prematuregeneralization... It wasa system that was essentially intended to be a nice interfaceto adatabase, with some periodic background monitoring functions.The
person(s) who designed it decided to build this awesome generic
framework with all sorts of fancy features. For example, usersdon'thave to understand what SQL stands for, yet they can formulatecomplexqueries by means of nicely-abstracted OO interfaces. Hey, OO isall therage these days, so what can be better than to wrap SQL in OOin such away that the user wouldn't even know it's SQL underneath? Imean, whatif we wanted to switch to, oh, Berkeley DB one of these days?!Butabstracting a database isn't good enough. There's also thisincrediblegeneric framework that handles timers and events, such that youdon'thave to understand what an event loop is and you can writeevent-drivencode, just like that. Oh, and to run all of these complicatedfancyfeatures, we have to put it inside its own standalone daemon,so that ifit crashes, we can use another super-powerful generic frameworktohandle crashes and automatically restart so that the userdoesn't evenhave to know the database engine is crashing underneath him;the daemonwill pick up the query and continue running it after itrestarts! Isn'tthat cool? But of course, since it runs as a separate daemon,we have touse IPC to interface it with user code. It all makes totalsense!
...

After about 3 years worth of this, the system has become a giant
behemoth, awesome (and awful) to behold, slumbering onwards in
unyielding persistence, soaking up all RAM everywhere it canfind any,and peaking at 99% CPU when you're not looking (gotta keepthose savvy
customers who know how to use 'top' happy, y'know?). The old OO
abstraction layers for the database are mostly no longer used,nowadayswe're just writing straight SQL anyway, but some core codestill usesthem, so we daren't delete them just yet. The resourceacquisition codehas mutated under the CPU's electromagnetic radiation, and hasacquired
5 or 6 different ways of acquiring mutex locks, each written by
different people who couldn't figure out how to use the previous
person's code. None of these locks could be used simultaneouslywitheach other, for they interact in mysterious, and oftendisastrous, ways.Adding more features to the daemon is a road through aminefield filled
with the remains of less savvy C++ veterans.
Then one day, I was called upon to implement something thatrequiredmaking an IPC call to this dying but stubbornly still-survivingdaemon.Problem #1: the calling code was part of a C library that, dueto thebloatedness of the superdooper generic framework, is completelyisolatedfrom it. Problem #2: as a result, I was not allowed to link theC++ IPCwrapper library to it, because that would pull in 8000+ IPCwrapperfunctions from that horrific auto-generated header file, whichin turnrequires linking in all the C++-based framework libraries,which in turnpulls in yet more subsidiary supporting libraries, which if youadd it
all up, adds about 600MB to the C library size. Which is Not
Acceptable(tm). So what to do? Well, first write a separatelibrary tohandle interfacing with the 1 or 2 IPC calls that I can't dowithout, tokeep the nasty ugly hacks in one place. Next, in this library,since wecan't talk to the C++ part directly, write out functionarguments usingfwrite() into a temporary file, then fork() and exec() a C++wrapper
executable that *can* link with the C++ IPC code. This wrapper
executable then reads the temporary file and unpacks thefunctionarguments, then hands them over to the IPC code that repacksthem in thedifferent format understood by the daemon, then sends it off.Inside thedaemon, write more code to recognize this special request,unpack itsarguments once again, then do some setup work (y'know, acquirethosenasty mutexes, create some OO abstraction objects, the works),thenactually call the real function that does the work. But we'renot done;that function must return some results, so after carefullycleaning upafter ourselves (making sure that the "RAII" objects aredestructed in
the right order to prevent nasty problems like deadlocks or
double-free()'s), we repackage the function's return value andsend itback over the IPC link. On the other end, the IPC librarydecodes thatand returns it to the wrapper executable, which now mustfwrite() itinto another temporary file, and then exit with a specific exitcode sothat the C library that fork-and-exec'd it will know to lookfor thefunction results in the temporary file, so that it can readthem backin, unpack them, then return to the original caller. This nastypiece of
work was done EVERY SINGLE TIME AN IPC FUNCTION WAS CALLED.
What's that you say? Performance is poor? Well, that's justbecause youneed to upgrade to our new, latest-release, shiny hardware!We'll doublethe amount of RAM and the speed of the CPU -- we'll throw in anextracore or two, too -- and you'll be up and running in no time!Meanwhile,back in the R&D department (nicely insulated from customersupport), I
say to myself, gee I wonder why performance is so bad...
After years of continual horrendous problems, nasty deadlockbugs,hair-pulling sessions, bugfixes that introduced yet more bugsbecausethe whole thing has become a tower of cards, the PTBs finallywasconvinced that we needed to do something about it. Long storyshort, we
trashed the ENTIRE C++ generic framework, and went back to using
straight int's and char's and good ole single-threaded C code,with noIPCs or mutex RAII objects or 5-layer DB abstractions -- theresult wasa system at the most 20% of the size of the original, ran 5timesfaster, and was more flexible in handling DB queries than theprevious
system ever could.
These days, whenever I hear the phrase "generic framework",esp. if ithas "OO" in it, I roll my eyes and go home and happily work onmy D code
that deals directly with int's and char's. :)
That's not to say that abstractions are worthless. On thecontrary,having the *right* abstractions can be extremely powerful --things like
D ranges, for example, literally revolutionized the way I write
iterative code. The *wrong* abstractions, OTOH... let's justsay it's onthe path toward that proverbial minefield littered with theremains of
less-than-savvy programmer-wannabes. What constitutes a *good*
abstraction, though, while easy to define in general terms, isratherelusive in practice. It takes a lot of skill and experience tobe ableto come up with useful abstractions. Unfortunately, it's alltoo easy tocome up with idealistic abstractions that actually detract,rather thanadd -- and people reinvent them all the time. The good thing isthatusually other people will fail to see any value in them ('costhere isnone) so they get quickly forgotten, like they should be. Thebad thingis that they keep coming back through people who don't knowbetter.

I agree, it is very hard, and I think that is why the compilermust make such things easier. I think the problems tend to bemore that compilers get in the way or are difficult to implementthe abstracts and end up causing problems down the road due tohacks and workarounds rather than the other way around. While itis true that it is difficult to abstract things and take intoaccount unforeseen events, a properly abstracted system should beable to be general enough to deal with them... when it's not,then I'm sure it is much more difficult to rectify than aconcrete implementation.

Abstraction is difficult, requires a good memory, and theintelligence to deal with the complexity involved... but therewards are well worth it. We, as a civilization, can't getbetter at it without workings at it. You can't expect everyone toget it right all the time.

Most people are idiots, simple as that. You can't expect mostpeople to comprehend complex systems, or even have the desire todo so if they are capable. Most people want to blindly apply afamiliar pattern that has worked before they don't know anybetter. This is not necessarily bad except when the pattern isn'tthe right one.

Even the real intelligent people that are capable of dealing withthe complexity can only do so for so long. A human brain, nomatter how good, can't deal with exponential factors... at somepoint it will become too much to handle.

I'm one of those believers that at some point you have to scrapthe broken way and start afresh, learned from your mistakes tomake something better. This is what you guys did whenimplementing a simpler system... as what you learned was simplewas better.

One of the great things though, is that breaking complexity intosimpler parts always arrives at a set of simple enough piecesthat can be dealt with. The problem is that someone still has tounderstand all the complexity more or less.

Again I come back to Knuth's insightful quote -- to truly builda usefulabstraction, you have to understand what it translates to. Youhave to
understand how the machine works, and how all lower layers of
abstraction works, before you can build something new *and*useful.That's why I said that in order to make the machine dance, youmust singits tune. You can't just pull an abstraction out of thin airand expectthat it will all somehow work out in the end. Before we masterthe newabstractions introduced by D -- like ranges -- we're not reallyin the
position to discover better abstractions that improve upon them.


I think we agree, basically what I was getting at above.

I don't see anything like this happening so depending on yourscale, Idon't think we are getting better, but just chasing ourtails... howmany more languages do we need that just change the syntax ofC++? Whydo people think syntax matters? Semantics is what is importantbutthere seems to be little focus on it. Of course, we mustexpresssemantics through syntax so for practical purposes it mattesto some
degree.... But not nearly as much as the number of programming
languages suggest.
Actually, I would say that in terms of semantics, it allultimately mapsto Turing machines anyway, so ultimately it's all the same. Youcan
write anything in assembly language. (Or, to quote Larry Wall's
tongue-in-cheek way of putting it: you can write assembly codein any
language. :-P) That's already been known and done.

Yes! So what is different is only what the language itself has tooffer to make life easier to abstract... if we didn't wantabstraction we would just write in 0's and 1's... or if we hadthe memory and intelligence, it would be the easiest way(insteadof waiting for multiple alias this's ;))

What matters is, what kind of abstractions can we build on topof it,
that allows us maximum expressivity and usability? The seeming
simplicity of Turing machines (or assembly language) belies the
astounding computational power hidden behind the apparentsimplicity.The goal of programming language design is to discover ways ofspanningthis range of computational power in a way that's easy tounderstand,easy to use, and efficient to run in hardware. Syntax is partof the"easy to use" equation, a small part, but no less important(ever tried
writing code in lambda calculus?).

The harder part is the balancing act between expressiveness and
implementability (i.e. efficiency, or more generally, how tomake yourcomputation run in a reasonable amount of time with areasonable amountof space -- a program that can solve all your problems isuseless if itwill take 10 billion years to produce the answer; so is aprogram thatrequires more memory than you can afford to buy). That's wheretheabstractions come in -- what kind of abstractions will youhave, and howwell do they map to the underlying machine? It's all too easyto thinkin terms of the former, and neglect the latter -- you end upwithsomething that works perfectly in theory but requiresunreasonableamounts of time/memory in practice or is just a plain mess whenmapped
to actual hardware.

One of the most useful aspects of programming, and what makes itso powerful, is the ability to leverage what others have done.Unfortunately what happens is people write the same old stuffother people have written. This problem has gotten better overthe last few years with the internet making it easier but stillis an issue.

Just think of all the man hours wasted due to people writing thesame code, debugging code because of bugs, or giving up becausethey didn't have the code they needed(but it existed). Thinkthings were optimal from the get go.... how much further we'd bealong.

I don't mind people making mistakes in the theoretical vspractical tug of war... because I think the theoretical is whatpushes boundaries and the practical is with strengthens them.

Much of recent mathematics is purely theoretical, which in turnhas fueled practical things... mathematics started out purelypractical. Maybe this ebb and flow is a natural thing in life.

But I believe just because sometime is practical doesn't mean itis best. For example, suppose you are able to design aprogramming language that somehow is provably better than allother languages combined. The problem is, it requires a new CPUdesign that is difficult and expensive to create. Should thelanguage not be used because it is not practical? Of course not.Luckily all of humanity does not have to stop while such thingsare being built.

We need people pushing the boundaries to keep us from spinningour wheels and we need people keeping the wheels turning.

Re: Are we getting better at designing programming languages?

Reply via email to