On 1/7/2012 1:28 PM, Andrei Alexandrescu wrote:
Having a pluggable interface so the implementation can be changed is all
right, as long as the binary API does not change.
If the binary API changes, then of course, two different libraries
cannot be linked together. I strongly oppose any changes which would
lead to a balkanization of D libraries.

In my opinion this statement is thoroughly wrong and backwards. I also think it
reflects a misunderstanding of what my stance is. Allow me to clarify how I see
the situation.

Currently built-in hash table use generates special-cased calls to non-template
functions implemented surreptitiously in druntime. The underlying theory, also
sustained by the statement quoted above, is that we are interested in supporting
linking together object files and libraries BUILT WITH DISTINCT MAJOR RELEASES
OF DRUNTIME.

There is zero interest for that. ZERO. No language even attempts to do so.
Runtimes that are not compatible with their previous versions are common,
frequent, and well understood as an issue.

We've agree on this before, perhaps I misstated it here, but I am not talking about changing druntime. I'm talking about someone providing their own hash table implementation that has a different binary API than the one in druntime, such that code from their library cannot be linked with any other code that uses the regular hashtable.

A different implementation of hashtable would be fine, as long as it is binary compatible. We did this when we switched from a binary tree collision resolution to a linear one, and the switchover went without a hitch because it did not require even a recompile of existing binaries.


In an ideal world, built-in hash tables should work in a very simple manner. The
compiler lowers all special hashtable syntax - in a manner that's MINIMAL,
SIMPLE, and CLEAR - into D code that resolves to use of object.di (not some
random user-defined library!). From then on, druntime code takes over. It could
choose to use templates, dynamic type info, whatever. It's NOT the concern of
the compiler. The compiler has NO BUSINESS taking library code and hardwiring it
in for no good reason.

That was already true of the hashtables - it's just that the interface to them was through a set of fixed function calls, rather than a template interface. To the compiler, the hashtables were a completely opaque void*. The compiler had zero knowledge of how they actually were implemented inside the runtime.

Changing it to a template implementation enables a more efficient interface, as inlining, etc., can be done instead of the slow opApply() interface. The downside of that is it becomes a bit perilous, as the binary API is not so flexible anymore.


(Consider the disaster C++ has had forever with everyone inventing their
own string type. That insured zero interoperability between C++
libraries, a situation that persists even for 10 years after C++ finally
acquired a standard string library.)

It is exactly this kind of canned statement and prejudice that we must avoid. It
unfairly singles out C++ when there also exist incompatible libraries in C,
Java, Python, you name it.

Of course, but strings are a fundamental data type, and so it was worse with C++. I don't agree that my opinion on it is prejudicial or unfair, because I many times was stuck with having to deal with the issues of trying to glue together disparate code that had differing string classes. Often, it was the only incompatibility, but it permeated the library interfaces.

> Also, the last time the claim that everywhere invented their own string type
> could have been credibly aired was around 2004.

Sure, people rarely (never?) do their own C++ string classes anymore, but that old code and those old libraries are still around, and are actively maintained.

http://msdn.microsoft.com/en-us/library/ms174288.aspx

Notice that's for Visual Studio C++ 2010.

The string problem was a mistake I was determined not to make with D.

I have agreed with you and still agree with the notion of using lowering instead of custom code. Also, keep in mind that the hashtable design was done long before D even had templates. It was "lowered" to what D had at the time - function calls and opApply.



What's built inside the compiler is like axioms in math, and what's library is
like theorems supported by the axioms. A good language, just like a good
mathematical system, has few axioms and many theorems. That means the system is
coherent and expressive. Hardwiring stuff in the language definition is almost
always a failure of the expressive power of the language.

True.

Sometimes it's fine to
just admit it and hardwire inside the compiler e.g. the prior knowledge that "+"
on int does modulo addition.

Right, I understand that the abstraction abilities of D are not good enough to produce a credible 'int' type, or 'float', etc., hence they are wired in.

But most always it's NOT, and definitely not in the
context of a complex data structure like a hash table. I also think that adding
a hecatomb of built-in types and functions has smells, though to a good extent I
concede to the necessity of it.

I want to reiterate that I don't think there is a way with the current compiler technology to make a library SIMD type that will perform as well as a builtin one, and those who use SIMD tend to be extremely demanding of performance.

(One could make a semantic equivalent, but not a performance equivalent.)


We should start from what the user wants to accomplish. Then figure how to
express that within the language. And only lastly, when needed, change the
language to mandate lowering constructs to the MINIMUM EXTENT POSSIBLE into
constructs that can be handled within the existing language. This approach has
been immensely successful virtually whenever we applied it: foreach for ranges
(though there's work left to do there), operator overloading, and too little
with hashes. Lately I see a sort of getting lazy and skipping the second pass
entirely. Need something? Yeah, what the hell, we'll put it in the language.

I don't think that is entirely fair in regards to the SIMD stuff. It reminds me of after I spent a couple years at Caltech, where every class was essentially a math class. My sister asked me for help with her high school trig homework, and I just glanced at it and wrote down all the answers. She said she was supposed to show the steps involved, but to me I was so used to doing it there was only one step.

So while it may seem I'm skipping steps with the SIMD, I have been thinking about it for years off and on, and I have a fair experience with what needs to be done to generate good code.




I am a bit worried about the increasing radicalization of the discussion here,
but recent statements come in frontal collision with my core principles, which I
think stand on solid evidential ground. I am appealing for building consensus
and staying principled instead of reaching for the cheap solution. If we do the
latter, it's quite likely we'll regret it later.


Andrei

Reply via email to