On 26/05/2013, at 8:52 AM, srean wrote: > Calling them bad compilers is, I think, unfair.
No its not. People that know little theory hack stuff together without research and thought. I actually suggested on the clang mailing list the slower optimisations turn off when they're having trouble. "Oh no, we can't have unpredictable results". The same guy said that it was a bug if any optimisations were slower than linear (i.e. O(N)). Well guess what. The most basic analysis of SSA form which is the core of Clang is O(N log N). Clang is so good I've had -O2 run slower than -O1. If you look in Felix clang now defaults to -O1. Its not worth the time bothering with -O2. Anyone that thinks the results of optimisations processes are predictable doesn't know ANYTHING about optimisation. > They optimize for the cases that they expect to find, that's as reasonable > and smart as it gets. Its as smart as the developers can make it without thinking. If you try "make bootstrap" in Felix you can watch the Ocaml compiler vs the C and C++ compilers. Ocaml is over 10 times faster compiling and its compiling a much more sophisticated better typed language, and generates code which runs about 50% the speed of C most of the time, and is faster if you need to do complex memory management. > The other thing that holds them back is many optimizations techniques are > patented, so free software just cannot use them. I'm not aware of any serious patented optimisations. MS holds a few related to vtable optimisation, of little relevance to compiler back ends. Most optimisations are public domain and new ones are published regularly in the research literature. In most cases, the optimisations are of no interested, they're well understood. The problem is when to apply them. The multi-pass approach used in clang is easy to manage but it doesn't work. Felix uses recursion, which works much better but is really hard to manage because all the optimisations are tied together. I worked on advanced bug detection software (which is very similar to optimisations in compilers). We had some slow precise routines and some fast less precise ones, and the smart researcher suggested the way to go was use the fast guesses to decide when it was worth applying the slower more precise ones. Costing is fundamental to all optimisation processes and is NOT well understood. In fact the one most relevant to YOU (srean) is the costing of parallelisation. You want to split a job up into parallel parts. Which parts do you split up? How many pieces? What shape are the pieces? This job is done MANUALLY by programmers for a single platform, eg a Cray, for a single problem. No one knows how to automate it. The fundamental theory is to build a cost model, and then try to minimise the cost. Note that we're NOT talking here about how to optimise your code. We're talking about how to DECIDE how to optimise your code! The point is you NEVER apply serious optimisations uniformly. Its too expensive. you only apply them where its worth while. And my point is clang is so dumb it cannot tell the difference. It cannot take a big function and break it into separate pieces even if you made it as a series of calls to separate functions and then inlined them. I'm not saying partitioning a function is easy: I am saying it is essential. Oh, and SSA form which Clang uses is designed to do just that. Break a function up into basic blocks and gotos. > But back to the point, how does GO deal with error codes and exceptions ? can > any inspiration be drawn from there ? I don't know but I doubt it. I did read an article about it recently, and the consensus was that Go's method was crap, and not significantly different from C++ underneath. Go was developed by Pike who has no idea about programming language design. RTTI instead of unions. Uggh. Stupidity. No polymorphism; you have to be kidding! Go's advantage is that it's a pure compiler, it targets assembly. So it can bypass the archaic C model. But it only works on x86 machines AFAIK. It main claim to fame is goroutines. Which Felix has had for 15 years. > This is neat. Can any of these continuations preserve the function call > frames too ? They do. You can return a pointer to local variable in a function in Felix. Its safe: fun f (x:int) { var y = x; return &y; } // safe The frame is preserved as long as the returned pointer is reachable so that the frame itself is reachable. The problem is that function "stack frames" DO NOT include the current code position of the caller, that is, the return address. It goes on the machine stack. Procedure frames DO include the current code position. Actually, a procedure frame contains a pointer to its caller frame. And each frame has its own program counter, Generally the PC is an int and a switch is used to jump to a case tag. With gcc we actually use a machine address and a computed goto like: goto *pc; which is probably faster. > I do not know why C++ does not allow that, too expensive ? C++ does allow it. The technique is well known, you make an object of a class with an operator ()(args.. ) method. C++ people misname this a functor, its correct name is an applicative object. You can pass one of these things with a () method to a template, the template can't tell the difference from a real function. You can do what you like with these applicative objects. Guess how Felix implements what I described? Actually it uses "apply(arguments)" method NOT operator (). And for procedures it uses call(arguments) to set the arguments into the object and repeated calls to resume() method to execute the procedure. The constructor is used to set pointers to the display environment (enclosing functions) and global storage. The object created after the constructor is finished is called a closure. It's ready to be "call()"ed or "apply()"ed with arguments, but has already been bound to its scope. The difference between C++ and Felix is you just write proc f (a:int) ... fun g (b:int) ... and Felix does all that for you AND optimises it down to ordinary functions or even inlines them if possible. 90% of functions are inlined away. That's why Felix is so fast (the output I mean). The compiler is slow. Because I don't have the resources to rewrite it to be faster. > One could have had throw_with_full_frame( ), and throw_without_full_frame( ). > But with better names of course. Because "auto" storage in C++ is on the machine stack. So you cannot throw it as an object, at least without copying it. And then you have to be careful with copies because stuff on stacks contains pointers and copies of things aren't at the same machine addresses as the original, so you'd have to fiddle pointers. Pointer fiddling is possible in a disciplined language. Such as Ocaml. It does do this. It actually compacts local storage by copying used objects from one storage bank to another, adjusting pointers on the way. Thats what a copying collector does. but it relies on a model that emphasises fast heap allocation and doesn't use the stack much. Ocaml heap allocation is as fast as incrementing the stack pointer because it literally works by incrementing a heap pointer, exactly the same. When you run out of space it compacts the heap and off you go again. Sometimes stuff is moved out of the local heap into a bigger slower heap. Its called a generational collector, how they manage that. -- john skaller skal...@users.sourceforge.net http://felix-lang.org ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may _______________________________________________ Felix-language mailing list Felix-language@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/felix-language