Re: Software Assurance Reference Dataset

2014-07-23 Thread Dmitry Olshansky via Digitalmars-d

21-Jul-2014 02:29, Walter Bright пишет:

On 7/20/2014 3:10 PM, Dmitry Olshansky wrote:

The computed goto is faster for two reasons, according to the article:

1.The switch does a bit more per iteration because of bounds checking.


Now let's consider proper implementation of thread-code interpreter.
where *code pointer points to an array of addresses. We've been
through this
before and it turns out switch is slower because of an extra load.

a) Switch does 1 load for opcode, 1 load for the jump table, 1
indirect jump to
advance
(not even counting bounds checking of the switch)

b) Threaded-code via (say) computed goto does 1 load of opcode and 1
indirect
jump, because opcode is an address already (so there is no separate
jump table).


True, but I'd like to find a way that this can be done as an optimization.

I found a way but that relies on tail-call optimization, otherwise it 
would overflow stack. I would rather find some way that works without -O 
flag.


In fact it brings another unrelated problem with Phobos: any 
template-heavy libraries have amazingly awful speeds w/o inlining  
optimization enabled _by the client_. It should be the same with C++ though.



I'm certain that forced tail call would work just fine instead of
computed goto
for this scenario. In fact I've measured this with LDC and the results
are
promising but only work with -O2/-O3 (where tail call is optimized).
I'd gladly
dig them up if you are interested.


I'm pretty reluctant to add language features that can be done as
optimizations.


The point is - software that only works in release build is kind of hard 
to develop, even more so with libraries. Thus I'm in opposition to 
labeling such things as optimization when they, in fact, change semantics.


--
Dmitry Olshansky


Re: Software Assurance Reference Dataset

2014-07-23 Thread Walter Bright via Digitalmars-d

On 7/23/2014 1:06 PM, Dmitry Olshansky wrote:

The point is - software that only works in release build is kind of hard to
develop, even more so with libraries. Thus I'm in opposition to labeling such
things as optimization when they, in fact, change semantics.


-release, -inline, -O and -g are all separate flags for good reasons. You can 
mix and match as convenient for what you're trying to accomplish.




Re: Software Assurance Reference Dataset

2014-07-23 Thread Dmitry Olshansky via Digitalmars-d

24-Jul-2014 00:14, Walter Bright пишет:

On 7/23/2014 1:06 PM, Dmitry Olshansky wrote:

The point is - software that only works in release build is kind of
hard to
develop, even more so with libraries. Thus I'm in opposition to
labeling such
things as optimization when they, in fact, change semantics.


-release, -inline, -O and -g are all separate flags for good reasons.
You can mix and match as convenient for what you're trying to accomplish.



Unless you are library writer, and then it's user who mixes and matches 
flags and you are foobared unless it works with them all :)


--
Dmitry Olshansky


Re: Software Assurance Reference Dataset

2014-07-23 Thread Walter Bright via Digitalmars-d

On 7/23/2014 1:20 PM, Dmitry Olshansky wrote:

Unless you are library writer, and then it's user who mixes and matches flags
and you are foobared unless it works with them all :)


I think it's a generally unworkable rabbit hole if we add language features to 
turn on/off various optimizations.




Re: Software Assurance Reference Dataset

2014-07-23 Thread Dmitry Olshansky via Digitalmars-d

24-Jul-2014 00:43, Walter Bright пишет:

On 7/23/2014 1:20 PM, Dmitry Olshansky wrote:

Unless you are library writer, and then it's user who mixes and
matches flags
and you are foobared unless it works with them all :)


I think it's a generally unworkable rabbit hole if we add language
features to turn on/off various optimizations.

That's true. I'm thinking of how to solve things with minimal extension 
to the language that is generally useful (unlike compute-goto which is 
very niche and dangerous).


Obviously optimizations won't solve this problem, with that said it 
would be awesome to have your 2 final switch enhancements.


--
Dmitry Olshansky


Re: Software Assurance Reference Dataset

2014-07-23 Thread bearophile via Digitalmars-d

Dmitry Olshansky:


(unlike compute-goto which is very niche and dangerous).


It's a niche that system languages are the few fit for. Even 
CPython (and other interpreters) are using that feature.


Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-07-23 Thread Dmitry Olshansky via Digitalmars-d

24-Jul-2014 01:05, bearophile пишет:

Dmitry Olshansky:


(unlike compute-goto which is very niche and dangerous).


It's a niche that system languages are the few fit for. Even CPython
(and other interpreters) are using that feature.



All of fast portable interpreters i.e. not JITs.
It's still a niche, and a feature should have uses beyond that or be 
better tailored for this use case.


Forced tail call was my best shot - it's useful for functional 
programming and it works for fast interpreters.


--
Dmitry Olshansky


Re: Software Assurance Reference Dataset

2014-07-23 Thread Andrei Alexandrescu via Digitalmars-d

On 7/23/14, 1:14 PM, Walter Bright wrote:

On 7/23/2014 1:06 PM, Dmitry Olshansky wrote:

The point is - software that only works in release build is kind of
hard to
develop, even more so with libraries. Thus I'm in opposition to
labeling such
things as optimization when they, in fact, change semantics.


-release, -inline, -O and -g are all separate flags for good reasons.
You can mix and match as convenient for what you're trying to accomplish.


... and -noboundscheck. -- Andrei



Re: Software Assurance Reference Dataset

2014-07-23 Thread Walter Bright via Digitalmars-d

On 7/23/2014 1:53 PM, Dmitry Olshansky wrote:

Obviously optimizations won't solve this problem, with that said it would be
awesome to have your 2 final switch enhancements.


Yes. Feel free to ping me if I forget about them.

Also, I particularly like the first of them because it's not available to C/C++.



Re: Software Assurance Reference Dataset

2014-07-23 Thread Andrei Alexandrescu via Digitalmars-d

On 7/23/14, 2:41 PM, Walter Bright wrote:

On 7/23/2014 1:53 PM, Dmitry Olshansky wrote:

Obviously optimizations won't solve this problem, with that said it
would be
awesome to have your 2 final switch enhancements.


Yes. Feel free to ping me if I forget about them.


bugzilla -- Andrei



Re: Software Assurance Reference Dataset

2014-07-23 Thread Walter Bright via Digitalmars-d

On 7/23/2014 2:42 PM, Andrei Alexandrescu wrote:

On 7/23/14, 2:41 PM, Walter Bright wrote:

On 7/23/2014 1:53 PM, Dmitry Olshansky wrote:

Obviously optimizations won't solve this problem, with that said it
would be
awesome to have your 2 final switch enhancements.


Yes. Feel free to ping me if I forget about them.


bugzilla -- Andrei



They're already there:

https://issues.dlang.org/show_bug.cgi?id=13169
https://issues.dlang.org/show_bug.cgi?id=13170

But there's so much in bugzilla I often overlook things.


Re: Software Assurance Reference Dataset

2014-07-22 Thread Timon Gehr via Digitalmars-d

On 07/22/2014 05:02 AM, Walter Bright wrote:

On 7/21/2014 3:25 PM, Timon Gehr wrote:

The example just uses the ST Monad which is quite similar to weakly pure
statements in D.


D doesn't have weakly pure statements - it has weakly pure functions.


The type checker distinguishes between statements that are allowed in 
'pure'-qualified functions and statements that are not.


Re: Software Assurance Reference Dataset

2014-07-22 Thread Andrew Godfrey via Digitalmars-d

On Monday, 21 July 2014 at 22:10:06 UTC, bearophile wrote:

Tobias Müller:

Wouldn't it be more useful to have a modified/annotated return 
statement for that?


I don't know.


I see what you're saying, and it works for the quick sort 
example. I'm not sure it covers all tailrec cases, but it seems 
useful.


Re: Software Assurance Reference Dataset

2014-07-22 Thread Andrew Godfrey via Digitalmars-d

My understanding is that it can be done
but only with annotations or whole program analysis.


I think that's true but you don't necessarily have to annotate 
every function.


a) possibly there's something interesting to do at the module 
level. I think more than one thing. E.g. A module that doesn't 
have any callbacks in its interface is 'interesting'. E.g. 
'Layering' of modules.


b) Some situations are particularly dangerous and so a function 
annotation could be encouraged for those. E.g. If you have a 
recursive function without tail recursion, and the possible 
recursion depth is substantial, then while it is deep in its 
recursion, it should limit what other functions it calls. Someone 
could come along later and add a logging statement to it, which 
usually isn't dangerous but here it could be.



Quick sort is an instructive example because it has the security 
weakness that, although you expect the stack depth to typically 
be O(log n), an attacker in control of the input can force it to 
be O(n). Of course with tail recursion that doesn't threaten 
stack overflow, but it suggests that there are recursion cases we 
think are safe, and typically don't fall over, but are actually 
vulnerable. Which means if we don't feel like annotating them in 
defense, we're being irresponsible in a way.





Re: Software Assurance Reference Dataset

2014-07-22 Thread Andrei Alexandrescu via Digitalmars-d

On 7/21/14, 3:22 PM, Timon Gehr wrote:

On 07/21/2014 06:40 AM, Andrei Alexandrescu wrote:

On 7/20/14, 8:15 PM, Timon Gehr wrote:

On 07/20/2014 10:50 PM, Walter Bright wrote:

On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:

Functional programming is full of simple recursion and it would be
nice not to
stack overflow in debug builds.


Traditional FP languages don't have loops, and so must do recursion.


Uh...


D has loops, even in pure functions,


So does Haskell.

import Control.Monad
import Control.Monad.ST
import Data.STRef

factorial :: Integer - Integer
factorial n = runST $ do
   r - newSTRef 1
   forM_ [1..n] $ \i-
 modifySTRef r (*i)
   readSTRef r

main = print $ factorial 5 -- 120


And that doesn't look awkward at all :o). -- Andrei




Indeed.


Code sucks. It's its own destruction. -- Andrei



Re: Software Assurance Reference Dataset

2014-07-22 Thread Timon Gehr via Digitalmars-d

On 07/22/2014 05:49 PM, Andrei Alexandrescu wrote:

On 7/21/14, 3:22 PM, Timon Gehr wrote:

On 07/21/2014 06:40 AM, Andrei Alexandrescu wrote:

On 7/20/14, 8:15 PM, Timon Gehr wrote:

On 07/20/2014 10:50 PM, Walter Bright wrote:

...
D has loops, even in pure functions,


So does Haskell.

import Control.Monad
import Control.Monad.ST
import Data.STRef

factorial :: Integer - Integer
factorial n = runST $ do
   r - newSTRef 1
   forM_ [1..n] $ \i-
 modifySTRef r (*i)
   readSTRef r

main = print $ factorial 5 -- 120


And that doesn't look awkward at all :o). -- Andrei



Indeed.


Code sucks. It's its own destruction. -- Andrei



It is easy to understand and works correctly, but of course this is not 
a canonical implementation of factorial.


Re: Software Assurance Reference Dataset

2014-07-22 Thread bearophile via Digitalmars-d

Timon Gehr:


It is easy to understand and


It's easy to understand because you are very intelligent Timon :-)

Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-07-22 Thread Walter Bright via Digitalmars-d

On 7/20/2014 8:15 PM, Timon Gehr wrote:

So does Haskell.

import Control.Monad
import Control.Monad.ST
import Data.STRef

factorial :: Integer - Integer
factorial n = runST $ do
   r - newSTRef 1
   forM_ [1..n] $ \i-
 modifySTRef r (*i)
   readSTRef r


Would you rather write that or:

  int factorial(int i) pure {
auto r = 1;
foreach (i; 1..n + 1)
r *= i;
return r;
  }

And I'd love to see what native code is generated for the Haskell example.


Re: Software Assurance Reference Dataset

2014-07-22 Thread bearophile via Digitalmars-d

Walter Bright:


Would you rather write that or:


How much experience do you have in writing Haskell programs 
longer than 50 lines of code? I have a large respect for your 
experience and your intelligence, but be careful with comments 
like that, to not show just your ignorance...


If you are a Haskell programmer you usually don't write code like 
that, so in practice it's not a problem. And even if sometimes 
you have to write awkward code, the Haskell programmers can 
assure you that on average the advantages given by Haskell purity 
far outweighs the disadvantages.


On overall I prefer D over Haskell, but if you want to criticize 
Haskell you need much much better arguments and way bigger 
cannons :-)


Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-07-22 Thread H. S. Teoh via Digitalmars-d
On Tue, Jul 22, 2014 at 11:34:20AM -0700, Walter Bright via Digitalmars-d wrote:
 On 7/20/2014 8:15 PM, Timon Gehr wrote:
 So does Haskell.
 
 import Control.Monad
 import Control.Monad.ST
 import Data.STRef
 
 factorial :: Integer - Integer
 factorial n = runST $ do
r - newSTRef 1
forM_ [1..n] $ \i-
  modifySTRef r (*i)
readSTRef r
 
 Would you rather write that or:
 
   int factorial(int i) pure {
 auto r = 1;
 foreach (i; 1..n + 1)
   r *= i;
 return r;
   }
 
 And I'd love to see what native code is generated for the Haskell
 example.

I'd rather write:

int factorial(int n) pure {
return reduce!((a,b) = a*b)(1, iota(2, n));
}

;-)

With gdc -O3, this produces almost the same code as your looped version.
But I can barely read the disassembly, because it appears that gdc -O3
triggers aggressive vectorization, so there are tons of instructions
manipulating xmm registers and the loop is unrolled by 8 iterations.

I have this lurking suspicion that this may actually perform more poorly
than the naïve version for small values of n. :-) DMD, with -release -O
-inline, was unable to inline the call to reduce, but was at least able
to optimize reduce+iota into the equivalent of a simple foreach loop. So
it looks quite promising.

We should push for more aggressive optimizations of functional-style
constructs in D; I think there's a lot of potential here.


T

-- 
My program has no bugs! Only unintentional features...


Re: Software Assurance Reference Dataset

2014-07-22 Thread Walter Bright via Digitalmars-d

On 7/22/2014 11:44 AM, bearophile wrote:

Walter Bright:


Would you rather write that or:


How much experience do you have in writing Haskell programs longer than 50 lines
of code?


Zero.



I have a large respect for your experience and your intelligence, but
be careful with comments like that, to not show just your ignorance...

If you are a Haskell programmer you usually don't write code like that, so in
practice it's not a problem.


Does that imply you agree with my earlier point, then, that D doesn't need tail 
recursion because it can write loops instead?




And even if sometimes you have to write awkward
code, the Haskell programmers can assure you that on average the advantages
given by Haskell purity far outweighs the disadvantages.

On overall I prefer D over Haskell, but if you want to criticize Haskell you
need much much better arguments and way bigger cannons :-)


The point was not at all to criticize Haskell. The point was that D does not 
need tail recursion because D supports writing loop constructs. Showing how 
loops could be written using awkward monads in Haskell that Haskell programmers 
wouldn't write does not change that point.




Re: Software Assurance Reference Dataset

2014-07-22 Thread deadalnix via Digitalmars-d

On Tuesday, 22 July 2014 at 19:12:31 UTC, H. S. Teoh via
Digitalmars-d wrote:

I'd rather write:

int factorial(int n) pure {
return reduce!((a,b) = a*b)(1, iota(2, n));
}

;-)


This is how we fucked UFCS.


Re: Software Assurance Reference Dataset

2014-07-22 Thread Timon Gehr via Digitalmars-d

On 07/22/2014 08:34 PM, Walter Bright wrote:

On 7/20/2014 8:15 PM, Timon Gehr wrote:

So does Haskell.

import Control.Monad
import Control.Monad.ST
import Data.STRef

factorial :: Integer - Integer
factorial n = runST $ do
   r - newSTRef 1
   forM_ [1..n] $ \i-
 modifySTRef r (*i)
   readSTRef r


Would you rather write that or:

   int factorial(int i) pure {
 auto r = 1;
 foreach (i; 1..n + 1)
 r *= i;
 return r;
   }
...


If I fix the typo, this computes the same numbers for inputs int.min up 
to and including 12.



And I'd love to see what native code is generated for the Haskell example.


Well, the _actually_ corresponding D code using BigInt and DMD is 
significantly slower than what GHC generates here.


In any case, I don't think that this kind of mutation-heavy code is a 
core focus of GHC, so the assembly will probably not be nearly as well 
optimized as it could be.


But if you'd like to compare the offers of D and Haskell a little 
further, consider the following cute code, which I wrote in 5 minutes 
and which computes the 100th natural number that is a product of 
powers of 2, 3 and 5 well below a second on my machine:


merge :: Ord a = [a] - [a] - [a]
merge xs [] = xs
merge [] xs = xs
merge xxs@(x:xs) yys@(y:ys) =
  case compare x y of
LT - x : merge xs yys
EQ - x : merge xs ys
GT - y : merge xxs ys

hamming :: [Integer]
hamming = 1 : merge(map (*2) hamming)
(merge (map (*3) hamming)
   (map (*5) hamming))

main = print $ last $ take 100 hamming

51931278044838873608958984375000

What do you think would be the corresponding idiomatic/fast D code? Does 
it outperform the Haskell code?


Re: Software Assurance Reference Dataset

2014-07-22 Thread Timon Gehr via Digitalmars-d

On 07/22/2014 09:42 PM, Walter Bright wrote:


On overall I prefer D over Haskell, but if you want to criticize
Haskell you
need much much better arguments and way bigger cannons :-)


The point was not at all to criticize Haskell. The point was that D does
not need tail recursion because D supports writing loop constructs.


Tail call support is still useful. Looping is not the main reason for 
supporting tail calls. Eg. a tail call might be indirect and only 
sometimes recurse on the same function and sometimes call another function.



Showing how loops could be written using awkward


The point is not to criticize? :-)


monads  in Haskell that
Haskell programmers wouldn't write does not change that point.


Sure, that part of the discussion grew out of the claim that there are 
no loops there.


Re: Software Assurance Reference Dataset

2014-07-22 Thread Timon Gehr via Digitalmars-d

On 07/22/2014 11:23 PM, Timon Gehr wrote:

Looping is not the main reason for supporting tail calls.


(In D!)


Re: Software Assurance Reference Dataset

2014-07-22 Thread bearophile via Digitalmars-d

Timon Gehr:

What do you think would be the corresponding idiomatic/fast D 
code? Does it outperform the Haskell code?


Look in RosettaCode for various implementations in D:
http://rosettacode.org/wiki/Hamming_numbers#D

A good part of the performance difference is caused by the D GC 
and the fact that Haskell GHC uses a the GNU multiprecision libs 
for its bigsint. In the meantime we have to maintain, optimize 
and debug our bigint implementation, wasting time better spent 
elsewhere.


Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-07-22 Thread Walter Bright via Digitalmars-d

On 7/22/2014 2:23 PM, Timon Gehr wrote:

On 07/22/2014 09:42 PM, Walter Bright wrote:

Showing how loops could be written using awkward


The point is not to criticize? :-)


Yes. If you want to infer criticism of Haskell from that, go ahead, but none was 
intended. You can write awkward code in any language.




monads  in Haskell that
Haskell programmers wouldn't write does not change that point.

Sure, that part of the discussion grew out of the claim that there are no
loops there.


Bending a language over backwards to be pedantic is pointless for this 
discussion. Nobody claims Obfuscated C Contest entries are best (or even 
suggested) practices.


Re: Software Assurance Reference Dataset

2014-07-22 Thread Walter Bright via Digitalmars-d

On 7/22/2014 2:29 PM, bearophile wrote:

A good part of the performance difference is caused by the D GC and the fact
that Haskell GHC uses a the GNU multiprecision libs for its bigsint. In the
meantime we have to maintain, optimize and debug our bigint implementation,
wasting time better spent elsewhere.



I.e. the perf difference has nothing to do with Haskell or tail recursion.



Re: Software Assurance Reference Dataset

2014-07-22 Thread Walter Bright via Digitalmars-d

On 7/22/2014 2:08 PM, Timon Gehr wrote:

And I'd love to see what native code is generated for the Haskell example.

Well, the _actually_ corresponding D code using BigInt and DMD is significantly
slower than what GHC generates here.


Irrelevant to the looping issue - comparing the code gen of D's BigInt to 
Haskell's BigInt is not the issue here.




Re: Software Assurance Reference Dataset

2014-07-22 Thread Timon Gehr via Digitalmars-d

On 07/23/2014 12:14 AM, Walter Bright wrote:



Sure, that part of the discussion grew out of the claim that there are
no
loops there.


Bending a language over backwards


Not what happened.


to be pedantic


Being pedantic is important enough.


is pointless for this discussion.


If one doesn't want pointless pedantry my suggestion would be to not 
make wrong claims about apparently irrelevant things, or to ignore the 
rebuttal to communicate silent agreement.



Nobody claims Obfuscated C Contest entries are best (or
even suggested) practices.


Irrelevant. I'll leave here.


Re: Software Assurance Reference Dataset

2014-07-22 Thread Brian Rogoff via Digitalmars-d

On Tuesday, 22 July 2014 at 21:23:33 UTC, Timon Gehr wrote:

On 07/22/2014 09:42 PM, Walter Bright wrote:
The point was not at all to criticize Haskell. The point was 
that D does
not need tail recursion because D supports writing loop 
constructs.


Tail call support is still useful. Looping is not the main 
reason for supporting tail calls. Eg. a tail call might be 
indirect and only sometimes recurse on the same function and 
sometimes call another function.


Indeed, I recall Matthias Felleisen wrote that OOP makes no sense 
without TCO because methods will indirectly call each other. See


https://blogs.oracle.com/jrose/entry/tail_calls_in_the_vm#comment-1259314984000
http://www.eighty-twenty.org/index.cgi/tech/oo-tail-calls-20111001.html

While I don't care much for OOP, D is supposed to support that 
style.


Re: Software Assurance Reference Dataset

2014-07-21 Thread Andrew Godfrey via Digitalmars-d

On Sunday, 20 July 2014 at 06:06:16 UTC, bearophile wrote:

Andrew Godfrey:

2) Annotations about when a function does not expect 
re-entrancy

to be possible based on call-graph analysis.


I don't understand. Assuming I know tAhis 
(http://en.wikipedia.org/wiki/Reentrancy_%28computing%29 ) can 
you show an example?


Sorry, by re-entrancy I just meant a statement
of whether it is possible for the function to be re-entered
(not whether it would be correct to do so).
Thanks for the reminder, so I can exclude the complication
of ISR's from my point.

In the codebase I work on - application code - there
are no ISR's, but a common cause of unexpected re-entrancy we 
have is:

functions that call a Windows API which pumps messages.
This produces unexpectedly deep call stacks and threatens stack 
overflow.


But that's a complicated case that may not be feasible
for the compiler to help with. So, a simpler example...


The general idea here is that you've convinced yourself you
need to do something which stresses usage of stack space,
and you believe it is safe to do because of *something*.
Ideally you can express that something so that
if anyone tries to change it later, the compiler catches that.


So, for example you've allocated a buffer on the stack,
and your function 'A' builds up data into it, then calls
another function 'X' to 'send' that data in some way.
Function 'A' definitely doesn't expect to be re-entered.
(It may or may not be re-entrant, in the definition of the wiki 
page.

Doesn't matter for this.)

Now the first step probably doesn't need language help:
call it assertIAmNotReentered(). I think any solution probably
still needs this as a backup. I say assert (i.e. runtime check
that you'd remove in a release build) because I'm assuming
that if you use it a lot you wouldn't want to pay for all those
checks in a release build.


Now: Suppose 'X' is a Phobos function, and 'A' doesn't
call anything else. Then it seems feasible (but a lot of work)
to provide an annotation (or combination of annotations)
that mean:
Compiler, I am not called by an ISR. You don't have to verify
that. But given that, I claim that I am not re-entered.
Please verify.

This is a lot of work - and potentially error-prone - because
when Phobos calls the C runtime or OS API's, humans
would need to make decisions about whether those functions
'may callback'. For this reason I don't think we'd want
to eliminate assertIAmNotReentered().

By this process, Phobos functions which may call a 'callback' 
back into user code (whether directly or indirectly) would be 
distinguished

from ones which cannot (let's call it NoCallback, since
I can't think of a non-terrible name right now).
And then the annotation above
can be enforced at least in those terms (i.e. the compiler
ensures that 'X' only calls NoCallback-annotated functions).


I think this is valuable, and not just to avoid stack overflow.
The message-pumping case often violates higher-level correctness.


Re: Software Assurance Reference Dataset

2014-07-21 Thread Walter Bright via Digitalmars-d

On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:

Functional programming is full of simple recursion and it would be nice not to
stack overflow in debug builds.


Traditional FP languages don't have loops, and so must do recursion. D has 
loops, even in pure functions, there's no reason not to use them.




Another use case is so-called threaded code interpreter, that can be done with
either computed gotos (and bunch of labels) or forced tail calls (and bunch of
functions). In both cases computed jump or tail call is indirect.


http://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables/

The computed goto is faster for two reasons, according to the article:

1.The switch does a bit more per iteration because of bounds checking.

2.The effects of hardware branch prediction.


(1) is eliminated with final switch. I know this optimization is not done with 
dmd, but D doesn't need a new language feature because of that.


(2) with a bit more work, this could also be implemented as a compiler 
optimization.


Both of these are worthy of bugzilla enhancement requests. I'd much rather 
implement improvements as better code generation rather than more language features.


Re: Software Assurance Reference Dataset

2014-07-21 Thread Walter Bright via Digitalmars-d

On 7/20/2014 1:50 PM, Walter Bright wrote:

Both of these are worthy of bugzilla enhancement requests.


Amazingly, these suddenly appeared:

https://issues.dlang.org/show_bug.cgi?id=13169
https://issues.dlang.org/show_bug.cgi?id=13170


Re: Software Assurance Reference Dataset

2014-07-21 Thread bearophile via Digitalmars-d

Walter Bright:


https://issues.dlang.org/show_bug.cgi?id=13169
https://issues.dlang.org/show_bug.cgi?id=13170


Are such optimizations portable and guaranteed on all D 
compilers? If the answer is negative, then they can't replace a 
_standard_ D syntax for computed gotos.


Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-07-21 Thread Walter Bright via Digitalmars-d

On 7/20/2014 3:10 PM, Dmitry Olshansky wrote:

The computed goto is faster for two reasons, according to the article:

1.The switch does a bit more per iteration because of bounds checking.


Now let's consider proper implementation of thread-code interpreter.
where *code pointer points to an array of addresses. We've been through this
before and it turns out switch is slower because of an extra load.

a) Switch does 1 load for opcode, 1 load for the jump table, 1 indirect jump to
advance
(not even counting bounds checking of the switch)

b) Threaded-code via (say) computed goto does 1 load of opcode and 1 indirect
jump, because opcode is an address already (so there is no separate jump table).


True, but I'd like to find a way that this can be done as an optimization.



I'm certain that forced tail call would work just fine instead of computed goto
for this scenario. In fact I've measured this with LDC and the results are
promising but only work with -O2/-O3 (where tail call is optimized). I'd gladly
dig them up if you are interested.


I'm pretty reluctant to add language features that can be done as optimizations.



Re: Software Assurance Reference Dataset

2014-07-21 Thread Dmitry Olshansky via Digitalmars-d

21-Jul-2014 00:50, Walter Bright пишет:

On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:

Functional programming is full of simple recursion and it would be
nice not to
stack overflow in debug builds.


Traditional FP languages don't have loops, and so must do recursion. D
has loops, even in pure functions, there's no reason not to use them.




Another use case is so-called threaded code interpreter, that can be
done with
either computed gotos (and bunch of labels) or forced tail calls (and
bunch of
functions). In both cases computed jump or tail call is indirect.


http://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables/


Actually that blog entry is wrong about it. Computed goto won't help 
when used as direct replacement of switch and you are correct in that 
the said code could be easily optimized with final switch.


What would help a lot is thread-code interpreter, that is bytecode 
contains direct addresses used in computed goto.



The computed goto is faster for two reasons, according to the article:

1.The switch does a bit more per iteration because of bounds checking.


Now let's consider proper implementation of thread-code interpreter.
where *code pointer points to an array of addresses. We've been through 
this before and it turns out switch is slower because of an extra load.


a) Switch does 1 load for opcode, 1 load for the jump table, 1 indirect 
jump to advance

(not even counting bounds checking of the switch)

b) Threaded-code via (say) computed goto does 1 load of opcode and 1 
indirect jump, because opcode is an address already (so there is no 
separate jump table).


I'm certain that forced tail call would work just fine instead of 
computed goto for this scenario. In fact I've measured this with LDC and 
the results are promising but only work with -O2/-O3 (where tail call is 
optimized). I'd gladly dig them up if you are interested.



--
Dmitry Olshansky


Re: Software Assurance Reference Dataset

2014-07-21 Thread Walter Bright via Digitalmars-d

On 7/20/2014 2:38 PM, bearophile wrote:

Are such optimizations portable and guaranteed on all D compilers? If the answer
is negative, then they can't replace a _standard_ D syntax for computed gotos.


C'mon, bearophile. Optimizations are always implementation dependent. People 
rely on them for fast code. This isn't any different.




Re: Software Assurance Reference Dataset

2014-07-21 Thread Timon Gehr via Digitalmars-d

On 07/20/2014 10:50 PM, Walter Bright wrote:

On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:

Functional programming is full of simple recursion and it would be
nice not to
stack overflow in debug builds.


Traditional FP languages don't have loops, and so must do recursion.


Uh...


D has loops, even in pure functions,


So does Haskell.

import Control.Monad
import Control.Monad.ST
import Data.STRef

factorial :: Integer - Integer
factorial n = runST $ do
  r - newSTRef 1
  forM_ [1..n] $ \i-
modifySTRef r (*i)
  readSTRef r

main = print $ factorial 5 -- 120



there's no reason not to use them.
...


But of course there are reasons to use tail calls.


Re: Software Assurance Reference Dataset

2014-07-21 Thread Walter Bright via Digitalmars-d

On 7/20/2014 8:15 PM, Timon Gehr wrote:

On 07/20/2014 10:50 PM, Walter Bright wrote:

On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:

Functional programming is full of simple recursion and it would be
nice not to
stack overflow in debug builds.


Traditional FP languages don't have loops, and so must do recursion.


Uh...


D has loops, even in pure functions,


So does Haskell.

import Control.Monad


Uh-oh! Monad! :-)



Re: Software Assurance Reference Dataset

2014-07-21 Thread Andrei Alexandrescu via Digitalmars-d

On 7/20/14, 8:15 PM, Timon Gehr wrote:

On 07/20/2014 10:50 PM, Walter Bright wrote:

On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:

Functional programming is full of simple recursion and it would be
nice not to
stack overflow in debug builds.


Traditional FP languages don't have loops, and so must do recursion.


Uh...


D has loops, even in pure functions,


So does Haskell.

import Control.Monad
import Control.Monad.ST
import Data.STRef

factorial :: Integer - Integer
factorial n = runST $ do
   r - newSTRef 1
   forM_ [1..n] $ \i-
 modifySTRef r (*i)
   readSTRef r

main = print $ factorial 5 -- 120


And that doesn't look awkward at all :o). -- Andrei




Re: Software Assurance Reference Dataset

2014-07-21 Thread deadalnix via Digitalmars-d

On Sunday, 20 July 2014 at 07:27:36 UTC, Walter Bright wrote:

What about the @continuation
(http://en.wikipedia.org/wiki/Continuation-passing_style )?


I doubt they'll want to use that attribute, either.



Especially if that can be done with AST macros :D


Re: Software Assurance Reference Dataset

2014-07-21 Thread Tobias Müller via Digitalmars-d
Andrew Godfrey x...@y.com wrote:
 1) A function annotation that means I will call myself
 recursively, and when I do, I expect the tail recursion
 optimization. I have seen code which allocates something big on
 the stack and depends on the optimization. So this intent should
 be expressible.

Wouldn't it be more useful to have a modified/annotated return statement
for that?

Tail-recursiveness is an implementation detail, for the user of the
function it's not really interesting. Except for the fact that it has
bounded stack size which is a useful property by itself and not only for
tailrecursive functions.

Tobi


Re: Software Assurance Reference Dataset

2014-07-21 Thread bearophile via Digitalmars-d

Tobias Müller:

Wouldn't it be more useful to have a modified/annotated return 
statement for that?


I don't know.


Tail-recursiveness is an implementation detail, for the user of 
the function it's not really interesting.


Yes, in theory a @tailrec annotation doesn't change the mangling 
of the function. But D has exceptions and stack traces. A stack 
trace for an annotated function is different because there are 
less stack frames...



Except for the fact that it has bounded stack size which is a 
useful property by itself and not only for tailrecursive 
functions.


Yes, that's why I have said that a @continuation is a more 
general solution than @tailrec.


Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-07-21 Thread Timon Gehr via Digitalmars-d

On 07/21/2014 06:40 AM, Andrei Alexandrescu wrote:

On 7/20/14, 8:15 PM, Timon Gehr wrote:

On 07/20/2014 10:50 PM, Walter Bright wrote:

On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:

Functional programming is full of simple recursion and it would be
nice not to
stack overflow in debug builds.


Traditional FP languages don't have loops, and so must do recursion.


Uh...


D has loops, even in pure functions,


So does Haskell.

import Control.Monad
import Control.Monad.ST
import Data.STRef

factorial :: Integer - Integer
factorial n = runST $ do
   r - newSTRef 1
   forM_ [1..n] $ \i-
 modifySTRef r (*i)
   readSTRef r

main = print $ factorial 5 -- 120


And that doesn't look awkward at all :o). -- Andrei




Indeed.

This just creates a variable, modifies it 'n' times in a loop and then 
reads the result.


Re: Software Assurance Reference Dataset

2014-07-21 Thread Timon Gehr via Digitalmars-d

On 07/21/2014 06:56 AM, Walter Bright wrote:

On 7/20/2014 8:15 PM, Timon Gehr wrote:

On 07/20/2014 10:50 PM, Walter Bright wrote:

On 7/20/2014 3:27 AM, Dmitry Olshansky wrote:

Functional programming is full of simple recursion and it would be
nice not to
stack overflow in debug builds.


Traditional FP languages don't have loops, and so must do recursion.


Uh...


D has loops, even in pure functions,


So does Haskell.

import Control.Monad


Uh-oh! Monad! :-)



The example just uses the ST Monad which is quite similar to weakly pure 
statements in D.


Re: Software Assurance Reference Dataset

2014-07-21 Thread bearophile via Digitalmars-d

Andrei Alexandrescu:


And that doesn't look awkward at all :o). -- Andrei


A related thread:
http://stackoverflow.com/questions/6622524/why-is-haskell-sometimes-referred-to-as-best-imperative-language

Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-07-21 Thread Timon Gehr via Digitalmars-d

On 07/22/2014 12:10 AM, bearophile wrote:




Except for the fact that it has bounded stack size which is a useful
property by itself and not only for tailrecursive functions.


Yes, that's why I have said that a @continuation is a more general
solution than @tailrec.

Bye,
bearophile


(Continuations and a tail calls are not the same kind of thing.)

http://en.wikipedia.org/wiki/Continuation#First-class_continuations
http://en.wikipedia.org/wiki/Delimited_continuation

http://en.wikipedia.org/wiki/Tail_call


Re: Software Assurance Reference Dataset

2014-07-21 Thread Walter Bright via Digitalmars-d

On 7/21/2014 3:25 PM, Timon Gehr wrote:

The example just uses the ST Monad which is quite similar to weakly pure
statements in D.


D doesn't have weakly pure statements - it has weakly pure functions.


Re: Software Assurance Reference Dataset

2014-07-20 Thread bearophile via Digitalmars-d

Andrew Godfrey:


1) A function annotation that means I will call myself
recursively, and when I do, I expect the tail recursion
optimization. I have seen code which allocates something big on
the stack and depends on the optimization. So this intent should
be expressible.


A @tailrec annotation seems good, and will improve the functional 
usages of D. It should stop compilation with a error if the 
function doesn't call itself or if the compiler is not able to 
remove the recursive call. This means it has to be fully enforced.




2) Annotations about when a function does not expect re-entrancy
to be possible based on call-graph analysis.


I don't understand. Assuming I know this 
(http://en.wikipedia.org/wiki/Reentrancy_%28computing%29 ) can 
you show an example?


Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-07-20 Thread bearophile via Digitalmars-d

Andrew Godfrey:


1) A function annotation that means I will call myself
recursively, and when I do, I expect the tail recursion
optimization. I have seen code which allocates something big 
on
the stack and depends on the optimization. So this intent 
should

be expressible.


A @tailrec annotation seems good, and will improve the 
functional usages of D. It should stop compilation with a error 
if the function doesn't call itself or if the compiler is not 
able to remove the recursive call. This means it has to be 
fully enforced.


Perhaps a @cps (or @continuation) annotation is better and more 
general.


Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-07-20 Thread Walter Bright via Digitalmars-d

On 7/19/2014 11:06 PM, bearophile wrote:

A @tailrec annotation seems good, and will improve the functional usages of D.
It should stop compilation with a error if the function doesn't call itself or
if the compiler is not able to remove the recursive call. This means it has to
be fully enforced.


If you want to guarantee replacement of a recursive call with a loop, just write 
a loop.




Re: Software Assurance Reference Dataset

2014-07-20 Thread bearophile via Digitalmars-d

Walter Bright:

If you want to guarantee replacement of a recursive call with a 
loop, just write a loop.


There are cases where a recursive algorithm is nicer. And people 
that want to use D functionally, may also be used to writing code 
recursively.


What about the @continuation 
(http://en.wikipedia.org/wiki/Continuation-passing_style )?


Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-07-20 Thread Walter Bright via Digitalmars-d

On 7/19/2014 11:55 PM, bearophile wrote:

Walter Bright:


If you want to guarantee replacement of a recursive call with a loop, just
write a loop.


There are cases where a recursive algorithm is nicer. And people that want to
use D functionally, may also be used to writing code recursively.


I doubt they'll want to use an @tailrec attribute.



What about the @continuation
(http://en.wikipedia.org/wiki/Continuation-passing_style )?


I doubt they'll want to use that attribute, either.


In any case, D supports more styles of programming than any other language I can 
think of. I doubt adding even more will be that helpful.


Re: Software Assurance Reference Dataset

2014-07-20 Thread Dmitry Olshansky via Digitalmars-d

20-Jul-2014 10:45, Walter Bright пишет:

On 7/19/2014 11:06 PM, bearophile wrote:

A @tailrec annotation seems good, and will improve the functional
usages of D.
It should stop compilation with a error if the function doesn't call
itself or
if the compiler is not able to remove the recursive call. This means
it has to
be fully enforced.


If you want to guarantee replacement of a recursive call with a loop,
just write a loop.



Functional programming is full of simple recursion and it would be nice 
not to stack overflow in debug builds.


Another use case is so-called threaded code interpreter, that can be 
done with either computed gotos (and bunch of labels) or forced tail 
calls (and bunch of functions). In both cases computed jump or tail call 
is indirect.


--
Dmitry Olshansky


Re: Software Assurance Reference Dataset

2014-07-20 Thread bearophile via Digitalmars-d

Walter Bright:


I doubt they'll want to use an @tailrec attribute.


In Scala there is @tailrec:
http://www.scala-lang.org/api/current/scala/annotation/tailrec.html

In both F# and OcaML there is the rec keyword:
http://msdn.microsoft.com/en-us/library/dd233232.aspx

http://caml.inria.fr/pub/docs/manual-ocaml-400/manual003.html#toc4

In Clojure there is recur (that is not an annotation):
http://clojure.org/special_forms?responseToken=08ea4841337f67bb8f07663aa70b03aca#recur

I think functional programmers are willing to use @tailrec 
attribute if it's well designed and it does what's written on its 
tin.




What about the @continuation
(http://en.wikipedia.org/wiki/Continuation-passing_style )?


I doubt they'll want to use that attribute, either.


I don't know. It's more general than the @tailrec, but probably 
many C and C++ and Java programmers don't even know what it is. 
But it allows a programming style that in some case is 
interesting (far cleaner than computed gotos).



In any case, D supports more styles of programming than any 
other language I can think of. I doubt adding even more will be 
that helpful.


I think a basic form of pattern matching implemented with the 
switch construct is a good idea for D.


Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-07-19 Thread Andrew Godfrey via Digitalmars-d

Returning to the earlier question, of helping to avoid stack
overflows:
I can easily think of two things the language could do.
(Forgive me if D already has them - I have read a fair amount but
not all the minutiae.)

1) A function annotation that means I will call myself
recursively, and when I do, I expect the tail recursion
optimization. I have seen code which allocates something big on
the stack and depends on the optimization. So this intent should
be expressible.

2) Annotations about when a function does not expect re-entrancy
to be possible based on call-graph analysis. This would obviously
have to restrict what it can do in order to be feasible, but
wouldn't it still be useful? Besides - the cases where it is hard
for the compiler to detect the possibility of re-entrancy are
often cases where humans have trouble noticing it too.


Re: Software Assurance Reference Dataset

2014-07-14 Thread via Digitalmars-d

On Sunday, 13 July 2014 at 23:35:46 UTC, Walter Bright wrote:

On 7/13/2014 4:04 AM, Marc Schütz schue...@gmx.net wrote:

On Sunday, 13 July 2014 at 03:25:08 UTC, Walter Bright wrote:

On 7/11/2014 10:28 AM, deadalnix wrote:

The compiler can ensure that you hit at least every 4k or so.


And it already does.


Doesn't look so:




int bar(int a) {
int[8000] b = void;
b[$-1] = a;
return b[$-1];
}


On Win32:

_D4foo53barFiZi comdat
assume  CS:_D4foo53barFiZi
pushEBP
mov EBP,ESP
mov EDX,7
L8: sub ESP,01000h
test[ESP],ESP
dec EDX
jne L8
sub ESP,0D04h
lea ECX,-8[EBP]
mov [ECX],EAX
mov EAX,-8[EBP]
leave
ret

It doesn't do it on Linux because gcc doesn't do it. But the 
capability is in the back end, and it does it for alloca(), too.


Hmm... but this using DMD, not GDC. Or do you mean that DMD 
doesn't do it, because GCC doesn't do it either? If so, what is 
the reason for this? Why shouldn't this feature be enabled on 
every platform?


Re: Software Assurance Reference Dataset

2014-07-14 Thread Walter Bright via Digitalmars-d

On 7/14/2014 3:07 AM, Marc Schütz schue...@gmx.net wrote:

Hmm... but this using DMD, not GDC. Or do you mean that DMD doesn't do it,
because GCC doesn't do it either?


Right.



If so, what is the reason for this? Why
shouldn't this feature be enabled on every platform?


There are often undocumented reasons why gcc generates code in certain ways, and 
I figured that being compatible was the best strategy. Every time I deviate, it 
comes back eventually as a bug report.


Re: Software Assurance Reference Dataset

2014-07-14 Thread Iain Buclaw via Digitalmars-d
On 14 July 2014 11:07, via Digitalmars-d digitalmars-d@puremagic.com wrote:
 On Sunday, 13 July 2014 at 23:35:46 UTC, Walter Bright wrote:

 On 7/13/2014 4:04 AM, Marc Schütz schue...@gmx.net wrote:

 On Sunday, 13 July 2014 at 03:25:08 UTC, Walter Bright wrote:

 On 7/11/2014 10:28 AM, deadalnix wrote:

 The compiler can ensure that you hit at least every 4k or so.


 And it already does.


 Doesn't look so:


 int bar(int a) {
 int[8000] b = void;
 b[$-1] = a;
 return b[$-1];
 }


 On Win32:

 _D4foo53barFiZi comdat
 assume  CS:_D4foo53barFiZi
 pushEBP
 mov EBP,ESP
 mov EDX,7
 L8: sub ESP,01000h
 test[ESP],ESP
 dec EDX
 jne L8
 sub ESP,0D04h
 lea ECX,-8[EBP]
 mov [ECX],EAX
 mov EAX,-8[EBP]
 leave
 ret

 It doesn't do it on Linux because gcc doesn't do it. But the capability is
 in the back end, and it does it for alloca(), too.


 Hmm... but this using DMD, not GDC. Or do you mean that DMD doesn't do it,
 because GCC doesn't do it either? If so, what is the reason for this? Why
 shouldn't this feature be enabled on every platform?


For GDC, there is -fstack-protector (which is turned on by default in
distributed binaries from Ubuntu).  However IIRC only functions that
use alloca or have static arrays of type char are actually checked.
Declaring an int[100] doesn't invoke alloca, so you won't see it.

The bounds checking in D alone is enough to catch most common
overflowing stack bugs.

Regards
Iain



Re: Software Assurance Reference Dataset

2014-07-13 Thread via Digitalmars-d

On Sunday, 13 July 2014 at 03:25:08 UTC, Walter Bright wrote:

On 7/11/2014 10:28 AM, deadalnix wrote:

The compiler can ensure that you hit at least every 4k or so.


And it already does.


Doesn't look so:

int foo(int a) {
int[100] b = void;
b[$-1] = a;
return b[$-1];
}

int bar(int a) {
int[8000] b = void;
b[$-1] = a;
return b[$-1];
}

# objdump -d bigstack.o | ddemangle | less

Disassembly of section .text.int bigstack.foo(int):

 int bigstack.foo(int):
   0:   55  push   %rbp
   1:   48 8b ecmov%rsp,%rbp
   4:   48 81 ec a0 01 00 00sub$0x1a0,%rsp
   b:   48 8d 45 f4 lea-0xc(%rbp),%rax
   f:   89 38   mov%edi,(%rax)
  11:   8b 45 f4mov-0xc(%rbp),%eax
  14:   c9  leaveq
  15:   c3  retq
  16:   66 90   xchg   %ax,%ax

Disassembly of section .text.int bigstack.bar(int):

 int bigstack.bar(int):
   0:   55  push   %rbp
   1:   48 8b ecmov%rsp,%rbp
   4:   48 81 ec 10 7d 00 00sub$0x7d10,%rsp
   b:   48 8d 45 f4 lea-0xc(%rbp),%rax
   f:   89 38   mov%edi,(%rax)
  11:   8b 45 f4mov-0xc(%rbp),%eax
  14:   c9  leaveq
  15:   c3  retq
  16:   66 90   xchg   %ax,%ax

(This is with DMD master on Linux x86_64. Same with -m32.)


Re: Software Assurance Reference Dataset

2014-07-13 Thread Walter Bright via Digitalmars-d

On 7/13/2014 4:04 AM, Marc Schütz schue...@gmx.net wrote:

On Sunday, 13 July 2014 at 03:25:08 UTC, Walter Bright wrote:

On 7/11/2014 10:28 AM, deadalnix wrote:

The compiler can ensure that you hit at least every 4k or so.


And it already does.


Doesn't look so:




 int bar(int a) {
 int[8000] b = void;
 b[$-1] = a;
 return b[$-1];
 }


On Win32:

_D4foo53barFiZi comdat
assume  CS:_D4foo53barFiZi
pushEBP
mov EBP,ESP
mov EDX,7
L8: sub ESP,01000h
test[ESP],ESP
dec EDX
jne L8
sub ESP,0D04h
lea ECX,-8[EBP]
mov [ECX],EAX
mov EAX,-8[EBP]
leave
ret

It doesn't do it on Linux because gcc doesn't do it. But the capability is in 
the back end, and it does it for alloca(), too.


Re: Software Assurance Reference Dataset

2014-07-12 Thread via Digitalmars-d

On Friday, 11 July 2014 at 17:28:39 UTC, deadalnix wrote:

The compiler can ensure that you hit at least every 4k or so. It
doesn't look like a very hard constraint to have a volatile load
per untouched 4k of stack (which should be very rare).


Sure, it should be possible to work it into the backend at the 
code-gen level.


For fibers without recursion I think the most powerful approach 
would be to do whole program analysis to establish a minimum of N 
for the typical case (without unbounded recursion). Of course, 
then alloca also have to extend the stack and do book keeping of 
the additional storage  taken by alloca() or simply have a 
separate alloc-stack.


Re: Software Assurance Reference Dataset

2014-07-12 Thread Walter Bright via Digitalmars-d

On 7/11/2014 10:28 AM, deadalnix wrote:

The compiler can ensure that you hit at least every 4k or so.


And it already does.


Re: Software Assurance Reference Dataset

2014-07-11 Thread deadalnix via Digitalmars-d

On Monday, 30 June 2014 at 08:00:37 UTC, Ola Fosheim Grøstad
wrote:

On Thursday, 26 June 2014 at 09:35:20 UTC, Walter Bright wrote:
Stack overflows are not safety problems when a guard page is 
used past the end of the stack. Then, overflow checking is 
done in hardware. Guard pages aren't currently used for 
fibers, so overflows are a real danger there.


But a page is only 2K? So what happens if you skip more than 2K 
and never touch the guard page? Does D prove that the stack 
pointer is never moved more than 2K-1 without a read or write 
in that range?




The compiler can ensure that you hit at least every 4k or so. It
doesn't look like a very hard constraint to have a volatile load
per untouched 4k of stack (which should be very rare).


Re: Software Assurance Reference Dataset

2014-07-10 Thread bearophile via Digitalmars-d

(Sorry for the very late answer.)

Walter Bright:

Stack overflows are not safety problems when a guard page is 
used past the end

of the stack.
It's not a safety problem in Erlang/Rust, because those 
languages are designed to manage such failures in a good way.


Please explain.


The idea comes from Erlang language (and perhaps Erlang has coped 
it from something else), and then Rust copied it (and indeed, if 
you look at the Influenced by list here, you see Erlang, and it 
Rust has copied only the Erlang feature I am discussing here: 
http://en.wikipedia.org/wiki/Rust_language ).


Erlang systems must be extremely reliable, they run 
telecommunication systems that must just always work, with only 
seconds or minutes of downtime every year. But programs contains 
errors and bugs, so to face this problem Erlang (and now Rust) 
has chosen a curious strategy.


The description, see 4.3 Error handling philosophy at page 
104-109, a PDF file:

http://www.erlang.org/download/armstrong_thesis_2003.pdf

It's also a bit explained here, at the 3. What is 
fault-tolerance section:

http://stackoverflow.com/questions/3172542/are-erlang-otp-messages-reliable-can-messages-be-duplicated/3176864#3176864

Some more technical info:
http://www.erlang.org/doc/design_principles/sup_princ.html

Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-07-10 Thread bearophile via Digitalmars-d

(Sorry for the very delayed answers.)



Walter Bright:

Oh well, there goes about 90% of D programs and pretty much all 
use of the D runtime library!


Right. On the other hand the need for such strict safety 
requirements that forbid stack overflows, is not present in 
desktop software, it's for uncommon systems, with limited 
resources, that need to work reliably, like some car/plane 
electronics.


So when you have such needs you are willing to throw away 90% of 
D libs/Phobos, and probably to not use d-runtime too :-) And in 
such cases it's good to have as much compile-time assurances as 
possible.




And I suspect the subset of D programs
that don't have indirection or recursion is so small as to not 
be worth the bother.


I do know there are a class of applications, for example in 
critical embedded
systems, were recursion and indirection, and even allocation, 
are not allowed.
Using D in such cases would require eschewing using Phobos, and 
some other care

taken,


Right. And people in such fields are looking for something safer 
and nicer than C + a ton of static tests (and many of them don't 
seem to love Ada).




Andrei Alexandrescu:

My understanding is that it can be done but only with 
annotations or whole program analysis.


I agree. But people in (example) automotive industry could be 
(and probably are) willing to do both.


Thus is a specialized niche, so adding such annotations to D 
language seems too much. So I've suggested to add enough static 
introspection (and some way to attach compile-time semantics to 
UDAs) to allow specialized programmers to add to D those 
annotations (and leave other more complex tests to external 
tools).




H. S. Teoh:

I think the compiler should be able to tell, at least for the 
simplest

cases, whether something will definitely stop recursing.


Whiley language (http://whiley.org/ ) is able to do this, with 
the loop variant you write.




deadalnix:


2) By alloca();


it is @system


I'd like to use something like alloca (or much better something 
similar to the variable-length stack-allocated 
partially-library-defined arrays of C++14/C++17) even in @safe 
code (but not in stack-constrained code we were discussing here).




We should have a page reserved at the end of the stack so we can
throw when reaching it. The compiler can ensure we don't bypass
it in case 1.


OK.



Ola Fosheim Grøstad:

A compromise would be to inject runtime checks to see if there 
is

sufficient stack space whenever you move the stack pointer


Run-time tests in debug mode seem useful.

Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-07-10 Thread Walter Bright via Digitalmars-d

On 7/10/2014 5:08 AM, bearophile wrote:

(Sorry for the very late answer.)

Walter Bright:


Stack overflows are not safety problems when a guard page is used past the end
of the stack.

It's not a safety problem in Erlang/Rust, because those languages are
designed to manage such failures in a good way.


Please explain.


The idea comes from Erlang language (and perhaps Erlang has coped it from
something else), and then Rust copied it (and indeed, if you look at the
Influenced by list here, you see Erlang, and it Rust has copied only the
Erlang feature I am discussing here: http://en.wikipedia.org/wiki/Rust_language 
).

Erlang systems must be extremely reliable, they run telecommunication systems
that must just always work, with only seconds or minutes of downtime every year.
But programs contains errors and bugs, so to face this problem Erlang (and now
Rust) has chosen a curious strategy.

The description, see 4.3 Error handling philosophy at page 104-109, a PDF 
file:
http://www.erlang.org/download/armstrong_thesis_2003.pdf

It's also a bit explained here, at the 3. What is fault-tolerance section:
http://stackoverflow.com/questions/3172542/are-erlang-otp-messages-reliable-can-messages-be-duplicated/3176864#3176864


Some more technical info:
http://www.erlang.org/doc/design_principles/sup_princ.html

Bye,
bearophile


Thanks for the links!


Re: Software Assurance Reference Dataset

2014-06-30 Thread via Digitalmars-d

On Thursday, 26 June 2014 at 09:35:20 UTC, Walter Bright wrote:
Stack overflows are not safety problems when a guard page is 
used past the end of the stack. Then, overflow checking is done 
in hardware. Guard pages aren't currently used for fibers, so 
overflows are a real danger there.


But a page is only 2K? So what happens if you skip more than 2K 
and never touch the guard page? Does D prove that the stack 
pointer is never moved more than 2K-1 without a read or write in 
that range?


Guard pages on a flat memory model are not as safe as a segmented 
memory model.


A compromise would be to inject runtime checks to see if there is 
sufficient stack space whenever you move the stack pointer and 
remove them when you can prove that there is enough room. E.g. 
collapse the checks into larger spans of stack space by 
propagating them upwards in the call chain.


Anyway, minimizing stack space is very useful for fibers in 
scientific simulations or real time systems since you want to be 
able to run as many as you can fit into memory. Each actor/agent 
could be very simple so I would not rule out the ability to prove 
it in most cases for some domains.


Re: Software Assurance Reference Dataset

2014-06-29 Thread ponce via Digitalmars-d

On Thursday, 26 June 2014 at 21:01:40 UTC, Araq wrote:


Spark is a research language that does not work, as I've 
discovered and discussed with you before. It cannot be 
determined the max stack usage at compile time, again, this is 
the halting problem.




What?! It's easily solvable: Forbid recursion and indirect
function calls and it's guaranteed that the program only 
requires

a fixed size stack and you can compute an upper bound of the
required stack size at compile-time. Which is BTW exactly what
OpenCL does as GPUs tend to have no stacks.


Actually CUDA do have recursion, and indirect function calls.


Re: Software Assurance Reference Dataset

2014-06-29 Thread Paulo Pinto via Digitalmars-d

Am 29.06.2014 11:07, schrieb ponce:

On Thursday, 26 June 2014 at 21:01:40 UTC, Araq wrote:


Spark is a research language that does not work, as I've discovered
and discussed with you before. It cannot be determined the max stack
usage at compile time, again, this is the halting problem.



What?! It's easily solvable: Forbid recursion and indirect
function calls and it's guaranteed that the program only requires
a fixed size stack and you can compute an upper bound of the
required stack size at compile-time. Which is BTW exactly what
OpenCL does as GPUs tend to have no stacks.


Actually CUDA do have recursion, and indirect function calls.


Not only that, CUDA offers a quite usable C++ subset whereas OpenCL 
keeps us in the primitive C land.


--
Paulo


Re: Software Assurance Reference Dataset

2014-06-29 Thread via Digitalmars-d

On Thursday, 26 June 2014 at 09:35:20 UTC, Walter Bright wrote:

On 6/26/2014 2:19 AM, bearophile wrote:

One kind of problem left is to avoid stack overflows.


In general, stack overflow checking at compile time is the 
halting problem. It needs a runtime check.


No, is N bytes of stack is sufficient is decidable if you don't 
count in the heap as an unlimited resource that preserves state. 
So it is not the halting problem. The halting problem would be 
is K bytes of stack sufficient, for K in [0,infinity.


You can also do a worst case analysis that requires you to 
specify too much stack, but proves that it is sufficient. E.g. 
set the recursive depth to the max number of nodes in a binary 
tree, even if you know that it will never get that deep.


But since you allow calls into C code you need extra annotations 
and probably have more pressing issues to deal with… (like GC).


Re: Software Assurance Reference Dataset

2014-06-26 Thread bearophile via Digitalmars-d

Walter Bright:

It's an interesting list, and an opportunity for D. I once said 
that my job was to put Coverity out of business.


Even if D has wide success, I don't think D will delete all the C 
and C++ code out of existence, so I don't think D will ever put 
Coverity out of business :-)




The more of these issues D can automatically prevent
with @safe, the better.


I think D will need/better guns for that, like a more principled 
(formalized, written fully down in specs, eventually even proved) 
management of uniqueness, etc.


Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-06-26 Thread bearophile via Digitalmars-d

Walter Bright:

It's an interesting list, and an opportunity for D. I once said 
that my job was to put Coverity out of business. The more of 
these issues D can automatically prevent with @safe, the better.


One kind of problem left is to avoid stack overflows. I have had 
several such cases in my D code (perhaps because I use 
fixed-sized arrays a lot).


I think they can be caused by:
1) Too much large allocations in stack frames;
2) By alloca();
3) By recursion and co-recursion chains;

The first cause is probably sufficiently easy to solve 
mechanically, with a conservative call tree analysis of the code. 
This is a job for external tools.


The second cause can be solved with a formal proof of the upper 
bounds of the arguments of alloca, or even more conservatively 
disallowing alloca.


The third cause, a cycle in the call graph, can be found with an 
external too, but in theory it's also easy to avoid with a 
@no_call_cyles annotation :-) Such annotation looks useful only 
if you want to avoid stack overflows, so it's better to use a 
more general annotation that also forbids alloca(). Something 
like @fixed_stack or @constant_memory.


I think adding this annotation to D is a little overkill (despite 
SPARK and Ada analysers are doing stack size analysis since many 
years). So I think a better solution (beside creating an external 
tool) is to add to D a more general feature to attach some 
compile-time semantics to user-defined annotations. To do this 
you need a trait that returns a sequence of all the functions 
called by a function.


Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-06-26 Thread Paulo Pinto via Digitalmars-d

On Thursday, 26 June 2014 at 00:36:58 UTC, Walter Bright wrote:

http://samate.nist.gov/SRD/view.php?count=20first=0sort=asc

This is a list of security vulnerabilities in languages 
including C/C++. 88,737 of them (!).


It's an interesting list, and an opportunity for D. I once said 
that my job was to put Coverity out of business. The more of 
these issues D can automatically prevent with @safe, the better.


As expected, most of them would be automatically fixed by using a 
descendent from Algol/Mesa/Cedar branch of languages.


I look forward that @safe can provide a similar set of guarantees.

--
Paulo


Re: Software Assurance Reference Dataset

2014-06-26 Thread Walter Bright via Digitalmars-d

On 6/26/2014 2:19 AM, bearophile wrote:

One kind of problem left is to avoid stack overflows.


In general, stack overflow checking at compile time is the halting problem. It 
needs a runtime check.


Stack overflows are not safety problems when a guard page is used past the end 
of the stack. Then, overflow checking is done in hardware. Guard pages aren't 
currently used for fibers, so overflows are a real danger there.


In 64 bit code, however, one should be able to use guard pages for fibers.


Re: Software Assurance Reference Dataset

2014-06-26 Thread bearophile via Digitalmars-d

Walter Bright:

In general, stack overflow checking at compile time is the 
halting problem. It needs a runtime check.


There are several systems, including SPARK, that perform a 
conservative and apparently acceptable stack overflow check at 
compile time. If you don't agree with what I've written in my 
post, then please give a more detailed answer to the points I've 
written above.



Stack overflows are not safety problems when a guard page is 
used past the end of the stack.


It's not a safety problem in Erlang/Rust, because those languages 
are designed to manage such failures in a good way. In most other 
languages it's a safety problem, if your program execution has 
some importance.


Bye,
bearophile


Re: Software Assurance Reference Dataset

2014-06-26 Thread Timon Gehr via Digitalmars-d

On 06/26/2014 11:35 AM, Walter Bright wrote:

On 6/26/2014 2:19 AM, bearophile wrote:

One kind of problem left is to avoid stack overflows.


In general, stack overflow checking at compile time is the halting
problem.


That is irrelevant to his point because he is not suggesting to solve 
the general problem precisely. Analogously: In general, checking whether 
some variable in e.g. Python is ever assigned a string value is 
undecidable as well, but this does not imply we cannot have 'int' 
variables in D.


Re: Software Assurance Reference Dataset

2014-06-26 Thread Walter Bright via Digitalmars-d

On 6/26/2014 2:50 AM, bearophile wrote:

Walter Bright:


In general, stack overflow checking at compile time is the halting problem. It
needs a runtime check.


There are several systems, including SPARK, that perform a conservative and
apparently acceptable stack overflow check at compile time. If you don't agree
with what I've written in my post, then please give a more detailed answer to
the points I've written above.


Spark is a research language that does not work, as I've discovered and 
discussed with you before. It cannot be determined the max stack usage at 
compile time, again, this is the halting problem.




Stack overflows are not safety problems when a guard page is used past the end
of the stack.

It's not a safety problem in Erlang/Rust, because those languages are designed
to manage such failures in a good way.


Please explain.



In most other languages it's a safety
problem, if your program execution has some importance.


I mean safety in the sense of being a security problem, which is the context 
of this thread.




Re: Software Assurance Reference Dataset

2014-06-26 Thread Walter Bright via Digitalmars-d

On 6/26/2014 2:52 AM, Timon Gehr wrote:

On 06/26/2014 11:35 AM, Walter Bright wrote:

On 6/26/2014 2:19 AM, bearophile wrote:

One kind of problem left is to avoid stack overflows.


In general, stack overflow checking at compile time is the halting
problem.


That is irrelevant to his point because he is not suggesting to solve the
general problem precisely. Analogously: In general, checking whether some
variable in e.g. Python is ever assigned a string value is undecidable as well,
but this does not imply we cannot have 'int' variables in D.


When you're dealing with security issues, which is what this about, you'll need 
a guarantee about stack overflow. Adding annotations is not helpful with this 
because they are not checkable.


Again, what WORKS is a runtime check.


Re: Software Assurance Reference Dataset

2014-06-26 Thread Araq via Digitalmars-d


Spark is a research language that does not work, as I've 
discovered and discussed with you before. It cannot be 
determined the max stack usage at compile time, again, this is 
the halting problem.




What?! It's easily solvable: Forbid recursion and indirect
function calls and it's guaranteed that the program only requires
a fixed size stack and you can compute an upper bound of the
required stack size at compile-time. Which is BTW exactly what
OpenCL does as GPUs tend to have no stacks.

In what way is Spark a research language that does not work?
And how many language design issues need to be discovered until
you admit that Safe-D is a research language that does not work?


Re: Software Assurance Reference Dataset

2014-06-26 Thread Walter Bright via Digitalmars-d

On 6/26/2014 2:01 PM, Araq wrote:


Spark is a research language that does not work, as I've discovered and
discussed with you before. It cannot be determined the max stack usage at
compile time, again, this is the halting problem.



What?! It's easily solvable: Forbid recursion and indirect
function calls


Oh well, there goes about 90% of D programs and pretty much all use of the D 
runtime library!




and it's guaranteed that the program only requires
a fixed size stack and you can compute an upper bound of the
required stack size at compile-time. Which is BTW exactly what
OpenCL does as GPUs tend to have no stacks.

In what way is Spark a research language that does not work?


A while back, bearophile posted here some advocacy that Spark was using its 
contracts to prove things about the code. I experimented with it a bit and 
discovered that such proofs did not go beyond the trivial. As I recall, 
bearophile then agreed that it was a great idea that the implementation fell far 
short of.



 And how many language design issues need to be discovered until
 you admit that Safe-D is a research language that does not work?

I recommend that all such issues you discover be put on bugzilla, and marked 
with the 'safe' keyword, so they can be addressed.


I think there's a big difference between only works for trivial cases with no 
idea how to handle the rest and does not work for all cases and there being 
reasonable paths to handle them.


I admit that Safe D does not yet handle all the cases.


Re: Software Assurance Reference Dataset

2014-06-26 Thread Timon Gehr via Digitalmars-d

On 06/26/2014 10:29 PM, Walter Bright wrote:

On 6/26/2014 2:52 AM, Timon Gehr wrote:

On 06/26/2014 11:35 AM, Walter Bright wrote:

On 6/26/2014 2:19 AM, bearophile wrote:

One kind of problem left is to avoid stack overflows.


In general, stack overflow checking at compile time is the halting
problem.


That is irrelevant to his point because he is not suggesting to solve the
general problem precisely. Analogously: In general, checking whether some
variable in e.g. Python is ever assigned a string value is undecidable
as well,
but this does not imply we cannot have 'int' variables in D.


When you're dealing with security issues, which is what this about,


This is about _avoiding stack overflows_. It's written down literally in 
the quoted passage.



you'll need a guarantee about stack overflow. Adding annotations is not
helpful with this because they are not checkable.
...


Not true. Basic facts:

- In practice, programs are constructed explicitly to fulfill a 
particular purpose and, in particular, if they do never overflow the 
stack, they usually do so for a reason.


- Reasoning can be formalized and formal proofs can be written down in 
such a way that a machine can check whether the proof is valid.


- Annotations can include a formal proof.

We've had similar discussions before. Why do we _still_ have to argue 
about the _possibility_ of having a system that is helpful for proving 
stack overflows (or other bad behaviours) absent?


You can say out of scope or not a priority or this should be 
realized in third-party tool support but not it is impossible.




Again, what WORKS is a runtime check.


A runtime check will not avoid the problem, which is exhaustion of stack 
space.


Re: Software Assurance Reference Dataset

2014-06-26 Thread Andrei Alexandrescu via Digitalmars-d

On 6/26/14, 2:01 PM, Araq wrote:


Spark is a research language that does not work, as I've discovered
and discussed with you before. It cannot be determined the max stack
usage at compile time, again, this is the halting problem.



What?! It's easily solvable: Forbid recursion and indirect
function calls


I do agree that a useful subset of a language can be conservatively 
defined that doesn't require solving the halting problem. But that's not 
easy at all - it requires interprocedural analysis.


Andrei



Re: Software Assurance Reference Dataset

2014-06-26 Thread Andrei Alexandrescu via Digitalmars-d

On 6/26/14, 4:16 PM, Timon Gehr wrote:

- Annotations can include a formal proof.


If a function can be annotated with what other functions it calls
(non-transitively), I agree that it can be guaranteed with local
semantic checking that a program won't overflow. However requiring such
annotations would be onerous, and deducing them would require whole
program analysis.


We've had similar discussions before. Why do we _still_ have to argue
about the _possibility_ of having a system that is helpful for
proving stack overflows (or other bad behaviours) absent?

You can say out of scope or not a priority or this should be
realized in third-party tool support but not it is impossible.


I also seem to reckon Walter is in the other extreme (he asserts it 
can't be done at all, period). My understanding is that it can be done 
but only with annotations or whole program analysis.



Andrei



Re: Software Assurance Reference Dataset

2014-06-26 Thread H. S. Teoh via Digitalmars-d
On Thu, Jun 26, 2014 at 04:43:33PM -0700, Andrei Alexandrescu via Digitalmars-d 
wrote:
 On 6/26/14, 2:01 PM, Araq wrote:
 
 Spark is a research language that does not work, as I've discovered
 and discussed with you before. It cannot be determined the max stack
 usage at compile time, again, this is the halting problem.
 
 
 What?! It's easily solvable: Forbid recursion and indirect
 function calls
 
 I do agree that a useful subset of a language can be conservatively
 defined that doesn't require solving the halting problem. But that's
 not easy at all - it requires interprocedural analysis.
[...]

And it may not be feasible for a compiler to automatically prove.

One possible approach is to have the user supply the proof of eventual
termination, which can be mechanically verified by the compiler.
(Checking a supplied proof for correctness is more tractable than coming
up with the proof in the first place.) But to handle proofs of
non-trivial code beyond just toy examples, you might end up needing a
full-scale deductive proof subsystem in the compiler, which may or may
not be practical for D's needs!


T

-- 
That's not a bug; that's a feature!


Re: Software Assurance Reference Dataset

2014-06-26 Thread Timon Gehr via Digitalmars-d

On 06/27/2014 01:47 AM, Andrei Alexandrescu wrote:

On 6/26/14, 4:16 PM, Timon Gehr wrote:

...

You can say out of scope or not a priority or this should be
realized in third-party tool support but not it is impossible.


I also seem to reckon Walter is in the other extreme (he asserts it
can't be done at all, period).


I don't think it makes sense to imply that I am defending an extreme 
position. I wasn't discussing design choices, truth is not a continuum 
and I was just objecting to the line of reasoning that went like 
undecidability of the halting problem implies that formal reasoning is 
pointless., which is clearly untrue.



My understanding is that it can be done
but only with annotations or whole program analysis.
...


Any way it is done, it doesn't come for free.


Re: Software Assurance Reference Dataset

2014-06-26 Thread H. S. Teoh via Digitalmars-d
On Thu, Jun 26, 2014 at 04:47:24PM -0700, Andrei Alexandrescu via Digitalmars-d 
wrote:
 On 6/26/14, 4:16 PM, Timon Gehr wrote:
 - Annotations can include a formal proof.
 
 If a function can be annotated with what other functions it calls
 (non-transitively), I agree that it can be guaranteed with local
 semantic checking that a program won't overflow. However requiring
 such annotations would be onerous, and deducing them would require
 whole program analysis.
 
 We've had similar discussions before. Why do we _still_ have to argue
 about the _possibility_ of having a system that is helpful for
 proving stack overflows (or other bad behaviours) absent?
 
 You can say out of scope or not a priority or this should be
 realized in third-party tool support but not it is impossible.
 
 I also seem to reckon Walter is in the other extreme (he asserts it
 can't be done at all, period). My understanding is that it can be done
 but only with annotations or whole program analysis.
[...]

I think the compiler should be able to tell, at least for the simplest
cases, whether something will definitely stop recursing. Proving that
something will *not* stop recursing is unsolvable in the general case,
but even if we restrict it to a solvable subset, it's still far from
trivial for the compiler to invent such a proof (unless we restrict it
so much that it excludes too many useful algorithms).

One approach is to have the user supply a proof that the compiler can
then check -- in general, if a proof exists at all, it should be
checkable. Such checks furthermore can probably be done fast enough so
as to not adversely affect current compilation times (unless the proof
is ridiculously complex, which for practical real-world applications I
don't think will happen).

However, absence of proof is not proof of absence; just because neither
the compiler nor the user is able to come up with a proof that something
will stop recursing, doesn't mean that it definitely won't stop
recursing. So the compiler cannot definitively reject the code as
definitely overflowing the stack. But we *can* make it so that the user
tells the compiler please stop compilation if neither of us can supply
a proof that this function will stop recursing. But it has to be
opt-in, because there will be many real-world applications that cannot
be proven to stop recursing, even though in practice they always will,
so we cannot make this a language default.

One particular example that comes to mind is the compiler itself: as it
parses the input program, there is in theory no guarantee that the input
won't have an arbitrarily deep nesting, such that the AST generated by
the parser will be infinitely deep, because you cannot statically prove
that the input will terminate. You don't know if the input file is
actually connected to the output pipe of a program that prints an
infinite series of ever-deeper nested structs, for example.  However, in
practice, such input never occurs, so we generally don't worry about
such contrived possibilities.  But this does mean that there can be no
proof of termination of recursion, even user-supplied ones -- because
there *are* cases where the parser *will* recurse forever, even if it
never happens in practice. Not accounting for that possibility
invalidates the proof, so any proof will always be rejected.  Therefore,
it is impossible to prove that recursion in the compiler is finite, even
though in practice it always is.


T

-- 
Life is all a great joke, but only the brave ever get the point. -- Kenneth 
Rexroth


Re: Software Assurance Reference Dataset

2014-06-26 Thread Walter Bright via Digitalmars-d

On 6/26/2014 4:16 PM, Timon Gehr wrote:

On 06/26/2014 10:29 PM, Walter Bright wrote:

When you're dealing with security issues, which is what this about,

This is about _avoiding stack overflows_. It's written down literally in the
quoted passage.


Check the title of this thread, the linked issues, and bearophile's comment 
bringing up stack overflows.


It's about security, not niceness.



you'll need a guarantee about stack overflow. Adding annotations is not
helpful with this because they are not checkable.
...


Not true. Basic facts:

- In practice, programs are constructed explicitly to fulfill a particular
purpose and, in particular, if they do never overflow the stack, they usually do
so for a reason.

- Reasoning can be formalized and formal proofs can be written down in such a
way that a machine can check whether the proof is valid.

- Annotations can include a formal proof.

We've had similar discussions before. Why do we _still_ have to argue about the
_possibility_ of having a system that is helpful for proving stack overflows (or
other bad behaviours) absent?

You can say out of scope or not a priority or this should be realized in
third-party tool support but not it is impossible.


In the general case, it is impossible. And I suspect the subset of D programs 
that don't have indirection or recursion is so small as to not be worth the bother.


I do know there are a class of applications, for example in critical embedded 
systems, were recursion and indirection, and even allocation, are not allowed. 
Using D in such cases would require eschewing using Phobos, and some other care 
taken, but that isn't the issue here, which is security vulnerabilities.




Again, what WORKS is a runtime check.

A runtime check will not avoid the problem, which is exhaustion of stack space.


We disagree on the problem. The problem I initiated this thread on is security 
vulnerabilities. Terminating a program that overflows its stack is not a 
security vulnerability.


As for formal proofs, I'll let slip a guilty secret - I know so little about 
computer science proofs I wouldn't even recognize one if I saw one, let alone 
have the ability to craft one. So when you advocate formal proofs, directing it 
at me won't accomplish anything. To get formal proofs for D in the spec, in the 
code, in the compiler, I cannot deliver that. People like you are going to have 
to step forward and do them.


Re: Software Assurance Reference Dataset

2014-06-26 Thread Andrei Alexandrescu via Digitalmars-d

On 6/26/14, 5:29 PM, Timon Gehr wrote:

On 06/27/2014 01:47 AM, Andrei Alexandrescu wrote:

On 6/26/14, 4:16 PM, Timon Gehr wrote:

...

You can say out of scope or not a priority or this should be
realized in third-party tool support but not it is impossible.


I also seem to reckon Walter is in the other extreme (he asserts it
can't be done at all, period).


I don't think it makes sense to imply that I am defending an extreme
position. I wasn't discussing design choices, truth is not a continuum
and I was just objecting to the line of reasoning that went like
undecidability of the halting problem implies that formal reasoning is
pointless., which is clearly untrue.


I agree.


My understanding is that it can be done
but only with annotations or whole program analysis.
...


Any way it is done, it doesn't come for free.


That's not a fair characterization. Interprocedural analysis is quite a 
different ball game from classic semantic checking.



Andrei



Re: Software Assurance Reference Dataset

2014-06-26 Thread deadalnix via Digitalmars-d

On Thursday, 26 June 2014 at 09:19:05 UTC, bearophile wrote:

Walter Bright:

It's an interesting list, and an opportunity for D. I once 
said that my job was to put Coverity out of business. The more 
of these issues D can automatically prevent with @safe, the 
better.


One kind of problem left is to avoid stack overflows. I have 
had several such cases in my D code (perhaps because I use 
fixed-sized arrays a lot).


I think they can be caused by:
1) Too much large allocations in stack frames;


If generated by the compiler, they be made @safe


2) By alloca();


it is @system


3) By recursion and co-recursion chains;


We should have a page reserved at the end of the stack so we can 
throw when reaching it. The compiler can ensure we don't bypass 
it in case 1.


Software Assurance Reference Dataset

2014-06-25 Thread Walter Bright via Digitalmars-d

http://samate.nist.gov/SRD/view.php?count=20first=0sort=asc

This is a list of security vulnerabilities in languages including C/C++. 88,737 
of them (!).


It's an interesting list, and an opportunity for D. I once said that my job was 
to put Coverity out of business. The more of these issues D can automatically 
prevent with @safe, the better.