Re: Safer Linux Kernel Modules Using the D Programming Language

2023-01-11 Thread Walter Bright via Digitalmars-d-announce

On 1/11/2023 8:15 PM, Tejas wrote:
Well, the companies don't get to single-handedly decide what features to add or 
deprecate, thanks to C spec being written by ISO, which is why they have 
developed their own PLs.


Yes they can, as they add extensions to C all the time.


But also, adding dynamic arrays to C won't make the currently existing C code 
safer, the one they care about, because no one's gonna send the money to update 
their C89/99/whatever code to C23/26. Even if they did, there's no guarantee 
others would as well.


You can incrementally fix code, as I do with the dmd source code (originally in 
C) regularly.




Re: Safer Linux Kernel Modules Using the D Programming Language

2023-01-11 Thread Tejas via Digitalmars-d-announce
On Wednesday, 11 January 2023 at 19:27:15 UTC, Walter Bright 
wrote:

On 1/11/2023 3:26 AM, Paulo Pinto wrote:
It is kind of "solved", by turning all computers into C 
machines,


What an amazing amount of work just to avoid adding dynamic 
arrays to C.


Well, the companies don't get to single-handedly decide what 
features to add or deprecate, thanks to C spec being written by 
ISO, which is why they have developed their own PLs.


But also, adding dynamic arrays to C won't make the currently 
existing C code safer, the one they care about, because no one's 
gonna send the money to update their C89/99/whatever code to 
C23/26. Even if they did, there's no guarantee others would as 
well.


So when you can't change the world, what do you do?

You change yourself, and that's what they did, by making bounds 
checking and whatnot part of the _hardware semantics_ itself, now 
the C programmers get to be happy that the program still is 2 
instructions long, while at the micro-architecture/microcode 
level the checks are still getting performed.


Re: Safer Linux Kernel Modules Using the D Programming Language

2023-01-11 Thread Walter Bright via Digitalmars-d-announce

On 1/11/2023 3:26 AM, Paulo Pinto wrote:

It is kind of "solved", by turning all computers into C machines,


What an amazing amount of work just to avoid adding dynamic arrays to C.



Re: Safer Linux Kernel Modules Using the D Programming Language

2023-01-11 Thread Paulo Pinto via Digitalmars-d-announce
On Wednesday, 11 January 2023 at 09:52:23 UTC, Walter Bright 
wrote:
By the way, back in the 80's, I wrote my own pointer checker 
for my own use developing C code. It was immensely useful in 
flushing bugs out of my code. There are vestiges of it still in 
the dmd source code.


But it ran very slooowwly, and was not 
usable for shipped code.


A lot of very capable engineers have working on this problem C 
has for many decades. If it was solvable, they would have 
solved it by now.


It is kind of "solved", by turning all computers into C machines,

Solaris under SPARC ADI,

https://docs.oracle.com/cd/E53394_01/html/E54815/gqajs.html

Android with MTE,

https://source.android.com/docs/security/test/memory-safety/arm-mte

iOS with XP,

https://developer.apple.com/documentation/security/preparing_your_app_to_work_with_pointer_authentication

FreeBSD with CHERI,

https://www.cheribsd.org/

Intel messed up their MPX design, but certainly won't want to be 
left behind.


Basically acknowledging that only having bounds and pointer 
checking via hardware memory tagging will fix C derived issues, 
and all mitigations thus far have failed one way or the other.


Re: Safer Linux Kernel Modules Using the D Programming Language

2023-01-11 Thread Walter Bright via Digitalmars-d-announce
By the way, back in the 80's, I wrote my own pointer checker for my own use 
developing C code. It was immensely useful in flushing bugs out of my code. 
There are vestiges of it still in the dmd source code.


But it ran very slooowwly, and was not usable for 
shipped code.


A lot of very capable engineers have working on this problem C has for many 
decades. If it was solvable, they would have solved it by now.


Re: Safer Linux Kernel Modules Using the D Programming Language

2023-01-11 Thread Walter Bright via Digitalmars-d-announce

On 1/10/2023 10:49 PM, Siarhei Siamashka wrote:
It's impractical to have this in the ISO standard, but surely not impossible. 
Various C compilers from different vendors implement bounds checking. See:


   * https://bellard.org/tcc/tcc-doc.html#Bounds


This works by constructing a data structure of all the allocated memory, and 
then comparing a pointer dereference to see if it's pointing to valid data. It 
sounds like what valgrind does. It's very slow, and wouldn't be used in a 
shipped executable, like you wouldn't ship valgrind. It's vulnerable to memory 
corruption when your app gets tested with inputs that were never tested when 
this checking was turned on.




   * https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html


Adds a bunch of runtime checks you wouldn't want to ship an executable with them 
turned on.



   * https://clang.llvm.org/docs/AddressSanitizer.html


Same problem. 2x slowdown, won't use it in shipped executable.

   * 
https://learn.microsoft.com/en-us/visualstudio/debugger/how-to-use-native-run-time-checks?view=vs-2022


Not really clear what this does.


So your statement that "C has no mechanism to prevent them" just ignores reality 
and the existing C compilers. If you are comparing the lowest common denominator 
ISO C spec with the vendor specific DigitalMars D implementation, then this is 
not a honest apples-to-apples comparison.


They all seem to have the same problem - they are only useful when the program 
is under test. When the program is shipped, they're not there.


The Linux kernel is using GNU C compiler and recently switched from `-std=gnu89` 
to `-std=gnu11`.


Bounds checking in the Linux kernel is done by 
https://docs.kernel.org/dev-tools/kfence.html or


Being sampling based, this is not good enough.



https://docs.kernel.org/dev-tools/kasan.html


Another test-only tool.

Please don't misunderstand me, these tools are good. But they have really 
nothing to do with the C language specification (which is completely unhelpful 
in resolving this issue), have too high overhead to be useful in a shipped 
product, and have not stopped C from having buffer overflows being the #1 bug in 
shipped software.


I stand by the idea that C's semantics make it impossible. These tools are all 
things layered on top of C, and they certainly help, and I would use them if I 
was developing in C, but they simply do not solve the problem.