On 6/1/2017 3:26 AM, Steven Schveighoffer wrote:
On 5/31/17 9:05 PM, Walter Bright wrote:
On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:
Technically this is a programming error, and a bug. But memory hasn't
actually been corrupted.

Since you don't know where the bad index came from, such a conclusion
cannot be drawn.

You could say that about any error. You could say that about malformed unicode strings, malformed JSON data, file not found. In this mindset, everything should be an Error, and nothing should be recoverable.

What's missing here is looking carefully at a program and deciding what are input (and environmental) errors and what are program bugs. The former are recoverable, the latter are not.

For example, malformed unicode strings. Joel Spolsky wrote about this issue long ago, in that data in a program should be compartmentalized into untrusted and trusted data.

Untrusted data comes from the input, and stays untrusted until it is validated. Malformed untrusted data are recoverable. Once it is validated, it becomes trusted data. Any malformations in trusted data are programming bugs. It should be clear in a well designed program what data is trusted and what data is untrusted. Spolsky suggests using different types for them so they are distinct.

For your date case, the date was not validated, and was fed into an array, where the invalid date overflowed the array bounds. The program was relying on the array bounds checking to validate the data.

I'd argue this is a problematic program design because:

1. It's inefficient. Data should be validated once in a clear location in the program. Arrays appear all over the place, and tend to be in hot locations. Validating the same data over and over is highly inefficient.

2. Array bounds checking can be turned off by a compiler switch. Program data validation should not be silently disabled in such an unexpected manner.

3. Arrays are a ubiquitous data structure. They are used all over the place. There is no way to distinguish "this is a data validation use" and "this must be valid data".

4. It would be surprising to anyone familiar with D looking at your code to realize that an array access is data validation rather than bug checking.

5. Arrays are sometimes optimized by removing the bounds checking. This should not turn off data validation.

6. @safe code is intended to find programming bugs, not validate input data.

7. Just because code is marked @safe doesn't mean memory corruption is impossible. Even if @safe is perfect, programs have @trusted and @system code too, and those may have memory corrupting bugs.

8. It does not distinguish array overflow from programming bugs / corruption from invalid program input.

Reply via email to