On 15/07/17 17:17, Martok wrote:
For example, if I index an array, I know bad things may happen if I don't check
the index beforehand, so I must always do that.

No, you don't always have to do that. That is the whole point of a type system.

That if the compiler makes up the array access somewhere along the way sometimes
no check happens is not very predictable.

Array indexation is just a side-effect. The basic thing is this:

{$r+}
type
  tenum = (ea,eb,ec,ed,ef,eg);
  tsubenum = eb..ef;
  tsubenum2 = ec..ef;
var
  a: tsubenum;
  b: tsubenum2;
begin
  b:=tsubenum2(eg);
  a:=b;
end.

This will never generate a range check error, because the type information states that a tsubenum2 value is always a valid tsubenum value. Array indexing a special case of this, as semantically the expression you use to index the array is first assigned to the range type of the array.

I would assume that this is something that "someone with a solid knowledge of the language" would expect.

and in comparisons that get optimised away at compile time because they will
always have the same result at run time according to the type information.
I've shown that is not the case for the more obvious expressions in the forum
post linked above.
Several different ways of writing the (apparent) tautology "is EnumVar in
Low(EnumType)..High(EnumType)" all handle out-of-range-values (expressly, not as
a side effect of something else).

The in-expression may indeed handle this, but plain comparisons are removed at compile-time:

type
  tsubrange = 6..8;
var
  a: tsubrange;
begin
  a:=tsubrange(10);
  if a>8 then
writeln('this statement is removed at compile-time, because a > 8 is impossible according to the type information');
end.

It seems we don't do this transformation for enums right now (and only for integer subtypes), but that's a limitation of the implementation rather than something that is done by design. And the principle is the same.

Which is especially noteworthy because with
strict enums, we might as well drop the elseblock entirely and warn "unreachable
code" in these tests.

Indeed, just like the removal of the comparison above generates a warning.

However, FPC does not have the luxury of being the first to define and implement
a new language (well, except for $mode FPC and ObjFPC). There is precedent.

At least the precedent in ISO Pascal (http://www.standardpascal.org/iso7185rules.html) is that you cannot convert anything else to an enum, and hence an enum by design always contains a value that is valid for that type (unless you did not initialise it all, in which case the result is obviously undefined as well).

And for subranges, it says "It is an error to assign a value outside of the corresponding range to a variable of that type". Using subrange values to calculate something else does promote it to the integer type, but we do that too.

The Extended Pascal standard (http://www.eah-jena.de/~kleine/history/languages/iso-iec-10206-1990-ExtendedPascal.pdf) says that enumeration and subrange types are "non-bindable". This means that they cannot be used with input/output (including files; this avoids the issue you mentioned with reading invalid values from disk). It does not really say much else about enumerated types specifically, but they are of course also ordinal types and for those it says in the section about Assignment-compatibility (6.4.6):

***
A value of type T2 shall be designated assignment-compatible with a type T1 if any of the following six statements is true:
...
d) T1 and T2 are compatible ordinal-types, and the value of type T2 is in the closed interval specified by the type T1.
...
At any place where the rule of assignment-compatibility is used
a) it shall be an error if T1 and T2 are compatible ordinal-types and the value of type T2 is not in the closed interval specified by the type
T1;
***

That seems pretty clear in terms of stating that having value that is outside the range of a type is an error. And error is defined as:

***
A violation by a program of the requirements of this International Standard that a processor is permitted to leave undetected.
***

I.e., undefined behaviour.

It does say that the "range-type" of a subrange-type is the "host-type", but this range-type is only referenced in very specific contexts, like when defining assignment compatibility (in a non-quoted part of section 6.4.6 above), and when defining how for-loops must behave (which is a place were FPC is in fact in error: https://bugs.freepascal.org/view.php?id=24318 )

And
that precedent is Conclusion 1 of my post above: Enums are handled as a
redefinition of the base type with constants for the names. Some intrinsics
(pred/succ) and the use of the type itself (array[TEnumType], set of) use the
enum-ness for something, most don't. There is nothing undefined.
Do not confuse the additional treatment added by {$R+} with the basic defined
behaviour.

{$r+} can help with detecting when undefined behaviour would otherwise occur, like when assigning a value that is out-of-bounds to a subrange type or an enum. Explicit typecasting disables this aid. It does not remove the undefined behaviour.


Jonas
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to