Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread DaWorm
I store record data in files with a checksum (usually a CRC). I block read
them into an array buffer and verify the checksum.  If it passes, I assign
via typecast the array buffer to a variable of the record type.  If I'm the
only one reading and writing the files that is usually enough to handle
drive bit rot, or transfer errors.

If someone else's code can write the data I validate everything either when
reading and assigning to the record type, or occasionally before use.  Sure
its slow but it's the only safe thing to do. I wouldn't think of abrogating
that responsibility to the compiler.

Jeff

On Jul 2, 2017 4:50 PM, "Marco van de Voort"  wrote:

> In our previous episode, Florian Kl?mpfl said:
> [ Charset UTF-8 unsupported, converting... ]
> > Am 02.07.2017 um 21:40 schrieb Martok:
> > > Honestly, I still don't understand why we're even having this
> discussion.
> >
> > Because it is a fundamental question: if there is any defined behavior
> possible if a variable
> > contains an invalid value. I consider a value outside of the declared
> range as invalid, if it shall
> > be valid, change the declaration of the type.
>
> _AND_ remove types that can't have reasonably cheap range checks like
> sparse
> enums ? :-)
> ___
> fpc-devel maillist  -  fpc-devel@lists.freepascal.org
> http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
>
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Marco van de Voort
In our previous episode, Florian Kl?mpfl said:
[ Charset UTF-8 unsupported, converting... ]
> Am 02.07.2017 um 21:40 schrieb Martok:
> > Honestly, I still don't understand why we're even having this discussion.
> 
> Because it is a fundamental question: if there is any defined behavior 
> possible if a variable
> contains an invalid value. I consider a value outside of the declared range 
> as invalid, if it shall
> be valid, change the declaration of the type.

_AND_ remove types that can't have reasonably cheap range checks like sparse
enums ? :-)
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Tomas Hajny
On Sun, July 2, 2017 18:39, Michael Van Canneyt wrote:
> On Sun, 2 Jul 2017, Tomas Hajny wrote:
>
>>> By declaring it as a File of Enum, you are telling the compiler that it
>>> contains only valid enums.
>>
>> Noone can ever ensure, that a file doesn't get corrupted / tampered with
>> on a storage medium.
>
> No-one can ensure a memory location cannot get corrupted either.

I don't think this is true. The operating system should ensure that no
other process corrupts memory location exclusively used by my program, and
I should make sure that my own program doesn't corrupt it itself.

File is usually not protected to be exclusively used by your own program,
unless it's created by the same program in a locked state and later read
again (still locked) during the same run of that program - let's say that
this pattern isn't a typical use of files, right?


>> Moreover, using the same Read for reading from a text file _does_
>> perform such checks (e.g. when using Read for reading an integer from
>> a text file, the value read is validated whether it conforms the given
>> type and potential failures are signalized either as an RTE, or
>> a non-zero IOResult depending on the $I state).
>
> Text files by definition are not type safe. The compiler cannot know what
> it contains.

I don't talk about the compiler, but about the RTL here.


> By using file of enum (or any data type), you are explicitly telling the
> compiler it is OK.

There isn't much difference between telling the compiler that all values
in certain file are of certain type and telling the compiler that the next
value read from that file (text in this case) will conform to certain
type. Both is typed, both should provide means for ensuring type safety
while loading the value to the variable of the given type.

Note that I don't talk about typecasting here, of course, that's something
completely different (and manual checking is absolutely appropriate
there).

Tomas


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Florian Klämpfl
Am 02.07.2017 um 21:40 schrieb Martok:
> Honestly, I still don't understand why we're even having this discussion.

Because it is a fundamental question: if there is any defined behavior possible 
if a variable
contains an invalid value. I consider a value outside of the declared range as 
invalid, if it shall
be valid, change the declaration of the type.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Tomas Hajny
On Sun, July 2, 2017 19:15, Marco van de Voort wrote:
> In our previous episode, Tomas Hajny said:
>> > Worse, tying it to range check would then have heaps of redundant
>> checking
>> > everywhere, not just enums.
>>
>> True. That's why I believe that Read from a (typed) file should perform
>> such validation - but it doesn't at the moment, as mentioned in my
>> e-mail
>> in the other thread. :-(
>
> That slows down needlessly IMHO. Or do you mean only in $R+?

$R+ would be sufficient from my point of view, but I'm not sure if that is
possible, because $R+ is usually in effect in the place of declaration and
the checks would probably need to happen inside the Read implementation
(which is already compiled at that point in time). Unlike to $I, there's
probably no way for $R to provide feedback to the caller which may be used
for checks around the call.


> Most will blockread records anyway.

That's exactly the point. If someone uses BlockRead/BlockWrite for higher
performance and thus uses untyped access, he/she has to perform manual
checks as appropriate. If someone decides to use typed files, he/she
probably prefers type safety over performance, but doesn't get either at
the moment. :-(

Tomas


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Martok
Am 02.07.2017 um 19:47 schrieb Florian Klämpfl:
> Am 02.07.2017 um 19:29 schrieb Martok:
>>type Percentile = 1..99;
>>var I: Percentile;
>>begin
>>  I:= 99;
>>  inc(I);   // I is now 100
> 
> Forgot the mention:
> Tried with $r+ :)?
That case is also documented. RTE in {$R+}, legal in {$R-}. That also means that
while you could make assumptions about the content in {$R+} (Delphi does not*),
you definitely cannot as soon as there is a single write in {$R-}. A C++
compiler could probably try tracing that using constness of variables and
parameters, but we cannot, and so must be defensive.

*) Even FPC makes no such assumptions in all other instances!

type
  TF = 1..25;
var
  t: TF;
begin
  t:= TF(200);
  if t in [1..50] then  // tautology!
Writeln('a')
  else
writeln('b');

What does that print?
Yeah. As documented.
Check the codegen in R+: the if is still fully generated.
Only tcgcasenode does something else.


Honestly, I still don't understand why we're even having this discussion.
We're not talking about adding a new check - only not leaving one out that is
already there 99% of the time.
We're not talking about standardising some new behaviour - Borland did that
decades ago.
The correct behaviour is already documented in every Pascal language reference
(partly including our own), and is also the intuitive one.

I just don't get it. Why would you sacrifice the runtime safety, or, if you
prefer, the code compatibility, of your compiler over an (arguably wrong in at
least 2 modes) specific technicality of the type system that is adhered to
nowhere else?


Taking a break for now. Grading a thesis starts to sound like good relaxation.

Kind regards,

Martok





___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Martok
Am 02.07.2017 um 20:29 schrieb Ondrej Pokorny:
> On 02.07.2017 20:23, Florian Klämpfl wrote:
>> And the compiler writes no warning during compilation?
> 
> It does indeed.
But about something else.
Can we please stop derailing from the main issue here?


> If we get a convenient way to assign ordinal to enum with range checks, 
> everything will be fine :)
No it will not, we still can no longer elegantly pass/receive enums to/from
libraries from other compilers.
But at least it would be defined then, so programmers would know this is an
incompatibility.


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Ondrej Pokorny

On 02.07.2017 20:23, Florian Klämpfl wrote:

And the compiler writes no warning during compilation?


It does indeed.

On 02.07.2017 20:18, Florian Klämpfl wrote:

Yes, undefined behavior.


I think I got your point :) You are right, sorry for wasting your time.

If we get a convenient way to assign ordinal to enum with range checks, 
everything will be fine :)


Ondrej
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Florian Klämpfl
Am 02.07.2017 um 19:55 schrieb Ondrej Pokorny:
> On 02.07.2017 19:29, Martok wrote:
>>  - Case statements execute *precisely one* of their branches: the statements 
>> of
>> the matching case label, or the else block otherwise
> 
> To support your argument, the current Delphi documentation says the same:
> 
> http://docwiki.embarcadero.com/RADStudio/Tokyo/en/Declarations_and_Statements
> 
> /Whichever caseList has a value equal to that of selectorExpression 
> determines the statement to be
> used. If none of the caseLists has the same value as selectorExpression, then 
> the statements in the
> else clause (if there is one) are executed./
> 
> According to Delphi documentation, invalid values should point to the else 
> clause.
> 
> Furthermore, it is OK to use invalid values in caseList as well:
> 
> program Project1;
> 
> {$APPTYPE CONSOLE}
> 
> type
>   TMyEnum = (one, two);
> 
> {$R+}
> var
>   E: TMyEnum;
> begin
>   E := TMyEnum(-1);
>   case E of
> one, two: Writeln('valid');
> TMyEnum(-1): Writeln('minus one');
>   else
> Writeln('invalid');
>   end;
> end.
> 
> The program above writes 'minus one' in Delphi.

And the compiler writes no warning during compilation?

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Florian Klämpfl
Am 02.07.2017 um 20:12 schrieb Martok:
>> They are:
>> http://docwiki.embarcadero.com/Libraries/XE5/en/System.Boolean
> That prototype is a recent invention, it wasn't there in older versions. Also

*sigh* This is the case since pascal was iso standarized.

> the text sounds quite different somewhere else:
> http://docwiki.embarcadero.com/RADStudio/XE5/en/Simple_Types#Boolean_Types
> 
>> Yes. What I wanted to point out: also delphi does optimizations on enums 
>> which fails if one feeds
>> invalid values.
> Okay, if you want believe that Booleans are enums:

I do not believe, I know.

> 
>   b:=boolean(42);
>   if not b then
> writeln('falsy')
>   else
> writeln('truthy');
> 
> Prints truthy. Doesn't crash.

Yes, undefined behavior.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Ondrej Pokorny

On 02.07.2017 19:39, Florian Klämpfl wrote:

So this means:

var
   b : boolean;

begin
   b:=boolean(3);
   if b then
 writeln(true)
   else if not(b) then
 writeln(false)
   else
 writeln(ord(b));
end.

writes 3 in delphi?


IMO you picked up a Delphi compiler bug/undocumented feature (call it as 
you want). "if boolean(3) then A" executes A contrary to the 
documentation - the docs say something different then the compiler does. 
You should not use it as an argument but create an issue report on 
Embarcadero's Quality Central so that they either fix the documentation 
or fix the compiler.


Whereas:
case boolean(3) of
  True: A;
  False: B;
else
  C;
end;

is documented to execute C and the compiler executes C => good.

Ondrej
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Martok
> They are:
> http://docwiki.embarcadero.com/Libraries/XE5/en/System.Boolean
That prototype is a recent invention, it wasn't there in older versions. Also
the text sounds quite different somewhere else:
http://docwiki.embarcadero.com/RADStudio/XE5/en/Simple_Types#Boolean_Types

> Yes. What I wanted to point out: also delphi does optimizations on enums 
> which fails if one feeds
> invalid values.
Okay, if you want believe that Booleans are enums:

  b:=boolean(42);
  if not b then
writeln('falsy')
  else
writeln('truthy');

Prints truthy. Doesn't crash.



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Florian Klämpfl
Am 02.07.2017 um 19:51 schrieb Martok:
> Booleans are not enums in Delphi (not even ordinals), 

They are:
http://docwiki.embarcadero.com/Libraries/XE5/en/System.Boolean

> but their own little
> thing. "if boolean_expr" is always a jz/jnz, no matter what. 

Yes. This is an optimization which is invalid as well if I follow your 
argumentation. Boolean(3)<>true.

> They are defined as
> 0=FALSE and "everything else"=TRUE

No, see link above.

> 
> However:
> 
> var
>   b : boolean;
> begin
>   b:=boolean(3);
>   if b = True then
> writeln(true)
>   else if b = False then
> writeln(false)
>   else
> writeln(ord(b));
> end.
> 
> That writes 3, 

Yes. What I wanted to point out: also delphi does optimizations on enums which 
fails if one feeds
invalid values.

> which is why your should never compare on the boolean lexicals.
> Some Winapi functions returning longbool rely on that.

No, longbool is something different (even bytebool is).

> 
> Wait, that was a trick question, wasn't it?

In the sense to point out that also delphi assumes enumeration variables 
contain always valid values.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Ondrej Pokorny

On 02.07.2017 19:29, Martok wrote:

  - Case statements execute*precisely one*  of their branches: the statements of
the matching case label, or the else block otherwise


To support your argument, the current Delphi documentation says the same:

http://docwiki.embarcadero.com/RADStudio/Tokyo/en/Declarations_and_Statements

/Whichever caseList has a value equal to that of selectorExpression 
determines the statement to be used. If none of the caseLists has the 
same value as selectorExpression, then the statements in the else clause 
(if there is one) are executed./


According to Delphi documentation, invalid values should point to the 
else clause.


Furthermore, it is OK to use invalid values in caseList as well:

program Project1;

{$APPTYPE CONSOLE}

type
  TMyEnum = (one, two);

{$R+}
var
  E: TMyEnum;
begin
  E := TMyEnum(-1);
  case E of
one, two: Writeln('valid');
TMyEnum(-1): Writeln('minus one');
  else
Writeln('invalid');
  end;
end.

The program above writes 'minus one' in Delphi.

Ondrej

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Martok
Booleans are not enums in Delphi (not even ordinals), but their own little
thing. "if boolean_expr" is always a jz/jnz, no matter what. They are defined as
0=FALSE and "everything else"=TRUE

However:

var
  b : boolean;
begin
  b:=boolean(3);
  if b = True then
writeln(true)
  else if b = False then
writeln(false)
  else
writeln(ord(b));
end.

That writes 3, which is why your should never compare on the boolean lexicals.
Some Winapi functions returning longbool rely on that.

Wait, that was a trick question, wasn't it?
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Florian Klämpfl
Am 02.07.2017 um 19:29 schrieb Martok:
>type Percentile = 1..99;
>var I: Percentile;
>begin
>  I:= 99;
>  inc(I);   // I is now 100

Forgot the mention:
Tried with $r+ :)?

>So if this is a legal statement, 

Well, it is a matter of definition, if a statement causing a rte 201 when 
compiled with $r+ is a
legal statement ...


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Florian Klämpfl
Am 02.07.2017 um 19:29 schrieb Martok:
> Addendum to this:
> 
>> This was also always my intuition that the else block is also triggered for
>> invalid enum values (the docs even literally say that, "If none of the case
>> constants match the expression value") - and it *is* true in Delphi.
> There is a reason why this is true in Delphi: because this is the way it has
> been documented in Borland products for at least 25 years!
> 
> I have checked with the TP7 language reference (it pays to keep books around),
> which defines the following things:
>  - Enumeration element names are implicitly defined as typed constants of 
> their
> enum type
>  - The enum type is either Byte (<=256 elements) or Word.
>  - Subrange types are defined as the smallest type that can contain their 
> range
>  - Case statements execute the statements of the matching case label, or the
> else block otherwise
> 
> Note that they actually defined enumerations as what I called 'fancy 
> constants'
> before.
> 
> 
> The Delphi 4 language reference (also in book form, which is a bit more 
> detailed
> than what is in the .hlp files) uses more precise language:
>  - Enumeration element names are implicitly defined as typed constants of 
> their
> enum type
>  - The enum type is either Byte, Word, or Longword, depending on $Z and 
> element
> count
>  - Subrange types are defined as the smallest type that can contain their 
> range
>  - it is legal to inc/dec outside of a subrange, example from the book:
>type Percentile = 1..99;
>var I: Percentile;
>begin
>  I:= 99;
>  inc(I);   // I is now 100
>So if this is a legal statement, subrange types can contain values outside 
> of
> their range. The description in the German version is "Die Variable wird in
> ihren Basistyp umgewandelt", the variable becomes its base type.
>  - Case statements execute *precisely one* of their branches: the statements 
> of
> the matching case label, or the else block otherwise
> 
> So, in D4, we have enums as fancy constants, subrange-types are not safe (so
> enums can also never be), and case statements cannot fail.
> 
> 
> FPC's language reference has no formal definition of what enums or subranges
> really are, and the same language as TP7 regarding case statements.
> 
> 
> So at least in modes TP and DELPHI, the optimisation in question is formally 
> wrong.

So this means:

var
  b : boolean;

begin
  b:=boolean(3);
  if b then
writeln(true)
  else if not(b) then
writeln(false)
  else
writeln(ord(b));
end.

writes 3 in delphi?

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Martok
Addendum to this:

> This was also always my intuition that the else block is also triggered for
> invalid enum values (the docs even literally say that, "If none of the case
> constants match the expression value") - and it *is* true in Delphi.
There is a reason why this is true in Delphi: because this is the way it has
been documented in Borland products for at least 25 years!

I have checked with the TP7 language reference (it pays to keep books around),
which defines the following things:
 - Enumeration element names are implicitly defined as typed constants of their
enum type
 - The enum type is either Byte (<=256 elements) or Word.
 - Subrange types are defined as the smallest type that can contain their range
 - Case statements execute the statements of the matching case label, or the
else block otherwise

Note that they actually defined enumerations as what I called 'fancy constants'
before.


The Delphi 4 language reference (also in book form, which is a bit more detailed
than what is in the .hlp files) uses more precise language:
 - Enumeration element names are implicitly defined as typed constants of their
enum type
 - The enum type is either Byte, Word, or Longword, depending on $Z and element
count
 - Subrange types are defined as the smallest type that can contain their range
 - it is legal to inc/dec outside of a subrange, example from the book:
   type Percentile = 1..99;
   var I: Percentile;
   begin
 I:= 99;
 inc(I);   // I is now 100
   So if this is a legal statement, subrange types can contain values outside of
their range. The description in the German version is "Die Variable wird in
ihren Basistyp umgewandelt", the variable becomes its base type.
 - Case statements execute *precisely one* of their branches: the statements of
the matching case label, or the else block otherwise

So, in D4, we have enums as fancy constants, subrange-types are not safe (so
enums can also never be), and case statements cannot fail.


FPC's language reference has no formal definition of what enums or subranges
really are, and the same language as TP7 regarding case statements.


So at least in modes TP and DELPHI, the optimisation in question is formally 
wrong.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Marco van de Voort
In our previous episode, Tomas Hajny said:
> > Worse, tying it to range check would then have heaps of redundant checking
> > everywhere, not just enums.
> 
> True. That's why I believe that Read from a (typed) file should perform
> such validation - but it doesn't at the moment, as mentioned in my e-mail
> in the other thread. :-(

That slows down needlessly IMHO. Or do you mean only in $R+?

Most will blockread records anyway.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Ondrej Pokorny

On 02.07.2017 18:49, Jonas Maebe wrote:
No, there is no built-in checked conversion from integer to arbitrary 
enumeration types. That's why I suggested in the bug report that 
started this thread to file a feature request for such a conversion.


Very good :)

Are there any disadvantages of the enum-AS operator that prevents its 
introduction?


Someone else could already have code that overloads the AS-operator in 
this way, and such a change would break this (you cannot overload 
operators with a built-in meaning). I would be in favour of a new 
intrinsic.


If I am not mistaken, the AS operator cannot be overloaded - so no 
chance to break legacy code here.


Ondrej
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Jonas Maebe

On 02/07/17 18:43, Ondrej Pokorny wrote:
Thanks, so there is no enumeration range checking from the compiler at 
all :/


Yes, there is range checking for enums. No, there is no built-in checked 
conversion from integer to arbitrary enumeration types. That's why I 
suggested in the bug report that started this thread to file a feature 
request for such a conversion.


Are there any disadvantages of the enum-AS operator that prevents its 
introduction?


Someone else could already have code that overloads the AS-operator in 
this way, and such a change would break this (you cannot overload 
operators with a built-in meaning). I would be in favour of a new intrinsic.



Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Ondrej Pokorny

On 02.07.2017 18:28, Jonas Maebe wrote:

On 02/07/17 18:26, Ondrej Pokorny wrote:
Allow me a stupid question: how to convert an integer to enum with 
range checking?


The current possibilities and possibly improvements have been 
mentioned elsewhere in this thread already

* http://lists.freepascal.org/pipermail/fpc-devel/2017-July/038013.html
* http://lists.freepascal.org/pipermail/fpc-devel/2017-July/038014.html


Thanks, so there is no enumeration range checking from the compiler at 
all :/ Everything has to be done manually :/


1.) if (I>=Ord(Low(TMyEnum))) and (I<=Ord(High(TMyEnum))) then

It's long and ugly and it is manual checking that the $RANGECHECKS 
directive has no effect to. (Yes, I use it in my code.)


2.) function ValueInEnumRange(TypeInfo : PTypeInfo; AValue : Integer) : 
boolean;


This still involves a manual checking.

Another problem: RTTI is not generated for enums with explicit indexes, 
if I am not mistaken: TEnum = (two = 2, four = 4).


---

IMO FPC/Pascal lacks an assignment operator for enums with range 
checking. Something like:


EnumValue := IntegerValue as TEnum;

Are there any disadvantages of the enum-AS operator that prevents its 
introduction?


Ondrej
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Michael Van Canneyt



On Sun, 2 Jul 2017, Tomas Hajny wrote:


By declaring it as a File of Enum, you are telling the compiler that it
contains only valid enums.


Noone can ever ensure, that a file doesn't get corrupted / tampered with
on a storage medium.


No-one can ensure a memory location cannot get corrupted either.


Moreover, using the same Read for reading from a text file _does_ perform
such checks (e.g. when using Read for reading an integer from a text file,
the value read is validated whether it conforms the given type and
potential failures are signalized either as an RTE, or a non-zero IOResult
depending on the $I state).


Text files by definition are not type safe. The compiler cannot know what it
contains.

By using file of enum (or any data type), you are explicitly telling the 
compiler it is OK.
The only exception is reference counted types; the compiler will forbid you
to define

myrecord = record
 a : ansistring;
 b : integer;
end;

f = file of myrecord;

Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Jonas Maebe

On 02/07/17 18:26, Ondrej Pokorny wrote:

On 02.07.2017 18:20, Jonas Maebe wrote:
Range checking code is generated for operations involving enums if, 
according to the type system, the enum can be out of range. Just like 
with integer sub-range types.


Allow me a stupid question: how to convert an integer to enum with range 
checking?


The current possibilities and possibly improvements have been mentioned 
elsewhere in this thread already

* http://lists.freepascal.org/pipermail/fpc-devel/2017-July/038013.html
* http://lists.freepascal.org/pipermail/fpc-devel/2017-July/038014.html


Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Ondrej Pokorny

On 02.07.2017 18:20, Jonas Maebe wrote:
Range checking code is generated for operations involving enums if, 
according to the type system, the enum can be out of range. Just like 
with integer sub-range types.


Allow me a stupid question: how to convert an integer to enum with range 
checking? A cast does not generate range checking, if I am not mistaken:


program Project1;

type
  TEnum = (one, two);

{$R+}
var
  I: Integer;
  E: TEnum;
begin
  I := 2;
  E := TEnum(I); // <<< I want a range check error here
  Writeln(Ord(E));
end.

Ondrej
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Jonas Maebe

On 02/07/17 11:59, Yury Sidorov wrote:
Indeed, I've done some tests and found out that when range checking is 
enabled enums are not checked at all. Even array access with enum index 
is not checked.

According to docs enums should be range checked:
https://www.freepascal.org/docs-html/prog/progsu65.html#x72-710001.2.65


Range checking code is generated for operations involving enums if, 
according to the type system, the enum can be out of range. Just like 
with integer sub-range types.


E.g., this generates a range check error:

{$r+}
type
  tenum = (ea,eb,ec,ed);
  tsubenum = eb..ec;
var
  arr: array[tsubenum] of byte;
  index: tenum;
begin
  index:=ed;
  writeln(arr[index]);
end.


Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Tomas Hajny
On Sun, July 2, 2017 17:05, Michael Van Canneyt wrote:
> On Sun, 2 Jul 2017, Tomas Hajny wrote:
>
>> On Sun, July 2, 2017 16:48, Marco van de Voort wrote:
>>> In our previous episode, Martok said:
 It is really hard to write code that interacts with the outside world
 without
 having a validation problem.
>>>
>>> Then you arguing wrong. Then you don't need validation everywhere, but
>>> something you can call to simply confirm an enum has correct values
>>> after
>>> reading.
>>>
>>> It is not logical to have heaps of checks littered everywhere if the
>>> corruption happens in a defined place (on load).
>>>
>>> Worse, tying it to range check would then have heaps of redundant
>>> checking
>>> everywhere, not just enums.
>>
>> True. That's why I believe that Read from a (typed) file should perform
>> such validation - but it doesn't at the moment, as mentioned in my
>> e-mail
>> in the other thread. :-(
>
> IMHO it should not.
>
> By declaring it as a File of Enum, you are telling the compiler that it
> contains only valid enums.

Noone can ever ensure, that a file doesn't get corrupted / tampered with
on a storage medium. In other words, you cannot check your assumption
mentioned above earlier than while reading the file. In this logic, typed
files could never be used in any program, because noone could ever ensure
that these files conform to their stated type before their contents enters
a variable of the declared type (and it should be validated before that
point, because that's exactly the point at which the compiler and possibly
also your program start assuming type safety).

Moreover, using the same Read for reading from a text file _does_ perform
such checks (e.g. when using Read for reading an integer from a text file,
the value read is validated whether it conforms the given type and
potential failures are signalized either as an RTE, or a non-zero IOResult
depending on the $I state).

Tomas


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Michael Van Canneyt



On Sun, 2 Jul 2017, Tomas Hajny wrote:


On Sun, July 2, 2017 16:48, Marco van de Voort wrote:

In our previous episode, Martok said:

It is really hard to write code that interacts with the outside world
without
having a validation problem.


Then you arguing wrong. Then you don't need validation everywhere, but
something you can call to simply confirm an enum has correct values after
reading.

It is not logical to have heaps of checks littered everywhere if the
corruption happens in a defined place (on load).

Worse, tying it to range check would then have heaps of redundant checking
everywhere, not just enums.


True. That's why I believe that Read from a (typed) file should perform
such validation - but it doesn't at the moment, as mentioned in my e-mail
in the other thread. :-(


IMHO it should not.

By declaring it as a File of Enum, you are telling the compiler that it 
contains only valid enums.


Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Tomas Hajny
On Sun, July 2, 2017 16:48, Marco van de Voort wrote:
> In our previous episode, Martok said:
>> It is really hard to write code that interacts with the outside world
>> without
>> having a validation problem.
>
> Then you arguing wrong. Then you don't need validation everywhere, but
> something you can call to simply confirm an enum has correct values after
> reading.
>
> It is not logical to have heaps of checks littered everywhere if the
> corruption happens in a defined place (on load).
>
> Worse, tying it to range check would then have heaps of redundant checking
> everywhere, not just enums.

True. That's why I believe that Read from a (typed) file should perform
such validation - but it doesn't at the moment, as mentioned in my e-mail
in the other thread. :-(

Tomas


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Michael Van Canneyt



On Sun, 2 Jul 2017, Martok wrote:


Hi all,

The only way to get data with an invalid value in an enum in Pascal is by 
using an unchecked (aka explicit) typecast, by executing code without range 
checking (assigning an enum from a larger parent enum type into a smaller 
sub-enum type), or by having an uninitialised variable.


Turns out this is really not true. There are also as "esoteric" things as using
Read(File). Or TStream.Read. Or the socket implementation of your choice. Or by
calling a library function. There are many ways to have an invalid value in an
enum in any meaningful code. Pretty much everything that is not a direct
assignment from a constant is a potential candidate.


These cases are without exception covered by the " unchecked (aka explicit) 
typecast,"
part of Jonas's statement. Including Read(File).

If you use Read(File) you are implicitly telling the compiler that the file
only contains valid values for the enum. If you yourself are not sure of this, 
you must use file of integer instead.


If you check their definitions, you will see that they all use untyped 
buffers to do their work. So all 'type safety' bets are off as soon as 
you use one of these mechanisms. This is not only true of enums, but for

every data type.

The correct pascal way is to do

var
  I : integer;
  M : MyEnum;

begin
  MyStream.ReadBuffer(I,SizeOf(I));
  if (I>=Ord(Low(TMyEnum))) and (I<=Ord(High(TMyEnum))) then
M:=TMyEnum(I)
  else
// error
end

Instead of

   MyStream.ReadBuffer(M,SizeOf(M));

Which is inherently not safe, as it uses an untyped buffer.

In essence: 
you are on your own as soon as you use external sources of values for enums.


Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Michael Van Canneyt



On Sun, 2 Jul 2017, Florian Klämpfl wrote:


So, we have a problem here: either the type system is broken because we can put
stuff in a type without being able to check if it actually belongs there, or
Tcgcasenode is broken because it (and _only_ it, as far as I can see) wants to
be clever by omitting an essentially free check for very little benefit.
I know which interpretation I would choose: the one with the easier fix ;-)


Yes, checking the data. I can easily create a similar problem as above with the 
"range checks" for
the jump table by reading a negative value into the enum. Unfortunately, the 
checks are unsigned ...

The correct solution is to provide a function which checks an integer based on 
rtti if it is valid
for a certain enum. Everything else is curing only symptoms.


GetEnumName from typinfo will already do this for you.
We could add an additional function that just returns true or false.
Something as
function ValueInEnumRange(TypeInfo : PTypeInfo; AValue : Integer) : boolean;

If memory serves correct, this will not work for enums that have explicitly 
assigned
numerical values, though.

Michael.___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Mark Morgan Lloyd

On 01/07/17 22:45, Martok wrote:


This is fine if (and only if) we can be absolutely sure that theEXPRESSIONRESULT always is between 
[low(ENUM)..high(ENUM)] - otherwise %eax inthe example above may be anywhere up to 
high(basetype)'th element of thejumptable, loading an address from anything that happens to be 
located after ourjumptable and jumping there. This is, I cannot stress this enough, 
extremelydangerous! I expect not everyone follows recent security research topics, sojust believe 
me when I say that: if there is any way at all to jump "anywhere",a competent attacker 
will find a way to make that "anywhere" be malicious code.


Is this made safe by always having an else/otherwise? If so, could the 
compiler at least raise a warning if an enumeration was sparse but there 
was no else/otherwise to catch unexpected cases?


--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Florian Klämpfl
Am 02.07.2017 um 00:26 schrieb Martok:
> Depending on the number of case labels, tcgcasenode.pass_generate_code
> (ncgset.pas:1070) may choose one of 3 strategies for the "matching" task:
> jumptable, jmptree or linearlist. Jmptree and linearlist are basically "lots 
> of
> if/else", while jumptable is a computed goto. The address of every possible
> label's code block is put into a table that is then indexed in a jmp.
> Example for x86, switching on a byte-sized #EXPRESSIONRESULT:
>mov#EXPRESSIONRESULT,%al
>sub#LOWESTLABEL,%al
>and$ff,%eax
>jmp*0x40c000(,%eax,4)
> 
> What if EXPRESSIONRESULT is larger than the largest case label? To be safe 
> from
> that, the compiler generates a range check, so the example above actually 
> looks
> like this:
>mov#EXPRESSIONRESULT,%al
>sub#LOWESTLABEL,%al
>cmp(#HIGHESTLABEL-#LOWESTLABEL),%al
>ja $#ELSE-BLOCK
>and$ff,%eax
>jmp*0x40c000(,%eax,4)
> 
> This is very fast because modern CPUs will correctly branch-predict the JA and
> start caching the jumptable so we effectively get the check for free, and 
> still

If you read about branch-prediction a little bit more in detail, then you will 
learn that branch
prediction fails to work well as soon as their is more than one cond. jump per 
16 bytes, this
basically applies to all CPUs.

> way faster than the equivalent series of "if/else" of the other strategies on
> old CPUs.
> 
> So far, so good. This is where Delphi is done and just emits the code.
> FPC however attempts one more optimization (at LEVEL1, so at the same level 
> that
> enables jumptables in the first place): if at all possible, the range check is
> omitted (which was probably a reasonable idea back when branches were always
> expensive). The only criterion for that is if the highest and lowest value of
> EXPRESSIONRESULT's type have case labels, ie. if the jumptable will be "full".
> This makes sense for simple basetypes like this:
>   case aByteVar of
> 0: DoSomething;
> // ... many more cases
> 255: DoSomethingLast;
>   end;
> 
> The most likely case where one might encounter this is however is with
> enumerations. Here, the criterion becomes "are the first and last declared
> element present as case labels?", and we're no longer necessarily talking 
> about
> highest and lowest value of the basetype.
> 
> This is fine if (and only if) we can be absolutely sure that the
> EXPRESSIONRESULT always is between [low(ENUM)..high(ENUM)] - otherwise %eax in
> the example above may be anywhere up to high(basetype)'th element of the
> jumptable, loading an address from anything that happens to be located after 
> our
> jumptable and jumping there. This is, I cannot stress this enough, extremely
> dangerous! I expect not everyone follows recent security research topics, so
> just believe me when I say that: if there is any way at all to jump 
> "anywhere",
> a competent attacker will find a way to make that "anywhere" be malicious 
> code.

Indeed. The same problem as with any array on the stack. You have to ensure by 
any means, that the
index of the array is within the declared range of the array. If you have an 
array with an enum as
index, you have to ensure that the enum is within the declared range, else you 
get the same problem
as with the case.

> 
> So, we have a problem here: either the type system is broken because we can 
> put
> stuff in a type without being able to check if it actually belongs there, or
> Tcgcasenode is broken because it (and _only_ it, as far as I can see) wants to
> be clever by omitting an essentially free check for very little benefit.
> I know which interpretation I would choose: the one with the easier fix ;-)

Yes, checking the data. I can easily create a similar problem as above with the 
"range checks" for
the jump table by reading a negative value into the enum. Unfortunately, the 
checks are unsigned ...

The correct solution is to provide a function which checks an integer based on 
rtti if it is valid
for a certain enum. Everything else is curing only symptoms.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel