Re: [fpc-devel] Dangerous optimization in CASE..OF
I store record data in files with a checksum (usually a CRC). I block read them into an array buffer and verify the checksum. If it passes, I assign via typecast the array buffer to a variable of the record type. If I'm the only one reading and writing the files that is usually enough to handle drive bit rot, or transfer errors. If someone else's code can write the data I validate everything either when reading and assigning to the record type, or occasionally before use. Sure its slow but it's the only safe thing to do. I wouldn't think of abrogating that responsibility to the compiler. Jeff On Jul 2, 2017 4:50 PM, "Marco van de Voort"wrote: > In our previous episode, Florian Kl?mpfl said: > [ Charset UTF-8 unsupported, converting... ] > > Am 02.07.2017 um 21:40 schrieb Martok: > > > Honestly, I still don't understand why we're even having this > discussion. > > > > Because it is a fundamental question: if there is any defined behavior > possible if a variable > > contains an invalid value. I consider a value outside of the declared > range as invalid, if it shall > > be valid, change the declaration of the type. > > _AND_ remove types that can't have reasonably cheap range checks like > sparse > enums ? :-) > ___ > fpc-devel maillist - fpc-devel@lists.freepascal.org > http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel > ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
In our previous episode, Florian Kl?mpfl said: [ Charset UTF-8 unsupported, converting... ] > Am 02.07.2017 um 21:40 schrieb Martok: > > Honestly, I still don't understand why we're even having this discussion. > > Because it is a fundamental question: if there is any defined behavior > possible if a variable > contains an invalid value. I consider a value outside of the declared range > as invalid, if it shall > be valid, change the declaration of the type. _AND_ remove types that can't have reasonably cheap range checks like sparse enums ? :-) ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On Sun, July 2, 2017 18:39, Michael Van Canneyt wrote: > On Sun, 2 Jul 2017, Tomas Hajny wrote: > >>> By declaring it as a File of Enum, you are telling the compiler that it >>> contains only valid enums. >> >> Noone can ever ensure, that a file doesn't get corrupted / tampered with >> on a storage medium. > > No-one can ensure a memory location cannot get corrupted either. I don't think this is true. The operating system should ensure that no other process corrupts memory location exclusively used by my program, and I should make sure that my own program doesn't corrupt it itself. File is usually not protected to be exclusively used by your own program, unless it's created by the same program in a locked state and later read again (still locked) during the same run of that program - let's say that this pattern isn't a typical use of files, right? >> Moreover, using the same Read for reading from a text file _does_ >> perform such checks (e.g. when using Read for reading an integer from >> a text file, the value read is validated whether it conforms the given >> type and potential failures are signalized either as an RTE, or >> a non-zero IOResult depending on the $I state). > > Text files by definition are not type safe. The compiler cannot know what > it contains. I don't talk about the compiler, but about the RTL here. > By using file of enum (or any data type), you are explicitly telling the > compiler it is OK. There isn't much difference between telling the compiler that all values in certain file are of certain type and telling the compiler that the next value read from that file (text in this case) will conform to certain type. Both is typed, both should provide means for ensuring type safety while loading the value to the variable of the given type. Note that I don't talk about typecasting here, of course, that's something completely different (and manual checking is absolutely appropriate there). Tomas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
Am 02.07.2017 um 21:40 schrieb Martok: > Honestly, I still don't understand why we're even having this discussion. Because it is a fundamental question: if there is any defined behavior possible if a variable contains an invalid value. I consider a value outside of the declared range as invalid, if it shall be valid, change the declaration of the type. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On Sun, July 2, 2017 19:15, Marco van de Voort wrote: > In our previous episode, Tomas Hajny said: >> > Worse, tying it to range check would then have heaps of redundant >> checking >> > everywhere, not just enums. >> >> True. That's why I believe that Read from a (typed) file should perform >> such validation - but it doesn't at the moment, as mentioned in my >> e-mail >> in the other thread. :-( > > That slows down needlessly IMHO. Or do you mean only in $R+? $R+ would be sufficient from my point of view, but I'm not sure if that is possible, because $R+ is usually in effect in the place of declaration and the checks would probably need to happen inside the Read implementation (which is already compiled at that point in time). Unlike to $I, there's probably no way for $R to provide feedback to the caller which may be used for checks around the call. > Most will blockread records anyway. That's exactly the point. If someone uses BlockRead/BlockWrite for higher performance and thus uses untyped access, he/she has to perform manual checks as appropriate. If someone decides to use typed files, he/she probably prefers type safety over performance, but doesn't get either at the moment. :-( Tomas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
Am 02.07.2017 um 19:47 schrieb Florian Klämpfl: > Am 02.07.2017 um 19:29 schrieb Martok: >>type Percentile = 1..99; >>var I: Percentile; >>begin >> I:= 99; >> inc(I); // I is now 100 > > Forgot the mention: > Tried with $r+ :)? That case is also documented. RTE in {$R+}, legal in {$R-}. That also means that while you could make assumptions about the content in {$R+} (Delphi does not*), you definitely cannot as soon as there is a single write in {$R-}. A C++ compiler could probably try tracing that using constness of variables and parameters, but we cannot, and so must be defensive. *) Even FPC makes no such assumptions in all other instances! type TF = 1..25; var t: TF; begin t:= TF(200); if t in [1..50] then // tautology! Writeln('a') else writeln('b'); What does that print? Yeah. As documented. Check the codegen in R+: the if is still fully generated. Only tcgcasenode does something else. Honestly, I still don't understand why we're even having this discussion. We're not talking about adding a new check - only not leaving one out that is already there 99% of the time. We're not talking about standardising some new behaviour - Borland did that decades ago. The correct behaviour is already documented in every Pascal language reference (partly including our own), and is also the intuitive one. I just don't get it. Why would you sacrifice the runtime safety, or, if you prefer, the code compatibility, of your compiler over an (arguably wrong in at least 2 modes) specific technicality of the type system that is adhered to nowhere else? Taking a break for now. Grading a thesis starts to sound like good relaxation. Kind regards, Martok ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
Am 02.07.2017 um 20:29 schrieb Ondrej Pokorny: > On 02.07.2017 20:23, Florian Klämpfl wrote: >> And the compiler writes no warning during compilation? > > It does indeed. But about something else. Can we please stop derailing from the main issue here? > If we get a convenient way to assign ordinal to enum with range checks, > everything will be fine :) No it will not, we still can no longer elegantly pass/receive enums to/from libraries from other compilers. But at least it would be defined then, so programmers would know this is an incompatibility. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On 02.07.2017 20:23, Florian Klämpfl wrote: And the compiler writes no warning during compilation? It does indeed. On 02.07.2017 20:18, Florian Klämpfl wrote: Yes, undefined behavior. I think I got your point :) You are right, sorry for wasting your time. If we get a convenient way to assign ordinal to enum with range checks, everything will be fine :) Ondrej ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
Am 02.07.2017 um 19:55 schrieb Ondrej Pokorny: > On 02.07.2017 19:29, Martok wrote: >> - Case statements execute *precisely one* of their branches: the statements >> of >> the matching case label, or the else block otherwise > > To support your argument, the current Delphi documentation says the same: > > http://docwiki.embarcadero.com/RADStudio/Tokyo/en/Declarations_and_Statements > > /Whichever caseList has a value equal to that of selectorExpression > determines the statement to be > used. If none of the caseLists has the same value as selectorExpression, then > the statements in the > else clause (if there is one) are executed./ > > According to Delphi documentation, invalid values should point to the else > clause. > > Furthermore, it is OK to use invalid values in caseList as well: > > program Project1; > > {$APPTYPE CONSOLE} > > type > TMyEnum = (one, two); > > {$R+} > var > E: TMyEnum; > begin > E := TMyEnum(-1); > case E of > one, two: Writeln('valid'); > TMyEnum(-1): Writeln('minus one'); > else > Writeln('invalid'); > end; > end. > > The program above writes 'minus one' in Delphi. And the compiler writes no warning during compilation? ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
Am 02.07.2017 um 20:12 schrieb Martok: >> They are: >> http://docwiki.embarcadero.com/Libraries/XE5/en/System.Boolean > That prototype is a recent invention, it wasn't there in older versions. Also *sigh* This is the case since pascal was iso standarized. > the text sounds quite different somewhere else: > http://docwiki.embarcadero.com/RADStudio/XE5/en/Simple_Types#Boolean_Types > >> Yes. What I wanted to point out: also delphi does optimizations on enums >> which fails if one feeds >> invalid values. > Okay, if you want believe that Booleans are enums: I do not believe, I know. > > b:=boolean(42); > if not b then > writeln('falsy') > else > writeln('truthy'); > > Prints truthy. Doesn't crash. Yes, undefined behavior. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On 02.07.2017 19:39, Florian Klämpfl wrote: So this means: var b : boolean; begin b:=boolean(3); if b then writeln(true) else if not(b) then writeln(false) else writeln(ord(b)); end. writes 3 in delphi? IMO you picked up a Delphi compiler bug/undocumented feature (call it as you want). "if boolean(3) then A" executes A contrary to the documentation - the docs say something different then the compiler does. You should not use it as an argument but create an issue report on Embarcadero's Quality Central so that they either fix the documentation or fix the compiler. Whereas: case boolean(3) of True: A; False: B; else C; end; is documented to execute C and the compiler executes C => good. Ondrej ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
> They are: > http://docwiki.embarcadero.com/Libraries/XE5/en/System.Boolean That prototype is a recent invention, it wasn't there in older versions. Also the text sounds quite different somewhere else: http://docwiki.embarcadero.com/RADStudio/XE5/en/Simple_Types#Boolean_Types > Yes. What I wanted to point out: also delphi does optimizations on enums > which fails if one feeds > invalid values. Okay, if you want believe that Booleans are enums: b:=boolean(42); if not b then writeln('falsy') else writeln('truthy'); Prints truthy. Doesn't crash. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
Am 02.07.2017 um 19:51 schrieb Martok: > Booleans are not enums in Delphi (not even ordinals), They are: http://docwiki.embarcadero.com/Libraries/XE5/en/System.Boolean > but their own little > thing. "if boolean_expr" is always a jz/jnz, no matter what. Yes. This is an optimization which is invalid as well if I follow your argumentation. Boolean(3)<>true. > They are defined as > 0=FALSE and "everything else"=TRUE No, see link above. > > However: > > var > b : boolean; > begin > b:=boolean(3); > if b = True then > writeln(true) > else if b = False then > writeln(false) > else > writeln(ord(b)); > end. > > That writes 3, Yes. What I wanted to point out: also delphi does optimizations on enums which fails if one feeds invalid values. > which is why your should never compare on the boolean lexicals. > Some Winapi functions returning longbool rely on that. No, longbool is something different (even bytebool is). > > Wait, that was a trick question, wasn't it? In the sense to point out that also delphi assumes enumeration variables contain always valid values. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On 02.07.2017 19:29, Martok wrote: - Case statements execute*precisely one* of their branches: the statements of the matching case label, or the else block otherwise To support your argument, the current Delphi documentation says the same: http://docwiki.embarcadero.com/RADStudio/Tokyo/en/Declarations_and_Statements /Whichever caseList has a value equal to that of selectorExpression determines the statement to be used. If none of the caseLists has the same value as selectorExpression, then the statements in the else clause (if there is one) are executed./ According to Delphi documentation, invalid values should point to the else clause. Furthermore, it is OK to use invalid values in caseList as well: program Project1; {$APPTYPE CONSOLE} type TMyEnum = (one, two); {$R+} var E: TMyEnum; begin E := TMyEnum(-1); case E of one, two: Writeln('valid'); TMyEnum(-1): Writeln('minus one'); else Writeln('invalid'); end; end. The program above writes 'minus one' in Delphi. Ondrej ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
Booleans are not enums in Delphi (not even ordinals), but their own little thing. "if boolean_expr" is always a jz/jnz, no matter what. They are defined as 0=FALSE and "everything else"=TRUE However: var b : boolean; begin b:=boolean(3); if b = True then writeln(true) else if b = False then writeln(false) else writeln(ord(b)); end. That writes 3, which is why your should never compare on the boolean lexicals. Some Winapi functions returning longbool rely on that. Wait, that was a trick question, wasn't it? ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
Am 02.07.2017 um 19:29 schrieb Martok: >type Percentile = 1..99; >var I: Percentile; >begin > I:= 99; > inc(I); // I is now 100 Forgot the mention: Tried with $r+ :)? >So if this is a legal statement, Well, it is a matter of definition, if a statement causing a rte 201 when compiled with $r+ is a legal statement ... ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
Am 02.07.2017 um 19:29 schrieb Martok: > Addendum to this: > >> This was also always my intuition that the else block is also triggered for >> invalid enum values (the docs even literally say that, "If none of the case >> constants match the expression value") - and it *is* true in Delphi. > There is a reason why this is true in Delphi: because this is the way it has > been documented in Borland products for at least 25 years! > > I have checked with the TP7 language reference (it pays to keep books around), > which defines the following things: > - Enumeration element names are implicitly defined as typed constants of > their > enum type > - The enum type is either Byte (<=256 elements) or Word. > - Subrange types are defined as the smallest type that can contain their > range > - Case statements execute the statements of the matching case label, or the > else block otherwise > > Note that they actually defined enumerations as what I called 'fancy > constants' > before. > > > The Delphi 4 language reference (also in book form, which is a bit more > detailed > than what is in the .hlp files) uses more precise language: > - Enumeration element names are implicitly defined as typed constants of > their > enum type > - The enum type is either Byte, Word, or Longword, depending on $Z and > element > count > - Subrange types are defined as the smallest type that can contain their > range > - it is legal to inc/dec outside of a subrange, example from the book: >type Percentile = 1..99; >var I: Percentile; >begin > I:= 99; > inc(I); // I is now 100 >So if this is a legal statement, subrange types can contain values outside > of > their range. The description in the German version is "Die Variable wird in > ihren Basistyp umgewandelt", the variable becomes its base type. > - Case statements execute *precisely one* of their branches: the statements > of > the matching case label, or the else block otherwise > > So, in D4, we have enums as fancy constants, subrange-types are not safe (so > enums can also never be), and case statements cannot fail. > > > FPC's language reference has no formal definition of what enums or subranges > really are, and the same language as TP7 regarding case statements. > > > So at least in modes TP and DELPHI, the optimisation in question is formally > wrong. So this means: var b : boolean; begin b:=boolean(3); if b then writeln(true) else if not(b) then writeln(false) else writeln(ord(b)); end. writes 3 in delphi? ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
Addendum to this: > This was also always my intuition that the else block is also triggered for > invalid enum values (the docs even literally say that, "If none of the case > constants match the expression value") - and it *is* true in Delphi. There is a reason why this is true in Delphi: because this is the way it has been documented in Borland products for at least 25 years! I have checked with the TP7 language reference (it pays to keep books around), which defines the following things: - Enumeration element names are implicitly defined as typed constants of their enum type - The enum type is either Byte (<=256 elements) or Word. - Subrange types are defined as the smallest type that can contain their range - Case statements execute the statements of the matching case label, or the else block otherwise Note that they actually defined enumerations as what I called 'fancy constants' before. The Delphi 4 language reference (also in book form, which is a bit more detailed than what is in the .hlp files) uses more precise language: - Enumeration element names are implicitly defined as typed constants of their enum type - The enum type is either Byte, Word, or Longword, depending on $Z and element count - Subrange types are defined as the smallest type that can contain their range - it is legal to inc/dec outside of a subrange, example from the book: type Percentile = 1..99; var I: Percentile; begin I:= 99; inc(I); // I is now 100 So if this is a legal statement, subrange types can contain values outside of their range. The description in the German version is "Die Variable wird in ihren Basistyp umgewandelt", the variable becomes its base type. - Case statements execute *precisely one* of their branches: the statements of the matching case label, or the else block otherwise So, in D4, we have enums as fancy constants, subrange-types are not safe (so enums can also never be), and case statements cannot fail. FPC's language reference has no formal definition of what enums or subranges really are, and the same language as TP7 regarding case statements. So at least in modes TP and DELPHI, the optimisation in question is formally wrong. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
In our previous episode, Tomas Hajny said: > > Worse, tying it to range check would then have heaps of redundant checking > > everywhere, not just enums. > > True. That's why I believe that Read from a (typed) file should perform > such validation - but it doesn't at the moment, as mentioned in my e-mail > in the other thread. :-( That slows down needlessly IMHO. Or do you mean only in $R+? Most will blockread records anyway. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On 02.07.2017 18:49, Jonas Maebe wrote: No, there is no built-in checked conversion from integer to arbitrary enumeration types. That's why I suggested in the bug report that started this thread to file a feature request for such a conversion. Very good :) Are there any disadvantages of the enum-AS operator that prevents its introduction? Someone else could already have code that overloads the AS-operator in this way, and such a change would break this (you cannot overload operators with a built-in meaning). I would be in favour of a new intrinsic. If I am not mistaken, the AS operator cannot be overloaded - so no chance to break legacy code here. Ondrej ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On 02/07/17 18:43, Ondrej Pokorny wrote: Thanks, so there is no enumeration range checking from the compiler at all :/ Yes, there is range checking for enums. No, there is no built-in checked conversion from integer to arbitrary enumeration types. That's why I suggested in the bug report that started this thread to file a feature request for such a conversion. Are there any disadvantages of the enum-AS operator that prevents its introduction? Someone else could already have code that overloads the AS-operator in this way, and such a change would break this (you cannot overload operators with a built-in meaning). I would be in favour of a new intrinsic. Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On 02.07.2017 18:28, Jonas Maebe wrote: On 02/07/17 18:26, Ondrej Pokorny wrote: Allow me a stupid question: how to convert an integer to enum with range checking? The current possibilities and possibly improvements have been mentioned elsewhere in this thread already * http://lists.freepascal.org/pipermail/fpc-devel/2017-July/038013.html * http://lists.freepascal.org/pipermail/fpc-devel/2017-July/038014.html Thanks, so there is no enumeration range checking from the compiler at all :/ Everything has to be done manually :/ 1.) if (I>=Ord(Low(TMyEnum))) and (I<=Ord(High(TMyEnum))) then It's long and ugly and it is manual checking that the $RANGECHECKS directive has no effect to. (Yes, I use it in my code.) 2.) function ValueInEnumRange(TypeInfo : PTypeInfo; AValue : Integer) : boolean; This still involves a manual checking. Another problem: RTTI is not generated for enums with explicit indexes, if I am not mistaken: TEnum = (two = 2, four = 4). --- IMO FPC/Pascal lacks an assignment operator for enums with range checking. Something like: EnumValue := IntegerValue as TEnum; Are there any disadvantages of the enum-AS operator that prevents its introduction? Ondrej ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On Sun, 2 Jul 2017, Tomas Hajny wrote: By declaring it as a File of Enum, you are telling the compiler that it contains only valid enums. Noone can ever ensure, that a file doesn't get corrupted / tampered with on a storage medium. No-one can ensure a memory location cannot get corrupted either. Moreover, using the same Read for reading from a text file _does_ perform such checks (e.g. when using Read for reading an integer from a text file, the value read is validated whether it conforms the given type and potential failures are signalized either as an RTE, or a non-zero IOResult depending on the $I state). Text files by definition are not type safe. The compiler cannot know what it contains. By using file of enum (or any data type), you are explicitly telling the compiler it is OK. The only exception is reference counted types; the compiler will forbid you to define myrecord = record a : ansistring; b : integer; end; f = file of myrecord; Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On 02/07/17 18:26, Ondrej Pokorny wrote: On 02.07.2017 18:20, Jonas Maebe wrote: Range checking code is generated for operations involving enums if, according to the type system, the enum can be out of range. Just like with integer sub-range types. Allow me a stupid question: how to convert an integer to enum with range checking? The current possibilities and possibly improvements have been mentioned elsewhere in this thread already * http://lists.freepascal.org/pipermail/fpc-devel/2017-July/038013.html * http://lists.freepascal.org/pipermail/fpc-devel/2017-July/038014.html Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On 02.07.2017 18:20, Jonas Maebe wrote: Range checking code is generated for operations involving enums if, according to the type system, the enum can be out of range. Just like with integer sub-range types. Allow me a stupid question: how to convert an integer to enum with range checking? A cast does not generate range checking, if I am not mistaken: program Project1; type TEnum = (one, two); {$R+} var I: Integer; E: TEnum; begin I := 2; E := TEnum(I); // <<< I want a range check error here Writeln(Ord(E)); end. Ondrej ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On 02/07/17 11:59, Yury Sidorov wrote: Indeed, I've done some tests and found out that when range checking is enabled enums are not checked at all. Even array access with enum index is not checked. According to docs enums should be range checked: https://www.freepascal.org/docs-html/prog/progsu65.html#x72-710001.2.65 Range checking code is generated for operations involving enums if, according to the type system, the enum can be out of range. Just like with integer sub-range types. E.g., this generates a range check error: {$r+} type tenum = (ea,eb,ec,ed); tsubenum = eb..ec; var arr: array[tsubenum] of byte; index: tenum; begin index:=ed; writeln(arr[index]); end. Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On Sun, July 2, 2017 17:05, Michael Van Canneyt wrote: > On Sun, 2 Jul 2017, Tomas Hajny wrote: > >> On Sun, July 2, 2017 16:48, Marco van de Voort wrote: >>> In our previous episode, Martok said: It is really hard to write code that interacts with the outside world without having a validation problem. >>> >>> Then you arguing wrong. Then you don't need validation everywhere, but >>> something you can call to simply confirm an enum has correct values >>> after >>> reading. >>> >>> It is not logical to have heaps of checks littered everywhere if the >>> corruption happens in a defined place (on load). >>> >>> Worse, tying it to range check would then have heaps of redundant >>> checking >>> everywhere, not just enums. >> >> True. That's why I believe that Read from a (typed) file should perform >> such validation - but it doesn't at the moment, as mentioned in my >> e-mail >> in the other thread. :-( > > IMHO it should not. > > By declaring it as a File of Enum, you are telling the compiler that it > contains only valid enums. Noone can ever ensure, that a file doesn't get corrupted / tampered with on a storage medium. In other words, you cannot check your assumption mentioned above earlier than while reading the file. In this logic, typed files could never be used in any program, because noone could ever ensure that these files conform to their stated type before their contents enters a variable of the declared type (and it should be validated before that point, because that's exactly the point at which the compiler and possibly also your program start assuming type safety). Moreover, using the same Read for reading from a text file _does_ perform such checks (e.g. when using Read for reading an integer from a text file, the value read is validated whether it conforms the given type and potential failures are signalized either as an RTE, or a non-zero IOResult depending on the $I state). Tomas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On Sun, 2 Jul 2017, Tomas Hajny wrote: On Sun, July 2, 2017 16:48, Marco van de Voort wrote: In our previous episode, Martok said: It is really hard to write code that interacts with the outside world without having a validation problem. Then you arguing wrong. Then you don't need validation everywhere, but something you can call to simply confirm an enum has correct values after reading. It is not logical to have heaps of checks littered everywhere if the corruption happens in a defined place (on load). Worse, tying it to range check would then have heaps of redundant checking everywhere, not just enums. True. That's why I believe that Read from a (typed) file should perform such validation - but it doesn't at the moment, as mentioned in my e-mail in the other thread. :-( IMHO it should not. By declaring it as a File of Enum, you are telling the compiler that it contains only valid enums. Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On Sun, July 2, 2017 16:48, Marco van de Voort wrote: > In our previous episode, Martok said: >> It is really hard to write code that interacts with the outside world >> without >> having a validation problem. > > Then you arguing wrong. Then you don't need validation everywhere, but > something you can call to simply confirm an enum has correct values after > reading. > > It is not logical to have heaps of checks littered everywhere if the > corruption happens in a defined place (on load). > > Worse, tying it to range check would then have heaps of redundant checking > everywhere, not just enums. True. That's why I believe that Read from a (typed) file should perform such validation - but it doesn't at the moment, as mentioned in my e-mail in the other thread. :-( Tomas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On Sun, 2 Jul 2017, Martok wrote: Hi all, The only way to get data with an invalid value in an enum in Pascal is by using an unchecked (aka explicit) typecast, by executing code without range checking (assigning an enum from a larger parent enum type into a smaller sub-enum type), or by having an uninitialised variable. Turns out this is really not true. There are also as "esoteric" things as using Read(File). Or TStream.Read. Or the socket implementation of your choice. Or by calling a library function. There are many ways to have an invalid value in an enum in any meaningful code. Pretty much everything that is not a direct assignment from a constant is a potential candidate. These cases are without exception covered by the " unchecked (aka explicit) typecast," part of Jonas's statement. Including Read(File). If you use Read(File) you are implicitly telling the compiler that the file only contains valid values for the enum. If you yourself are not sure of this, you must use file of integer instead. If you check their definitions, you will see that they all use untyped buffers to do their work. So all 'type safety' bets are off as soon as you use one of these mechanisms. This is not only true of enums, but for every data type. The correct pascal way is to do var I : integer; M : MyEnum; begin MyStream.ReadBuffer(I,SizeOf(I)); if (I>=Ord(Low(TMyEnum))) and (I<=Ord(High(TMyEnum))) then M:=TMyEnum(I) else // error end Instead of MyStream.ReadBuffer(M,SizeOf(M)); Which is inherently not safe, as it uses an untyped buffer. In essence: you are on your own as soon as you use external sources of values for enums. Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On Sun, 2 Jul 2017, Florian Klämpfl wrote: So, we have a problem here: either the type system is broken because we can put stuff in a type without being able to check if it actually belongs there, or Tcgcasenode is broken because it (and _only_ it, as far as I can see) wants to be clever by omitting an essentially free check for very little benefit. I know which interpretation I would choose: the one with the easier fix ;-) Yes, checking the data. I can easily create a similar problem as above with the "range checks" for the jump table by reading a negative value into the enum. Unfortunately, the checks are unsigned ... The correct solution is to provide a function which checks an integer based on rtti if it is valid for a certain enum. Everything else is curing only symptoms. GetEnumName from typinfo will already do this for you. We could add an additional function that just returns true or false. Something as function ValueInEnumRange(TypeInfo : PTypeInfo; AValue : Integer) : boolean; If memory serves correct, this will not work for enums that have explicitly assigned numerical values, though. Michael.___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On 01/07/17 22:45, Martok wrote: This is fine if (and only if) we can be absolutely sure that theEXPRESSIONRESULT always is between [low(ENUM)..high(ENUM)] - otherwise %eax inthe example above may be anywhere up to high(basetype)'th element of thejumptable, loading an address from anything that happens to be located after ourjumptable and jumping there. This is, I cannot stress this enough, extremelydangerous! I expect not everyone follows recent security research topics, sojust believe me when I say that: if there is any way at all to jump "anywhere",a competent attacker will find a way to make that "anywhere" be malicious code. Is this made safe by always having an else/otherwise? If so, could the compiler at least raise a warning if an enumeration was sparse but there was no else/otherwise to catch unexpected cases? -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues] ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
Am 02.07.2017 um 00:26 schrieb Martok: > Depending on the number of case labels, tcgcasenode.pass_generate_code > (ncgset.pas:1070) may choose one of 3 strategies for the "matching" task: > jumptable, jmptree or linearlist. Jmptree and linearlist are basically "lots > of > if/else", while jumptable is a computed goto. The address of every possible > label's code block is put into a table that is then indexed in a jmp. > Example for x86, switching on a byte-sized #EXPRESSIONRESULT: >mov#EXPRESSIONRESULT,%al >sub#LOWESTLABEL,%al >and$ff,%eax >jmp*0x40c000(,%eax,4) > > What if EXPRESSIONRESULT is larger than the largest case label? To be safe > from > that, the compiler generates a range check, so the example above actually > looks > like this: >mov#EXPRESSIONRESULT,%al >sub#LOWESTLABEL,%al >cmp(#HIGHESTLABEL-#LOWESTLABEL),%al >ja $#ELSE-BLOCK >and$ff,%eax >jmp*0x40c000(,%eax,4) > > This is very fast because modern CPUs will correctly branch-predict the JA and > start caching the jumptable so we effectively get the check for free, and > still If you read about branch-prediction a little bit more in detail, then you will learn that branch prediction fails to work well as soon as their is more than one cond. jump per 16 bytes, this basically applies to all CPUs. > way faster than the equivalent series of "if/else" of the other strategies on > old CPUs. > > So far, so good. This is where Delphi is done and just emits the code. > FPC however attempts one more optimization (at LEVEL1, so at the same level > that > enables jumptables in the first place): if at all possible, the range check is > omitted (which was probably a reasonable idea back when branches were always > expensive). The only criterion for that is if the highest and lowest value of > EXPRESSIONRESULT's type have case labels, ie. if the jumptable will be "full". > This makes sense for simple basetypes like this: > case aByteVar of > 0: DoSomething; > // ... many more cases > 255: DoSomethingLast; > end; > > The most likely case where one might encounter this is however is with > enumerations. Here, the criterion becomes "are the first and last declared > element present as case labels?", and we're no longer necessarily talking > about > highest and lowest value of the basetype. > > This is fine if (and only if) we can be absolutely sure that the > EXPRESSIONRESULT always is between [low(ENUM)..high(ENUM)] - otherwise %eax in > the example above may be anywhere up to high(basetype)'th element of the > jumptable, loading an address from anything that happens to be located after > our > jumptable and jumping there. This is, I cannot stress this enough, extremely > dangerous! I expect not everyone follows recent security research topics, so > just believe me when I say that: if there is any way at all to jump > "anywhere", > a competent attacker will find a way to make that "anywhere" be malicious > code. Indeed. The same problem as with any array on the stack. You have to ensure by any means, that the index of the array is within the declared range of the array. If you have an array with an enum as index, you have to ensure that the enum is within the declared range, else you get the same problem as with the case. > > So, we have a problem here: either the type system is broken because we can > put > stuff in a type without being able to check if it actually belongs there, or > Tcgcasenode is broken because it (and _only_ it, as far as I can see) wants to > be clever by omitting an essentially free check for very little benefit. > I know which interpretation I would choose: the one with the easier fix ;-) Yes, checking the data. I can easily create a similar problem as above with the "range checks" for the jump table by reading a negative value into the enum. Unfortunately, the checks are unsigned ... The correct solution is to provide a function which checks an integer based on rtti if it is valid for a certain enum. Everything else is curing only symptoms. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel