Re: [fpc-devel] Dangerous optimization in CASE..OF
On 02/07/17 11:59, Yury Sidorov wrote: Indeed, I've done some tests and found out that when range checking is enabled enums are not checked at all. Even array access with enum index is not checked. According to docs enums should be range checked: https://www.freepascal.org/docs-html/prog/progsu65.html#x72-710001.2.65 Range checking code is generated for operations involving enums if, according to the type system, the enum can be out of range. Just like with integer sub-range types. E.g., this generates a range check error: {$r+} type tenum = (ea,eb,ec,ed); tsubenum = eb..ec; var arr: array[tsubenum] of byte; index: tenum; begin index:=ed; writeln(arr[index]); end. Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On Sun, July 2, 2017 17:05, Michael Van Canneyt wrote: > On Sun, 2 Jul 2017, Tomas Hajny wrote: > >> On Sun, July 2, 2017 16:48, Marco van de Voort wrote: >>> In our previous episode, Martok said: It is really hard to write code that interacts with the outside world without having a validation problem. >>> >>> Then you arguing wrong. Then you don't need validation everywhere, but >>> something you can call to simply confirm an enum has correct values >>> after >>> reading. >>> >>> It is not logical to have heaps of checks littered everywhere if the >>> corruption happens in a defined place (on load). >>> >>> Worse, tying it to range check would then have heaps of redundant >>> checking >>> everywhere, not just enums. >> >> True. That's why I believe that Read from a (typed) file should perform >> such validation - but it doesn't at the moment, as mentioned in my >> e-mail >> in the other thread. :-( > > IMHO it should not. > > By declaring it as a File of Enum, you are telling the compiler that it > contains only valid enums. Noone can ever ensure, that a file doesn't get corrupted / tampered with on a storage medium. In other words, you cannot check your assumption mentioned above earlier than while reading the file. In this logic, typed files could never be used in any program, because noone could ever ensure that these files conform to their stated type before their contents enters a variable of the declared type (and it should be validated before that point, because that's exactly the point at which the compiler and possibly also your program start assuming type safety). Moreover, using the same Read for reading from a text file _does_ perform such checks (e.g. when using Read for reading an integer from a text file, the value read is validated whether it conforms the given type and potential failures are signalized either as an RTE, or a non-zero IOResult depending on the $I state). Tomas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On Sun, 2 Jul 2017, Tomas Hajny wrote: On Sun, July 2, 2017 16:48, Marco van de Voort wrote: In our previous episode, Martok said: It is really hard to write code that interacts with the outside world without having a validation problem. Then you arguing wrong. Then you don't need validation everywhere, but something you can call to simply confirm an enum has correct values after reading. It is not logical to have heaps of checks littered everywhere if the corruption happens in a defined place (on load). Worse, tying it to range check would then have heaps of redundant checking everywhere, not just enums. True. That's why I believe that Read from a (typed) file should perform such validation - but it doesn't at the moment, as mentioned in my e-mail in the other thread. :-( IMHO it should not. By declaring it as a File of Enum, you are telling the compiler that it contains only valid enums. Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On Sun, July 2, 2017 16:48, Marco van de Voort wrote: > In our previous episode, Martok said: >> It is really hard to write code that interacts with the outside world >> without >> having a validation problem. > > Then you arguing wrong. Then you don't need validation everywhere, but > something you can call to simply confirm an enum has correct values after > reading. > > It is not logical to have heaps of checks littered everywhere if the > corruption happens in a defined place (on load). > > Worse, tying it to range check would then have heaps of redundant checking > everywhere, not just enums. True. That's why I believe that Read from a (typed) file should perform such validation - but it doesn't at the moment, as mentioned in my e-mail in the other thread. :-( Tomas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On Sun, 2 Jul 2017, Martok wrote: Hi all, The only way to get data with an invalid value in an enum in Pascal is by using an unchecked (aka explicit) typecast, by executing code without range checking (assigning an enum from a larger parent enum type into a smaller sub-enum type), or by having an uninitialised variable. Turns out this is really not true. There are also as "esoteric" things as using Read(File). Or TStream.Read. Or the socket implementation of your choice. Or by calling a library function. There are many ways to have an invalid value in an enum in any meaningful code. Pretty much everything that is not a direct assignment from a constant is a potential candidate. These cases are without exception covered by the " unchecked (aka explicit) typecast," part of Jonas's statement. Including Read(File). If you use Read(File) you are implicitly telling the compiler that the file only contains valid values for the enum. If you yourself are not sure of this, you must use file of integer instead. If you check their definitions, you will see that they all use untyped buffers to do their work. So all 'type safety' bets are off as soon as you use one of these mechanisms. This is not only true of enums, but for every data type. The correct pascal way is to do var I : integer; M : MyEnum; begin MyStream.ReadBuffer(I,SizeOf(I)); if (I>=Ord(Low(TMyEnum))) and (I<=Ord(High(TMyEnum))) then M:=TMyEnum(I) else // error end Instead of MyStream.ReadBuffer(M,SizeOf(M)); Which is inherently not safe, as it uses an untyped buffer. In essence: you are on your own as soon as you use external sources of values for enums. Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On Sun, 2 Jul 2017, Florian Klämpfl wrote: So, we have a problem here: either the type system is broken because we can put stuff in a type without being able to check if it actually belongs there, or Tcgcasenode is broken because it (and _only_ it, as far as I can see) wants to be clever by omitting an essentially free check for very little benefit. I know which interpretation I would choose: the one with the easier fix ;-) Yes, checking the data. I can easily create a similar problem as above with the "range checks" for the jump table by reading a negative value into the enum. Unfortunately, the checks are unsigned ... The correct solution is to provide a function which checks an integer based on rtti if it is valid for a certain enum. Everything else is curing only symptoms. GetEnumName from typinfo will already do this for you. We could add an additional function that just returns true or false. Something as function ValueInEnumRange(TypeInfo : PTypeInfo; AValue : Integer) : boolean; If memory serves correct, this will not work for enums that have explicitly assigned numerical values, though. Michael.___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
On 01/07/17 22:45, Martok wrote: This is fine if (and only if) we can be absolutely sure that theEXPRESSIONRESULT always is between [low(ENUM)..high(ENUM)] - otherwise %eax inthe example above may be anywhere up to high(basetype)'th element of thejumptable, loading an address from anything that happens to be located after ourjumptable and jumping there. This is, I cannot stress this enough, extremelydangerous! I expect not everyone follows recent security research topics, sojust believe me when I say that: if there is any way at all to jump "anywhere",a competent attacker will find a way to make that "anywhere" be malicious code. Is this made safe by always having an else/otherwise? If so, could the compiler at least raise a warning if an enumeration was sparse but there was no else/otherwise to catch unexpected cases? -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues] ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Dangerous optimization in CASE..OF
Am 02.07.2017 um 00:26 schrieb Martok: > Depending on the number of case labels, tcgcasenode.pass_generate_code > (ncgset.pas:1070) may choose one of 3 strategies for the "matching" task: > jumptable, jmptree or linearlist. Jmptree and linearlist are basically "lots > of > if/else", while jumptable is a computed goto. The address of every possible > label's code block is put into a table that is then indexed in a jmp. > Example for x86, switching on a byte-sized #EXPRESSIONRESULT: >mov#EXPRESSIONRESULT,%al >sub#LOWESTLABEL,%al >and$ff,%eax >jmp*0x40c000(,%eax,4) > > What if EXPRESSIONRESULT is larger than the largest case label? To be safe > from > that, the compiler generates a range check, so the example above actually > looks > like this: >mov#EXPRESSIONRESULT,%al >sub#LOWESTLABEL,%al >cmp(#HIGHESTLABEL-#LOWESTLABEL),%al >ja $#ELSE-BLOCK >and$ff,%eax >jmp*0x40c000(,%eax,4) > > This is very fast because modern CPUs will correctly branch-predict the JA and > start caching the jumptable so we effectively get the check for free, and > still If you read about branch-prediction a little bit more in detail, then you will learn that branch prediction fails to work well as soon as their is more than one cond. jump per 16 bytes, this basically applies to all CPUs. > way faster than the equivalent series of "if/else" of the other strategies on > old CPUs. > > So far, so good. This is where Delphi is done and just emits the code. > FPC however attempts one more optimization (at LEVEL1, so at the same level > that > enables jumptables in the first place): if at all possible, the range check is > omitted (which was probably a reasonable idea back when branches were always > expensive). The only criterion for that is if the highest and lowest value of > EXPRESSIONRESULT's type have case labels, ie. if the jumptable will be "full". > This makes sense for simple basetypes like this: > case aByteVar of > 0: DoSomething; > // ... many more cases > 255: DoSomethingLast; > end; > > The most likely case where one might encounter this is however is with > enumerations. Here, the criterion becomes "are the first and last declared > element present as case labels?", and we're no longer necessarily talking > about > highest and lowest value of the basetype. > > This is fine if (and only if) we can be absolutely sure that the > EXPRESSIONRESULT always is between [low(ENUM)..high(ENUM)] - otherwise %eax in > the example above may be anywhere up to high(basetype)'th element of the > jumptable, loading an address from anything that happens to be located after > our > jumptable and jumping there. This is, I cannot stress this enough, extremely > dangerous! I expect not everyone follows recent security research topics, so > just believe me when I say that: if there is any way at all to jump > "anywhere", > a competent attacker will find a way to make that "anywhere" be malicious > code. Indeed. The same problem as with any array on the stack. You have to ensure by any means, that the index of the array is within the declared range of the array. If you have an array with an enum as index, you have to ensure that the enum is within the declared range, else you get the same problem as with the case. > > So, we have a problem here: either the type system is broken because we can > put > stuff in a type without being able to check if it actually belongs there, or > Tcgcasenode is broken because it (and _only_ it, as far as I can see) wants to > be clever by omitting an essentially free check for very little benefit. > I know which interpretation I would choose: the one with the easier fix ;-) Yes, checking the data. I can easily create a similar problem as above with the "range checks" for the jump table by reading a negative value into the enum. Unfortunately, the checks are unsigned ... The correct solution is to provide a function which checks an integer based on rtti if it is valid for a certain enum. Everything else is curing only symptoms. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel