Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Jonas Maebe

On 02/07/17 11:59, Yury Sidorov wrote:
Indeed, I've done some tests and found out that when range checking is 
enabled enums are not checked at all. Even array access with enum index 
is not checked.

According to docs enums should be range checked:
https://www.freepascal.org/docs-html/prog/progsu65.html#x72-710001.2.65


Range checking code is generated for operations involving enums if, 
according to the type system, the enum can be out of range. Just like 
with integer sub-range types.


E.g., this generates a range check error:

{$r+}
type
  tenum = (ea,eb,ec,ed);
  tsubenum = eb..ec;
var
  arr: array[tsubenum] of byte;
  index: tenum;
begin
  index:=ed;
  writeln(arr[index]);
end.


Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Tomas Hajny
On Sun, July 2, 2017 17:05, Michael Van Canneyt wrote:
> On Sun, 2 Jul 2017, Tomas Hajny wrote:
>
>> On Sun, July 2, 2017 16:48, Marco van de Voort wrote:
>>> In our previous episode, Martok said:
 It is really hard to write code that interacts with the outside world
 without
 having a validation problem.
>>>
>>> Then you arguing wrong. Then you don't need validation everywhere, but
>>> something you can call to simply confirm an enum has correct values
>>> after
>>> reading.
>>>
>>> It is not logical to have heaps of checks littered everywhere if the
>>> corruption happens in a defined place (on load).
>>>
>>> Worse, tying it to range check would then have heaps of redundant
>>> checking
>>> everywhere, not just enums.
>>
>> True. That's why I believe that Read from a (typed) file should perform
>> such validation - but it doesn't at the moment, as mentioned in my
>> e-mail
>> in the other thread. :-(
>
> IMHO it should not.
>
> By declaring it as a File of Enum, you are telling the compiler that it
> contains only valid enums.

Noone can ever ensure, that a file doesn't get corrupted / tampered with
on a storage medium. In other words, you cannot check your assumption
mentioned above earlier than while reading the file. In this logic, typed
files could never be used in any program, because noone could ever ensure
that these files conform to their stated type before their contents enters
a variable of the declared type (and it should be validated before that
point, because that's exactly the point at which the compiler and possibly
also your program start assuming type safety).

Moreover, using the same Read for reading from a text file _does_ perform
such checks (e.g. when using Read for reading an integer from a text file,
the value read is validated whether it conforms the given type and
potential failures are signalized either as an RTE, or a non-zero IOResult
depending on the $I state).

Tomas


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Michael Van Canneyt



On Sun, 2 Jul 2017, Tomas Hajny wrote:


On Sun, July 2, 2017 16:48, Marco van de Voort wrote:

In our previous episode, Martok said:

It is really hard to write code that interacts with the outside world
without
having a validation problem.


Then you arguing wrong. Then you don't need validation everywhere, but
something you can call to simply confirm an enum has correct values after
reading.

It is not logical to have heaps of checks littered everywhere if the
corruption happens in a defined place (on load).

Worse, tying it to range check would then have heaps of redundant checking
everywhere, not just enums.


True. That's why I believe that Read from a (typed) file should perform
such validation - but it doesn't at the moment, as mentioned in my e-mail
in the other thread. :-(


IMHO it should not.

By declaring it as a File of Enum, you are telling the compiler that it 
contains only valid enums.


Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Tomas Hajny
On Sun, July 2, 2017 16:48, Marco van de Voort wrote:
> In our previous episode, Martok said:
>> It is really hard to write code that interacts with the outside world
>> without
>> having a validation problem.
>
> Then you arguing wrong. Then you don't need validation everywhere, but
> something you can call to simply confirm an enum has correct values after
> reading.
>
> It is not logical to have heaps of checks littered everywhere if the
> corruption happens in a defined place (on load).
>
> Worse, tying it to range check would then have heaps of redundant checking
> everywhere, not just enums.

True. That's why I believe that Read from a (typed) file should perform
such validation - but it doesn't at the moment, as mentioned in my e-mail
in the other thread. :-(

Tomas


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Michael Van Canneyt



On Sun, 2 Jul 2017, Martok wrote:


Hi all,

The only way to get data with an invalid value in an enum in Pascal is by 
using an unchecked (aka explicit) typecast, by executing code without range 
checking (assigning an enum from a larger parent enum type into a smaller 
sub-enum type), or by having an uninitialised variable.


Turns out this is really not true. There are also as "esoteric" things as using
Read(File). Or TStream.Read. Or the socket implementation of your choice. Or by
calling a library function. There are many ways to have an invalid value in an
enum in any meaningful code. Pretty much everything that is not a direct
assignment from a constant is a potential candidate.


These cases are without exception covered by the " unchecked (aka explicit) 
typecast,"
part of Jonas's statement. Including Read(File).

If you use Read(File) you are implicitly telling the compiler that the file
only contains valid values for the enum. If you yourself are not sure of this, 
you must use file of integer instead.


If you check their definitions, you will see that they all use untyped 
buffers to do their work. So all 'type safety' bets are off as soon as 
you use one of these mechanisms. This is not only true of enums, but for

every data type.

The correct pascal way is to do

var
  I : integer;
  M : MyEnum;

begin
  MyStream.ReadBuffer(I,SizeOf(I));
  if (I>=Ord(Low(TMyEnum))) and (I<=Ord(High(TMyEnum))) then
M:=TMyEnum(I)
  else
// error
end

Instead of

   MyStream.ReadBuffer(M,SizeOf(M));

Which is inherently not safe, as it uses an untyped buffer.

In essence: 
you are on your own as soon as you use external sources of values for enums.


Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Michael Van Canneyt



On Sun, 2 Jul 2017, Florian Klämpfl wrote:


So, we have a problem here: either the type system is broken because we can put
stuff in a type without being able to check if it actually belongs there, or
Tcgcasenode is broken because it (and _only_ it, as far as I can see) wants to
be clever by omitting an essentially free check for very little benefit.
I know which interpretation I would choose: the one with the easier fix ;-)


Yes, checking the data. I can easily create a similar problem as above with the 
"range checks" for
the jump table by reading a negative value into the enum. Unfortunately, the 
checks are unsigned ...

The correct solution is to provide a function which checks an integer based on 
rtti if it is valid
for a certain enum. Everything else is curing only symptoms.


GetEnumName from typinfo will already do this for you.
We could add an additional function that just returns true or false.
Something as
function ValueInEnumRange(TypeInfo : PTypeInfo; AValue : Integer) : boolean;

If memory serves correct, this will not work for enums that have explicitly 
assigned
numerical values, though.

Michael.___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Mark Morgan Lloyd

On 01/07/17 22:45, Martok wrote:


This is fine if (and only if) we can be absolutely sure that theEXPRESSIONRESULT always is between 
[low(ENUM)..high(ENUM)] - otherwise %eax inthe example above may be anywhere up to 
high(basetype)'th element of thejumptable, loading an address from anything that happens to be 
located after ourjumptable and jumping there. This is, I cannot stress this enough, 
extremelydangerous! I expect not everyone follows recent security research topics, sojust believe 
me when I say that: if there is any way at all to jump "anywhere",a competent attacker 
will find a way to make that "anywhere" be malicious code.


Is this made safe by always having an else/otherwise? If so, could the 
compiler at least raise a warning if an enumeration was sparse but there 
was no else/otherwise to catch unexpected cases?


--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Florian Klämpfl
Am 02.07.2017 um 00:26 schrieb Martok:
> Depending on the number of case labels, tcgcasenode.pass_generate_code
> (ncgset.pas:1070) may choose one of 3 strategies for the "matching" task:
> jumptable, jmptree or linearlist. Jmptree and linearlist are basically "lots 
> of
> if/else", while jumptable is a computed goto. The address of every possible
> label's code block is put into a table that is then indexed in a jmp.
> Example for x86, switching on a byte-sized #EXPRESSIONRESULT:
>mov#EXPRESSIONRESULT,%al
>sub#LOWESTLABEL,%al
>and$ff,%eax
>jmp*0x40c000(,%eax,4)
> 
> What if EXPRESSIONRESULT is larger than the largest case label? To be safe 
> from
> that, the compiler generates a range check, so the example above actually 
> looks
> like this:
>mov#EXPRESSIONRESULT,%al
>sub#LOWESTLABEL,%al
>cmp(#HIGHESTLABEL-#LOWESTLABEL),%al
>ja $#ELSE-BLOCK
>and$ff,%eax
>jmp*0x40c000(,%eax,4)
> 
> This is very fast because modern CPUs will correctly branch-predict the JA and
> start caching the jumptable so we effectively get the check for free, and 
> still

If you read about branch-prediction a little bit more in detail, then you will 
learn that branch
prediction fails to work well as soon as their is more than one cond. jump per 
16 bytes, this
basically applies to all CPUs.

> way faster than the equivalent series of "if/else" of the other strategies on
> old CPUs.
> 
> So far, so good. This is where Delphi is done and just emits the code.
> FPC however attempts one more optimization (at LEVEL1, so at the same level 
> that
> enables jumptables in the first place): if at all possible, the range check is
> omitted (which was probably a reasonable idea back when branches were always
> expensive). The only criterion for that is if the highest and lowest value of
> EXPRESSIONRESULT's type have case labels, ie. if the jumptable will be "full".
> This makes sense for simple basetypes like this:
>   case aByteVar of
> 0: DoSomething;
> // ... many more cases
> 255: DoSomethingLast;
>   end;
> 
> The most likely case where one might encounter this is however is with
> enumerations. Here, the criterion becomes "are the first and last declared
> element present as case labels?", and we're no longer necessarily talking 
> about
> highest and lowest value of the basetype.
> 
> This is fine if (and only if) we can be absolutely sure that the
> EXPRESSIONRESULT always is between [low(ENUM)..high(ENUM)] - otherwise %eax in
> the example above may be anywhere up to high(basetype)'th element of the
> jumptable, loading an address from anything that happens to be located after 
> our
> jumptable and jumping there. This is, I cannot stress this enough, extremely
> dangerous! I expect not everyone follows recent security research topics, so
> just believe me when I say that: if there is any way at all to jump 
> "anywhere",
> a competent attacker will find a way to make that "anywhere" be malicious 
> code.

Indeed. The same problem as with any array on the stack. You have to ensure by 
any means, that the
index of the array is within the declared range of the array. If you have an 
array with an enum as
index, you have to ensure that the enum is within the declared range, else you 
get the same problem
as with the case.

> 
> So, we have a problem here: either the type system is broken because we can 
> put
> stuff in a type without being able to check if it actually belongs there, or
> Tcgcasenode is broken because it (and _only_ it, as far as I can see) wants to
> be clever by omitting an essentially free check for very little benefit.
> I know which interpretation I would choose: the one with the easier fix ;-)

Yes, checking the data. I can easily create a similar problem as above with the 
"range checks" for
the jump table by reading a negative value into the enum. Unfortunately, the 
checks are unsigned ...

The correct solution is to provide a function which checks an integer based on 
rtti if it is valid
for a certain enum. Everything else is curing only symptoms.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


<    1   2