[sqlite] Article about pointer abuse in SQLite

Scott Robison Sat, 19 Mar 2016 02:04:35 -0600

On Mar 18, 2016 11:12 PM, "James K. Lowden" <jklowden at schemamania.org>
wrote:
>
> On Fri, 18 Mar 2016 16:33:56 -0600
> Scott Robison <scott at casaderobison.com> wrote:
>
> > I'd rather have code that might use some "undefined behavior" and
> > generates the right answer than code that always conformed to defined
> > behavior yet was logically flawed.
>
> Code that falls under undefined behavior *is* logically flawed, by
> definition.  Whether or not it works, it's not specified to.  The
> compiler may have generated perfectly correct machine code, but another
> compiler or some future version of your present compiler may not.


Perhaps I should have said "undefined behavior as per the standard". Code
that does what is intended for the intended target environment utilizing a
specific tool chain is not logically flawed just because the standard calls
a construct "undefined behavior".

> You might share my beef with the compiler writers, though: lots things
> that are left undefined shouldn't be.

Not just compiler writers but standards organizations. Lots of overlap to
be sure, but not 100%.

Because hardware architecture
> varies, some practices that do work and have worked and are expected to
> work on a wide variety of machines are UB.  A recent thread on using
> void* for a function pointer is an example: dlsym(2) returns a function
> pointer defined as void*, but the C standard says void* can only refer
> to data, not functions!

So I guess casting between function pointers might be arguably safer if the
void* form of a function was called void(*)(void) though even then there
can be multiple forms of function pointers (like near vs far pointers in
x86 real mode).

>
> Machines exist for which the size of a function pointer is not
> sizeof(void*).  Source code that assumes they are the same size is not
> portable to those architectures.  Fine.  But a particular compiler
> generates code for a particular architecture.  On x86 hardware, all
> pointers have always been and will always be the same size.  All
> Linux/Posix code relies on that, too, along with a host of other
> assumptions. If that ever changed, a boat load of code would have to be
> changed.  Why does the compiler writer feel it's in his interest or
> mine to warn me about that not-happening eventuality?  For the machine
> I'm compilng for, the code is *not* in error.  For some future machine,
> maybe it will be; let's leave that until then.
>
> I was looking at John Regehr's blog the other day.  I think it was
> there that I learned that the practice of dropping UB code on the floor
> has been going on longer than I'd realized; it's just that gcc has been
> more aggressive in recent years.  I think it was there I saw this
> construction:
>
>         if( p < p + n)
>                 error
>
> where p is a pointer.  On lots of architectures, for large n, p + n can
> be negative.  The test works.  Or did.  The C standard says that's
> UB, though. It doesn't promise the pointer will go negative.  It doesn't
> promise it won't.  It doesn't promise not to tell your mother about
> it.  And, in one recent version, it doesn't compile it.  Warning?  No.
> Error? No.  Machine code?  No!  It's UB, so no code is generated (ergo,
> no error handling)!  Even though the hardware instructions that would
> be -- that used to be -- generated work as implied by the code.

Yes, these are the sorts of things that are frustrating.

> Postel's Law is to be liberal in what you accept and conservative in
> what you emit.  The compilers have been practicing the opposite,
> thwarting common longstanding practice just because they "can".
>
> Dan Bernstein is calling for a new C compiler that is 100%
> deterministic: no UB.  All UB per the standard would be defined by the
> compiler.  And maybe a few goodies, like zero-initialized automatic
> (stack) variables.

Neat idea, though undefined behavior isn't always bad. Interesting article
at http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
that goes into some considerations.

>
> Such a compiler would enjoy great popularity, even if it imposed, say,
> a 5% performance penalty, because C programmers would have greater
> confidence in their code working as expected. They'd have some
> assurance that the compiler wouldn't cut them off at the knees in its
> next release.  As he says, there's not real choice between fast and
> correct

Except that testing can verify something is correct for a given environment.

  If the "always defined befavior" compiler got off the ground,
> may it would finally drive gcc & friends in the direction of working
> with their users for a change.  Or make them irrelevant.

I think they'd continue to be popular with people looking to eek out as
much performance as possible.

[sqlite] Article about pointer abuse in SQLite

Reply via email to