Its easy to lose those things (like no warnings). We had all of that
early on in MySQL but with more pressure and more people it got lost. 

And the key to avoid that in drizzle is to rely heavily on automatic
checking, we had that early on in MySQL but it got lost (like crash-me
and the simple single threaded benchmarks that was done for every
release).

I especially disliked to see crash-me go away since it was a cool idea
that I think should be resurrected for drizzle.

BTW: for those that missed it crash-me was a perl script that tested
limits. Like sending a larger and larger query summing number with
parenthesis. Or trying longer and longer table/field names. 

It was *VERY* efficient in finding limits and bugs in database kernels
and drivers. As I remember only oracle handle it ok in the first try.
Every other db we found bugs in with this little script.

I highly recommend resurrecting it, or possible build a new one on top
of Philip Stoev's random query generator (that Stewart did some drizzle
porting work on)

/David

On Fri, 2009-07-31 at 14:16 -0400, Jay Pipes wrote:
> Lest I be accused of merely bashing MySQL engineering, let me explain 
> why I was less-than-slightly amused at the below email thread and 
> forwarded it to this list...
> 
> Early on in the development of Drizzle, Brian, Monty T, Stewart and I 
> spent hundreds of hours fixing nasty, hard-to-diagnose bugs just like 
> these.  Seeing this email brought back some not-so-happy memories of 
> that refactoring work.
> 
> Why does Drizzle not suffer from these kind of bugs (or at least very 
> few of them) compared to MySQL?  Very simple: our build/compile process.
> 
> Compilers emit warnings for a reason.  When you ignore those warnings 
> (such as a variable's data could be truncated via a cast), you let your 
> code get exposed to these kind of bugs.  By enabling -Werror and 
> compiling with VERY strict warnings, these kinds of bugs are caught 
> EARLY by the compiler, which complains that you are trying to do 
> something that may not be a good idea!
> 
> Anyway, I apologize for including private email addresses in the 
> forward, but the lessons of strict compilation's benefits should not be 
> forgotten.
> 
> -jay
> 
> Jay Pipes wrote:
> > LMAO.
> > 
> > -------- Original Message --------
> > Subject: Re: Exercise of the day: calling my_read() ...
> > Sinisa,
> > 
> >> 5.1 uses (correctly) size_t instead of int's ...
> > 
> > Nah, I disagree. It does use size_t but I would argue it doesn't do so
> > correctly.
> > 
> > Count is size_t:
> > 
> >> size_t my_read(File Filedes, uchar *Buffer, size_t Count, myf MyFlags)
> > 
> > It is passed as an argument to read(2) where a size_t is expected.
> > 
> >> if ((readbytes= read(Filedes, Buffer, (uint) Count)) != Count)
> > 
> > Oh wait, we have a size_t and need a size_t but cast it to uint just for
> > the fun of it? This means that reading Count > UINT_MAX bytes will not
> > work even if size_t is wider than uint (which it may well be on a 32 bit
> > system).
> > 
> > On the same line, the result of read(2) is stored in readbytes, which is
> > size_t. However, the result type of read(2) is ssize_t on most
> > platforms. Normally size_t is signed too on UNIX so we are in luck. Also
> > it's likely that ssize_t and size_t would have the same width in any case.
> > 
> > By the way, on the line above the read(2) call I find this:
> > 
> >> errno= 0; /* Linux doesn't reset this */
> > 
> > Of course Linux doesn't reset it; no standard-conforming function that
> > uses errno to return an error will reset it...
> > 
> > Anyway, later, readbytes is cast to int, in order to compare to -1
> > (which is an int literal in this case):
> > 
> >> if ((readbytes == 0 || (int) readbytes == -1) && errno == EINTR)
> > 
> > This means that any read of size S from an interrupted call where S %
> > UINT_MAX == -1 will be interpreted as a failure and the data is lost, so
> > we can definitely not trust it with Count > UINT_MAX even if we remove
> > the blocking cast. Of course this is unlikely to happen in reality
> > because the syscall has to be interrupted for this to trigger.
> > 
> > In some another place it is cast to long for printing. Since that is
> > debug code it doesn't really matter:
> > 
> >> DBUG_PRINT("debug", ("my_read() was interrupted and returned %ld",
> >>                     (long) readbytes));
> > 
> > Although it also appears cast to int in debug messages, so we likely
> > don't know what we want anyway:
> > 
> >> DBUG_PRINT("warning",("Read only %d bytes off %lu from %d, errno: %d",
> >>                        (int) readbytes, (ulong) Count, Filedes,
> >>                        my_errno));
> > 
> > In the same debug statement, Count is also cast to ulong for printing
> > even though we cast it to uint for execution and only expect to read a
> > block of size <= INT_MAX as evidenced by the cast of readbytes to int in
> > the same message. This doesn't actually break reading > INT_MAX but it
> > is clear that we don't expect to do read blocks that large. (There is a
> > typo there in the English text too.)
> > 
> > Elsewhere, it is not cast to int, but instead compared to (size_t) -1:
> > 
> >> if (readbytes == (size_t) -1)
> > 
> > This usually works, but if size_t is wider than int, a size_t value of
> > -1 will not be equal to an int value of -1 cast to size_t, and because
> > of that the equality test will have a different result.
> > 
> > ++
> > 
> > I would not trust this function with anything > INT_MAX, and I'm not
> > even sure I would trust it with anything > 50 or so.
> > 
> > 
> > Daniel
> > 
> 
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~drizzle-discuss
> Post to     : [email protected]
> Unsubscribe : https://launchpad.net/~drizzle-discuss
> More help   : https://help.launchpad.net/ListHelp


_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to