Ilja Booij <[EMAIL PROTECTED]> said: > Hi, > > Aaron Stone wrote: > > > Funny you should mention the fgets() in read_line(), I just noticed that > > this morning, too! This way my fix: > > > > tmpline = fgets(tmpline, MAX_LINE_SIZE, instream); > > + if (!tmpline) > > + continue; > > > > On the next turn around the while loop, feof(instream) stops the loop. > > Although I'm also thinking that this might be inviting trouble in the > > form of an infinite loop if there was some other reason for fgets() to > > return NULL... > > what was your solution? > tmpline = fgets(tmpline, MAX_LINE_SIZE, instream); > if (!tmpline) > break; >
Well now, that makes sense :-P > > > > The makemd5() function looks like it's on shaky ground to me. Lots of > > magic numbers being used for things... but it's within the md5 functions > > that the problem is occurring... > > > > Ok, try this: in makemd5(), change result[16] to be twice as large. Don't > > change the size of the malloc(33), since we can use valgrind to try to > > catch the memory error to confirm my theory. What I'm thinking is that > > perhaps the gdm_md5_final() function is producing a larger result due to > > being 64 bit, and smashing the stack by running off the end of result[16]. > Doesn't help I'm afraid. > > Strange thing though. Without -O2, the error seems to be in md5 > functions(), but with -O2, it's in a PQEscapeString (on PostgreSQL of > course) In my experience, when a problem moves around due to optimization, it's either because you're triggering a different segfault somewhere else first, and your original bug and a new bug have been uncovered... or because there's an overflow somewhere way out in left field that's randomly screwing things up just depending upon the layout of the functions in memory :-\ It occurs to me that valgrind actually emulates a Pentium instruction set. I might be on a wild goose chase with the 64-bit thing, but valgrind should either be unable to monitor such a program or just freak out about the 64-bits and the additional x86-64 instructions that it doesn't know about... or perhaps it just glosses over them like it does with SSE, etc? > > Ilja > > > > Aaron > > > > > > Ilja Booij <[EMAIL PROTECTED]> said: > > > > > >>Hi all, > >> > >>A few days ago Paul F De La Cruz told us that dbmail was crashing on his > >>Dual Opteron system. He has given me the opportunity to use his system > >>for debugging. > >> > >>I actually found two bugs. One was a bug that I could reproduce on my > >>own systems. > >> > >>It's located in header.c, in the function read_header(). In line 72, > >>fgets can return NULL. There was no check for this, so the strlen() on > >>line 73 segfaulted. > >> > >>After fixing this bug, I happily, and wrongly ;), concluded that I had > >>fixed it all! Paul told me that the thing was still segfaulting.. It > >>turned out to be the makemd5() function from dbmd5.c that eventually > >>(somewhere deep down in the md5 functions) overwrites some memory it > >>should not overwrite. The md5 algorithm is pretty unclear to me, so I > >>cannot find what is going wrong. > >> > >>running valgrind on x86 did not reveal any problems. > >> > >>Does anyone have an idea how to fix this? > >> > >>Ilja > >>_______________________________________________ > >>Dbmail-dev mailing list > >>Dbmail-dev@dbmail.org > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev > >> > > > > > > > > > _______________________________________________ > Dbmail-dev mailing list > Dbmail-dev@dbmail.org > http://twister.fastxs.net/mailman/listinfo/dbmail-dev > --