--On June 19, 2009 4:12:40 PM -0400 Michael Bacon <[email protected]> wrote:

(Dropping info-cyrus on the followup)

--On June 19, 2009 3:43:43 PM -0400 Michael Bacon <[email protected]>
wrote:

--On June 19, 2009 9:57:03 AM +1000 Bron Gondwana <[email protected]>
wrote:

On Thu, Jun 18, 2009 at 05:44:19PM -0400, Michael Bacon wrote:
Another one stomped here.  This time, it's a 32/64 bit issue.  myinit
in cyrusdb_skiplist.c assumes that type_t is 4 bytes long, and writes
 out that many from the current timestamp when creating
$confdir/db/skipstamp.

Actually, reading the code, that's not strictly true:

       a = htonl(global_recovery);
-       if (r != -1) r = write(fd, &a, 4);
+       if (r != -1) r = write(fd, &a, sizeof(time_t));

It writes "a", which is the result of calling htonl on global_recovery.

If htonl isn't returning a 32 bit value of the lower order bytes of the
value that it's given, then this bug is going to be causing a LOT more
problems than just this.  We assume this works in quite a few other
places in the code, including the timestamp value in the skiplist header
itself, and in places throughout the mailbox code too.

"htonl" => "host to net long" by my reading.  There's also htonll for 64
bit values.  Is your platform creating net longlongs?

Good question -- this may be a Solaris bug after all.  Solaris clearly
defines in the man page that htonl is supposed to return a uint32_t from
htonl, but looking at sys/byteorder.h, that's um, not being enforced...

# if defined(_BIG_ENDIAN) && !defined(ntohl) && !defined(__lint)
/* big-endian */
# define ntohl(x)        (x)
# define ntohs(x)        (x)
# define htonl(x)        (x)
# define htons(x)        (x)

# elif !defined(ntohl) /* little-endian */

I think I may give our friends out in CA a call here...

I've put in a ticket with Sun on this, but in thinking about this, I'm
pretty sure this kind of definition is widespread (on our Linux 2.6.9
login cluster it's the same story in netinet/in.h), so while I can point
it out to Sun, expecting strong typing to come out of the byteorder
functions is probably a general mistake.  Since the functions explicitly
want a uint32_t or a uint16_t as the argument, the 100% proper thing to
do would seem to me to do an explicit typecast in the argument to these
functions.  If it's just a null macro, that solves the problem, and if
it's a real function, it's good form anyway.

Okay, so here's a patch to go against current CVS+Bron's last patch which converts everything over to uint32_t and does explicit typecasting on the arguments to all byteorder calls. This passes my basic, non-production tests, but it may not be the way folks want to proceed, so I'll float it out there for feedback.

I realize this requires C99 spec compliance, but is that still problematic in 2009?

-Michael



Attachment: skiplist_uint32.patch
Description: Binary data

Reply via email to