On 5/15/2017 9:55 PM, Ben Peart wrote:


On 5/15/2017 8:34 PM, Jeff King wrote:
On Tue, May 16, 2017 at 12:22:14AM +0000, brian m. carlson wrote:

On Mon, May 15, 2017 at 03:13:44PM -0400, Ben Peart wrote:
+    istate->last_update = (time_t)ntohll(*(uint64_t *)index);
+    index += sizeof(uint64_t);
+
+    ewah_size = ntohl(*(uint32_t *)index);
+    index += sizeof(uint32_t);

To answer the question you asked in your cover letter, you cannot write
this unless you can guarantee (((uintptr_t)index & 7) == 0) is true.
Otherwise, this will produce a SIGBUS on SPARC, Alpha, MIPS, and some
ARM systems, and it will perform poorly on PowerPC and other ARM
systems[0].

If you got that pointer from malloc and have only indexed multiples of 8
on it, you're good.  But if you're not sure, you probably want to use
memcpy.  If the compiler can determine that it's not necessary, it will
omit the copy and perform a direct load.

I think get_be32() does exactly what we want for the ewah_size read. For
the last_update one, we don't have a get_be64() yet, but it should be
easy to make based on the 16/32 versions.

Thanks for the pointers.  I'll update this to use the existing get_be32
and have created a get_be64 and will use that for the last_update.


OK, now I'm confused as to the best path for adding a get_be64. This one is trivial:

#define get_be64(p)     ntohll(*(uint64_t *)(p))

but should the unaligned version be:

#define get_be64(p)     ( \
        (*((unsigned char *)(p) + 0) << 56) | \
        (*((unsigned char *)(p) + 1) << 48) | \
        (*((unsigned char *)(p) + 2) << 40) | \
        (*((unsigned char *)(p) + 3) << 32) | \
        (*((unsigned char *)(p) + 4) << 24) | \
        (*((unsigned char *)(p) + 5) << 16) | \
        (*((unsigned char *)(p) + 6) <<  8) | \
        (*((unsigned char *)(p) + 7) <<  0) )

or would it be better to do it like this:

#define get_be64(p)     ( \
        ((uint64_t)get_be32((unsigned char *)(p) + 0) << 32) | \
        ((uint64_t)get_be32((unsigned char *)(p) + 4) <<  0)

or with a static inline function like git_bswap64:

or something else entirely?

I'm not sure why the different styles in this one file and which I should be emulating.



(I note also that time_t is not necessarily 64-bits in the first place,
but David said something about this not really being a time_t).


The in memory representation is a time_t as that is the return value of
time(NULL) but it is converted to/from a 64 bit value when written/read
to the index extension so that the index format is the same no matter
the native size of time_t.

-Peff

Reply via email to