I notice that the cmem unit does not align memory in the same way as the
default unit - removing the cmem unit makes a factor of two difference
in the speed of some double precision matrix code. (My system is i386).
Inspecting the cmem unit indicates the issue is the extra bytes
allocated for
On 03 Apr 2010, at 13:00, C Western wrote:
I notice that the cmem unit does not align memory in the same way as the
default unit - removing the cmem unit makes a factor of two difference in the
speed of some double precision matrix code. (My system is i386). Inspecting
the cmem unit
C Western wrote:
Inspecting the cmem unit indicates the issue is the extra bytes
allocated for the count - is this really needed? Or do we have to
allocate more bytes for blocks that are a multiple of 8?
Do C memory managers guarantee any alignment anyway? Not for SSE (16
bytes) I'm sure,
In our previous episode, Jonas Maebe said:
Or do we have to allocate more bytes for blocks that are a multiple of 8?
FPC's default memory manager even guarantees 16 byte alignment (for vectors).
So a possible solution is to allocate 16-sizeof(ptruint) bytes more?
for 32-bit that would mean:
On 03 Apr 2010, at 14:09, Micha Nelissen wrote:
Do C memory managers guarantee any alignment anyway? Not for SSE (16 bytes)
I'm sure, but 8 bytes I don't know.
From Linux' malloc man page:
For calloc() and malloc(), the value returned is a pointer to the allo-
cated memory, which is
Marco van de Voort wrote:
In our previous episode, Jonas Maebe said:
Or do we have to allocate more bytes for blocks that are a multiple of 8?
FPC's default memory manager even guarantees 16 byte alignment (for vectors).
So a possible solution is to allocate 16-sizeof(ptruint) bytes more?