Hi everyone,

Partially from reading around the Internet and seeing the occasional things 
that GCC does, I've realised that 
there's a bit of a problem when it comes to false dependencies on CPU 
registers, and that's when, say, a 32-
bit register is allocated, used, deallocated and then a smaller part of it, 
say, 16 bits, is used later.

e.g.

// allocate %eax
xorl %eax,%eax
...
movl %eax,(mem1)
// deallocate %eax
// allocate %ax
movw $2000,%ax
...

The issue here is that there's an expensive partial write penalty on the "movw" 
instruction because the upper 
16 bits of %eax have to be preserved (and may be non-zero).  The way to 
mitigate this is to expand the "movw" 
instruction to a 32-bit "movl" instruction or use "movzwl" when dealing with 
other registers or memory reads.  
That way, the entire register is being overwritten and the CPU's register 
renaming scheme can be fully 
utilised.

The thing is, there is no straightforward way to do this other than to force 
all writes to be at least 32-bit, 
which is horribly wasteful.  One way around it is to watch the tai_regalloc 
entries more carefully to see 
exactly which sub-register is in use, but there are a few instances where 
register allocation isn't done 
properly, or the full 32-bit register is allocated.  I'm thinking up ideas as 
to how one can approach this 
problem.

Gareth aka. Kit
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to