FWIW, memcpy() vs a for() loop has different semantics with respect to
address alignment. I don't know how much it will matter, but last time I
was reading assembly output, copying int[] via for() loop didn't produce a
codepath for 16-byte aligned addresses (allowing for SSE streaming) while
memcpy
On Don, 2014-02-20 at 10:21 -0800, Tom Stellard wrote:
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c
> b/src/gallium/drivers/radeonsi/si_shader.c
> index 54270cd..9b04e6b 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -233
---
src/gallium/drivers/radeonsi/si_descriptors.c | 2 +-
src/gallium/drivers/radeonsi/si_shader.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c
b/src/gallium/drivers/radeonsi/si_descriptors.c
index d45bc80..9b10d01 10064