Great Job Tony!
From Spec, only below messages are supported on constant_cache (which is 
Hardware support for __constant memory read).
Message Type
0000: OWord Block Read
0001: Unaligned OWord Block Read
0010: OWord Dual Block Read
0011: DWord Scattered Read
All other encodings are reserved.

For a normal varying load(different work-item access different buffer address), 
we would use dword scatter read(that is dword_gather in gen_insn_selection.cpp),
But it is sad these message do not support 2/3/4 DWORD read. It only supports 
one simd8/simd16 DWORD read. So, we have to split constant memory load 
instructions.
But there did exist some opportunity that uniform load of constant memory can 
be merged. I think it is in our TODO list.


From: Beignet [mailto:[email protected]] On Behalf Of Tony 
Moore
Sent: Wednesday, November 26, 2014 5:11 AM
To: [email protected]
Subject: [Beignet] Combine Loads from __constant space

Hello,
I notice that reads are not being combined when I use __constant on a read-only 
kernel buffer. Is this something that can be improved?

In my kernel there are many loads from a read-only data structure. When I use 
the __global specifier for the memory space I see a total of 33 send 
instructions and a runtime of 81ms. When I use the __constant specifier, I see 
43 send instructions and a runtime of 40ms. I'm hoping that combining the loads 
could improve performance further.

thanks!
tony
_______________________________________________
Beignet mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/beignet

Reply via email to