Great Job Tony! From Spec, only below messages are supported on constant_cache (which is Hardware support for __constant memory read). Message Type 0000: OWord Block Read 0001: Unaligned OWord Block Read 0010: OWord Dual Block Read 0011: DWord Scattered Read All other encodings are reserved.
For a normal varying load(different work-item access different buffer address), we would use dword scatter read(that is dword_gather in gen_insn_selection.cpp), But it is sad these message do not support 2/3/4 DWORD read. It only supports one simd8/simd16 DWORD read. So, we have to split constant memory load instructions. But there did exist some opportunity that uniform load of constant memory can be merged. I think it is in our TODO list. From: Beignet [mailto:[email protected]] On Behalf Of Tony Moore Sent: Wednesday, November 26, 2014 5:11 AM To: [email protected] Subject: [Beignet] Combine Loads from __constant space Hello, I notice that reads are not being combined when I use __constant on a read-only kernel buffer. Is this something that can be improved? In my kernel there are many loads from a read-only data structure. When I use the __global specifier for the memory space I see a total of 33 send instructions and a runtime of 81ms. When I use the __constant specifier, I see 43 send instructions and a runtime of 40ms. I'm hoping that combining the loads could improve performance further. thanks! tony
_______________________________________________ Beignet mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/beignet
