Hello, I notice that reads are not being combined when I use __constant on a read-only kernel buffer. Is this something that can be improved?
In my kernel there are many loads from a read-only data structure. When I use the __global specifier for the memory space I see a total of 33 send instructions and a runtime of 81ms. When I use the __constant specifier, I see 43 send instructions and a runtime of 40ms. I'm hoping that combining the loads could improve performance further. thanks! tony
_______________________________________________ Beignet mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/beignet
