I carefully read the implementation, it looks good.
But according to Zhigang's experiment, some case would suffer performance
downgrade.
And he thought the main problem still lies in the non-accurate latency model.
And the pre-schedule may make the post-schedule cannot re-schedule again
because
From: Luo Xionghu
define a MACRO to hold the value.
v2: use same MACRO in cl_extensions.h; add header file protection for
cl_extension.h.
Signed-off-by: Luo Xionghu
---
src/cl_device_id.h | 5 -
src/cl_extensions.c | 2 +-
From: Luo Xionghu
llvm 3.7 may generate cast instructions "%13 = uitofp i1 %12 to float",
while the dst type is float or double , should call the coresponding
newXXXimmediate function.
Signed-off-by: Luo Xionghu
---
The patchset LGTM, thanks.
> -Original Message-
> From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of
> xionghu@intel.com
> Sent: Wednesday, November 11, 2015 15:35
> To: beignet@lists.freedesktop.org
> Cc: Luo, Xionghu
> Subject: [Beignet] [Patch v2 2/2] gbe:
Use wait function to extend a debug function:
void debugwait(void)
This function can hang the gpu unless gpu reset
or host send something to let it go.
EXTREMELY DANGEROUS for machines turn off hangcheck
v2:
Fix some bugs, and add setting predicate and execwidth,
also modify some inst
On 10/11/15 02:37, Zou, Nanhai wrote:
looks like something related to drm driver bo management.
dose
export bo_reuse=0
help?
No; if anything, it makes it fill the memory _faster_.
(All of this is in 1.1.1; I haven't tried it in master, at least not
recently.)
Thanks
Zou Nanhai
Use the thread_local make android build hard to port. Actually, it just pass
the printf pass's information to GenWrite, you could use unit for it.
> -Original Message-
> From: Pan, Xiuli
> Sent: Tuesday, November 10, 2015 13:30
> To: Pan, Xiuli; Song, Ruiling;