Pekka,

Thank you for the explanation.

At clEnqueueNDRangeKernel() time, all workgroup details are available.
If I have a local workgroup of (2, 1, 1), can I understand that the 
kernel is replicated/unrolled 2 * 1 * 1 = 2 times, regardless other 
global dimensions?

Thank you

Chuck

On 8/21/2013 11:49 AM, Pekka Jääskeläinen wrote:
> Hi Chuck,
>
> On 08/21/2013 09:28 PM, Chuck Zhao wrote:
>> - when the kernel is vectorized, how many times does the kernel get
>> replicated ? (through either WorkitemReplicate or WorkitemUnroll).
>> - Who/How is this number decided?
> WorkitemReplicate and WorkitemLoops produce work-group functions for
> known local sizes. This means that they can be generated only
> when the local size is known (at enqueue).
>
> WorkitemReplicate always replicates the work-items fully (i.e. you
> get as many work-item copies as the local size is). WorkitemLoops
> unrolls the wiloop only if instructed. See
>
> http://pocl.sourceforge.net/docs/html/using.html#tuning-pocl-behavior
> -->POCL_WORK_GROUP_METHOD
>
> If you want to vectorize work groups, WorkitemReplicate can be used in
> combination with WIVectorize of pocl (branched BBVectorize). This is
> "deprecated" as I do not think it's a good idea to maintain the hackish
> WIVectorize out of LLVM tree (we implemented this for a research
> core and it was never really polished for wider use).
> This style of WG-vectorization is enabled with
> export POCL_WORK_GROUP_METHOD=repl
> export POCL_VECTORIZE_WORK_GROUPS=1
>
> In the future, better use and improve the WorkitemLoops+LoopVectorize
> (of LLVM upstream) instead. Then the wiloop unrolling decision should
> be left to the LLVM's LoopVectorizer. Enable this with
>
> export POCL_WORK_GROUP_METHOD=loopvec
>
> If you want to debug why your kernel does not vectorize horizontally,
> you can get some hints by enabling the -debug in the opt call of
> pocl-workgroup and grep for the vectorizer debug printouts.
>



------------------------------------------------------------------------------
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to