On Wed, May 20, 2015 at 02:01:44PM +0200, Bernd Schmidt wrote:
> To implement OpenACC vector-single mode, we need to ensure that only one
> thread out of the group representing a worker executes. The others skip
> computations but follow along the CFG, so the results of conditional branch
> decisions must be broadcast to them.
> 
> The patch below adds a new builtin and nvptx pattern to implement that
> broadcast functionality.

So, is the goal of this that threads in the warp other than the 0th
don't do anything except in vectorized regions, where all the threads
in the warp participate in the vectorization?
Thus, for OpenMP, should the whole warp be a single thread
(thus omp_get_thread_num () would be tid.x >> 5)?
If so, is the GCC vectorizer going to be taught about this?

        Jakub

Reply via email to