Re: [Mesa-dev] [PATCH 1/4] radeonsi: use CE suballocator for CP DMA realignment.
On Wed, Apr 20, 2016 at 1:29 AM, Bas Nieuwenhuizen wrote: > I retract patch 1 and 2. Large scratch buffers are nice, but the > hardware only supports a 32-bit offset into it. You can still allocate a smaller scratch buffer. This should limit the number of waves in hw. TMPRING_SIZE.WAVES should be adjusted accordingly. We can also decrease the size of scratch based on the max number of waves with the given register and LDS usage. si_shader_dump_stats calculates the max number of waves. You just need to: - set TMPRING_SIZE.WAVES = MIN2(32, max_simd_waves * 4) - allocate scratch for TMPRING_SIZE.WAVES waves per CU (instead of 32) - si_context::scratch_waves can be moved to si_shader in some form (e.g. max_scratch_bytes_per_cu) Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] radeonsi: use CE suballocator for CP DMA realignment.
On 2016-04-20 09:29, Bas Nieuwenhuizen wrote: I retract patch 1 and 2. Large scratch buffers are nice, but the hardware only supports a 32-bit offset into it. - Bas On Wed, Apr 20, 2016 at 12:50 AM, Bas Nieuwenhuizen wrote: Use the CE suballocator instead of the normal one as the usage is most similar to the CE, i.e. only read and written on GPU and not mapped to CPU. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_cp_dma.c | 27 ++- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c b/src/gallium/drivers/radeonsi/si_cp_dma.c index 38e0ee6..264789d 100644 --- a/src/gallium/drivers/radeonsi/si_cp_dma.c +++ b/src/gallium/drivers/radeonsi/si_cp_dma.c @@ -222,31 +222,24 @@ static void si_clear_buffer(struct pipe_context *ctx, struct pipe_resource *dst, */ static void si_cp_dma_realign_engine(struct si_context *sctx, unsigned size) { + trivial spurious '\n' uint64_t va; unsigned dma_flags = 0; unsigned scratch_size = CP_DMA_ALIGNMENT * 2; + unsigned offset; + struct r600_resource *tmp_buf; assert(size < CP_DMA_ALIGNMENT); - /* Use the scratch buffer as the dummy buffer. The 3D engine should be -* idle at this point. -*/ - if (!sctx->scratch_buffer || - sctx->scratch_buffer->b.b.width0 < scratch_size) { - r600_resource_reference(&sctx->scratch_buffer, NULL); - sctx->scratch_buffer = - si_resource_create_custom(&sctx->screen->b.b, - PIPE_USAGE_DEFAULT, - scratch_size); - if (!sctx->scratch_buffer) - return; - sctx->emit_scratch_reloc = true; - } + u_suballocator_alloc(sctx->ce_suballocator, scratch_size, &offset, +(struct pipe_resource**)&tmp_buf); + if (!tmp_buf) + return; - si_cp_dma_prepare(sctx, &sctx->scratch_buffer->b.b, - &sctx->scratch_buffer->b.b, size, size, &dma_flags); + si_cp_dma_prepare(sctx, &tmp_buf->b.b, + &tmp_buf->b.b, size, size, &dma_flags); - va = sctx->scratch_buffer->gpu_address; + va = tmp_buf->gpu_address + offset; si_emit_cp_dma_copy_buffer(sctx, va, va + CP_DMA_ALIGNMENT, size, dma_flags); } -- 2.8.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] radeonsi: use CE suballocator for CP DMA realignment.
On Wed, Apr 20, 2016 at 2:13 AM, Nicolai Hähnle wrote: > On 19.04.2016 18:29, Bas Nieuwenhuizen wrote: >> >> I retract patch 1 and 2. Large scratch buffers are nice, but the >> hardware only supports a 32-bit offset into it. > > > Do you mean patch 2 and 3? Do you plan alternative patches to error out when > there is an integer overflow? That's still kind of important... > > Cheers, > Nicolai Really, patch 1 and 2. I did patch 1 only so I did not need to make the wole cp_dma work with pb_buffer. Although I guess patch 3 can best be merged with the to be written patch that checks that the resulting size fits in 32 bit. - Bas > >> >> - Bas >> >> On Wed, Apr 20, 2016 at 12:50 AM, Bas Nieuwenhuizen >> wrote: >>> >>> Use the CE suballocator instead of the normal one as the usage >>> is most similar to the CE, i.e. only read and written on GPU >>> and not mapped to CPU. >>> >>> Signed-off-by: Bas Nieuwenhuizen >>> --- >>> src/gallium/drivers/radeonsi/si_cp_dma.c | 27 >>> ++- >>> 1 file changed, 10 insertions(+), 17 deletions(-) >>> >>> diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c >>> b/src/gallium/drivers/radeonsi/si_cp_dma.c >>> index 38e0ee6..264789d 100644 >>> --- a/src/gallium/drivers/radeonsi/si_cp_dma.c >>> +++ b/src/gallium/drivers/radeonsi/si_cp_dma.c >>> @@ -222,31 +222,24 @@ static void si_clear_buffer(struct pipe_context >>> *ctx, struct pipe_resource *dst, >>>*/ >>> static void si_cp_dma_realign_engine(struct si_context *sctx, unsigned >>> size) >>> { >>> + >>> uint64_t va; >>> unsigned dma_flags = 0; >>> unsigned scratch_size = CP_DMA_ALIGNMENT * 2; >>> + unsigned offset; >>> + struct r600_resource *tmp_buf; >>> >>> assert(size < CP_DMA_ALIGNMENT); >>> >>> - /* Use the scratch buffer as the dummy buffer. The 3D engine >>> should be >>> -* idle at this point. >>> -*/ >>> - if (!sctx->scratch_buffer || >>> - sctx->scratch_buffer->b.b.width0 < scratch_size) { >>> - r600_resource_reference(&sctx->scratch_buffer, NULL); >>> - sctx->scratch_buffer = >>> - si_resource_create_custom(&sctx->screen->b.b, >>> - PIPE_USAGE_DEFAULT, >>> - scratch_size); >>> - if (!sctx->scratch_buffer) >>> - return; >>> - sctx->emit_scratch_reloc = true; >>> - } >>> + u_suballocator_alloc(sctx->ce_suballocator, scratch_size, >>> &offset, >>> +(struct pipe_resource**)&tmp_buf); >>> + if (!tmp_buf) >>> + return; >>> >>> - si_cp_dma_prepare(sctx, &sctx->scratch_buffer->b.b, >>> - &sctx->scratch_buffer->b.b, size, size, >>> &dma_flags); >>> + si_cp_dma_prepare(sctx, &tmp_buf->b.b, >>> + &tmp_buf->b.b, size, size, &dma_flags); >>> >>> - va = sctx->scratch_buffer->gpu_address; >>> + va = tmp_buf->gpu_address + offset; >>> si_emit_cp_dma_copy_buffer(sctx, va, va + CP_DMA_ALIGNMENT, >>> size, >>> dma_flags); >>> } >>> -- >>> 2.8.0 >>> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] radeonsi: use CE suballocator for CP DMA realignment.
On 19.04.2016 17:50, Bas Nieuwenhuizen wrote: Use the CE suballocator instead of the normal one as the usage is most similar to the CE, i.e. only read and written on GPU and not mapped to CPU. The scratch buffer is also only read and written by the GPU... Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_cp_dma.c | 27 ++- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c b/src/gallium/drivers/radeonsi/si_cp_dma.c index 38e0ee6..264789d 100644 --- a/src/gallium/drivers/radeonsi/si_cp_dma.c +++ b/src/gallium/drivers/radeonsi/si_cp_dma.c @@ -222,31 +222,24 @@ static void si_clear_buffer(struct pipe_context *ctx, struct pipe_resource *dst, */ static void si_cp_dma_realign_engine(struct si_context *sctx, unsigned size) { + uint64_t va; unsigned dma_flags = 0; unsigned scratch_size = CP_DMA_ALIGNMENT * 2; + unsigned offset; + struct r600_resource *tmp_buf; assert(size < CP_DMA_ALIGNMENT); - /* Use the scratch buffer as the dummy buffer. The 3D engine should be -* idle at this point. -*/ - if (!sctx->scratch_buffer || - sctx->scratch_buffer->b.b.width0 < scratch_size) { - r600_resource_reference(&sctx->scratch_buffer, NULL); - sctx->scratch_buffer = - si_resource_create_custom(&sctx->screen->b.b, - PIPE_USAGE_DEFAULT, - scratch_size); - if (!sctx->scratch_buffer) - return; - sctx->emit_scratch_reloc = true; - } + u_suballocator_alloc(sctx->ce_suballocator, scratch_size, &offset, +(struct pipe_resource**)&tmp_buf); + if (!tmp_buf) + return; - si_cp_dma_prepare(sctx, &sctx->scratch_buffer->b.b, - &sctx->scratch_buffer->b.b, size, size, &dma_flags); + si_cp_dma_prepare(sctx, &tmp_buf->b.b, + &tmp_buf->b.b, size, size, &dma_flags); - va = sctx->scratch_buffer->gpu_address; + va = tmp_buf->gpu_address + offset; si_emit_cp_dma_copy_buffer(sctx, va, va + CP_DMA_ALIGNMENT, size, dma_flags); } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] radeonsi: use CE suballocator for CP DMA realignment.
On 19.04.2016 18:29, Bas Nieuwenhuizen wrote: I retract patch 1 and 2. Large scratch buffers are nice, but the hardware only supports a 32-bit offset into it. Do you mean patch 2 and 3? Do you plan alternative patches to error out when there is an integer overflow? That's still kind of important... Cheers, Nicolai - Bas On Wed, Apr 20, 2016 at 12:50 AM, Bas Nieuwenhuizen wrote: Use the CE suballocator instead of the normal one as the usage is most similar to the CE, i.e. only read and written on GPU and not mapped to CPU. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_cp_dma.c | 27 ++- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c b/src/gallium/drivers/radeonsi/si_cp_dma.c index 38e0ee6..264789d 100644 --- a/src/gallium/drivers/radeonsi/si_cp_dma.c +++ b/src/gallium/drivers/radeonsi/si_cp_dma.c @@ -222,31 +222,24 @@ static void si_clear_buffer(struct pipe_context *ctx, struct pipe_resource *dst, */ static void si_cp_dma_realign_engine(struct si_context *sctx, unsigned size) { + uint64_t va; unsigned dma_flags = 0; unsigned scratch_size = CP_DMA_ALIGNMENT * 2; + unsigned offset; + struct r600_resource *tmp_buf; assert(size < CP_DMA_ALIGNMENT); - /* Use the scratch buffer as the dummy buffer. The 3D engine should be -* idle at this point. -*/ - if (!sctx->scratch_buffer || - sctx->scratch_buffer->b.b.width0 < scratch_size) { - r600_resource_reference(&sctx->scratch_buffer, NULL); - sctx->scratch_buffer = - si_resource_create_custom(&sctx->screen->b.b, - PIPE_USAGE_DEFAULT, - scratch_size); - if (!sctx->scratch_buffer) - return; - sctx->emit_scratch_reloc = true; - } + u_suballocator_alloc(sctx->ce_suballocator, scratch_size, &offset, +(struct pipe_resource**)&tmp_buf); + if (!tmp_buf) + return; - si_cp_dma_prepare(sctx, &sctx->scratch_buffer->b.b, - &sctx->scratch_buffer->b.b, size, size, &dma_flags); + si_cp_dma_prepare(sctx, &tmp_buf->b.b, + &tmp_buf->b.b, size, size, &dma_flags); - va = sctx->scratch_buffer->gpu_address; + va = tmp_buf->gpu_address + offset; si_emit_cp_dma_copy_buffer(sctx, va, va + CP_DMA_ALIGNMENT, size, dma_flags); } -- 2.8.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] radeonsi: use CE suballocator for CP DMA realignment.
I retract patch 1 and 2. Large scratch buffers are nice, but the hardware only supports a 32-bit offset into it. - Bas On Wed, Apr 20, 2016 at 12:50 AM, Bas Nieuwenhuizen wrote: > Use the CE suballocator instead of the normal one as the usage > is most similar to the CE, i.e. only read and written on GPU > and not mapped to CPU. > > Signed-off-by: Bas Nieuwenhuizen > --- > src/gallium/drivers/radeonsi/si_cp_dma.c | 27 ++- > 1 file changed, 10 insertions(+), 17 deletions(-) > > diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c > b/src/gallium/drivers/radeonsi/si_cp_dma.c > index 38e0ee6..264789d 100644 > --- a/src/gallium/drivers/radeonsi/si_cp_dma.c > +++ b/src/gallium/drivers/radeonsi/si_cp_dma.c > @@ -222,31 +222,24 @@ static void si_clear_buffer(struct pipe_context *ctx, > struct pipe_resource *dst, > */ > static void si_cp_dma_realign_engine(struct si_context *sctx, unsigned size) > { > + > uint64_t va; > unsigned dma_flags = 0; > unsigned scratch_size = CP_DMA_ALIGNMENT * 2; > + unsigned offset; > + struct r600_resource *tmp_buf; > > assert(size < CP_DMA_ALIGNMENT); > > - /* Use the scratch buffer as the dummy buffer. The 3D engine should be > -* idle at this point. > -*/ > - if (!sctx->scratch_buffer || > - sctx->scratch_buffer->b.b.width0 < scratch_size) { > - r600_resource_reference(&sctx->scratch_buffer, NULL); > - sctx->scratch_buffer = > - si_resource_create_custom(&sctx->screen->b.b, > - PIPE_USAGE_DEFAULT, > - scratch_size); > - if (!sctx->scratch_buffer) > - return; > - sctx->emit_scratch_reloc = true; > - } > + u_suballocator_alloc(sctx->ce_suballocator, scratch_size, &offset, > +(struct pipe_resource**)&tmp_buf); > + if (!tmp_buf) > + return; > > - si_cp_dma_prepare(sctx, &sctx->scratch_buffer->b.b, > - &sctx->scratch_buffer->b.b, size, size, &dma_flags); > + si_cp_dma_prepare(sctx, &tmp_buf->b.b, > + &tmp_buf->b.b, size, size, &dma_flags); > > - va = sctx->scratch_buffer->gpu_address; > + va = tmp_buf->gpu_address + offset; > si_emit_cp_dma_copy_buffer(sctx, va, va + CP_DMA_ALIGNMENT, size, >dma_flags); > } > -- > 2.8.0 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] radeonsi: use CE suballocator for CP DMA realignment.
Use the CE suballocator instead of the normal one as the usage is most similar to the CE, i.e. only read and written on GPU and not mapped to CPU. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_cp_dma.c | 27 ++- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c b/src/gallium/drivers/radeonsi/si_cp_dma.c index 38e0ee6..264789d 100644 --- a/src/gallium/drivers/radeonsi/si_cp_dma.c +++ b/src/gallium/drivers/radeonsi/si_cp_dma.c @@ -222,31 +222,24 @@ static void si_clear_buffer(struct pipe_context *ctx, struct pipe_resource *dst, */ static void si_cp_dma_realign_engine(struct si_context *sctx, unsigned size) { + uint64_t va; unsigned dma_flags = 0; unsigned scratch_size = CP_DMA_ALIGNMENT * 2; + unsigned offset; + struct r600_resource *tmp_buf; assert(size < CP_DMA_ALIGNMENT); - /* Use the scratch buffer as the dummy buffer. The 3D engine should be -* idle at this point. -*/ - if (!sctx->scratch_buffer || - sctx->scratch_buffer->b.b.width0 < scratch_size) { - r600_resource_reference(&sctx->scratch_buffer, NULL); - sctx->scratch_buffer = - si_resource_create_custom(&sctx->screen->b.b, - PIPE_USAGE_DEFAULT, - scratch_size); - if (!sctx->scratch_buffer) - return; - sctx->emit_scratch_reloc = true; - } + u_suballocator_alloc(sctx->ce_suballocator, scratch_size, &offset, +(struct pipe_resource**)&tmp_buf); + if (!tmp_buf) + return; - si_cp_dma_prepare(sctx, &sctx->scratch_buffer->b.b, - &sctx->scratch_buffer->b.b, size, size, &dma_flags); + si_cp_dma_prepare(sctx, &tmp_buf->b.b, + &tmp_buf->b.b, size, size, &dma_flags); - va = sctx->scratch_buffer->gpu_address; + va = tmp_buf->gpu_address + offset; si_emit_cp_dma_copy_buffer(sctx, va, va + CP_DMA_ALIGNMENT, size, dma_flags); } -- 2.8.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev