On 13.02.2017 18:01, Marek Olšák wrote:
From: Marek Olšák <marek.ol...@amd.com>

So that we can disable u_vbuf for GL core profiles.

This is a v2 of the previous VI-only patch.
It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI.
---
 src/gallium/drivers/radeonsi/si_descriptors.c |  1 -
 src/gallium/drivers/radeonsi/si_pipe.c        | 12 +++++++++---
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 3c98176..8abbf10 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -1014,21 +1014,20 @@ bool si_upload_vertex_buffer_descriptors(struct 
si_context *sctx)
                         * up so that the hardware sees four components as
                         * being inside the buffer if and only if the first
                         * three components are in the buffer.
                         *
                         * Since the offset and stride are guaranteed to be
                         * 4-byte aligned, this alignment will never cross the
                         * winsys buffer boundary.
                         */
                        size3 = (fix_size3 >> (2 * i)) & 3;
                        if (vb->stride && size3) {
-                               assert(offset % 4 == 0 && vb->stride % 4 == 0);
                                assert(size3 <= 2);
                                desc[2] = align(desc[2], size3 * 2);

I think there's still a bug with this. Consider the following setup:

3-element GL_UNSIGNED_BYTE vertex attribute, stride = 3, pointing at an offset e.g. 12 bytes before the end of a buffer for a total of 4 vertices.

We use unaligned 8_8_8_8 to read this vertex attribute, but when the last vertex is read, its last byte (of the extended 8_8_8_8) is out-of-bounds of the buffer. The hardware does bounds checks all-or-nothing (i.e., not per-channel) and drops the entire load as if the vertex data were out-of-bounds.

This is admittedly a bit of a ridiculous case that should not prevent us from dropping u_vbuf. It'll probably need a shader work-around of loading each channel individually, like with your recent GL_DOUBLE patch. At least I can't think of anything else right now.

In any case, the comment above that code snippet is no longer really correct.

Cheers,
Nicolai


                        }
                }

                desc[3] = velems->rsrc_word3[i];

                if (first_vb_use_mask & (1 << i)) {
                        radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
                                              (struct r600_resource*)vb->buffer,
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 8806027..ec324b8 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -353,23 +353,20 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
        case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER:
        case PIPE_CAP_SM3:
        case PIPE_CAP_SEAMLESS_CUBE_MAP:
        case PIPE_CAP_PRIMITIVE_RESTART:
        case PIPE_CAP_CONDITIONAL_RENDER:
        case PIPE_CAP_TEXTURE_BARRIER:
        case PIPE_CAP_INDEP_BLEND_ENABLE:
        case PIPE_CAP_INDEP_BLEND_FUNC:
        case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
        case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
-       case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
-       case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
-       case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
        case PIPE_CAP_USER_INDEX_BUFFERS:
        case PIPE_CAP_USER_CONSTANT_BUFFERS:
        case PIPE_CAP_START_INSTANCE:
        case PIPE_CAP_NPOT_TEXTURES:
        case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES:
        case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
        case PIPE_CAP_VERTEX_COLOR_CLAMPED:
        case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
         case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
        case PIPE_CAP_TGSI_INSTANCEID:
@@ -455,20 +452,29 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)

        case PIPE_CAP_GLSL_FEATURE_LEVEL:
                if (si_have_tgsi_compute(sscreen))
                        return 450;
                return HAVE_LLVM >= 0x0309 ? 420 :
                       HAVE_LLVM >= 0x0307 ? 410 : 330;

        case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
                return MIN2(sscreen->b.info.max_alloc_size, INT_MAX);

+       case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
+       case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
+       case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
+               /* SI doesn't support unaligned loads.
+                * CIK needs DRM 2.50.0 on radeon. */
+               return sscreen->b.chip_class == SI ||
+                      (sscreen->b.info.drm_major == 2 &&
+                       sscreen->b.info.drm_minor < 50);
+
        /* Unsupported features. */
        case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY:
        case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
        case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
        case PIPE_CAP_USER_VERTEX_BUFFERS:
        case PIPE_CAP_FAKE_SW_MSAA:
        case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
        case PIPE_CAP_VERTEXID_NOBASE:
        case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
        case PIPE_CAP_TGSI_VOTE:


_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to