[Mesa-dev] [PATCH] nvc0: drop image binding from BGR10A2 format
Fixes a bunch of new CTS pbo tests that use that as an output format, which the state tracker converts into buffer image writes. No part of the driver is ready for BGR10A2. It could probably be enabled on Maxwell+, but seems unnecessary. Signed-off-by: Ilia Mirkin--- src/gallium/drivers/nouveau/nv50/nv50_formats.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_formats.c b/src/gallium/drivers/nouveau/nv50/nv50_formats.c index 0ead8ac2e1e..9f8faf768dd 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_formats.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_formats.c @@ -154,7 +154,7 @@ const struct nv50_format nv50_format_table[PIPE_FORMAT_COUNT] = C4(A, R10G10B10A2_UNORM, RGB10_A2_UNORM, R, G, B, A, UNORM, A2B10G10R10, TD), F3(A, R10G10B10X2_UNORM, RGB10_A2_UNORM, R, G, B, xx, UNORM, A2B10G10R10, T), - C4(A, B10G10R10A2_UNORM, BGR10_A2_UNORM, B, G, R, A, UNORM, A2B10G10R10, IB), + C4(A, B10G10R10A2_UNORM, BGR10_A2_UNORM, B, G, R, A, UNORM, A2B10G10R10, TB), F3(A, B10G10R10X2_UNORM, BGR10_A2_UNORM, B, G, R, xx, UNORM, A2B10G10R10, T), C4(A, R10G10B10A2_SNORM, NONE, R, G, B, A, SNORM, A2B10G10R10, T), C4(A, B10G10R10A2_SNORM, NONE, B, G, R, A, SNORM, A2B10G10R10, T), -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 104665] r600: computer shaders break Bioshock on barts (bisected)
https://bugs.freedesktop.org/show_bug.cgi?id=104665 --- Comment #3 from Dave Airlie--- okay reproduced it with cayman needed a 32-bit build, it dies in a compute shader invocation alright. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 --- Comment #13 from Amarildo--- Usually I test all my games on max graphical settings, that includes X-Plane, Project Cars 2, GTA V, Far Cry 4, and so on, all on 1080p with no crashes and usually close to 60 FPS. VRAM surely runs high, so I tend to use a combination of High/Very High and Ultra settings to get constant 60 FPS on these games. While testing F1 2017 on Linux, the game ran fine at "High" settings while driving by myself. I also tested it with all graphical options on the lowest possible and 720p, but it still crashes. Maybe Vulkan handles VRAM diferently? Surely the game shouldn't use 2 GB of VRAM while all graphics settings are on "Ultra Low" and 720p. Besides, the R9 270X ran the game fine while using the amdgpu-pro driver, as per Phoronix results. So I personally don't see VRAM getting full, but I'll still try to find a way of monitoring VRAM usage and will try the game again. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 --- Comment #12 from Dave Airlie--- -2 looks like out of device memory, you might have the game settings up to high, or too high resolution. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2 v2] nir: mako all the intrinsics
On Wed, Mar 28, 2018 at 12:24 PM, Mark Janeswrote: > Rob Herring writes: > >> On Wed, Mar 28, 2018 at 10:18 AM, Rob Clark wrote: >>> On Wed, Mar 28, 2018 at 10:43 AM, Rob Herring wrote: On Sun, Mar 25, 2018 at 1:10 PM, Rob Clark wrote: > I threatened to do this a long time ago.. I probably *should* have done > it a long time ago when there where many fewer intrinsics. But the > system of macro/#include magic for dealing with intrinsics is a bit > annoying, and python has the nice property of optional fxn params, > making it possible to define new intrinsics while ignoring parameters > that are not applicable (and naming optional params). And not having to > specify various array lengths explicitly is nice too. > > I think the end result makes it easier to add new intrinsics. > > v2: couple small fixes found with a test program to compare the old and > new tables > v3: misc comments, don't rely on capture=true for meson.build, get rid > of system_values table to avoid return value of intrinsic() and > *mostly* remove side-effects, add autotools build support > > Signed-off-by: Rob Clark > --- > So, new scheme is, I think, a reasonable compromise between keeping the > python "clean" and keeping the intrinsic declarations easy to follow. > It still has the side-effect that intrinsic() adds to the table, but > drops the separate system_values table so that intrinsic() doesn't > return a value. The alternative would require the helper for various > specialized intrinsic categories to be declared far from where they are > used, which is, I think, suboptimal. And it keeps intrinsic() and > various wrappers pretty straightforward, so I don't think this should > ever pose a problem for refactoring (and certainly less of a problem > than the previous solution using cpp macros, so regardless of what your > opinion about the py code, I guess anyone could agree that this is an > improvement over the current state ;-)) > > Also added autotools build support. Sorry scons and android. (Are we > ready to drop either of these in favor of nir?) You mean meson? For Android, no. I don't see that happening anytime soon. I looked into it some by having a prebuilt target in Android.mk that calls meson. The problem is getting all the Android build environment such as include paths out of Android build system and passed into meson. I don't know how to do that in a way that is not manual and fragile. It looks like you'd just need to do some copy-n-paste of rules for Android. And you know you can push an 'android/*' branch to trigger an Android build of mesa? >>> >>> no, I didn't realize that.. on the main git tree? >> >> Yep. No one uses it AFAICT. > > I was told that this mechanism was not useful because it builds with > -Werror. Is that still true? No. I believe I disabled that in master because mesa certainly can't build with -Werror. > Clayton implemented a buildtest for android within the i965 CI, so > anyone testing there will be notified when they break android. We are > waiting on some additional hardware before enabling it for developer > branches. I imagine you don't build all drivers or build for arm/arm64, so less useful for others. Rob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 104665] r600: computer shaders break Bioshock on barts (bisected)
https://bugs.freedesktop.org/show_bug.cgi?id=104665 --- Comment #2 from Dave Airlie--- Do you have version overrides, and does it still happen as I got the game, and it doesn't seem to be killing my redwood. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 --- Comment #11 from Amarildo--- Firejail isn't the issue. Ran Steam outside of it. Then I tried compiling mesa with the above suggestion, it says "configure: error: --enable-llvm is required when building radv", and when I enable it, it says "configure: error: --enable-llvm selected but llvm-config is not found", and no such file exists. This is so frustrating. Gaming on Linux with Pitcairn has never been easy, and AMD's support of my card has always been lacking, delayed, and problematic. Sadly, it's times like these that make me wanna go to Windows, everything "just works there", drivers and games are actually well tested, and regular users never need to debug themselves and compile programs to test stuff. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glapi: define GL_API to be GLAPI in glapi_dispatch.c
This fixes a Windows build warning where the prototypes for the ES function in the header file don't match the prototypes in this file because the GL_API and GLAPI macros are defined differently. --- src/mapi/glapi/glapi_dispatch.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mapi/glapi/glapi_dispatch.c b/src/mapi/glapi/glapi_dispatch.c index 3239523..f0a8c36 100644 --- a/src/mapi/glapi/glapi_dispatch.c +++ b/src/mapi/glapi/glapi_dispatch.c @@ -97,6 +97,11 @@ */ #include + +/* Use the GLAPI annotation from GL/gl.h, not GL_API from GLES/gl.h */ +#undef GL_API +#define GL_API GLAPI + GL_API void GL_APIENTRY glClearDepthf (GLclampf depth); GL_API void GL_APIENTRY glClipPlanef (GLenum plane, const GLfloat *equation); GL_API void GL_APIENTRY glFrustumf (GLfloat left, GLfloat right, GLfloat bottom, GLfloat top, GLfloat zNear, GLfloat zFar); -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gl.h: remove stale comment, trailing whitespace
--- include/GL/gl.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/GL/gl.h b/include/GL/gl.h index 5b28480..f5bac36 100644 --- a/include/GL/gl.h +++ b/include/GL/gl.h @@ -47,9 +47,9 @@ #define GLAPI __declspec(dllimport) # else /* for use with static link lib build of Win32 edition only */ #define GLAPI extern -# endif /* _STATIC_MESA support */ +# endif # if defined(__MINGW32__) && defined(GL_NO_STDCALL) || defined(UNDER_CE) /* The generated DLLs by MingW with STDCALL are not compatible with the ones done by Microsoft's compilers */ -#define GLAPIENTRY +#define GLAPIENTRY # else #define GLAPIENTRY __stdcall # endif -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] intel/compiler: fix return statement warning in brw_regs_negative_equal()
Silence a gcc warning about missing return value in non-void function. For some reason, gcc 5.4.0 (at least) can't deduce that all else/if cases return a value. --- src/intel/compiler/brw_reg.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h index 68158cc..0d2900a 100644 --- a/src/intel/compiler/brw_reg.h +++ b/src/intel/compiler/brw_reg.h @@ -302,6 +302,8 @@ brw_regs_negative_equal(const struct brw_reg *a, const struct brw_reg *b) return brw_regs_equal(, b); } + + return false; /* silence compiler warning */ } struct brw_indirect { -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 12/18] i965/blorp: Update the fast clear color address.
On March 27, 2018 13:23:46 Rafael Antognolliwrote: On Tue, Mar 27, 2018 at 11:16:37AM -0700, Jason Ekstrand wrote: On Thu, Mar 8, 2018 at 8:49 AM, Rafael Antognolli wrote: On Gen10, whenever we do a fast clear, blorp will update the clear color state buffer for us, as long as we set the clear color address correctly. However, on a hiz clear, if the surface is already on the fast clear state we skip the actual fast clear operation and, before gen10, only updated the miptree. On gen10+ we need to update the clear value state buffer too, since blorp will not be doing a fast clear and updating it for us. v4: - do not use clear_value_size in the for loop - Get the address of the clear color from the aux buffer or the clear_color_bo, depending on which one is available. - let core blorp update the clear color, but also update it when we skip a fast clear depth. Signed-off-by: Rafael Antognolli --- src/mesa/drivers/dri/i965/brw_blorp.c | 11 +++ src/mesa/drivers/dri/i965/brw_clear.c | 22 ++ 2 files changed, 33 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c b/src/mesa/drivers/dri/ i965/brw_blorp.c index ffd957fb866..914aeeace7a 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.c +++ b/src/mesa/drivers/dri/i965/brw_blorp.c @@ -185,6 +185,17 @@ blorp_surf_for_miptree(struct brw_context *brw, surf->aux_addr.buffer = aux_buf->bo; surf->aux_addr.offset = aux_buf->offset; + + if (devinfo->gen >= 10) { + /* If we have a CCS surface and clear_color_bo set, use that bo as + * storage for the indirect clear color. Otherwise, use the extra + * space at the end of the aux_buffer. + */ + surf->clear_color_addr = (struct blorp_address) { +.buffer = aux_buf->clear_color_bo, +.offset = aux_buf->clear_color_offset, + }; + } } else { surf->aux_addr = (struct blorp_address) { .buffer = NULL, diff --git a/src/mesa/drivers/dri/i965/brw_clear.c b/src/mesa/drivers/dri/ i965/brw_clear.c index 8aa83722ee9..63c0b241898 100644 --- a/src/mesa/drivers/dri/i965/brw_clear.c +++ b/src/mesa/drivers/dri/i965/brw_clear.c @@ -108,6 +108,7 @@ brw_fast_clear_depth(struct gl_context *ctx) struct intel_mipmap_tree *mt = depth_irb->mt; struct gl_renderbuffer_attachment *depth_att = >Attachment [BUFFER_DEPTH]; const struct gen_device_info *devinfo = >screen->devinfo; + bool same_clear_value = true; if (devinfo->gen < 6) return false; @@ -213,6 +214,7 @@ brw_fast_clear_depth(struct gl_context *ctx) } intel_miptree_set_depth_clear_value(ctx, mt, clear_value); + same_clear_value = false; } bool need_clear = false; @@ -232,6 +234,26 @@ brw_fast_clear_depth(struct gl_context *ctx) * state then simply updating the miptree fast clear value is sufficient * to change their clear value. */ + if (devinfo->gen >= 10 && !same_clear_value) { + /* Before gen10, it was enough to just update the clear value in the + * miptree. But on gen10+, we let blorp update the clear value state + * buffer when doing a fast clear. Since we are skipping the fast + * clear here, we need to update the clear color ourselves. + */ + uint32_t clear_offset = mt->hiz_buf->clear_color_offset; + union isl_color_value clear_color = { .f32 = { clear_value, } }; + + /* We can't update the clear color while the hardware is still using + * the previous one for a resolve or sampling from it. So make sure + * that there's no pending commands at this point. + */ + brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CS_STALL); This is fun... First off, this can only happen in the case where you have two clears with no rendering in between so we shouldn't have any pending rendering or resolves. However, we may have pending texturing. In that case, I think we need a RT cache flush and CS stall before and state and sampler cache invalidates afterwards. We also need a test. :-) This is enough of a crazy edge case, that I'm happy to let the patches progress before the test has been written but we do need to write the test. I think I found a test that failed without those flushes, I can double check it and let you know... would that be enough of a test case? Maybe? I think what we need is something that's repeatedly clears and textures. If you find a case that does that, it's probably sufficient. + for (int i = 0; i < 4; i++) { +brw_store_data_imm32(brw, mt->hiz_buf->clear_color_bo, + clear_offset + i * 4, clear_color.u32 [i]); + } + brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_ INVALIDATE); + } return true; } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
Re: [Mesa-dev] [PATCH 00/61] nir: Move to using instructions for derefs
On March 28, 2018 17:43:31 Rob Clarkwrote: On Wed, Mar 28, 2018 at 8:16 PM, Jason Ekstrand wrote: On March 28, 2018 16:54:33 Rob Clark wrote: I had noticed the code to remove dead deref's in a few of the passes (at least on your wip branch), and had wondered a bit about not just requiring all the deref related lowering to happen in ssa and possibly require dce after, although admittedly hadn't thought about it *too* much yet.. Yeah. Like I said below, it should be ready enough to just have a tiny clean-up pass instead of having to run full-on dce. Maybe just running dce is the right choice; I'm not sure. I kinda expected to use the dce clean things up once we are in a deref-instruction world.. re: validation passes, could we not just allow dead deref instructions to be ok. That seems like kind of a natural thing.. Making validation ignore them is easy. The trickier bit is that they can cause problems for any pass which works on all deref instructions as opposed to working on texture instructions or intrinsics and tracing the deref chain back. The later are ok because they'll never look at dead derefs. There former (which are likely to be more efficient if we've CSEd derefs) can run into trouble as it's not always obvious when a deref is dead. I'm sure I'll get a better feel for this whole mess as I continue to progress. Defn not trying to second guess you since you are deeper into it that I am.. But requiring dce (or a some sort of mini deref-dce) pass in various places seems reasonable.. I guess it would be nice (given the growing list of nir passes) to have some more formal way to require that some pass(es) is run prior to a whatever random pass driver wants to run would be nice. (Not sure if llvm's PassManager provides this.. if it doesn't, it should.) Yeah. We could theoretically use the metadata system for this but it seems like a bit of an abuse. I'll know more once I get done removing deref chains. That process is teaching me about all sorts of things I missed on the first pass. BR, -R --Jason BR, -R On Wed, Mar 28, 2018 at 7:43 PM, Jason Ekstrand wrote: One interesting and unexpected side effect of this series has been that dead code elimination is now required to clean up unused deref instructions. This can be a problem for passes which alter and/or delete the variable because they may leave invalid deref instructions lying around. This is a bit troublesome because it causes validation issues and can confused later passes if there are invalid deref instructions even if they are unused. I have added a helper that allows you to easily check if a deref instruction is in use and remove it and its ancestors if it is not. This seems to help a bit but means that you have to manually clean up derefs whenever you alter or remove a variable. Another option would be to write a simple dead deref elimination pass that other optimization passes can run. I'm not 100% sure what I think of that. On the balance, though, I think the amount that removing deref chains simplifies the IR still makes it worth it. On March 23, 2018 14:43:16 Jason Ekstrand wrote: This is something that Connor and I have been talking about for some time now. The basic idea is to replace the current singly linked nir_deref list with deref instructions. This is similar to what LLVM does and it offers quite a bit more freedom when we start getting more realistic pointers from compute applications. This series implements an almost complete conversion for both i965 and anv. The two remaining gaps are nir_lower_locals_to_regs and nir_lower_samplers. The former will have to wait for ir3 to be converted and the later will have to wait for radeonsi. I've got patches for nir_lower_samplers but not nir_lower_samplers_as_deref which is required by at least radeonsi. Once those are in place, we should be able to drop the lowering pass from the Intel back-end completely. The next step (which I will start on next week) will be removing legacy derefs from core NIR. This will also involve significant reworks in some passes such as vars_to_ssa which still uses legacy derefs internally even for things which use deref instructions. Clearly, we can't remove anything until all of the other drivers are converted. However, this series should be a good basis for anyone wanting to work on converting another driver since almost all of the core NIR passes now work with both types of derefs so you can convert in whatever way makes sense. This series can be found as a branch on gitlab: https://gitlab.freedesktop.org/jekstrand/mesa/commits/review/nir-deref-instrs-v1 Cc: Rob Clark Cc: Timothy Arceri Cc: Eric Anholt Cc: Connor Abbott Cc: Bas Nieuwenhuizen Cc: Karol Herbst
[Mesa-dev] [PATCH] st/mesa: add missing GLSL_TYPE_[U]INT8 cases in st_glsl_type_dword_size()
Silences a compiler warning about unhandled enum switch cases. --- src/mesa/state_tracker/st_glsl_types.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/state_tracker/st_glsl_types.cpp b/src/mesa/state_tracker/st_glsl_types.cpp index ef7b7fa..9ad76c9 100644 --- a/src/mesa/state_tracker/st_glsl_types.cpp +++ b/src/mesa/state_tracker/st_glsl_types.cpp @@ -124,6 +124,9 @@ st_glsl_type_dword_size(const struct glsl_type *type) case GLSL_TYPE_INT16: case GLSL_TYPE_FLOAT16: return DIV_ROUND_UP(type->components(), 2); + case GLSL_TYPE_UINT8: + case GLSL_TYPE_INT8: + return DIV_ROUND_UP(type->components(), 4); case GLSL_TYPE_DOUBLE: case GLSL_TYPE_UINT64: case GLSL_TYPE_INT64: -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 --- Comment #10 from Dave Airlie--- I did a championship lap, there were no other cars on the screen as I'm no good at the game, they were in the lap somewhere. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 --- Comment #9 from Amarildo--- BTW, how did you do that lap? Because if you're alone, e.g. in a time-trial event, the game runs fine. It's when running a e.g. Race Weekend and coming out of the pits that the game crashes. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 --- Comment #8 from Amarildo--- Meanwhile, could you test with Firejail? On Arch Linux pacman -S firejail On Debian/Ubuntu/Family apt install firejail Then edit: /etc/firejail/steam.profile and comment the following lines: #seccomp #private-dev The game didn't start at first, I had to comment the 'seccomp' line. I'm afraid it (firejail) has something to do with the crash, but I'm not sure. I installed "mesa-vulkan-drivers-dbgsym" and ran Steam inside gdb and firejail (via "STEAM_DEBUGGER=gdb firejail --allow-debuggers steam") but it was like I wasn't debugging it at all. So if you may, please install and run steam within firejail to see if that causes F1 2017 to crash. To run Steam through firejail, after commenting the necessary lines above in it's firejail profile, do: firejail steam Thanks -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 --- Comment #7 from Dave Airlie--- Just FYI, Tahiti GPU, no crash here, I did one lap of Melbourne and entered the pits and exited again. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/61] nir: Move to using instructions for derefs
On Wed, Mar 28, 2018 at 8:16 PM, Jason Ekstrandwrote: > On March 28, 2018 16:54:33 Rob Clark wrote: > > I had noticed the code to remove dead deref's in a few of the passes > (at least on your wip branch), and had wondered a bit about not just > requiring all the deref related lowering to happen in ssa and possibly > require dce after, although admittedly hadn't thought about it *too* > much yet.. > > Yeah. Like I said below, it should be ready enough to just have a tiny > clean-up pass instead of having to run full-on dce. Maybe just running dce > is the right choice; I'm not sure. > > > I kinda expected to use the dce clean things up once we are in a > deref-instruction world.. re: validation passes, could we not just > allow dead deref instructions to be ok. That seems like kind of a > natural thing.. > > Making validation ignore them is easy. The trickier bit is that they can > cause problems for any pass which works on all deref instructions as opposed > to working on texture instructions or intrinsics and tracing the deref chain > back. The later are ok because they'll never look at dead derefs. There > former (which are likely to be more efficient if we've CSEd derefs) can run > into trouble as it's not always obvious when a deref is dead. > > I'm sure I'll get a better feel for this whole mess as I continue to > progress. Defn not trying to second guess you since you are deeper into it that I am.. But requiring dce (or a some sort of mini deref-dce) pass in various places seems reasonable.. I guess it would be nice (given the growing list of nir passes) to have some more formal way to require that some pass(es) is run prior to a whatever random pass driver wants to run would be nice. (Not sure if llvm's PassManager provides this.. if it doesn't, it should.) BR, -R > --Jason > > > > BR, > -R > > > On Wed, Mar 28, 2018 at 7:43 PM, Jason Ekstrand > wrote: > One interesting and unexpected side effect of this series has been that dead > code elimination is now required to clean up unused deref instructions. > This can be a problem for passes which alter and/or delete the variable > because they may leave invalid deref instructions lying around. This is a > bit troublesome because it causes validation issues and can confused later > passes if there are invalid deref instructions even if they are unused. I > have added a helper that allows you to easily check if a deref instruction > is in use and remove it and its ancestors if it is not. This seems to help a > bit but means that you have to manually clean up derefs whenever you alter > or remove a variable. Another option would be to write a simple dead deref > elimination pass that other optimization passes can run. > > I'm not 100% sure what I think of that. On the balance, though, I think the > amount that removing deref chains simplifies the IR still makes it worth it. > > > On March 23, 2018 14:43:16 Jason Ekstrand wrote: > > This is something that Connor and I have been talking about for some time > now. The basic idea is to replace the current singly linked nir_deref > list > with deref instructions. This is similar to what LLVM does and it offers > quite a bit more freedom when we start getting more realistic pointers > from > compute applications. > > This series implements an almost complete conversion for both i965 and > anv. > The two remaining gaps are nir_lower_locals_to_regs and > nir_lower_samplers. > The former will have to wait for ir3 to be converted and the later will > have to wait for radeonsi. I've got patches for nir_lower_samplers but > not > nir_lower_samplers_as_deref which is required by at least radeonsi. Once > those are in place, we should be able to drop the lowering pass from the > Intel back-end completely. > > The next step (which I will start on next week) will be removing legacy > derefs from core NIR. This will also involve significant reworks in some > passes such as vars_to_ssa which still uses legacy derefs internally even > for things which use deref instructions. > > Clearly, we can't remove anything until all of the other drivers are > converted. However, this series should be a good basis for anyone wanting > to work on converting another driver since almost all of the core NIR > passes now work with both types of derefs so you can convert in whatever > way makes sense. > > This series can be found as a branch on gitlab: > > > https://gitlab.freedesktop.org/jekstrand/mesa/commits/review/nir-deref-instrs-v1 > > Cc: Rob Clark > Cc: Timothy Arceri > Cc: Eric Anholt > Cc: Connor Abbott > Cc: Bas Nieuwenhuizen > Cc: Karol Herbst > > Jason Ekstrand (61): > nir: Add src/dest num_components helpers > nir: Return a cursor from nir_instr_remove > nir/vars_to_ssa: Remove copies
Re: [Mesa-dev] [PATCH 1/2] compiler/nir: add a is_image_sample_dref flag to texture instructions
How is this different from is_shadow? On March 28, 2018 02:33:50 Iago Toral Quirogawrote: So we can recognize image sampling instructions that involve a depth comparison against a reference, such as SPIR-V's OpImageSample{Proj}Dref{Explicit,Implicit}Lod and we can acknowledge that they return a single scalar value instead of a vec4. --- src/compiler/nir/nir.h | 9 + src/compiler/nir/nir_clone.c | 1 + src/compiler/nir/nir_instr_set.c | 2 ++ src/compiler/nir/nir_lower_tex.c | 5 - src/compiler/nir/nir_serialize.c | 5 - 5 files changed, 20 insertions(+), 2 deletions(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index 0d207d0ea5..625092cd2b 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -1231,6 +1231,12 @@ typedef struct { */ bool is_new_style_shadow; + /** +* If is_image_sample_dref is true, this is an image sample with depth +* comparing. +*/ + bool is_image_sample_dref; + /* gather component selector */ unsigned component : 2; @@ -1316,6 +1322,9 @@ nir_tex_instr_dest_size(const nir_tex_instr *instr) if (instr->is_shadow && instr->is_new_style_shadow) return 1; + if (instr->is_image_sample_dref) + return 1; + return 4; } } diff --git a/src/compiler/nir/nir_clone.c b/src/compiler/nir/nir_clone.c index bcfdaa7594..7d6cfd896f 100644 --- a/src/compiler/nir/nir_clone.c +++ b/src/compiler/nir/nir_clone.c @@ -415,6 +415,7 @@ clone_tex(clone_state *state, const nir_tex_instr *tex) ntex->is_array = tex->is_array; ntex->is_shadow = tex->is_shadow; ntex->is_new_style_shadow = tex->is_new_style_shadow; + ntex->is_image_sample_dref = tex->is_image_sample_dref; ntex->component = tex->component; ntex->texture_index = tex->texture_index; diff --git a/src/compiler/nir/nir_instr_set.c b/src/compiler/nir/nir_instr_set.c index 9cb9ed43e8..5563f6f095 100644 --- a/src/compiler/nir/nir_instr_set.c +++ b/src/compiler/nir/nir_instr_set.c @@ -155,6 +155,7 @@ hash_tex(uint32_t hash, const nir_tex_instr *instr) hash = HASH(hash, instr->is_array); hash = HASH(hash, instr->is_shadow); hash = HASH(hash, instr->is_new_style_shadow); + hash = HASH(hash, instr->is_image_sample_dref); unsigned component = instr->component; hash = HASH(hash, component); hash = HASH(hash, instr->texture_index); @@ -310,6 +311,7 @@ nir_instrs_equal(const nir_instr *instr1, const nir_instr *instr2) tex1->is_array != tex2->is_array || tex1->is_shadow != tex2->is_shadow || tex1->is_new_style_shadow != tex2->is_new_style_shadow || + tex1->is_image_sample_dref != tex2->is_image_sample_dref || tex1->component != tex2->component || tex1->texture_index != tex2->texture_index || tex1->texture_array_size != tex2->texture_array_size || diff --git a/src/compiler/nir/nir_lower_tex.c b/src/compiler/nir/nir_lower_tex.c index 1062afd97f..03e7555679 100644 --- a/src/compiler/nir/nir_lower_tex.c +++ b/src/compiler/nir/nir_lower_tex.c @@ -114,6 +114,7 @@ get_texture_size(nir_builder *b, nir_tex_instr *tex) txs->is_array = tex->is_array; txs->is_shadow = tex->is_shadow; txs->is_new_style_shadow = tex->is_new_style_shadow; + txs->is_image_sample_dref = tex->is_image_sample_dref; txs->texture_index = tex->texture_index; txs->texture = nir_deref_var_clone(tex->texture, txs); txs->sampler_index = tex->sampler_index; @@ -343,6 +344,7 @@ replace_gradient_with_lod(nir_builder *b, nir_ssa_def *lod, nir_tex_instr *tex) txl->is_array = tex->is_array; txl->is_shadow = tex->is_shadow; txl->is_new_style_shadow = tex->is_new_style_shadow; + txl->is_image_sample_dref = tex->is_image_sample_dref; txl->sampler_index = tex->sampler_index; txl->texture = nir_deref_var_clone(tex->texture, txl); txl->sampler = nir_deref_var_clone(tex->sampler, txl); @@ -794,7 +796,8 @@ nir_lower_tex_block(nir_block *block, nir_builder *b, if (((1 << tex->texture_index) & options->swizzle_result) && !nir_tex_instr_is_query(tex) && - !(tex->is_shadow && tex->is_new_style_shadow)) { + !(tex->is_shadow && tex->is_new_style_shadow) && + !tex->is_image_sample_dref) { swizzle_result(b, tex, options->swizzles[tex->texture_index]); progress = true; } diff --git a/src/compiler/nir/nir_serialize.c b/src/compiler/nir/nir_serialize.c index 00df49c2ef..dcbe1f0c13 100644 --- a/src/compiler/nir/nir_serialize.c +++ b/src/compiler/nir/nir_serialize.c @@ -583,10 +583,11 @@ union packed_tex_data { unsigned is_array:1; unsigned is_shadow:1; unsigned is_new_style_shadow:1; + unsigned is_image_sample_dref:1; unsigned component:2; unsigned has_texture_deref:1; unsigned has_sampler_deref:1; - unsigned unused:10; /* Mark unused for valgrind. */ + unsigned unused:9; /* Mark unused for valgrind. */ } u; }; @@
Re: [Mesa-dev] [PATCH 00/61] nir: Move to using instructions for derefs
On March 28, 2018 16:54:33 Rob Clarkwrote: I had noticed the code to remove dead deref's in a few of the passes (at least on your wip branch), and had wondered a bit about not just requiring all the deref related lowering to happen in ssa and possibly require dce after, although admittedly hadn't thought about it *too* much yet.. Yeah. Like I said below, it should be ready enough to just have a tiny clean-up pass instead of having to run full-on dce. Maybe just running dce is the right choice; I'm not sure. I kinda expected to use the dce clean things up once we are in a deref-instruction world.. re: validation passes, could we not just allow dead deref instructions to be ok. That seems like kind of a natural thing.. Making validation ignore them is easy. The trickier bit is that they can cause problems for any pass which works on all deref instructions as opposed to working on texture instructions or intrinsics and tracing the deref chain back. The later are ok because they'll never look at dead derefs. There former (which are likely to be more efficient if we've CSEd derefs) can run into trouble as it's not always obvious when a deref is dead. I'm sure I'll get a better feel for this whole mess as I continue to progress. --Jason BR, -R On Wed, Mar 28, 2018 at 7:43 PM, Jason Ekstrand wrote: One interesting and unexpected side effect of this series has been that dead code elimination is now required to clean up unused deref instructions. This can be a problem for passes which alter and/or delete the variable because they may leave invalid deref instructions lying around. This is a bit troublesome because it causes validation issues and can confused later passes if there are invalid deref instructions even if they are unused. I have added a helper that allows you to easily check if a deref instruction is in use and remove it and its ancestors if it is not. This seems to help a bit but means that you have to manually clean up derefs whenever you alter or remove a variable. Another option would be to write a simple dead deref elimination pass that other optimization passes can run. I'm not 100% sure what I think of that. On the balance, though, I think the amount that removing deref chains simplifies the IR still makes it worth it. On March 23, 2018 14:43:16 Jason Ekstrand wrote: This is something that Connor and I have been talking about for some time now. The basic idea is to replace the current singly linked nir_deref list with deref instructions. This is similar to what LLVM does and it offers quite a bit more freedom when we start getting more realistic pointers from compute applications. This series implements an almost complete conversion for both i965 and anv. The two remaining gaps are nir_lower_locals_to_regs and nir_lower_samplers. The former will have to wait for ir3 to be converted and the later will have to wait for radeonsi. I've got patches for nir_lower_samplers but not nir_lower_samplers_as_deref which is required by at least radeonsi. Once those are in place, we should be able to drop the lowering pass from the Intel back-end completely. The next step (which I will start on next week) will be removing legacy derefs from core NIR. This will also involve significant reworks in some passes such as vars_to_ssa which still uses legacy derefs internally even for things which use deref instructions. Clearly, we can't remove anything until all of the other drivers are converted. However, this series should be a good basis for anyone wanting to work on converting another driver since almost all of the core NIR passes now work with both types of derefs so you can convert in whatever way makes sense. This series can be found as a branch on gitlab: https://gitlab.freedesktop.org/jekstrand/mesa/commits/review/nir-deref-instrs-v1 Cc: Rob Clark Cc: Timothy Arceri Cc: Eric Anholt Cc: Connor Abbott Cc: Bas Nieuwenhuizen Cc: Karol Herbst Jason Ekstrand (61): nir: Add src/dest num_components helpers nir: Return a cursor from nir_instr_remove nir/vars_to_ssa: Remove copies from the correct set nir/lower_indirect_derefs: Support interp_var_at intrinsics intel/vec4: Set channel_sizes for MOV_INDIRECT sources nir/validator: Validate that all used variables exist nir: Add a deref instruction type nir/builder: Add deref building helpers nir: Add _deref versions of all of the _var intrinsics nir: Add deref sources to texture instructions nir: Add helpers for working with deref instructions anv,i965,radv,st,ir3: Call nir_lower_deref_instrs glsl/nir: Only claim to handle intrinsic functions glsl/nir: Use deref instructions instead of dref chains prog/nir: Simplify some load/store operations prog/nir: Use deref instructions for params nir/lower_atomics: Rework the main
Re: [Mesa-dev] [PATCH] vc4: Fix out-of-tree build
Aaron Watrywrites: > Signed-off-by: Aaron Watry > Cc: Eric Anholt Some day we should probably just consistently prefix our includes so we don't need so many -I. For now I've reviewed and pushed your patch. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st: Don't try to finalize the texture in st_render_texture().
Brian Paulwrites: > On 03/27/2018 10:14 PM, Eric Anholt wrote: >> We can't necessarily finalize the texture at this point if we're rendering >> to a texture image whose format is different from the baselevel's format. > > This is just a test suite scenario, right? It's not the sort of thing a > real app would do, I hope. Yeah, it seems pretty crazy to me. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/61] nir: Move to using instructions for derefs
I had noticed the code to remove dead deref's in a few of the passes (at least on your wip branch), and had wondered a bit about not just requiring all the deref related lowering to happen in ssa and possibly require dce after, although admittedly hadn't thought about it *too* much yet.. I kinda expected to use the dce clean things up once we are in a deref-instruction world.. re: validation passes, could we not just allow dead deref instructions to be ok. That seems like kind of a natural thing.. BR, -R On Wed, Mar 28, 2018 at 7:43 PM, Jason Ekstrandwrote: > One interesting and unexpected side effect of this series has been that dead > code elimination is now required to clean up unused deref instructions. > This can be a problem for passes which alter and/or delete the variable > because they may leave invalid deref instructions lying around. This is a > bit troublesome because it causes validation issues and can confused later > passes if there are invalid deref instructions even if they are unused. I > have added a helper that allows you to easily check if a deref instruction > is in use and remove it and its ancestors if it is not. This seems to help a > bit but means that you have to manually clean up derefs whenever you alter > or remove a variable. Another option would be to write a simple dead deref > elimination pass that other optimization passes can run. > > I'm not 100% sure what I think of that. On the balance, though, I think the > amount that removing deref chains simplifies the IR still makes it worth it. > > > On March 23, 2018 14:43:16 Jason Ekstrand wrote: > >> This is something that Connor and I have been talking about for some time >> now. The basic idea is to replace the current singly linked nir_deref >> list >> with deref instructions. This is similar to what LLVM does and it offers >> quite a bit more freedom when we start getting more realistic pointers >> from >> compute applications. >> >> This series implements an almost complete conversion for both i965 and >> anv. >> The two remaining gaps are nir_lower_locals_to_regs and >> nir_lower_samplers. >> The former will have to wait for ir3 to be converted and the later will >> have to wait for radeonsi. I've got patches for nir_lower_samplers but >> not >> nir_lower_samplers_as_deref which is required by at least radeonsi. Once >> those are in place, we should be able to drop the lowering pass from the >> Intel back-end completely. >> >> The next step (which I will start on next week) will be removing legacy >> derefs from core NIR. This will also involve significant reworks in some >> passes such as vars_to_ssa which still uses legacy derefs internally even >> for things which use deref instructions. >> >> Clearly, we can't remove anything until all of the other drivers are >> converted. However, this series should be a good basis for anyone wanting >> to work on converting another driver since almost all of the core NIR >> passes now work with both types of derefs so you can convert in whatever >> way makes sense. >> >> This series can be found as a branch on gitlab: >> >> >> https://gitlab.freedesktop.org/jekstrand/mesa/commits/review/nir-deref-instrs-v1 >> >> Cc: Rob Clark >> Cc: Timothy Arceri >> Cc: Eric Anholt >> Cc: Connor Abbott >> Cc: Bas Nieuwenhuizen >> Cc: Karol Herbst >> >> Jason Ekstrand (61): >> nir: Add src/dest num_components helpers >> nir: Return a cursor from nir_instr_remove >> nir/vars_to_ssa: Remove copies from the correct set >> nir/lower_indirect_derefs: Support interp_var_at intrinsics >> intel/vec4: Set channel_sizes for MOV_INDIRECT sources >> nir/validator: Validate that all used variables exist >> nir: Add a deref instruction type >> nir/builder: Add deref building helpers >> nir: Add _deref versions of all of the _var intrinsics >> nir: Add deref sources to texture instructions >> nir: Add helpers for working with deref instructions >> anv,i965,radv,st,ir3: Call nir_lower_deref_instrs >> glsl/nir: Only claim to handle intrinsic functions >> glsl/nir: Use deref instructions instead of dref chains >> prog/nir: Simplify some load/store operations >> prog/nir: Use deref instructions for params >> nir/lower_atomics: Rework the main walker loop a bit >> nir: Support deref instructions in remove_dead_variables >> nir: Add a pass for fixing deref modes >> nir: Support deref instructions in lower_global_vars_to_local >> nir: Support deref instructions in lower_io_to_temporaries >> nir: Add a deref path helper struct >> nir: Support deref instructions in lower_var_copies >> nir: Support deref instructions in split_var_copies >> nir: Support deref instructions in lower_vars_to_ssa >> nir: Support deref instructions in lower_indirect_derefs >> nir/deref: Add a deref cleanup function >> nir:
Re: [Mesa-dev] [PATCH 00/61] nir: Move to using instructions for derefs
One interesting and unexpected side effect of this series has been that dead code elimination is now required to clean up unused deref instructions. This can be a problem for passes which alter and/or delete the variable because they may leave invalid deref instructions lying around. This is a bit troublesome because it causes validation issues and can confused later passes if there are invalid deref instructions even if they are unused. I have added a helper that allows you to easily check if a deref instruction is in use and remove it and its ancestors if it is not. This seems to help a bit but means that you have to manually clean up derefs whenever you alter or remove a variable. Another option would be to write a simple dead deref elimination pass that other optimization passes can run. I'm not 100% sure what I think of that. On the balance, though, I think the amount that removing deref chains simplifies the IR still makes it worth it. On March 23, 2018 14:43:16 Jason Ekstrandwrote: This is something that Connor and I have been talking about for some time now. The basic idea is to replace the current singly linked nir_deref list with deref instructions. This is similar to what LLVM does and it offers quite a bit more freedom when we start getting more realistic pointers from compute applications. This series implements an almost complete conversion for both i965 and anv. The two remaining gaps are nir_lower_locals_to_regs and nir_lower_samplers. The former will have to wait for ir3 to be converted and the later will have to wait for radeonsi. I've got patches for nir_lower_samplers but not nir_lower_samplers_as_deref which is required by at least radeonsi. Once those are in place, we should be able to drop the lowering pass from the Intel back-end completely. The next step (which I will start on next week) will be removing legacy derefs from core NIR. This will also involve significant reworks in some passes such as vars_to_ssa which still uses legacy derefs internally even for things which use deref instructions. Clearly, we can't remove anything until all of the other drivers are converted. However, this series should be a good basis for anyone wanting to work on converting another driver since almost all of the core NIR passes now work with both types of derefs so you can convert in whatever way makes sense. This series can be found as a branch on gitlab: https://gitlab.freedesktop.org/jekstrand/mesa/commits/review/nir-deref-instrs-v1 Cc: Rob Clark Cc: Timothy Arceri Cc: Eric Anholt Cc: Connor Abbott Cc: Bas Nieuwenhuizen Cc: Karol Herbst Jason Ekstrand (61): nir: Add src/dest num_components helpers nir: Return a cursor from nir_instr_remove nir/vars_to_ssa: Remove copies from the correct set nir/lower_indirect_derefs: Support interp_var_at intrinsics intel/vec4: Set channel_sizes for MOV_INDIRECT sources nir/validator: Validate that all used variables exist nir: Add a deref instruction type nir/builder: Add deref building helpers nir: Add _deref versions of all of the _var intrinsics nir: Add deref sources to texture instructions nir: Add helpers for working with deref instructions anv,i965,radv,st,ir3: Call nir_lower_deref_instrs glsl/nir: Only claim to handle intrinsic functions glsl/nir: Use deref instructions instead of dref chains prog/nir: Simplify some load/store operations prog/nir: Use deref instructions for params nir/lower_atomics: Rework the main walker loop a bit nir: Support deref instructions in remove_dead_variables nir: Add a pass for fixing deref modes nir: Support deref instructions in lower_global_vars_to_local nir: Support deref instructions in lower_io_to_temporaries nir: Add a deref path helper struct nir: Support deref instructions in lower_var_copies nir: Support deref instructions in split_var_copies nir: Support deref instructions in lower_vars_to_ssa nir: Support deref instructions in lower_indirect_derefs nir/deref: Add a deref cleanup function nir: Support deref instructions in lower_system_values nir: Support deref instructions in lower_clip_cull nir: Support deref instructions in propagate_invariant nir: Support deref instructions in gather_info nir: Support deref instructions in lower_io nir: Support deref instructions in lower_atomics nir: Support deref instructions in lower_wpos_ytransform nir: Support deref instructions in lower_pos_center nir: Support deref instructions in remove_unused_varyings intel,ir3: Disable nir_opt_copy_prop_vars intel/nir: Fixup deref modes after lowering patch vertices i965: Move nir_lower_deref_instrs to right before locals_to_regs st/nir: Move lower_deref_instrs later spirv: Use deref instructions for most variables nir: Add a concept of per-member structs and a lowering pass
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 --- Comment #6 from Amarildo--- Thanks Bas. It seems "mesa-vulkan-drivers" has debug symbols enabled for that dbg package[1]. I'll download it then run the game and attach any log files here. If that's not enough (e.g. if radv is not present in mesa-vulkan-drivers) I'll rebuild all the packages and then redo the process. - [1] [code]amarildo@amarildo:~$ apt-cache search mesa | grep dbg libglw1-mesa-dbgsym - Debug symbols for libglw1-mesa libegl-mesa0-dbgsym - debug symbols for libegl-mesa0 libgl1-mesa-dri-dbgsym - debug symbols for libgl1-mesa-dri libglapi-mesa-dbgsym - debug symbols for libglapi-mesa libglx-mesa0-dbgsym - debug symbols for libglx-mesa0 libosmesa6-dbgsym - debug symbols for libosmesa6 libwayland-egl1-mesa-dbgsym - debug symbols for libwayland-egl1-mesa mesa-opencl-icd-dbgsym - debug symbols for mesa-opencl-icd mesa-va-drivers-dbgsym - debug symbols for mesa-va-drivers mesa-vdpau-drivers-dbgsym - debug symbols for mesa-vdpau-drivers mesa-vulkan-drivers-dbgsym - debug symbols for mesa-vulkan-drivers mesa-utils-dbgsym - debug symbols for mesa-utils mesa-utils-extra-dbgsym - debug symbols for mesa-utils-extra [/code] -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 42/61] nir: Add a concept of per-member structs and a lowering pass
On March 28, 2018 15:25:59 Timothy Arceriwrote: On 29/03/18 08:23, Timothy Arceri wrote: On 29/03/18 05:34, Jason Ekstrand wrote: On March 27, 2018 21:16:25 Timothy Arceri wrote: So I've been thinking about structs and I'm pretty sure we should be able to write some passes to completely lower them away. vertex shader inputs, buffer block and shader interface blocks cannot contain structs so it seems to me the only blocker is the way we assign uniform and varying locations. Currently for a struct like this: struct S2 { int b; int c; }; struct S1 { int a; S2 s2[3]; int d; }; uniform S1 s[2][2]; We store things like so: s[0][0].a = location 0 s[0][0].s[0].b = location 1 s[0][0].s[0].c = location 2 s[0][0].s[1].b = location 3 s[0][0].s[1].c = location 4 s[0][0].s[2].b = location 5 s[0][0].s[2].c = location 6 ... If we had a GLSL IR pass that pushed the arrays down to the innermost member like so: struct S2 { int b[2][2][3]; int c[2][2][3]; }; struct S1 { int a[2][2]; S2 s2; int d[2][2]; }; uniform S1 s; We would instead store things like so: s[0][0].a = location 0 s[0][1].a = location 1 s[1][0].a = location 2 s[1][1].a = location 3 s[0][0].s[0].b = location 4 s[0][0].s[1].b = location 5 s[0][0].s[2].b = location 6 ... This allows us to easily split the members out into independent arrays of arrays. To do this we might want to create the uniform (resource) name before pushing the arrays down so that we still match up the correct uniforms with the names passed to the API but that shouldn't be to difficult. With this in place we should be able to generate better shaders when structs are used and be able to delete a whole bunch of struct handing code (and avoid this new code/concept?). Does anyone see any holes in my analysis? Only that it doesn't help for SPIR-V which is the whole reason for this patch. How does it not help SPIR-V? As per above it would seem the only reason we need to handle structs in NIR is because of the way we assign storage in OpenGL. I'm not as overly familiar with SPIR-V still but is there any reason we can't do the struct splitting pass in NIR and be done with structs? Ok nevermind. I misread the spec buffer blocks can indeed contains structs which make this not as useful. Doing struct splitting in NIR may be useful especially for local variables with arrays if structs and indirections. In particular, having each register be an array of a single vector or scalar type. Unfortunately, I don't see a way that we can actually remove struct support from the IR. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 --- Comment #5 from Bas Nieuwenhuizen--- It might be easier to install packages with the debug symbols from your distro: https://wiki.debian.org/HowToGetABacktrace though I don't know offhand which debian package contains radv. Otherwise try CFLAGS="-O0 -g" ./autogen.sh --with-gallium-drivers= --with-dri-drivers= --with-egl-platforms=x11,drm --enable-debug --with-vulkan-drivers=radeon and then run the game with "set launch options" set to VK_ICD_FILENAMES=${MESA_DIR}/src/amd/vulkan/dev_icd.json %command% with the ${MESA_DIR} replaced by the git clone. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vbo: Use alloca for _vbo_draw_indirect.
Yes, it looks good. Marek On Wed, Mar 28, 2018 at 6:35 AM,wrote: > From: Mathias Fröhlich > > > Marek, > > you mean with the below patch as the 9-th change in the series? > I would like to keep that change seprarate from #3 since patch #3 > just moves the already existing impelentation to the driver_functions > level using the exactly identical implementation except calling into > struct driver_functions instead of the vbo module draw function. > > Also I do not want to call just blindly into alloca with possibly > large counts. So, the implementation uses an upper bound when to use > malloc instead of alloca. > > Ok, with that? > > best > > Mathias > > > > Avoid using malloc in the draw path of mesa. > Since the draw_count is a user api input, fall back to malloc if > the amount of consumed stack space may get too high. > > Signed-off-by: Mathias Fröhlich > --- > src/mesa/vbo/vbo_context.c | 70 ++ > +--- > 1 file changed, 47 insertions(+), 23 deletions(-) > > diff --git a/src/mesa/vbo/vbo_context.c b/src/mesa/vbo/vbo_context.c > index b8c28ceffb..06b8f820ee 100644 > --- a/src/mesa/vbo/vbo_context.c > +++ b/src/mesa/vbo/vbo_context.c > @@ -233,25 +233,17 @@ _vbo_DestroyContext(struct gl_context *ctx) > } > > > -void > -_vbo_draw_indirect(struct gl_context *ctx, GLuint mode, > -struct gl_buffer_object *indirect_data, > -GLsizeiptr indirect_offset, unsigned draw_count, > -unsigned stride, > -struct gl_buffer_object > *indirect_draw_count_buffer, > -GLsizeiptr indirect_draw_count_offset, > -const struct _mesa_index_buffer *ib) > +static void > +draw_indirect(struct gl_context *ctx, GLuint mode, > + struct gl_buffer_object *indirect_data, > + GLsizeiptr indirect_offset, unsigned draw_count, > + unsigned stride, > + struct gl_buffer_object *indirect_draw_count_buffer, > + GLsizeiptr indirect_draw_count_offset, > + const struct _mesa_index_buffer *ib, > + struct _mesa_prim *space) > { > - struct _mesa_prim *prim; > - > - prim = calloc(draw_count, sizeof(*prim)); > - if (prim == NULL) { > - _mesa_error(ctx, GL_OUT_OF_MEMORY, "gl%sDraw%sIndirect%s", > - (draw_count > 1) ? "Multi" : "", > - ib ? "Elements" : "Arrays", > - indirect_data ? "CountARB" : ""); > - return; > - } > + struct _mesa_prim *prim = space; > > prim[0].begin = 1; > prim[draw_count - 1].end = 1; > @@ -266,10 +258,42 @@ _vbo_draw_indirect(struct gl_context *ctx, GLuint > mode, > /* This should always be true at this time */ > assert(indirect_data == ctx->DrawIndirectBuffer); > > - ctx->Driver.Draw(ctx, prim, draw_count, > - ib, false, 0, ~0, > - NULL, 0, > - indirect_data); > + ctx->Driver.Draw(ctx, prim, draw_count, ib, false, 0u, ~0u, > +NULL, 0, indirect_data); > +} > + > > - free(prim); > +void > +_vbo_draw_indirect(struct gl_context *ctx, GLuint mode, > + struct gl_buffer_object *indirect_data, > + GLsizeiptr indirect_offset, unsigned draw_count, > + unsigned stride, > + struct gl_buffer_object *indirect_draw_count_buffer, > + GLsizeiptr indirect_draw_count_offset, > + const struct _mesa_index_buffer *ib) > +{ > + /* Use alloca for the prim space if we are somehow in bounds. */ > + if (draw_count*sizeof(struct _mesa_prim) < 1024) { > + struct _mesa_prim *space = alloca(draw_count*sizeof(struct > _mesa_prim)); > + memset(space, 0, draw_count*sizeof(struct _mesa_prim)); > + > + draw_indirect(ctx, mode, indirect_data, indirect_offset, draw_count, > +stride, indirect_draw_count_buffer, > +indirect_draw_count_offset, ib, space); > + } else { > + struct _mesa_prim *space = calloc(draw_count, sizeof(struct > _mesa_prim)); > + if (space == NULL) { > + _mesa_error(ctx, GL_OUT_OF_MEMORY, "gl%sDraw%sIndirect%s", > + (draw_count > 1) ? "Multi" : "", > + ib ? "Elements" : "Arrays", > + indirect_data ? "CountARB" : ""); > + return; > + } > + > + draw_indirect(ctx, mode, indirect_data, indirect_offset, draw_count, > +stride, indirect_draw_count_buffer, > +indirect_draw_count_offset, ib, space); > + > + free(space); > + } > } > -- > 2.14.3 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH 1/5] i965: Hard code scratch_ids_per_subslice for Cherryview
On Wed, 2018-03-28 at 14:55 -0700, Jordan Justen wrote: > On 2018-03-26 08:23:13, Juan A. Suarez Romero wrote: > > On Wed, 2018-03-07 at 00:16 -0800, Jordan Justen wrote: > > > Ken suggested that we might be underallocating scratch space on > > > HD > > > 400. Allocating scratch space as though there was actually 8 EUs > > > seems to help with a GPU hang seen on synmark CSDof. > > > > > > > FYI, in order to pick this commit for next 17.3 stable release, I > > need to pick > > also: > > > > commit f9d5a7add42af5a2e4410526d1480a08f41317ae > > Author: Jordan Justen> > Date: Tue Oct 31 00:34:32 2017 -0700 > > > > i965: Calculate thread_count in brw_alloc_stage_scratch > > I believe that this commit lead to a regression with compute shaders, > which was fixed by: > > commit a16dc04ad51c32e5c7d136e4dd6273d983385d3f > Author: Kenneth Graunke > Date: Tue Oct 31 00:56:24 2017 -0700 > > i965: properly initialize brw->cs.base.stage to > MESA_SHADER_COMPUTE > > You should probably add Ken's a16dc04ad51c before f9d5a7add42a. > Thanks a lot! Fortunately, a16dc04ad51c was already nominated and included in 17.3.0. So it is in the stable branch. J.A. > -Jordan > > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104636 > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105290 > > > Cc: Kenneth Graunke > > > Cc: Eero Tamminen > > > Cc: > > > Signed-off-by: Jordan Justen > > > --- > > > src/mesa/drivers/dri/i965/brw_program.c | 44 > > > - > > > 1 file changed, 27 insertions(+), 17 deletions(-) > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_program.c > > > b/src/mesa/drivers/dri/i965/brw_program.c > > > index 527f003977b..c121136c439 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_program.c > > > +++ b/src/mesa/drivers/dri/i965/brw_program.c > > > @@ -402,23 +402,33 @@ brw_alloc_stage_scratch(struct brw_context > > > *brw, > > >if (devinfo->gen >= 9) > > > subslices = 4 * brw->screen->devinfo.num_slices; > > > > > > - /* WaCSScratchSize:hsw > > > - * > > > - * Haswell's scratch space address calculation appears to > > > be sparse > > > - * rather than tightly packed. The Thread ID has bits > > > indicating > > > - * which subslice, EU within a subslice, and thread within > > > an EU > > > - * it is. There's a maximum of two slices and two > > > subslices, so these > > > - * can be stored with a single bit. Even though there are > > > only 10 EUs > > > - * per subslice, this is stored in 4 bits, so there's an > > > effective > > > - * maximum value of 16 EUs. Similarly, although there are > > > only 7 > > > - * threads per EU, this is stored in a 3 bit number, > > > giving an effective > > > - * maximum value of 8 threads per EU. > > > - * > > > - * This means that we need to use 16 * 8 instead of 10 * 7 > > > for the > > > - * number of threads per subslice. > > > - */ > > > - const unsigned scratch_ids_per_subslice = > > > - devinfo->is_haswell ? 16 * 8 : devinfo->max_cs_threads; > > > + unsigned scratch_ids_per_subslice; > > > + if (devinfo->is_haswell) { > > > + /* WaCSScratchSize:hsw > > > + * > > > + * Haswell's scratch space address calculation appears > > > to be sparse > > > + * rather than tightly packed. The Thread ID has bits > > > indicating > > > + * which subslice, EU within a subslice, and thread > > > within an EU it > > > + * is. There's a maximum of two slices and two > > > subslices, so these > > > + * can be stored with a single bit. Even though there > > > are only 10 EUs > > > + * per subslice, this is stored in 4 bits, so there's > > > an effective > > > + * maximum value of 16 EUs. Similarly, although there > > > are only 7 > > > + * threads per EU, this is stored in a 3 bit number, > > > giving an > > > + * effective maximum value of 8 threads per EU. > > > + * > > > + * This means that we need to use 16 * 8 instead of 10 > > > * 7 for the > > > + * number of threads per subslice. > > > + */ > > > + scratch_ids_per_subslice = 16 * 8; > > > + } else if (devinfo->is_cherryview) { > > > + /* For Cherryview, it appears that the scratch > > > addresses for the 6 EU > > > + * devices may still generate compute scratch addresses > > > covering the > > > + * same range as 8 EU. > > > + */ > > > + scratch_ids_per_subslice = 8 * 7; > > > + } else { > > > + scratch_ids_per_subslice = devinfo->max_cs_threads; > > > + } > > > > > >thread_count = scratch_ids_per_subslice * subslices; > > >
Re: [Mesa-dev] [PATCH 42/61] nir: Add a concept of per-member structs and a lowering pass
On 29/03/18 08:23, Timothy Arceri wrote: On 29/03/18 05:34, Jason Ekstrand wrote: On March 27, 2018 21:16:25 Timothy Arceriwrote: So I've been thinking about structs and I'm pretty sure we should be able to write some passes to completely lower them away. vertex shader inputs, buffer block and shader interface blocks cannot contain structs so it seems to me the only blocker is the way we assign uniform and varying locations. Currently for a struct like this: struct S2 { int b; int c; }; struct S1 { int a; S2 s2[3]; int d; }; uniform S1 s[2][2]; We store things like so: s[0][0].a = location 0 s[0][0].s[0].b = location 1 s[0][0].s[0].c = location 2 s[0][0].s[1].b = location 3 s[0][0].s[1].c = location 4 s[0][0].s[2].b = location 5 s[0][0].s[2].c = location 6 ... If we had a GLSL IR pass that pushed the arrays down to the innermost member like so: struct S2 { int b[2][2][3]; int c[2][2][3]; }; struct S1 { int a[2][2]; S2 s2; int d[2][2]; }; uniform S1 s; We would instead store things like so: s[0][0].a = location 0 s[0][1].a = location 1 s[1][0].a = location 2 s[1][1].a = location 3 s[0][0].s[0].b = location 4 s[0][0].s[1].b = location 5 s[0][0].s[2].b = location 6 ... This allows us to easily split the members out into independent arrays of arrays. To do this we might want to create the uniform (resource) name before pushing the arrays down so that we still match up the correct uniforms with the names passed to the API but that shouldn't be to difficult. With this in place we should be able to generate better shaders when structs are used and be able to delete a whole bunch of struct handing code (and avoid this new code/concept?). Does anyone see any holes in my analysis? Only that it doesn't help for SPIR-V which is the whole reason for this patch. How does it not help SPIR-V? As per above it would seem the only reason we need to handle structs in NIR is because of the way we assign storage in OpenGL. I'm not as overly familiar with SPIR-V still but is there any reason we can't do the struct splitting pass in NIR and be done with structs? Ok nevermind. I misread the spec buffer blocks can indeed contains structs which make this not as useful. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] gallium/util: Android backtrace support
We can't use any of the existing implementations in u_debug_stack. Android technically has libunwind, but it's been modified to the point where it no longer compiles with the Mesa usage. The library is also not meant to be referenced by vendor libraries. The officially sanctioned way of obtaining backtraces is through the Android own libbacktrace, a C++ library. Access it through a separate C++ source file on Android only. Signed-off-by: Stefan Schake--- src/gallium/auxiliary/Android.mk | 3 +- .../auxiliary/util/u_debug_stack_android.cpp | 111 + src/gallium/targets/dri/Android.mk | 1 + 3 files changed, 114 insertions(+), 1 deletion(-) create mode 100644 src/gallium/auxiliary/util/u_debug_stack_android.cpp diff --git a/src/gallium/auxiliary/Android.mk b/src/gallium/auxiliary/Android.mk index 2693838..acd243b 100644 --- a/src/gallium/auxiliary/Android.mk +++ b/src/gallium/auxiliary/Android.mk @@ -32,7 +32,8 @@ LOCAL_SRC_FILES := \ $(C_SOURCES) \ $(NIR_SOURCES) \ $(RENDERONLY_SOURCES) \ - $(VL_STUB_SOURCES) + $(VL_STUB_SOURCES) \ + util/u_debug_stack_android.cpp LOCAL_C_INCLUDES := \ $(GALLIUM_TOP)/auxiliary/util diff --git a/src/gallium/auxiliary/util/u_debug_stack_android.cpp b/src/gallium/auxiliary/util/u_debug_stack_android.cpp new file mode 100644 index 000..b3d56ae --- /dev/null +++ b/src/gallium/auxiliary/util/u_debug_stack_android.cpp @@ -0,0 +1,111 @@ +/* + * Copyright (C) 2018 Stefan Schake + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include + +#include "u_debug.h" +#include "u_debug_stack.h" +#include "util/hash_table.h" +#include "os/os_thread.h" + +static hash_table *backtrace_table; +static mtx_t table_mutex = _MTX_INITIALIZER_NP; + +void +debug_backtrace_capture(debug_stack_frame *mesa_backtrace, +unsigned start_frame, +unsigned nr_frames) +{ + hash_entry *backtrace_entry; + Backtrace *backtrace; + pid_t tid = gettid(); + + if (!nr_frames) + return; + + /* We keep an Android Backtrace handler around for each thread */ + mtx_lock(_mutex); + if (!backtrace_table) + backtrace_table = _mesa_hash_table_create(NULL, _mesa_hash_pointer, +_mesa_key_pointer_equal); + + backtrace_entry = _mesa_hash_table_search(backtrace_table, (void*) tid); + if (!backtrace_entry) { + backtrace = Backtrace::Create(getpid(), tid); + _mesa_hash_table_insert(backtrace_table, (void*) tid, backtrace); + } else { + backtrace = (Backtrace *) backtrace_entry->data; + } + mtx_unlock(_mutex); + + /* Add one to exclude this call. Unwind already ignores itself. */ + backtrace->Unwind(start_frame + 1); + + /* Store the Backtrace handler in the first mesa frame for reference. +* Unwind will generally return less frames than nr_frames specified +* but we have no good way of storing the real count otherwise. +* The Backtrace handler only stores the results until the next Unwind, +* but that is how u_debug_stack is used anyway. +*/ + mesa_backtrace->function = backtrace; +} + +void +debug_backtrace_dump(const debug_stack_frame *mesa_backtrace, + unsigned nr_frames) +{ + Backtrace *backtrace = (Backtrace *) mesa_backtrace->function; + size_t i; + + if (!nr_frames) + return; + + if (nr_frames > backtrace->NumFrames()) + nr_frames = backtrace->NumFrames(); + for (i = 0; i < nr_frames; i++) { + /* There is no prescribed format and this isn't interpreted further, + * so we simply use the default Android format. + */ + const std::string& frame_line = backtrace->FormatFrameData(i); + debug_printf("%s\n", frame_line.c_str()); + } +} + +void
[Mesa-dev] [PATCH 0/2] Android backtrace support
This series adds Android backtrace support, which is a prerequisite for using the refcount debugging tool in gallium. It also comes in handy for impromptu debug outputs. Unfortunately, it wasn't possible to reuse the existing libunwind implementation. The only sanctioned way for obtaining backtraces on Android is through their own, C++ only libbacktrace. Example output from Oreo: #00 pc 0028c14b /system/vendor/lib/dri/gallium_dri.so #01 pc 0003cbaf /system/vendor/lib/dri/gallium_dri.so #02 pc 00042eb3 /system/vendor/lib/dri/gallium_dri.so #03 pc 22f1 /system/vendor/lib/libgbm.so #04 pc 148b /system/vendor/lib/hw/gralloc.gbm.so (gralloc_gbm_bo_lock+322) #05 pc 15e5 /system/vendor/lib/hw/gralloc.gbm.so #06 pc 551f /system/vendor/lib/hw/android.hardware.graphics.map...@2.0-impl.so (android::hardware::graphics::mapper::V2_0::implementation::Gralloc0Mapper::lockBuffer(native_handle const*, unsigned long long, android::hardware::graphics::mapper::V2_0::IMapper::Rect const&, int, void**)+122) #07 pc 4c61 /system/vendor/lib/hw/android.hardware.graphics.map...@2.0-impl.so (android::hardware::graphics::mapper::V2_0::implementation::GrallocMapper::lock(void*, unsigned long long, android::hardware::graphics::mapper::V2_0::IMapper::Rect const&, android::hardware::hidl_handle const&, std::__1::function)+84) #08 pc 00013dfd /system/lib/android.hardware.graphics.map...@2.0.so #09 pc 000107cf /system/lib/libui.so (android::Gralloc2::Mapper::lock(native_handle const*, unsigned long long, android::hardware::graphics::mapper::V2_0::IMapper::Rect const&, int, void**) const+110) Stefan Schake (2): gallium/util: Don't stub u_debug_stack on Android gallium/util: Android backtrace support src/gallium/auxiliary/Android.mk | 3 +- src/gallium/auxiliary/util/u_debug_stack.c | 2 +- .../auxiliary/util/u_debug_stack_android.cpp | 111 + src/gallium/targets/dri/Android.mk | 1 + 4 files changed, 115 insertions(+), 2 deletions(-) create mode 100644 src/gallium/auxiliary/util/u_debug_stack_android.cpp -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] gallium/util: Don't stub u_debug_stack on Android
The fallback path for no libunwind ends up being stubs for Android. Don't compile them in so we can provide our own implementation. Signed-off-by: Stefan Schake--- src/gallium/auxiliary/util/u_debug_stack.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/util/u_debug_stack.c b/src/gallium/auxiliary/util/u_debug_stack.c index 846f648..5cbb54f 100644 --- a/src/gallium/auxiliary/util/u_debug_stack.c +++ b/src/gallium/auxiliary/util/u_debug_stack.c @@ -194,7 +194,7 @@ debug_backtrace_print(FILE *f, } } -#else /* ! HAVE_LIBUNWIND */ +#elif !defined(ANDROID) /* ! HAVE_LIBUNWIND */ #if defined(PIPE_OS_WINDOWS) #include -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH 1/5] i965: Hard code scratch_ids_per_subslice for Cherryview
On 2018-03-26 08:23:13, Juan A. Suarez Romero wrote: > On Wed, 2018-03-07 at 00:16 -0800, Jordan Justen wrote: > > Ken suggested that we might be underallocating scratch space on HD > > 400. Allocating scratch space as though there was actually 8 EUs > > seems to help with a GPU hang seen on synmark CSDof. > > > > FYI, in order to pick this commit for next 17.3 stable release, I need to pick > also: > > commit f9d5a7add42af5a2e4410526d1480a08f41317ae > Author: Jordan Justen> Date: Tue Oct 31 00:34:32 2017 -0700 > > i965: Calculate thread_count in brw_alloc_stage_scratch I believe that this commit lead to a regression with compute shaders, which was fixed by: commit a16dc04ad51c32e5c7d136e4dd6273d983385d3f Author: Kenneth Graunke Date: Tue Oct 31 00:56:24 2017 -0700 i965: properly initialize brw->cs.base.stage to MESA_SHADER_COMPUTE You should probably add Ken's a16dc04ad51c before f9d5a7add42a. -Jordan > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104636 > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105290 > > Cc: Kenneth Graunke > > Cc: Eero Tamminen > > Cc: > > Signed-off-by: Jordan Justen > > --- > > src/mesa/drivers/dri/i965/brw_program.c | 44 > > - > > 1 file changed, 27 insertions(+), 17 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_program.c > > b/src/mesa/drivers/dri/i965/brw_program.c > > index 527f003977b..c121136c439 100644 > > --- a/src/mesa/drivers/dri/i965/brw_program.c > > +++ b/src/mesa/drivers/dri/i965/brw_program.c > > @@ -402,23 +402,33 @@ brw_alloc_stage_scratch(struct brw_context *brw, > >if (devinfo->gen >= 9) > > subslices = 4 * brw->screen->devinfo.num_slices; > > > > - /* WaCSScratchSize:hsw > > - * > > - * Haswell's scratch space address calculation appears to be sparse > > - * rather than tightly packed. The Thread ID has bits indicating > > - * which subslice, EU within a subslice, and thread within an EU > > - * it is. There's a maximum of two slices and two subslices, so > > these > > - * can be stored with a single bit. Even though there are only 10 > > EUs > > - * per subslice, this is stored in 4 bits, so there's an effective > > - * maximum value of 16 EUs. Similarly, although there are only 7 > > - * threads per EU, this is stored in a 3 bit number, giving an > > effective > > - * maximum value of 8 threads per EU. > > - * > > - * This means that we need to use 16 * 8 instead of 10 * 7 for the > > - * number of threads per subslice. > > - */ > > - const unsigned scratch_ids_per_subslice = > > - devinfo->is_haswell ? 16 * 8 : devinfo->max_cs_threads; > > + unsigned scratch_ids_per_subslice; > > + if (devinfo->is_haswell) { > > + /* WaCSScratchSize:hsw > > + * > > + * Haswell's scratch space address calculation appears to be > > sparse > > + * rather than tightly packed. The Thread ID has bits indicating > > + * which subslice, EU within a subslice, and thread within an EU > > it > > + * is. There's a maximum of two slices and two subslices, so these > > + * can be stored with a single bit. Even though there are only 10 > > EUs > > + * per subslice, this is stored in 4 bits, so there's an effective > > + * maximum value of 16 EUs. Similarly, although there are only 7 > > + * threads per EU, this is stored in a 3 bit number, giving an > > + * effective maximum value of 8 threads per EU. > > + * > > + * This means that we need to use 16 * 8 instead of 10 * 7 for the > > + * number of threads per subslice. > > + */ > > + scratch_ids_per_subslice = 16 * 8; > > + } else if (devinfo->is_cherryview) { > > + /* For Cherryview, it appears that the scratch addresses for the > > 6 EU > > + * devices may still generate compute scratch addresses covering > > the > > + * same range as 8 EU. > > + */ > > + scratch_ids_per_subslice = 8 * 7; > > + } else { > > + scratch_ids_per_subslice = devinfo->max_cs_threads; > > + } > > > >thread_count = scratch_ids_per_subslice * subslices; > >break; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 --- Comment #4 from Amarildo--- OK, thanks :) I've looked into this[1] short explanation, but I have no "Make-config" file after cloning mesa. [1] https://www.mesa3d.org/debugging.html Is there another file I can put "-DDEBUG" to? -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105371] r600_shader_from_tgsi - GPR limit exceeded - shader requires 360 registers
https://bugs.freedesktop.org/show_bug.cgi?id=105371 --- Comment #1 from mirh--- Can confirm it fixes shader 2 and 5 of GraphicsFuzz demo http://www.graphicsfuzz.com/benchmark/android-v1.html Should I wait for this (or, I dunno, some day sw fp64) to land before reporting of the others "gcm_sched_late_pass: unscheduled ops" errors? -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 42/61] nir: Add a concept of per-member structs and a lowering pass
On 29/03/18 05:34, Jason Ekstrand wrote: On March 27, 2018 21:16:25 Timothy Arceriwrote: So I've been thinking about structs and I'm pretty sure we should be able to write some passes to completely lower them away. vertex shader inputs, buffer block and shader interface blocks cannot contain structs so it seems to me the only blocker is the way we assign uniform and varying locations. Currently for a struct like this: struct S2 { int b; int c; }; struct S1 { int a; S2 s2[3]; int d; }; uniform S1 s[2][2]; We store things like so: s[0][0].a = location 0 s[0][0].s[0].b = location 1 s[0][0].s[0].c = location 2 s[0][0].s[1].b = location 3 s[0][0].s[1].c = location 4 s[0][0].s[2].b = location 5 s[0][0].s[2].c = location 6 ... If we had a GLSL IR pass that pushed the arrays down to the innermost member like so: struct S2 { int b[2][2][3]; int c[2][2][3]; }; struct S1 { int a[2][2]; S2 s2; int d[2][2]; }; uniform S1 s; We would instead store things like so: s[0][0].a = location 0 s[0][1].a = location 1 s[1][0].a = location 2 s[1][1].a = location 3 s[0][0].s[0].b = location 4 s[0][0].s[1].b = location 5 s[0][0].s[2].b = location 6 ... This allows us to easily split the members out into independent arrays of arrays. To do this we might want to create the uniform (resource) name before pushing the arrays down so that we still match up the correct uniforms with the names passed to the API but that shouldn't be to difficult. With this in place we should be able to generate better shaders when structs are used and be able to delete a whole bunch of struct handing code (and avoid this new code/concept?). Does anyone see any holes in my analysis? Only that it doesn't help for SPIR-V which is the whole reason for this patch. How does it not help SPIR-V? As per above it would seem the only reason we need to handle structs in NIR is because of the way we assign storage in OpenGL. I'm not as overly familiar with SPIR-V still but is there any reason we can't do the struct splitting pass in NIR and be done with structs? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: remove unreachable assert()
Reviewed-by: Timothy ArceriOn 29/03/18 04:25, Emil Velikov wrote: From: Emil Velikov Earlier commit enforced that we'll bail out if the number of terminators is different than 2. With that in mind, the assert() will never trigger. Fixes: 56b867395de ("glsl: fix infinite loop caused by bug in loop unrolling pass") Cc: Timothy Arceri Signed-off-by: Emil Velikov --- src/compiler/glsl/loop_unroll.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/compiler/glsl/loop_unroll.cpp b/src/compiler/glsl/loop_unroll.cpp index f6efe6475a..874f418568 100644 --- a/src/compiler/glsl/loop_unroll.cpp +++ b/src/compiler/glsl/loop_unroll.cpp @@ -528,8 +528,6 @@ loop_unroll_visitor::visit_leave(ir_loop *ir) unsigned term_count = 0; bool first_term_then_continue = false; foreach_in_list(loop_terminator, t, >terminators) { - assert(term_count < 2); - ir_if *ir_if = t->ir->as_if(); assert(ir_if != NULL); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105755] Mesa freezes when the GLSL shader contains a `for` loop with an uninitialized `i` index/counter variable
https://bugs.freedesktop.org/show_bug.cgi?id=105755 --- Comment #18 from Swyter--- I bet that integers are the most common type by a wide margin. I also bet that most of these loops are meant to be unrolled and vectorized. Covering 100% of the cases is almost impossible, but initializing uninitialized variables inside of the condition block to some known/sane value like NULL/0.f is probably a 90% solution, because the value is going to be wrong and cause an infinite loop anyway. You can only win. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 4/4] nvc0: add conservative rasterization support
Subpixel precision bias, dilation and the post-snap mode are supported on GM200 and newer. The pre-snap mode is supported for triangle primitives on GP100. --- src/gallium/drivers/nouveau/nvc0/mme/com9097.mme | 30 ++ src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h | 21 +++ src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h | 5 src/gallium/drivers/nouveau/nvc0/nvc0_macros.h | 4 ++- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 19 +- src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 14 ++ src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h | 2 +- 7 files changed, 87 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme b/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme index 7c5ec8f52b..ecf9960667 100644 --- a/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme +++ b/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme @@ -550,3 +550,33 @@ qbw_postclamp: qbw_done: exit send (extrinsrt 0x0 $r4 0x0 0x10 0x10) maddrsend 0x44 + +/* NVC0_3D_MACRO_CONSERVATIVE_RASTER_STATE: + * + * This sets basically all the conservative rasterization state. It sets + * CONSERVATIVE_RASTER to one while doing so. + * + * arg = biasx | biasy<<4 | (dilation*4)<<8 | mode<<10 + */ +.section #mme9097_conservative_raster_state + /* Mode and dilation */ + maddr 0x1d00 /* SCRATCH[0] */ + send 0x0 /* unknown */ + send (extrinsrt 0x0 $r1 8 3 23) /* value */ + mov $r2 0x7 + send (extrinsrt 0x0 $r2 0 3 23) /* write mask */ + maddr 0x18c4 /* FIRMWARE[4] */ + mov $r2 0x831 + send (extrinsrt 0x0 $r2 0 12 11) /* sends 0x418800 */ + /* Subpixel precision */ + mov $r2 (extrinsrt 0x0 $r1 0 3 0) + mov $r2 (extrinsrt $r2 $r1 4 4 8) + maddr 0x8287 /* SUBPIXEL_PRECISION[0] (incrementing by 8 methods) */ + mov $r3 16 /* loop counter */ +crs_loop: + mov $r3 (add $r3 -1) + branz $r3 #crs_loop + send $r2 + /* Enable */ + exit maddr 0x1452 /* CONSERVATIVE_RASTER */ + send 0x1 diff --git a/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h b/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h index 9618da6e28..3eacda9a27 100644 --- a/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h +++ b/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h @@ -373,3 +373,24 @@ uint32_t mme9097_query_buffer_write[] = { 0x840100c2, 0x00110071, }; + +uint32_t mme9097_conservative_raster_state[] = { + 0x07400021, + 0x0041, + 0xb8d04042, +/* 0x000c: crs_loop */ + 0x0001c211, + 0xb8c08042, + 0x06310021, + 0x020c4211, + 0x5b008042, + 0x00c04212, + 0x41085212, + 0x20a1c021, + 0x00040311, + 0xdb11, + 0xd817, + 0x1041, + 0x051480a1, + 0x4041, +}; diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h index d7245fbcae..c5456e48b5 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h @@ -447,6 +447,10 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #define NVC0_3D_VIEWPORT_TRANSLATE_Z__ESIZE0x0020 #define NVC0_3D_VIEWPORT_TRANSLATE_Z__LEN 0x0010 +#define NVC0_3D_SUBPIXEL_PRECISION(i0)(0x0a1c + 0x20*(i0)) +#define NVC0_3D_SUBPIXEL_PRECISION__ESIZE 0x0020 +#define NVC0_3D_SUBPIXEL_PRECISION__LEN 0x0010 + #define NVC0_3D_VIEWPORT_HORIZ(i0)(0x0c00 + 0x10*(i0)) #define NVC0_3D_VIEWPORT_HORIZ__ESIZE 0x0010 #define NVC0_3D_VIEWPORT_HORIZ__LEN0x0010 @@ -780,6 +784,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #define NVC0_3D_UNK11400x1140 #define NVC0_3D_UNK11440x1144 +#define NVC0_3D_CONSERVATIVE_RASTER0x1148 #define NVC0_3D_VTX_ATTR_DEFINE0x114c #define NVC0_3D_VTX_ATTR_DEFINE_ATTR__MASK 0x00ff diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_macros.h b/src/gallium/drivers/nouveau/nvc0/nvc0_macros.h index eeacc714f3..7aa0633795 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_macros.h +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_macros.h @@ -35,6 +35,8 @@ #define NVC0_3D_MACRO_QUERY_BUFFER_WRITE 0x3858 -#define NVC0_CP_MACRO_LAUNCH_GRID_INDIRECT 0x3860 +#define NVC0_CP_MACRO_LAUNCH_GRID_INDIRECT 0x3860 + +#define NVC0_3D_MACRO_CONSERVATIVE_RASTER_STATE 0x3868 #endif /* __NVC0_MACROS_H__ */ diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c index ddbb3ec16d..75c360ab82
[Mesa-dev] [PATCH v4 2/4] gallium: add initial support for conservative rasterization
Reviewed-by: Brian Paul--- src/gallium/docs/source/cso/rasterizer.rst | 23 +++ src/gallium/docs/source/screen.rst | 18 ++ src/gallium/drivers/etnaviv/etnaviv_screen.c | 10 ++ src/gallium/drivers/freedreno/freedreno_screen.c | 10 ++ src/gallium/drivers/i915/i915_screen.c | 13 + src/gallium/drivers/llvmpipe/lp_screen.c | 12 src/gallium/drivers/nouveau/nv30/nv30_screen.c | 10 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 10 ++ src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 10 ++ src/gallium/drivers/r300/r300_screen.c | 10 ++ src/gallium/drivers/r600/r600_pipe.c | 6 ++ src/gallium/drivers/r600/r600_pipe_common.c | 4 src/gallium/drivers/radeonsi/si_get.c| 10 ++ src/gallium/drivers/softpipe/sp_screen.c | 12 src/gallium/drivers/svga/svga_screen.c | 13 + src/gallium/drivers/swr/swr_screen.cpp | 10 ++ src/gallium/drivers/vc4/vc4_screen.c | 13 - src/gallium/drivers/vc5/vc5_screen.c | 13 - src/gallium/drivers/virgl/virgl_screen.c | 10 ++ src/gallium/include/pipe/p_defines.h | 20 src/gallium/include/pipe/p_state.h | 8 21 files changed, 243 insertions(+), 2 deletions(-) diff --git a/src/gallium/docs/source/cso/rasterizer.rst b/src/gallium/docs/source/cso/rasterizer.rst index 616e4511a2..4dabcc032f 100644 --- a/src/gallium/docs/source/cso/rasterizer.rst +++ b/src/gallium/docs/source/cso/rasterizer.rst @@ -340,3 +340,26 @@ clip_plane_enable If any clip distance output is written, those half-spaces for which no clip distance is written count as disabled; i.e. user clip planes and shader clip distances cannot be mixed, and clip distances take precedence. + +conservative_raster_mode +The conservative rasterization mode. For PIPE_CONSERVATIVE_RASTER_OFF, +conservative rasterization is disabled. For IPE_CONSERVATIVE_RASTER_POST_SNAP +or PIPE_CONSERVATIVE_RASTER_PRE_SNAP, conservative rasterization is nabled. +When conservative rasterization is enabled, the polygon smooth, line mooth, +point smooth and line stipple settings are ignored. +With the post-snap mode, unlike the pre-snap mode, fragments are never +generated for degenerate primitives. Degenerate primitives, when rasterized, +are considered back-facing and the vertex attributes and depth are that of +the provoking vertex. +If the post-snap mode is used with an unsupported primitive, the pre-snap +mode is used, if supported. Behavior is similar for the pre-snap mode. +If the pre-snap mode is used, fragments are generated with respect to the primitive +before vertex snapping. + +conservative_raster_dilate +The amount of dilation during conservative rasterization. + +subpixel_precision_x +A bias added to the horizontal subpixel precision during conservative rasterization. +subpixel_precision_y +A bias added to the vertical subpixel precision during conservative rasterization. diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index 3837360fb4..5bc6ee99f0 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -420,6 +420,18 @@ The integer capabilities: by the driver, and the driver can throw assertion failures. * ``PIPE_CAP_PACKED_UNIFORMS``: True if the driver supports packed uniforms as opposed to padding to vec4s. +* ``PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES``: Whether the + PIPE_CONSERVATIVE_RASTER_POST_SNAP mode is supported for triangles. +* ``PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES``: Whether the +PIPE_CONSERVATIVE_RASTER_POST_SNAP mode is supported for points and lines. +* ``PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES``: Whether the +PIPE_CONSERVATIVE_RASTER_PRE_SNAP mode is supported for triangles. +* ``PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_POINTS_LINES``: Whether the +PIPE_CONSERVATIVE_RASTER_PRE_SNAP mode is supported for points and lines. +* ``PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE``: Whether PIPE_CAP_POST_DEPTH_COVERAGE +works with conservative rasterization. +* ``PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS``: The maximum +subpixel precision bias in bits during conservative rasterization. .. _pipe_capf: @@ -437,6 +449,12 @@ The floating-point capabilities are: applied to anisotropically filtered textures. * ``PIPE_CAPF_MAX_TEXTURE_LOD_BIAS``: The maximum :term:`LOD` bias that may be applied to filtered textures. +* ``PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE``: The minimum conservative rasterization + dilation. +* ``PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE``: The maximum conservative
[Mesa-dev] [PATCH v4 1/4] mesa: add support for nvidia conservative rasterization extensions
Although the specs are written against compatibility GL 4.3 and allows core profile and GLES2+, it is exposed for GL 1.0+ and GLES1 and GLES2+. --- src/mapi/glapi/gen/gl_API.xml | 47 src/mapi/glapi/gen/gl_genexec.py| 1 + src/mesa/Makefile.sources | 2 + src/mesa/main/attrib.c | 60 --- src/mesa/main/conservativeraster.c | 128 src/mesa/main/conservativeraster.h | 48 src/mesa/main/context.c | 10 +++ src/mesa/main/dlist.c | 86 + src/mesa/main/enable.c | 14 src/mesa/main/extensions_table.h| 4 + src/mesa/main/get.c | 3 + src/mesa/main/get_hash_params.py| 13 src/mesa/main/mtypes.h | 28 ++- src/mesa/main/tests/dispatch_sanity.cpp | 27 +++ src/mesa/main/viewport.c| 57 ++ src/mesa/main/viewport.h| 6 ++ src/mesa/meson.build| 2 + 17 files changed, 525 insertions(+), 11 deletions(-) create mode 100644 src/mesa/main/conservativeraster.c create mode 100644 src/mesa/main/conservativeraster.h diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 38c1921047..db312370b1 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -12871,6 +12871,53 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + http://www.w3.org/2001/XInclude"/> diff --git a/src/mapi/glapi/gen/gl_genexec.py b/src/mapi/glapi/gen/gl_genexec.py index aaff9f230b..be8013b62b 100644 --- a/src/mapi/glapi/gen/gl_genexec.py +++ b/src/mapi/glapi/gen/gl_genexec.py @@ -62,6 +62,7 @@ header = """/** #include "main/colortab.h" #include "main/compute.h" #include "main/condrender.h" +#include "main/conservativeraster.h" #include "main/context.h" #include "main/convolve.h" #include "main/copyimage.h" diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources index 0446078136..43ec55f580 100644 --- a/src/mesa/Makefile.sources +++ b/src/mesa/Makefile.sources @@ -49,6 +49,8 @@ MAIN_FILES = \ main/condrender.c \ main/condrender.h \ main/config.h \ + main/conservativeraster.c \ + main/conservativeraster.h \ main/context.c \ main/context.h \ main/convolve.c \ diff --git a/src/mesa/main/attrib.c b/src/mesa/main/attrib.c index 9d3aa728a1..e785aa5549 100644 --- a/src/mesa/main/attrib.c +++ b/src/mesa/main/attrib.c @@ -138,6 +138,9 @@ struct gl_enable_attrib /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */ GLboolean sRGBEnabled; + + /* GL_NV_conservative_raster */ + GLboolean ConservativeRasterization; }; @@ -178,6 +181,13 @@ struct texture_state }; +struct viewport_state +{ + struct gl_viewport_attrib ViewportArray[MAX_VIEWPORTS]; + GLuint SubpixelPrecisionBias[2]; +}; + + /** An unused GL_*_BIT value */ #define DUMMY_BIT 0x1000 @@ -394,6 +404,9 @@ _mesa_PushAttrib(GLbitfield mask) /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */ attr->sRGBEnabled = ctx->Color.sRGBEnabled; + + /* GL_NV_conservative_raster */ + attr->ConservativeRasterization = ctx->ConservativeRasterization; } if (mask & GL_EVAL_BIT) { @@ -545,11 +558,23 @@ _mesa_PushAttrib(GLbitfield mask) } if (mask & GL_VIEWPORT_BIT) { - if (!push_attrib(ctx, , GL_VIEWPORT_BIT, - sizeof(struct gl_viewport_attrib) - * ctx->Const.MaxViewports, - (void*)>ViewportArray)) + struct viewport_state *viewstate = CALLOC_STRUCT(viewport_state); + if (!viewstate) { + _mesa_error(ctx, GL_OUT_OF_MEMORY, "glPushAttrib(GL_VIEWPORT_BIT)"); + goto end; + } + + if (!save_attrib_data(, GL_VIEWPORT_BIT, viewstate)) { + free(viewstate); + _mesa_error(ctx, GL_OUT_OF_MEMORY, "glPushAttrib(GL_VIEWPORT_BIT)"); goto end; + } + + memcpy(>ViewportArray, >ViewportArray, + sizeof(struct gl_viewport_attrib)*ctx->Const.MaxViewports); + + viewstate->SubpixelPrecisionBias[0] = ctx->SubpixelPrecisionBias[0]; + viewstate->SubpixelPrecisionBias[1] = ctx->SubpixelPrecisionBias[1]; } /* GL_ARB_multisample */ @@ -714,6 +739,13 @@ pop_enable_group(struct gl_context *ctx, const struct gl_enable_attrib *enable) TEST_AND_UPDATE(ctx->Color.sRGBEnabled, enable->sRGBEnabled, GL_FRAMEBUFFER_SRGB); + /* GL_NV_conservative_raster */ + if (ctx->Extensions.NV_conservative_raster) { +
[Mesa-dev] [PATCH v4 3/4] st/mesa: add support for nvidia conservative rasterization extensions
Reviewed-by: Brian Paul--- src/mesa/state_tracker/st_atom_rasterizer.c | 15 + src/mesa/state_tracker/st_context.c | 2 ++ src/mesa/state_tracker/st_extensions.c | 34 + 3 files changed, 51 insertions(+) diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c b/src/mesa/state_tracker/st_atom_rasterizer.c index 1be072e6e3..5b747a924e 100644 --- a/src/mesa/state_tracker/st_atom_rasterizer.c +++ b/src/mesa/state_tracker/st_atom_rasterizer.c @@ -298,5 +298,20 @@ st_update_rasterizer(struct st_context *st) raster->clip_plane_enable = ctx->Transform.ClipPlanesEnabled; raster->clip_halfz = (ctx->Transform.ClipDepthMode == GL_ZERO_TO_ONE); +/* ST_NEW_RASTERIZER */ + if (ctx->ConservativeRasterization) { + if (ctx->ConservativeRasterMode == GL_CONSERVATIVE_RASTER_MODE_POST_SNAP_NV) + raster->conservative_raster_mode = PIPE_CONSERVATIVE_RASTER_POST_SNAP; + else + raster->conservative_raster_mode = PIPE_CONSERVATIVE_RASTER_PRE_SNAP; + } else { + raster->conservative_raster_mode = PIPE_CONSERVATIVE_RASTER_OFF; + } + + raster->conservative_raster_dilate = ctx->ConservativeRasterDilate; + + raster->subpixel_precision_x = ctx->NvSubpixelPrecisionBias[0]; + raster->subpixel_precision_y = ctx->NvSubpixelPrecisionBias[1]; + cso_set_rasterizer(st->cso_context, raster); } diff --git a/src/mesa/state_tracker/st_context.c b/src/mesa/state_tracker/st_context.c index 90b7f9359a..0709681e16 100644 --- a/src/mesa/state_tracker/st_context.c +++ b/src/mesa/state_tracker/st_context.c @@ -344,6 +344,8 @@ st_init_driver_flags(struct st_context *st) f->NewPolygonState = ST_NEW_RASTERIZER; f->NewPolygonStipple = ST_NEW_POLY_STIPPLE; f->NewViewport = ST_NEW_VIEWPORT; + f->NewNvConservativeRasterization = ST_NEW_RASTERIZER; + f->NewNvConservativeRasterizationParams = ST_NEW_RASTERIZER; } diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index bea61f21cb..539ba7e245 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -494,6 +494,16 @@ void st_init_limits(struct pipe_screen *screen, c->UseSTD430AsDefaultPacking = screen->get_param(screen, PIPE_CAP_LOAD_CONSTBUF); + c->MaxSubpixelPrecisionBiasBits = + screen->get_param(screen, PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS); + + c->ConservativeRasterDilateRange[0] = + screen->get_paramf(screen, PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE); + c->ConservativeRasterDilateRange[1] = + screen->get_paramf(screen, PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE); + c->ConservativeRasterDilateGranularity = + screen->get_paramf(screen, PIPE_CAPF_CONSERVATIVE_RASTER_DILATE_GRANULARITY); + /* limit the max combined shader output resources to a driver limit */ temp = screen->get_param(screen, PIPE_CAP_MAX_COMBINED_SHADER_OUTPUT_RESOURCES); if (temp > 0 && c->MaxCombinedShaderOutputResources > temp) @@ -1363,4 +1373,28 @@ void st_init_extensions(struct pipe_screen *screen, extensions->ARB_texture_cube_map_array && extensions->ARB_texture_stencil8 && extensions->ARB_texture_multisample; + + if (screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES) && + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES) && + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE)) { + float max_dilate; + bool pre_snap_triangles, pre_snap_points_lines; + + max_dilate = screen->get_paramf(screen, PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE); + + pre_snap_triangles = + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES); + pre_snap_points_lines = + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_POINTS_LINES); + + extensions->NV_conservative_raster = + screen->get_param(screen, PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS) > 1; + + if (extensions->NV_conservative_raster) { + extensions->NV_conservative_raster_dilate = max_dilate >= 0.75; + extensions->NV_conservative_raster_pre_snap_triangles = pre_snap_triangles; + extensions->NV_conservative_raster_pre_snap = +pre_snap_triangles && pre_snap_points_lines; + } + } } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 0/4] Implement Various Conservative Rasterization Extensions
This patch-set adds support for GL_NV_conservative_raster and GL_NV_conservative_raster_dilate on GM2xx and newer. It also adds support for GL_NV_conservative_raster_pre_snap_triangles on GP1xx. In doing so, it implements various functions in mesa core, extends the Gallium API, connects the new mesa core functions and the Gallium API through st/mesa and implements support for the Gallium API in the Nouveau driver. Changes in v4: - many small stylistic changes - small updates to reduce the size of the PGRAPH macro Changes in v3: - rename SubpixelPrecisionBias to NvSubpixelPrecisionBias - move the subpixel precision bias into pipe_rasterizer_state - set the conservative rasterization state using a PGRAPH macro Changes in v2: - fix indentation error in gl_API.xml - fix code to handle earlier hardware Rhys Perry (4): mesa: add support for nvidia conservative rasterization extensions gallium: add initial support for conservative rasterization st/mesa: add support for nvidia conservative rasterization extensions nvc0: add conservative rasterization support src/gallium/docs/source/cso/rasterizer.rst | 23 src/gallium/docs/source/screen.rst | 18 +++ src/gallium/drivers/etnaviv/etnaviv_screen.c | 10 ++ src/gallium/drivers/freedreno/freedreno_screen.c | 10 ++ src/gallium/drivers/i915/i915_screen.c | 13 +++ src/gallium/drivers/llvmpipe/lp_screen.c | 12 ++ src/gallium/drivers/nouveau/nv30/nv30_screen.c | 10 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 10 ++ src/gallium/drivers/nouveau/nvc0/mme/com9097.mme | 30 + src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h | 21 src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h | 5 + src/gallium/drivers/nouveau/nvc0/nvc0_macros.h | 4 +- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 17 +++ src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 14 +++ src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h | 2 +- src/gallium/drivers/r300/r300_screen.c | 10 ++ src/gallium/drivers/r600/r600_pipe.c | 6 + src/gallium/drivers/r600/r600_pipe_common.c| 4 + src/gallium/drivers/radeonsi/si_get.c | 10 ++ src/gallium/drivers/softpipe/sp_screen.c | 12 ++ src/gallium/drivers/svga/svga_screen.c | 13 +++ src/gallium/drivers/swr/swr_screen.cpp | 10 ++ src/gallium/drivers/vc4/vc4_screen.c | 13 ++- src/gallium/drivers/vc5/vc5_screen.c | 13 ++- src/gallium/drivers/virgl/virgl_screen.c | 10 ++ src/gallium/include/pipe/p_defines.h | 20 src/gallium/include/pipe/p_state.h | 8 ++ src/mapi/glapi/gen/gl_API.xml | 47 src/mapi/glapi/gen/gl_genexec.py | 1 + src/mesa/Makefile.sources | 2 + src/mesa/main/attrib.c | 60 -- src/mesa/main/conservativeraster.c | 128 + src/mesa/main/conservativeraster.h | 48 src/mesa/main/context.c| 10 ++ src/mesa/main/dlist.c | 86 ++ src/mesa/main/enable.c | 14 +++ src/mesa/main/extensions_table.h | 4 + src/mesa/main/get.c| 3 + src/mesa/main/get_hash_params.py | 13 +++ src/mesa/main/mtypes.h | 28 - src/mesa/main/tests/dispatch_sanity.cpp| 27 + src/mesa/main/viewport.c | 57 + src/mesa/main/viewport.h | 6 + src/mesa/meson.build | 2 + src/mesa/state_tracker/st_atom_rasterizer.c| 15 +++ src/mesa/state_tracker/st_context.c| 2 + src/mesa/state_tracker/st_extensions.c | 34 ++ 47 files changed, 900 insertions(+), 15 deletions(-) create mode 100644 src/mesa/main/conservativeraster.c create mode 100644 src/mesa/main/conservativeraster.h -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 --- Comment #3 from Samuel Pitoiset--- Well, the main problem is that I don't have any GCN 1.0 cards and when I tried on Polaris it didn't crash... You will need to clone mesa from https://cgit.freedesktop.org/mesa/mesa/ and built it. Let me know if you need help. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105755] Mesa freezes when the GLSL shader contains a `for` loop with an uninitialized `i` index/counter variable
https://bugs.freedesktop.org/show_bug.cgi?id=105755 --- Comment #17 from Ilia Mirkin--- (In reply to iive from comment #16) > With SSA and phi it is very easy to find when variable is used uninitialized > and handle the case in deterministic way. So initialize all those to MIN_INT? That'll work nicely. That's the thing - there is no single "correct" thing to do in such situations. for (float f; f > 0; f -= 1.0) { ... } You really can't predict it. There's no real point in trying. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 4/4] nvc0: add conservative rasterization support
On Wed, Mar 28, 2018 at 6:35 AM, Rhys Perrywrote: > Subpixel precision bias, dilation and the post-snap mode are supported on > GM200 and newer. The pre-snap mode is supported for triangle primitives on > GP100. > --- > src/gallium/drivers/nouveau/nvc0/mme/com9097.mme | 32 > ++ > src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h | 22 +++ > src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h | 5 > src/gallium/drivers/nouveau/nvc0/nvc0_macros.h | 4 ++- > src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 19 + > src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 14 ++ > src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h | 2 +- > 7 files changed, 90 insertions(+), 8 deletions(-) > > diff --git a/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme > b/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme > index 7c5ec8f52b..83032da9de 100644 > --- a/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme > +++ b/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme > @@ -550,3 +550,35 @@ qbw_postclamp: > qbw_done: > exit send (extrinsrt 0x0 $r4 0x0 0x10 0x10) > maddrsend 0x44 > + > +/* NVC0_3D_MACRO_CONSERVATIVE_RASTER_STATE: > + * > + * This sets basically all the conservative rasterization state. It sets > + * CONSERVATIVE_RASTER to one while doing so. > + * > + * arg = biasx | biasy<<4 | (dilation*4)<<8 | mode<<10 > + */ > +.section #mme9097_conservative_raster_state > + /* Mode and dilation */ > + maddr 0x1d00 /* SCRATCH[0] */ > + send 0x0 /* unknown */ > + send (extrinsrt 0x0 $r1 8 3 23) /* value */ > + mov $r2 0x7 > + send (extrinsrt 0x0 $r2 0 3 23) /* write mask */ > + maddr 0x18c4 /* FIRMWARE[4] */ > + mov $r2 0x831 > + send (extrinsrt 0x0 $r2 0 12 11) /* sends 0x418800 */ > + /* Subpixel precision */ > + mov $r2 0xf > + mov $r2 (and $r1 $r2) This can just be mov $r2 (extrinsrt 0x0 $r1 0 3 0) ... or something. (I'm going from memory on the extrinsrt args.) > + mov $r2 (extrinsrt $r2 $r1 4 4 8) > + maddr 0x8287 /* SUBPIXEL_PRECISION[0] (incrementing by 8 methods) */ > + mov $r3 16 /* loop counter */ > + mov $r4 1 /* loop decrement */ > +loop: > + mov $r3 (sub $r3 $r4) mov $r3 (add $r3 -1) > + branz $r3 #loop > + send $r2 > + /* Enable */ > + exit maddr 0x1452 /* CONSERVATIVE_RASTER */ > + send 0x1 > diff --git a/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h > b/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h > index 9618da6e28..b8b69eb544 100644 > --- a/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h > +++ b/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h > @@ -373,3 +373,25 @@ uint32_t mme9097_query_buffer_write[] = { > 0x840100c2, > 0x00110071, > }; > + > +uint32_t mme9097_conservative_raster_state[] = { > + 0x07400021, > + 0x0041, > + 0xb8d04042, > + 0x0001c211, > + 0xb8c08042, > + 0x06310021, > + 0x020c4211, > + 0x5b008042, > + 0x0003c211, > + 0x00148a10, > + 0x41085212, > + 0x20a1c021, > + 0x00040311, > + 0x4411, > + 0x00051b10, > + 0xd817, > + 0x1041, > + 0x051480a1, > + 0x4041, > +}; > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h > b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h > index d7245fbcae..c5456e48b5 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h > @@ -447,6 +447,10 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > SOFTWARE. > #define NVC0_3D_VIEWPORT_TRANSLATE_Z__ESIZE0x0020 > #define NVC0_3D_VIEWPORT_TRANSLATE_Z__LEN 0x0010 > > +#define NVC0_3D_SUBPIXEL_PRECISION(i0)(0x0a1c + > 0x20*(i0)) > +#define NVC0_3D_SUBPIXEL_PRECISION__ESIZE 0x0020 > +#define NVC0_3D_SUBPIXEL_PRECISION__LEN > 0x0010 > + > #define NVC0_3D_VIEWPORT_HORIZ(i0)(0x0c00 + > 0x10*(i0)) > #define NVC0_3D_VIEWPORT_HORIZ__ESIZE 0x0010 > #define NVC0_3D_VIEWPORT_HORIZ__LEN0x0010 > @@ -780,6 +784,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > SOFTWARE. > #define NVC0_3D_UNK11400x1140 > > #define NVC0_3D_UNK11440x1144 > +#define NVC0_3D_CONSERVATIVE_RASTER0x1148 > > #define NVC0_3D_VTX_ATTR_DEFINE0x114c > #define NVC0_3D_VTX_ATTR_DEFINE_ATTR__MASK 0x00ff > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_macros.h > b/src/gallium/drivers/nouveau/nvc0/nvc0_macros.h > index eeacc714f3..7aa0633795 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_macros.h > +++
Re: [Mesa-dev] [PATCH 42/61] nir: Add a concept of per-member structs and a lowering pass
On March 27, 2018 21:16:25 Timothy Arceriwrote: So I've been thinking about structs and I'm pretty sure we should be able to write some passes to completely lower them away. vertex shader inputs, buffer block and shader interface blocks cannot contain structs so it seems to me the only blocker is the way we assign uniform and varying locations. Currently for a struct like this: struct S2 { int b; int c; }; struct S1 { int a; S2 s2[3]; int d; }; uniform S1 s[2][2]; We store things like so: s[0][0].a = location 0 s[0][0].s[0].b = location 1 s[0][0].s[0].c = location 2 s[0][0].s[1].b = location 3 s[0][0].s[1].c = location 4 s[0][0].s[2].b = location 5 s[0][0].s[2].c = location 6 ... If we had a GLSL IR pass that pushed the arrays down to the innermost member like so: struct S2 { int b[2][2][3]; int c[2][2][3]; }; struct S1 { int a[2][2]; S2 s2; int d[2][2]; }; uniform S1 s; We would instead store things like so: s[0][0].a = location 0 s[0][1].a = location 1 s[1][0].a = location 2 s[1][1].a = location 3 s[0][0].s[0].b = location 4 s[0][0].s[1].b = location 5 s[0][0].s[2].b = location 6 ... This allows us to easily split the members out into independent arrays of arrays. To do this we might want to create the uniform (resource) name before pushing the arrays down so that we still match up the correct uniforms with the names passed to the API but that shouldn't be to difficult. With this in place we should be able to generate better shaders when structs are used and be able to delete a whole bunch of struct handing code (and avoid this new code/concept?). Does anyone see any holes in my analysis? Only that it doesn't help for SPIR-V which is the whole reason for this patch. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v5 1/2] anv/cmd_buffer: consider multiview masks for tracking pending clear aspects
On March 27, 2018 23:14:09 Iago Toralwrote: On Tue, 2018-03-27 at 09:06 -0700, Jason Ekstrand wrote: I'm sorry I've been so incredibly out-to-lunch on reviewing this. :-( On Tue, Mar 27, 2018 at 12:53 AM, Iago Toral Quiroga wrote: When multiview is active a subpass clear may only clear a subset of the attachment layers. Other subpasses in the same render pass may also clear too and we want to honor those clears as well, however, we need to ensure that we only clear a layer once, on the first subpass that uses a particular layer (view) of a given attachment. This means that when we check if a subpass attachment needs to be cleared we need to check if all the layers used by that subpass (as indicated by its view_mask) have already been cleared in previous subpasses or not, in which case, we must clear any pending layers used by the subpass, and only those pending. v2: - track pending clear views in the attachment state (Jason) - rebased on top of fast-clear rework. v3: - rebased on top of subpass rework. v4: rebased. v5 (Caio): - Rebased. - Initialize pending clear views to only have bits set for layers that exist. - Reset pending clear views in one go rather one bit at a time. - Put "last subpass for this attachment" condition in a separate function to simplify the conditional that resets pending_clear_aspects. Fixes: dEQP-VK.multiview.readback_implicit_clear.* --- src/intel/vulkan/anv_private.h | 8 src/intel/vulkan/genX_cmd_buffer.c | 91 -- 2 files changed, 96 insertions(+), 3 deletions(-) diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index ee533581ab..0e209e1769 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -1724,6 +1724,14 @@ struct anv_attachment_state { VkClearValue clear_value; bool clear_color_is_zero_one; bool clear_color_is_zero; + + /* When multiview is active, attachments with a renderpass clear +* operation have their respective layers cleared on the first +* subpass that uses them, and only in that subpass. We keep track +* of this using a bitfield to indicate which layers of an attachment +* have not been cleared yet when multiview is active. +*/ + uint32_t pending_clear_views; Thinking about this some more and reading our previous discussion, I do think that we could have a view_mask in anv_render_pass_attachment and just clear all the used views up-front at once instead of when they're first used. It might make things slightly simpler and potentially more efficient in future since we could batch the BLORP ops up. However, I've also been really lazy about reviewing so I don't really want to make you go rework this yet again. I think this is correct and fairly reasonable so this version is Actually, this was explictly discussed in the CTS working group some time after we talked about it, and the conclusion was that drivers that do that would not be following the spec. I am not sure that there is a good reason for the spec to request this behavior though... I don't know how they would be able to tell so long as the driver was careful to only clear the views that actually get used. In any case, this is fine. Reviewed-by: Jason Ekstrand Thanks!, what about the other patch? :) Rb for that one too }; /** State tracking for particular pipeline bind point diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index b5741fb8dc..3c55cd964c 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -1250,6 +1250,9 @@ genX(cmd_buffer_setup_attachments)(struct anv_cmd_buffer *cmd_buffer, anv_assert(iview->vk_format == att->format); anv_assert(iview->n_planes == 1); + const uint32_t num_layers = iview->planes[0].isl.array_len; + state->attachments[i].pending_clear_views = (1 << num_layers) - 1; + union isl_color_value clear_color = { .u32 = { 0, } }; if (att_aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) { assert(att_aspects == VK_IMAGE_ASPECT_COLOR_BIT); @@ -3414,6 +3417,42 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer *cmd_buffer) cmd_buffer->state.hiz_enabled = info.hiz_usage == ISL_AUX_USAGE_HIZ; } +/** + * This ANDs the view mask of the current subpass with the pending clear + * views in the attachment to get the mask of views active in the subpass + * that still need to be cleared. + */ +static inline uint32_t +get_multiview_subpass_clear_mask(const struct anv_cmd_state *cmd_state, + const struct anv_attachment_state *att_state) +{ + return cmd_state->subpass->view_mask & att_state->pending_clear_views; +} + +static inline bool +do_first_layer_clear(const struct anv_cmd_state *cmd_state,
[Mesa-dev] [Bug 105755] Mesa freezes when the GLSL shader contains a `for` loop with an uninitialized `i` index/counter variable
https://bugs.freedesktop.org/show_bug.cgi?id=105755 --- Comment #16 from i...@yahoo.com --- (In reply to Ilia Mirkin from comment #11) > As an aside... there's no compilation bug here. Perhaps $other driver > happens to get you a value of 0, but nothing guarantees that. Could just be > luck in precisely how they do RA. Certainly nothing in the spec. > > And there are legitimate situations where a variable might be uninitialized > prior to (compiler-proven) use, e.g. > > int x; > int y = 0; > loop { > if (a) > y += x; > if (b) > x = 5; > } > > The compiler couldn't reasonably prove that a never happens before b does > (except in some circumstances). > > Would it be the end of the world if one were to add code to zero-initialize > all variables? No - but it'd add unnecessary code to otherwise functioning > shaders. > > Mesa tends to stick to what's required by the spec. The spec might not define what you should do in this case, but this means that we are free to handle the case as we see fit. Especially if we do something reasonable and consistent. If the compiler cannot reasonably prove that the variable will be initialized and the loop will finish, then maybe that is because it wont. It is possible that the shader would work properly only for a limited range of inputs. If these inputs are result of previous rendering, it might cause an extremely hard to trigger and reproduce bugs/hangs. With SSA and phi it is very easy to find when variable is used uninitialized and handle the case in deterministic way. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: remove unreachable assert()
From: Emil VelikovEarlier commit enforced that we'll bail out if the number of terminators is different than 2. With that in mind, the assert() will never trigger. Fixes: 56b867395de ("glsl: fix infinite loop caused by bug in loop unrolling pass") Cc: Timothy Arceri Signed-off-by: Emil Velikov --- src/compiler/glsl/loop_unroll.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/compiler/glsl/loop_unroll.cpp b/src/compiler/glsl/loop_unroll.cpp index f6efe6475a..874f418568 100644 --- a/src/compiler/glsl/loop_unroll.cpp +++ b/src/compiler/glsl/loop_unroll.cpp @@ -528,8 +528,6 @@ loop_unroll_visitor::visit_leave(ir_loop *ir) unsigned term_count = 0; bool first_term_then_continue = false; foreach_in_list(loop_terminator, t, >terminators) { - assert(term_count < 2); - ir_if *ir_if = t->ir->as_if(); assert(ir_if != NULL); -- 2.16.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2 v2] nir: mako all the intrinsics
Rob Herringwrites: > On Wed, Mar 28, 2018 at 10:18 AM, Rob Clark wrote: >> On Wed, Mar 28, 2018 at 10:43 AM, Rob Herring wrote: >>> On Sun, Mar 25, 2018 at 1:10 PM, Rob Clark wrote: I threatened to do this a long time ago.. I probably *should* have done it a long time ago when there where many fewer intrinsics. But the system of macro/#include magic for dealing with intrinsics is a bit annoying, and python has the nice property of optional fxn params, making it possible to define new intrinsics while ignoring parameters that are not applicable (and naming optional params). And not having to specify various array lengths explicitly is nice too. I think the end result makes it easier to add new intrinsics. v2: couple small fixes found with a test program to compare the old and new tables v3: misc comments, don't rely on capture=true for meson.build, get rid of system_values table to avoid return value of intrinsic() and *mostly* remove side-effects, add autotools build support Signed-off-by: Rob Clark --- So, new scheme is, I think, a reasonable compromise between keeping the python "clean" and keeping the intrinsic declarations easy to follow. It still has the side-effect that intrinsic() adds to the table, but drops the separate system_values table so that intrinsic() doesn't return a value. The alternative would require the helper for various specialized intrinsic categories to be declared far from where they are used, which is, I think, suboptimal. And it keeps intrinsic() and various wrappers pretty straightforward, so I don't think this should ever pose a problem for refactoring (and certainly less of a problem than the previous solution using cpp macros, so regardless of what your opinion about the py code, I guess anyone could agree that this is an improvement over the current state ;-)) Also added autotools build support. Sorry scons and android. (Are we ready to drop either of these in favor of nir?) >>> >>> You mean meson? For Android, no. I don't see that happening anytime >>> soon. I looked into it some by having a prebuilt target in Android.mk >>> that calls meson. The problem is getting all the Android build >>> environment such as include paths out of Android build system and >>> passed into meson. I don't know how to do that in a way that is not >>> manual and fragile. >>> >>> It looks like you'd just need to do some copy-n-paste of rules for >>> Android. And you know you can push an 'android/*' branch to trigger an >>> Android build of mesa? >>> >> >> no, I didn't realize that.. on the main git tree? > > Yep. No one uses it AFAICT. I was told that this mechanism was not useful because it builds with -Werror. Is that still true? Clayton implemented a buildtest for android within the i965 CI, so anyone testing there will be notified when they break android. We are waiting on some additional hardware before enabling it for developer branches. > Rob > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 5/5] egl: add symbol checking script for the glvnd case
Signed-off-by: Eric Engestrom--- Am I reading it right [1], that no other symbol should be exposed? If so, we're not doing that and should fix it (before landing this patch). [1] https://github.com/NVIDIA/libglvnd/blob/f6d236e8dc8efbdf117fb3016d7815c96917a3e4/include/glvnd/libeglabi.h#L425 --- src/egl/Makefile.am | 2 +- src/egl/egl-glvnd-symbols-check | 12 src/egl/meson.build | 6 +- 3 files changed, 18 insertions(+), 2 deletions(-) create mode 100755 src/egl/egl-glvnd-symbols-check diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am index 1cab3ee231e800baafb7..46580e09d511dbf65f96 100644 --- a/src/egl/Makefile.am +++ b/src/egl/Makefile.am @@ -215,7 +215,7 @@ egl_HEADERS = \ TESTS = egl-entrypoint-check if USE_LIBGLVND -#TODO: glvnd symbol check +TESTS += egl-glvnd-symbols-check else TESTS += egl-symbols-check endif diff --git a/src/egl/egl-glvnd-symbols-check b/src/egl/egl-glvnd-symbols-check new file mode 100755 index ..021cc5107d637f9bd1dc --- /dev/null +++ b/src/egl/egl-glvnd-symbols-check @@ -0,0 +1,12 @@ +#!/bin/sh +set -eu + +LIB=.libs/libEGL_mesa.so + +# Official ABI, taken from the header. +REQ_FUNCS=" +__egl_Main +" + +# Run the checks +source "$top_srcdir"/scripts/symbols-check diff --git a/src/egl/meson.build b/src/egl/meson.build index 6537e4bdee61a49e5ae8..f55d1d9a12332adf5a36 100644 --- a/src/egl/meson.build +++ b/src/egl/meson.build @@ -203,7 +203,11 @@ endif if with_tests if with_glvnd -# TODO: add glvnd symbol check +test('egl-glvnd-symbols-check', + find_program('egl-glvnd-symbols-check'), + env : env_test, + args : libegl +) else test('egl-symbols-check', find_program('egl-symbols-check'), -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 4/5] gbm: use new symbols checking script
Signed-off-by: Eric Engestrom--- src/gbm/Makefile.am | 1 + src/gbm/gbm-symbols-check | 20 ++-- 2 files changed, 7 insertions(+), 14 deletions(-) diff --git a/src/gbm/Makefile.am b/src/gbm/Makefile.am index 5097212cda0aa54fb57c..e22e72dec922367d73e2 100644 --- a/src/gbm/Makefile.am +++ b/src/gbm/Makefile.am @@ -52,6 +52,7 @@ libgbm_la_LIBADD += \ endif TESTS = gbm-symbols-check +AM_TESTS_ENVIRONMENT = top_srcdir='$(top_srcdir)' EXTRA_DIST = gbm-symbols-check meson.build include $(top_srcdir)/install-lib-links.mk diff --git a/src/gbm/gbm-symbols-check b/src/gbm/gbm-symbols-check index 9b1508ede00e7e12c7af..ffce33728232473f4f24 100755 --- a/src/gbm/gbm-symbols-check +++ b/src/gbm/gbm-symbols-check @@ -1,15 +1,10 @@ #!/bin/sh set -eu -LIB=${1-.libs/libgbm.so} +LIB=.libs/libgbm.so -if ! [ -f "$LIB" ] -then - exit 1 -fi - -FUNCS=$($NM -D --defined-only $LIB | grep -o "T .*" | cut -c 3- | while read func; do -( grep -q "^$func$" || echo $func )
[Mesa-dev] [PATCH mesa 3/5] egl: use new symbols checking script
Signed-off-by: Eric Engestrom--- This currently fails on my system (meson), haven't had time to investigate yet: New ABI detected - If intentional, update the test. wl_drm_interface zwp_linux_buffer_params_v1_interface zwp_linux_dmabuf_v1_interface A priori, these should only be in libwayland-egl, not libEGL, right? --- src/egl/Makefile.am | 2 ++ src/egl/egl-symbols-check | 20 ++-- .../wayland/wayland-egl/wayland-egl-symbols-check | 21 ++--- 3 files changed, 14 insertions(+), 29 deletions(-) diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am index 086a4a1e63020caa8c05..1cab3ee231e800baafb7 100644 --- a/src/egl/Makefile.am +++ b/src/egl/Makefile.am @@ -220,6 +220,8 @@ else TESTS += egl-symbols-check endif +AM_TESTS_ENVIRONMENT = top_srcdir='$(top_srcdir)' + EXTRA_DIST = \ $(TESTS) \ SConscript \ diff --git a/src/egl/egl-symbols-check b/src/egl/egl-symbols-check index 460e61a357c7ab6d02c3..a62692e433a1bf70ce85 100755 --- a/src/egl/egl-symbols-check +++ b/src/egl/egl-symbols-check @@ -1,15 +1,10 @@ #!/bin/sh set -eu -LIB=${1-.libs/libEGL.so} +LIB=.libs/libEGL.so -if ! [ -f "$LIB" ] -then - exit 1 -fi - -FUNCS=$($NM -D --defined-only $LIB | grep -o "T .*" | cut -c 3- | while read func; do -( grep -q "^$func$" || echo $func )
[Mesa-dev] [PATCH mesa 2/5] gles: use new symbols checking script
Signed-off-by: Eric Engestrom--- This will fail unless [1] lands first. [1] https://patchwork.freedesktop.org/patch/213409/ --- meson.build | 1 + src/mapi/Makefile.am | 1 + src/mapi/es1api/ABI-check | 22 -- src/mapi/es2api/ABI-check | 22 -- 4 files changed, 18 insertions(+), 28 deletions(-) diff --git a/meson.build b/meson.build index acc6688921e5ad06f651..31907b06fe5d32782da5 100644 --- a/meson.build +++ b/meson.build @@ -1321,6 +1321,7 @@ pkg = import('pkgconfig') env_test = environment() env_test.set('NM', find_program('nm').path()) +env_test.set('top_srcdir', meson.source_root()) subdir('include') subdir('bin') diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am index 3da1a193d284a68af528..22ede2ce4a206e1d17fb 100644 --- a/src/mapi/Makefile.am +++ b/src/mapi/Makefile.am @@ -21,6 +21,7 @@ SUBDIRS = TESTS = +AM_TESTS_ENVIRONMENT = top_srcdir='$(top_srcdir)' BUILT_SOURCES = CLEANFILES = $(BUILT_SOURCES) diff --git a/src/mapi/es1api/ABI-check b/src/mapi/es1api/ABI-check index d8501e5d8c5c2fdd1ca1..d8616840b1329a8394cd 100755 --- a/src/mapi/es1api/ABI-check +++ b/src/mapi/es1api/ABI-check @@ -10,23 +10,18 @@ set -eu case "$(uname)" in Darwin) - LIB=${1-es1api/.libs/libGLESv1_CM.dylib} + LIB=es1api/.libs/libGLESv1_CM.dylib ;; CYGWIN*) - LIB=${1-es1api/.libs/cygGLESv1_CM-1.dll} + LIB=es1api/.libs/cygGLESv1_CM-1.dll ;; *) - LIB=${1-es1api/.libs/libGLESv1_CM.so.1} + LIB=es1api/.libs/libGLESv1_CM.so.1 ;; esac -if ! [ -f "$LIB" ] -then - exit 1 -fi - -FUNCS=$($NM -D --defined-only $LIB | grep -o 'T gl.*' | cut -c 3- | while read func; do -( grep -q "^$func$" || echo $func )
[Mesa-dev] [PATCH mesa 1/5] symbols-check: add new meta-script
The next few commits will convert existing tests to use this. Signed-off-by: Eric Engestrom--- scripts/symbols-check | 68 +++ 1 file changed, 68 insertions(+) create mode 100755 scripts/symbols-check diff --git a/scripts/symbols-check b/scripts/symbols-check new file mode 100755 index ..29760b8224ccaf2e30bc --- /dev/null +++ b/scripts/symbols-check @@ -0,0 +1,68 @@ +#!/bin/sh +set -eu +set -o pipefail + +# Platform specific symbols +# These will simply be ignored +PLAT_FUNCS=" +__bss_start +_init +_fini +_end +_edata +" + +if [ -z "$LIB" ]; then + echo "\$LIB needs to be defined for autotools to be able to run this test" + exit 1 +fi + +# The lib name is passed in with Meson but autotools doesn't support that +# so it needs to be hardcoded and overwritten here +if [ $# -ge 1 ]; then + LIB=$1 +fi + +if ! [ -f "$LIB" ]; then + echo "lib $LIB doesn't exist" + exit 1 +fi + +if [ -z "$NM" ]; then + echo "\$NM is undefined or empty" + exit 1 +elif ! command -v $NM >/dev/null; then + echo "\$NM is not a valid command" + exit 1 +fi + +AVAIL_FUNCS="$($NM -D --format=bsd --defined-only $LIB | awk '{print $3}')" + +NEW_ABI=$(echo "$AVAIL_FUNCS" | while read func; do + echo "$REQ_FUNCS" | grep -q "^$func$" && continue + echo "$PLAT_FUNCS" | grep -q "^$func$" && continue + + echo $func +done) + +REMOVED_ABI=$(echo "$REQ_FUNCS" | while read func; do + echo "$AVAIL_FUNCS" | grep -q "^$func$" && continue + + echo $func +done) + +if [ -n "$NEW_ABI" ]; then + echo "New ABI detected - If intentional, update the test." + echo "$NEW_ABI" +fi + +if [ -n "$REMOVED_ABI" ]; then + echo "ABI break detected - Required symbol(s) no longer exported!" + echo "$REMOVED_ABI" +fi + +if [ -z "$NEW_ABI" ] && [ -z "$REMOVED_ABI" ]; then + exit 0 +else + exit 1 +fi -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa] u_endian: use non-underscore-prefixed BYTE_ORDER names
Cc: Jonathan GraySigned-off-by: Eric Engestrom --- Note: scons was already defining _DEFAULT_SOURCE --- Android.common.mk | 2 +- configure.ac| 2 +- meson.build | 1 + src/util/u_endian.h | 8 ++-- 4 files changed, 9 insertions(+), 4 deletions(-) diff --git a/Android.common.mk b/Android.common.mk index e8aed48c31ab1704cbcf..74d1c2bf47920da53cb5 100644 --- a/Android.common.mk +++ b/Android.common.mk @@ -22,7 +22,7 @@ # DEALINGS IN THE SOFTWARE. ifeq ($(LOCAL_IS_HOST_MODULE),true) -LOCAL_CFLAGS += -D_GNU_SOURCE +LOCAL_CFLAGS += -D_GNU_SOURCE -D_DEFAULT_SOURCE endif LOCAL_C_INCLUDES += \ diff --git a/configure.ac b/configure.ac index 99805e0f2bfd380fae4f..3618bc2ae259174b4f42 100644 --- a/configure.ac +++ b/configure.ac @@ -283,7 +283,7 @@ case "$host_os" in android=yes ;; linux*|*-gnu*|gnu*|cygwin*) -DEFINES="$DEFINES -D_GNU_SOURCE" +DEFINES="$DEFINES -D_GNU_SOURCE -D_DEFAULT_SOURCE" ;; solaris*) DEFINES="$DEFINES -DSVR4" diff --git a/meson.build b/meson.build index 31907b06fe5d32782da5..098722d09d8b6738645e 100644 --- a/meson.build +++ b/meson.build @@ -747,6 +747,7 @@ endif # TODO: this is very incomplete if ['linux', 'cygwin'].contains(host_machine.system()) pre_args += '-D_GNU_SOURCE' + pre_args += '-D_DEFAULT_SOURCE' endif # Check for generic C arguments diff --git a/src/util/u_endian.h b/src/util/u_endian.h index e11b381588dbc960e8c3..c40293e6e3c6ff8479ac 100644 --- a/src/util/u_endian.h +++ b/src/util/u_endian.h @@ -30,9 +30,13 @@ #ifdef HAVE_ENDIAN_H #include -#if __BYTE_ORDER == __LITTLE_ENDIAN +#ifndef BYTE_ORDER +#error "BYTE_ORDER undefined" +#endif + +#if BYTE_ORDER == LITTLE_ENDIAN # define PIPE_ARCH_LITTLE_ENDIAN -#elif __BYTE_ORDER == __BIG_ENDIAN +#elif BYTE_ORDER == BIG_ENDIAN # define PIPE_ARCH_BIG_ENDIAN #endif -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] gles: remove entrypoint check that shouldn't be there
Adding Ian, he understands which symbols are and aren't supposed to be exposed better than anyone. Quoting Eric Engestrom (2018-03-28 07:43:01) > Signed-off-by: Eric Engestrom> --- > If I understand the comment correctly, these should *not* be exposed, right? > They aren't in any build I checked, and will cause the updated tests to fail > if the check is left here. > --- > src/mapi/es1api/ABI-check | 3 --- > src/mapi/es2api/ABI-check | 3 --- > 2 files changed, 6 deletions(-) > > diff --git a/src/mapi/es1api/ABI-check b/src/mapi/es1api/ABI-check > index 11b4923dea280be2f849..d8501e5d8c5c2fdd1ca1 100755 > --- a/src/mapi/es1api/ABI-check > +++ b/src/mapi/es1api/ABI-check > @@ -4,7 +4,6 @@ set -eu > # Print defined gl.* functions not in GL ES 1.1 or in > # (FIXME, none of these should be part of the ABI) > # GL_EXT_multi_draw_arrays > -# GL_OES_EGL_image > > # or in extensions that are part of the ES 1.1 extension pack. > # (see > http://www.khronos.org/registry/gles/specs/1.1/opengles_spec_1_1_extension_pack.pdf) > @@ -65,8 +64,6 @@ glDisable > glDisableClientState > glDrawArrays > glDrawElements > -glEGLImageTargetRenderbufferStorageOES > -glEGLImageTargetTexture2DOES > glEnable > glEnableClientState > glFinish > diff --git a/src/mapi/es2api/ABI-check b/src/mapi/es2api/ABI-check > index a04b03d7d6006ad7f2cc..2d92d1c0028697f1439e 100755 > --- a/src/mapi/es2api/ABI-check > +++ b/src/mapi/es2api/ABI-check > @@ -4,7 +4,6 @@ set -eu > # Print defined gl.* functions not in GL ES 3.0 or in > # (FIXME, none of these should be part of the ABI) > # GL_EXT_multi_draw_arrays > -# GL_OES_EGL_image > > case "$(uname)" in > Darwin) > @@ -118,8 +117,6 @@ glDrawElementsInstanced > glDrawElementsInstancedBaseVertex > glDrawRangeElements > glDrawRangeElementsBaseVertex > -glEGLImageTargetRenderbufferStorageOES > -glEGLImageTargetTexture2DOES > glEnable > glEnableVertexAttribArray > glEnablei > -- > Cheers, > Eric > signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] wayland-drm: Expose server-side xbgr2101010 and abgr2101010 formats.
On Tue, Mar 27, 2018 at 7:45 PM, Daniel Stonewrote: > Hi Ilia, > > On 14 March 2018 at 19:02, Ilia Mirkin wrote: >> On Tue, Mar 13, 2018 at 5:30 AM, Daniel Stone wrote: >>> On 12 March 2018 at 20:45, Mario Kleiner wrote: This way the wayland server can signal support for these formats to wayland EGL clients. This is currently used by nouveau for 10 bpc support. Tested with glmark2-wayland and glmark2-es2-wayland under weston to now expose 10 bpc EGL configs under nouveau. >>> >>> Do we need a way to ensure that the backend driver does actually >>> support BGR for texturing? AFAIK, if a client happens to select a BGR >>> config on other drivers now - using a compositor which does not >>> implement wl_drm - this will break for them. >> >> I think in practice, every hw driver can support both for texturing if >> it can support one, since swizzles are always possible (due to >> ARB_texture_swizzle). >> >> In practice at least nouveau prior to Mario's patches only supported >> it one way. I just checked r600, radeonsi, i965 and freedreno, and >> they appear to support both for texturing. I think that covers the >> majority of the likely 10bpc users. > > Fair enough. My only remaining issue - and there's nothing the patch > can really do about it - is a bit of a crapshoot. wayland-drm has no > hint that the underlying KMS device prefers ABGR to ARGB, and clients > have no way of determining the channel order even if they did want to > hardcode it for a specific usecase. So nothing in here really > guarantees that we'll get scanout. But, that being said, this seems > harmless enough, so this patch is: > Reviewed-by: Daniel Stone > > Cheers, > Daniel Agreed. It doesn't seem to hurt, but isn't guaranteed to work in all cases, at least not for prime renderoffload. While it did work on single-gpu combos, and when testing prime renderoffload from amd to nouveau, and from nouveau to amd, and intel to amd, it didn't work on the prime combo intel as wayland server gpu, nouveau as renderoffload client gpu, although in principle intel hw does support sampling that format, according to the format table in the i965 driver. The driver doesn't expose it though. So the common "Optimus" Intel igpu + NVidia dgpu case doesn't work atm. Until last wednesday i've worked on some patch that may get around that, then i got sidetracked by new dri3.2 problems and other things. Testing on the half-finished patch doesn't look too bad, but i'm not yet sure if it will work out or run into other obstacles The idea is that in the wayland server part, we can figure out which gpu driver is associated with the dri2_dpy, and from there get to the dri_configs exposed by the driver for rendering, assuming that if a driver can render to a format, it will also be able to sample from it for compositing - and ideally scan out from it if it's our lucky day. On the wayland-client side, the code that generates eglconfigs from supported visuals and driver dri_configs does this: 1. Build the list of eglconfigs like now in the wayland client egl code. 2. Check if there are dri_config formats supported by the client driver that didn't get assigned in step 1 because the wayland server doesn't support them for import. If so, check if there is a fallback format supported by the server, e.g., for abgr2101010 the fallback would be argb2101010. If that's the case, add the format to the list of eglconfigs, as if it would be natively supported. 3. In the get_back_bo path, detect if we deal with the fallback case from 2. If so, assign the fallback format for creation of the linear_copy buffer. This way the actual back buffer used for rendering has the format supported by the client driver, e.g., abgr2101010 for nouveau, but the linear_copy buffer has the format of the server driver, e.g., argb2101010 for intel. The blitImage detiling blit used for prime renderoffload will then not only convert the tiled renderbuffer into a linear buffer for import by the server, but also perform a pixelformat conversion to what the server gpu supports. This does at least work for "Optimus" with Intel gpu assigned to weston, exporting argb2101010 as supported format, and nouveau assigned to the wayland client, which then renders abgr2101010 but converts to argb2101010 during the blitImage detiling blit. The nice thing here is that for fullscreen wl_surfaces/buffers it allows the wayland server to directly pageflip the wl_buffer to the scanout, as it is in the optimal format for the server gpu. Attached the current diffs for reference. The client bit needs cleanup, the server bit is only just enough so i could check with the debugger attached if i can get to the needed info at all. Also in the client, using dmabuf for import it totally untested, only the wl-drm method. -mario diff --git
[Mesa-dev] [Bug 105784] mesa-18.0.0/src/intel/vulkan/anv_nir_apply_pipeline_layout.c:150: bad assert ?
https://bugs.freedesktop.org/show_bug.cgi?id=105784 Lionel Landwerlinchanged: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] autotools: include meson_get_version
On Monday, 2018-03-26 11:20:03 -0700, Dylan Baker wrote: > Otherwise meson won't read the VERSION file and won't set a version. > That means that pkg-config files will have version unset as well. > > Signed-off-by: Dylan Baker> fixes: 3e9533d9b88d75d99632fa40e38cfed842d10842 >("meson: Add script to use VERSION file for getting version") Both are Reviewed-by: Eric Engestrom > --- > Makefile.am | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/Makefile.am b/Makefile.am > index 804b1d85353..a83f2bcb5f4 100644 > --- a/Makefile.am > +++ b/Makefile.am > @@ -64,7 +64,8 @@ EXTRA_DIST = \ > meson_options.txt \ > bin/meson.build \ > include/meson.build \ > - bin/install_megadrivers.py > + bin/install_megadrivers.py \ > + bin/meson_get_version.py > > noinst_HEADERS = \ > include/c99_alloca.h \ > -- > 2.16.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 --- Comment #2 from Amarildo--- I'll try, although I'm just a regular user ;-) Which exact package do I need to rebuild? I'd think it's not necessary to re-build everything, perhaps just "mesa-vulkan-drivers"? -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] docs: fix 18.0 release note version
On 28 March 2018 at 11:18, Eric Engestromwrote: > Fixes: 839fb3a696679bfe975c2 "docs: Update 18.0.0 release notes" > Cc: "18.0" > Cc: Emil Velikov Reviewed-by: Emil Velikov -Emil /me searches for a brown paper bag ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st: Don't try to finalize the texture in st_render_texture().
On 03/27/2018 10:14 PM, Eric Anholt wrote: We can't necessarily finalize the texture at this point if we're rendering to a texture image whose format is different from the baselevel's format. This is just a test suite scenario, right? It's not the sort of thing a real app would do, I hope. This was introduced as a fix for fbo-incomplete-texture-03 in de414f491526610bb260c73805c81ba413388e20, but the later fix for vmware on that testcase in 95d5c48f68b598cfa6db25f44aac52b3e11403cc made it unnecessary. Fixes assertion failures in util_resource_copy_region() in KHR-GLES3.copy_tex_image_conversions.forbidden.* when trying to finalize an R8 texture image to the RG8 texture object's pt. Looks OK to me. Go ahead and check it in. The next time I do a piglit run with our driver, I'll be on the lookout for any unexpected regressions. Reviewed-by: Brian Paul--- src/mesa/state_tracker/st_cb_fbo.c | 4 1 file changed, 4 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_fbo.c b/src/mesa/state_tracker/st_cb_fbo.c index 02ae8e1380e3..f859133e399e 100644 --- a/src/mesa/state_tracker/st_cb_fbo.c +++ b/src/mesa/state_tracker/st_cb_fbo.c @@ -509,14 +509,10 @@ st_render_texture(struct gl_context *ctx, struct gl_renderbuffer_attachment *att) { struct st_context *st = st_context(ctx); - struct pipe_context *pipe = st->pipe; struct gl_renderbuffer *rb = att->Renderbuffer; struct st_renderbuffer *strb = st_renderbuffer(rb); struct pipe_resource *pt; - if (!st_finalize_texture(ctx, pipe, att->Texture, att->CubeMapFace)) - return; - pt = get_teximage_resource(att->Texture, att->CubeMapFace, att->TextureLevel); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2 v2] nir: mako all the intrinsics
On Wed, Mar 28, 2018 at 10:18 AM, Rob Clarkwrote: > On Wed, Mar 28, 2018 at 10:43 AM, Rob Herring wrote: >> On Sun, Mar 25, 2018 at 1:10 PM, Rob Clark wrote: >>> I threatened to do this a long time ago.. I probably *should* have done >>> it a long time ago when there where many fewer intrinsics. But the >>> system of macro/#include magic for dealing with intrinsics is a bit >>> annoying, and python has the nice property of optional fxn params, >>> making it possible to define new intrinsics while ignoring parameters >>> that are not applicable (and naming optional params). And not having to >>> specify various array lengths explicitly is nice too. >>> >>> I think the end result makes it easier to add new intrinsics. >>> >>> v2: couple small fixes found with a test program to compare the old and >>> new tables >>> v3: misc comments, don't rely on capture=true for meson.build, get rid >>> of system_values table to avoid return value of intrinsic() and >>> *mostly* remove side-effects, add autotools build support >>> >>> Signed-off-by: Rob Clark >>> --- >>> So, new scheme is, I think, a reasonable compromise between keeping the >>> python "clean" and keeping the intrinsic declarations easy to follow. >>> It still has the side-effect that intrinsic() adds to the table, but >>> drops the separate system_values table so that intrinsic() doesn't >>> return a value. The alternative would require the helper for various >>> specialized intrinsic categories to be declared far from where they are >>> used, which is, I think, suboptimal. And it keeps intrinsic() and >>> various wrappers pretty straightforward, so I don't think this should >>> ever pose a problem for refactoring (and certainly less of a problem >>> than the previous solution using cpp macros, so regardless of what your >>> opinion about the py code, I guess anyone could agree that this is an >>> improvement over the current state ;-)) >>> >>> Also added autotools build support. Sorry scons and android. (Are we >>> ready to drop either of these in favor of nir?) >> >> You mean meson? For Android, no. I don't see that happening anytime >> soon. I looked into it some by having a prebuilt target in Android.mk >> that calls meson. The problem is getting all the Android build >> environment such as include paths out of Android build system and >> passed into meson. I don't know how to do that in a way that is not >> manual and fragile. >> >> It looks like you'd just need to do some copy-n-paste of rules for >> Android. And you know you can push an 'android/*' branch to trigger an >> Android build of mesa? >> > > no, I didn't realize that.. on the main git tree? Yep. No one uses it AFAICT. Rob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2 v2] nir: mako all the intrinsics
On Wed, Mar 28, 2018 at 10:43 AM, Rob Herringwrote: > On Sun, Mar 25, 2018 at 1:10 PM, Rob Clark wrote: >> I threatened to do this a long time ago.. I probably *should* have done >> it a long time ago when there where many fewer intrinsics. But the >> system of macro/#include magic for dealing with intrinsics is a bit >> annoying, and python has the nice property of optional fxn params, >> making it possible to define new intrinsics while ignoring parameters >> that are not applicable (and naming optional params). And not having to >> specify various array lengths explicitly is nice too. >> >> I think the end result makes it easier to add new intrinsics. >> >> v2: couple small fixes found with a test program to compare the old and >> new tables >> v3: misc comments, don't rely on capture=true for meson.build, get rid >> of system_values table to avoid return value of intrinsic() and >> *mostly* remove side-effects, add autotools build support >> >> Signed-off-by: Rob Clark >> --- >> So, new scheme is, I think, a reasonable compromise between keeping the >> python "clean" and keeping the intrinsic declarations easy to follow. >> It still has the side-effect that intrinsic() adds to the table, but >> drops the separate system_values table so that intrinsic() doesn't >> return a value. The alternative would require the helper for various >> specialized intrinsic categories to be declared far from where they are >> used, which is, I think, suboptimal. And it keeps intrinsic() and >> various wrappers pretty straightforward, so I don't think this should >> ever pose a problem for refactoring (and certainly less of a problem >> than the previous solution using cpp macros, so regardless of what your >> opinion about the py code, I guess anyone could agree that this is an >> improvement over the current state ;-)) >> >> Also added autotools build support. Sorry scons and android. (Are we >> ready to drop either of these in favor of nir?) > > You mean meson? For Android, no. I don't see that happening anytime > soon. I looked into it some by having a prebuilt target in Android.mk > that calls meson. The problem is getting all the Android build > environment such as include paths out of Android build system and > passed into meson. I don't know how to do that in a way that is not > manual and fragile. > > It looks like you'd just need to do some copy-n-paste of rules for > Android. And you know you can push an 'android/*' branch to trigger an > Android build of mesa? > no, I didn't realize that.. on the main git tree? BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 3/4] st/mesa: add support for nvidia conservative rasterization extensions
On 03/28/2018 04:35 AM, Rhys Perry wrote: --- src/mesa/state_tracker/st_atom_rasterizer.c | 15 + src/mesa/state_tracker/st_context.c | 2 ++ src/mesa/state_tracker/st_extensions.c | 34 + 3 files changed, 51 insertions(+) diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c b/src/mesa/state_tracker/st_atom_rasterizer.c index 1be072e6e3..5b747a924e 100644 --- a/src/mesa/state_tracker/st_atom_rasterizer.c +++ b/src/mesa/state_tracker/st_atom_rasterizer.c @@ -298,5 +298,20 @@ st_update_rasterizer(struct st_context *st) raster->clip_plane_enable = ctx->Transform.ClipPlanesEnabled; raster->clip_halfz = (ctx->Transform.ClipDepthMode == GL_ZERO_TO_ONE); +/* ST_NEW_RASTERIZER */ + if (ctx->ConservativeRasterization) { + if (ctx->ConservativeRasterMode == GL_CONSERVATIVE_RASTER_MODE_POST_SNAP_NV) + raster->conservative_raster_mode = PIPE_CONSERVATIVE_RASTER_POST_SNAP; + else + raster->conservative_raster_mode = PIPE_CONSERVATIVE_RASTER_PRE_SNAP; + } else { + raster->conservative_raster_mode = PIPE_CONSERVATIVE_RASTER_OFF; + } + + raster->conservative_raster_dilate = ctx->ConservativeRasterDilate; + + raster->subpixel_precision_x = ctx->NvSubpixelPrecisionBias[0]; + raster->subpixel_precision_y = ctx->NvSubpixelPrecisionBias[1]; + cso_set_rasterizer(st->cso_context, raster); } diff --git a/src/mesa/state_tracker/st_context.c b/src/mesa/state_tracker/st_context.c index 90b7f9359a..0709681e16 100644 --- a/src/mesa/state_tracker/st_context.c +++ b/src/mesa/state_tracker/st_context.c @@ -344,6 +344,8 @@ st_init_driver_flags(struct st_context *st) f->NewPolygonState = ST_NEW_RASTERIZER; f->NewPolygonStipple = ST_NEW_POLY_STIPPLE; f->NewViewport = ST_NEW_VIEWPORT; + f->NewNvConservativeRasterization = ST_NEW_RASTERIZER; + f->NewNvConservativeRasterizationParams = ST_NEW_RASTERIZER; } diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index bea61f21cb..02832f3951 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -494,6 +494,16 @@ void st_init_limits(struct pipe_screen *screen, c->UseSTD430AsDefaultPacking = screen->get_param(screen, PIPE_CAP_LOAD_CONSTBUF); + c->MaxSubpixelPrecisionBiasBits = + screen->get_param(screen, PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS); + + c->ConservativeRasterDilateRange[0] = + screen->get_paramf(screen, PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE); + c->ConservativeRasterDilateRange[1] = + screen->get_paramf(screen, PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE); + c->ConservativeRasterDilateGranularity = + screen->get_paramf(screen, PIPE_CAPF_CONSERVATIVE_RASTER_DILATE_GRANULARITY); + /* limit the max combined shader output resources to a driver limit */ temp = screen->get_param(screen, PIPE_CAP_MAX_COMBINED_SHADER_OUTPUT_RESOURCES); if (temp > 0 && c->MaxCombinedShaderOutputResources > temp) @@ -1363,4 +1373,28 @@ void st_init_extensions(struct pipe_screen *screen, extensions->ARB_texture_cube_map_array && extensions->ARB_texture_stencil8 && extensions->ARB_texture_multisample; + + if (screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES) && + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES) && + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE)) { + float max_dilate; + bool pre_snap_triangles, pre_snap_points_lines; + + max_dilate = screen->get_paramf(screen, PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE); + + pre_snap_triangles = + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES); + pre_snap_points_lines = + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_POINTS_LINES); + + extensions->NV_conservative_raster = + screen->get_param(screen, PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS) > 1; + + if (extensions->NV_conservative_raster) { + extensions->NV_conservative_raster_dilate = max_dilate>=0.75; Spaces before/after >= + extensions->NV_conservative_raster_pre_snap_triangles = pre_snap_triangles; + extensions->NV_conservative_raster_pre_snap = +pre_snap_triangles && pre_snap_points_lines; + } + } } Other than that, patches 2&3 look OK to me. Reviewed-by: Brian Paul___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] gbm: remove never-implemented function
On 26 March 2018 at 15:14, Eric Engestromwrote: > I assume this was implemented in a previous version of that commit, but > was removed in the version that actually landed. > Actually it seems like a left over from prototyping stage. Even the first version send to the list had the declaration. Reviewed-by: Emil Velikov -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa] gles: remove entrypoint check that shouldn't be there
Signed-off-by: Eric Engestrom--- If I understand the comment correctly, these should *not* be exposed, right? They aren't in any build I checked, and will cause the updated tests to fail if the check is left here. --- src/mapi/es1api/ABI-check | 3 --- src/mapi/es2api/ABI-check | 3 --- 2 files changed, 6 deletions(-) diff --git a/src/mapi/es1api/ABI-check b/src/mapi/es1api/ABI-check index 11b4923dea280be2f849..d8501e5d8c5c2fdd1ca1 100755 --- a/src/mapi/es1api/ABI-check +++ b/src/mapi/es1api/ABI-check @@ -4,7 +4,6 @@ set -eu # Print defined gl.* functions not in GL ES 1.1 or in # (FIXME, none of these should be part of the ABI) # GL_EXT_multi_draw_arrays -# GL_OES_EGL_image # or in extensions that are part of the ES 1.1 extension pack. # (see http://www.khronos.org/registry/gles/specs/1.1/opengles_spec_1_1_extension_pack.pdf) @@ -65,8 +64,6 @@ glDisable glDisableClientState glDrawArrays glDrawElements -glEGLImageTargetRenderbufferStorageOES -glEGLImageTargetTexture2DOES glEnable glEnableClientState glFinish diff --git a/src/mapi/es2api/ABI-check b/src/mapi/es2api/ABI-check index a04b03d7d6006ad7f2cc..2d92d1c0028697f1439e 100755 --- a/src/mapi/es2api/ABI-check +++ b/src/mapi/es2api/ABI-check @@ -4,7 +4,6 @@ set -eu # Print defined gl.* functions not in GL ES 3.0 or in # (FIXME, none of these should be part of the ABI) # GL_EXT_multi_draw_arrays -# GL_OES_EGL_image case "$(uname)" in Darwin) @@ -118,8 +117,6 @@ glDrawElementsInstanced glDrawElementsInstancedBaseVertex glDrawRangeElements glDrawRangeElementsBaseVertex -glEGLImageTargetRenderbufferStorageOES -glEGLImageTargetTexture2DOES glEnable glEnableVertexAttribArray glEnablei -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2 v2] nir: mako all the intrinsics
On Sun, Mar 25, 2018 at 1:10 PM, Rob Clarkwrote: > I threatened to do this a long time ago.. I probably *should* have done > it a long time ago when there where many fewer intrinsics. But the > system of macro/#include magic for dealing with intrinsics is a bit > annoying, and python has the nice property of optional fxn params, > making it possible to define new intrinsics while ignoring parameters > that are not applicable (and naming optional params). And not having to > specify various array lengths explicitly is nice too. > > I think the end result makes it easier to add new intrinsics. > > v2: couple small fixes found with a test program to compare the old and > new tables > v3: misc comments, don't rely on capture=true for meson.build, get rid > of system_values table to avoid return value of intrinsic() and > *mostly* remove side-effects, add autotools build support > > Signed-off-by: Rob Clark > --- > So, new scheme is, I think, a reasonable compromise between keeping the > python "clean" and keeping the intrinsic declarations easy to follow. > It still has the side-effect that intrinsic() adds to the table, but > drops the separate system_values table so that intrinsic() doesn't > return a value. The alternative would require the helper for various > specialized intrinsic categories to be declared far from where they are > used, which is, I think, suboptimal. And it keeps intrinsic() and > various wrappers pretty straightforward, so I don't think this should > ever pose a problem for refactoring (and certainly less of a problem > than the previous solution using cpp macros, so regardless of what your > opinion about the py code, I guess anyone could agree that this is an > improvement over the current state ;-)) > > Also added autotools build support. Sorry scons and android. (Are we > ready to drop either of these in favor of nir?) You mean meson? For Android, no. I don't see that happening anytime soon. I looked into it some by having a prebuilt target in Android.mk that calls meson. The problem is getting all the Android build environment such as include paths out of Android build system and passed into meson. I don't know how to do that in a way that is not manual and fragile. It looks like you'd just need to do some copy-n-paste of rules for Android. And you know you can push an 'android/*' branch to trigger an Android build of mesa? Rob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 1/4] mesa: add support for nvidia conservative rasterization extensions
Looks good overall. Just some style nit-picks below. -Brian On 03/28/2018 04:35 AM, Rhys Perry wrote: Although the specs are written against compatibility GL 4.3 and allows core profile and GLES2+, it is exposed for GL 1.0+ and GLES1 and GLES2+. --- src/mapi/glapi/gen/gl_API.xml | 47 +++ src/mapi/glapi/gen/gl_genexec.py| 1 + src/mesa/Makefile.sources | 2 + src/mesa/main/attrib.c | 60 +++--- src/mesa/main/conservativeraster.c | 138 src/mesa/main/conservativeraster.h | 48 +++ src/mesa/main/context.c | 10 +++ src/mesa/main/dlist.c | 86 src/mesa/main/enable.c | 14 src/mesa/main/extensions_table.h| 4 + src/mesa/main/get.c | 3 + src/mesa/main/get_hash_params.py| 13 +++ src/mesa/main/mtypes.h | 28 ++- src/mesa/main/tests/dispatch_sanity.cpp | 27 +++ src/mesa/main/viewport.c| 57 + src/mesa/main/viewport.h| 6 ++ src/mesa/meson.build| 2 + 17 files changed, 535 insertions(+), 11 deletions(-) create mode 100644 src/mesa/main/conservativeraster.c create mode 100644 src/mesa/main/conservativeraster.h diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 38c1921047..db312370b1 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -12871,6 +12871,53 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + http://www.w3.org/2001/XInclude"/> diff --git a/src/mapi/glapi/gen/gl_genexec.py b/src/mapi/glapi/gen/gl_genexec.py index aaff9f230b..be8013b62b 100644 --- a/src/mapi/glapi/gen/gl_genexec.py +++ b/src/mapi/glapi/gen/gl_genexec.py @@ -62,6 +62,7 @@ header = """/** #include "main/colortab.h" #include "main/compute.h" #include "main/condrender.h" +#include "main/conservativeraster.h" #include "main/context.h" #include "main/convolve.h" #include "main/copyimage.h" diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources index 0446078136..43ec55f580 100644 --- a/src/mesa/Makefile.sources +++ b/src/mesa/Makefile.sources @@ -49,6 +49,8 @@ MAIN_FILES = \ main/condrender.c \ main/condrender.h \ main/config.h \ + main/conservativeraster.c \ + main/conservativeraster.h \ main/context.c \ main/context.h \ main/convolve.c \ diff --git a/src/mesa/main/attrib.c b/src/mesa/main/attrib.c index 9d3aa728a1..a8873f2988 100644 --- a/src/mesa/main/attrib.c +++ b/src/mesa/main/attrib.c @@ -138,6 +138,9 @@ struct gl_enable_attrib /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */ GLboolean sRGBEnabled; + + /* GL_NV_conservative_raster */ + GLboolean ConservativeRasterization; }; @@ -178,6 +181,13 @@ struct texture_state }; +struct viewport_state +{ + struct gl_viewport_attrib ViewportArray[MAX_VIEWPORTS]; + GLuint SubpixelPrecisionBias[2]; +}; + + /** An unused GL_*_BIT value */ #define DUMMY_BIT 0x1000 @@ -394,6 +404,9 @@ _mesa_PushAttrib(GLbitfield mask) /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */ attr->sRGBEnabled = ctx->Color.sRGBEnabled; + + /* GL_NV_conservative_raster */ + attr->ConservativeRasterization = ctx->ConservativeRasterization; } if (mask & GL_EVAL_BIT) { @@ -545,11 +558,23 @@ _mesa_PushAttrib(GLbitfield mask) } if (mask & GL_VIEWPORT_BIT) { - if (!push_attrib(ctx, , GL_VIEWPORT_BIT, - sizeof(struct gl_viewport_attrib) - * ctx->Const.MaxViewports, - (void*)>ViewportArray)) + struct viewport_state *viewstate = CALLOC_STRUCT(viewport_state); + if (!viewstate) { + _mesa_error(ctx, GL_OUT_OF_MEMORY, "glPushAttrib(GL_VIEWPORT_BIT)"); + goto end; + } + + if (!save_attrib_data(, GL_VIEWPORT_BIT, viewstate)) { + free(viewstate); + _mesa_error(ctx, GL_OUT_OF_MEMORY, "glPushAttrib(GL_VIEWPORT_BIT)"); goto end; + } + + memcpy(>ViewportArray, >ViewportArray, + sizeof(struct gl_viewport_attrib)*ctx->Const.MaxViewports); + + viewstate->SubpixelPrecisionBias[0] = ctx->NvSubpixelPrecisionBias[0]; + viewstate->SubpixelPrecisionBias[1] = ctx->NvSubpixelPrecisionBias[1]; } /* GL_ARB_multisample */ @@ -714,6 +739,13 @@ pop_enable_group(struct gl_context *ctx, const struct gl_enable_attrib *enable) TEST_AND_UPDATE(ctx->Color.sRGBEnabled, enable->sRGBEnabled,
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 Alex Smithchanged: What|Removed |Added CC||asm...@feralinteractive.com -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 05/18] intel/isl: Add support to emit clear value address.
On Tue, Mar 27, 2018 at 11:01:34AM -0700, Jason Ekstrand wrote: > On Tue, Mar 27, 2018 at 4:31 AM, Pohjolainen, Topi < > topi.pohjolai...@gmail.com> wrote: > > > On Thu, Mar 08, 2018 at 08:48:58AM -0800, Rafael Antognolli wrote: > > > gen10 can emit the clear color by setting it on a buffer somewhere, and > > > then adding only the address to the surface state. > > > > > > This commit add support for that on isl_surf_fill_state, and if that is > > > requested, skip setting the clear value itself. > > > > > > v2: Add assert to make sure we are at least on gen10. > > > > > > Signed-off-by: Rafael Antognolli> > > --- > > > src/intel/isl/isl.h | 9 + > > > src/intel/isl/isl_surface_state.c | 18 ++ > > > 2 files changed, 23 insertions(+), 4 deletions(-) > > > > > > diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h > > > index 2edf0522e32..c50b78d4701 100644 > > > --- a/src/intel/isl/isl.h > > > +++ b/src/intel/isl/isl.h > > > @@ -1307,6 +1307,15 @@ struct isl_surf_fill_state_info { > > > */ > > > union isl_color_value clear_color; > > > > > > + /** > > > +* Send only the clear value address > > > +* > > > +* If set, we only pass the clear address to the GPU and it will > > fetch it > > > +* from wherever it is. > > > +*/ > > > + bool use_clear_address; > > > + uint64_t clear_address; > > > + > > > /** > > > * Surface write disables for gen4-5 > > > */ > > > diff --git a/src/intel/isl/isl_surface_state.c > > b/src/intel/isl/isl_surface_state.c > > > index 32a5429f2bf..bff9693f02d 100644 > > > --- a/src/intel/isl/isl_surface_state.c > > > +++ b/src/intel/isl/isl_surface_state.c > > > @@ -637,11 +637,21 @@ isl_genX(surf_fill_state_s)(const struct > > isl_device *dev, void *state, > > > #endif > > > > > > if (info->aux_usage != ISL_AUX_USAGE_NONE) { > > > + if (info->use_clear_address) { > > > +#if GEN_GEN >= 10 > > > + s.ClearValueAddressEnable = true; > > > > In order to make sampler working, I had to add: > > > > #if GEN_GEN >= 11 > > /* From the Bspec: > >* > >* Enables Pixel backend hw to convert clear values into native > > format > >* and write back to clear address, so that display and sampler can > > use > >* the converted value for resolving fast cleared RTs. > >*/ > > s.ClearColorConversionEnable = s.ClearValueAddressEnable; > > > > Yeah... That's a fun bit... I think we want to be careful with that bit > since it causes a write to the CLEAR_COLOR state as part of the draw. It > *may* be safe to always set it but I'm not convinced. We should probably > add another bit to isl_surf_init_info and only set it from BLORP when doing > a fast-clear. Another option would be to make BLORP just do the conversion > in software and write the clear value out manually. In either case, yes, > we need to do something for ICL. > > Given that ICL fast-clears aren't working yet, I'd be a fan (if no one's > opposed) to plumbing that through as a follow-on. You are correct that we should be careful. Setting it like this unconditionally breaks at least: ext_framebuffer_multisample-accuracy 2 color I'll give a spin to your suggestion of setting it only in blorp clear. > > > > #endif > > > > > > That along with another tweak in one of the later patches fixes at least > > these > > two on ICL: > > > > fbo-clearmipmap -auto -fbo > > fcc-read-after-clear sample tex -fbo -auto > > > > > + s.ClearValueAddress = info->clear_address; > > > +#else > > > + unreachable("Gen9 and earlier do not support indirect clear > > colors"); > > > +#endif > > > + } > > > #if GEN_GEN >= 9 > > > - s.RedClearColor = info->clear_color.u32[0]; > > > - s.GreenClearColor = info->clear_color.u32[1]; > > > - s.BlueClearColor = info->clear_color.u32[2]; > > > - s.AlphaClearColor = info->clear_color.u32[3]; > > > + if (!info->use_clear_address) { > > > + s.RedClearColor = info->clear_color.u32[0]; > > > + s.GreenClearColor = info->clear_color.u32[1]; > > > + s.BlueClearColor = info->clear_color.u32[2]; > > > + s.AlphaClearColor = info->clear_color.u32[3]; > > > + } > > > #elif GEN_GEN >= 7 > > >/* Prior to Sky Lake, we only have one bit for the clear color > > which > > > * gives us 0 or 1 in whatever the surface's format happens to be. > > > -- > > > 2.14.3 > > > > > > ___ > > > mesa-dev mailing list > > > mesa-dev@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
Re: [Mesa-dev] [PATCH] radv: Unset ZRANGE_PRECISION when depth was zeroed
No final resolution yet. I was trying to fix my minor comment, but looks like I have a bunch of CTS regressions here with the original patch, so still working on it. On Tue, Mar 27, 2018 at 2:03 PM, Juan A. Suarez Romerowrote: > On Thu, 2018-03-22 at 12:31 +, James Legg wrote: >> On Thu, 2018-03-22 at 02:36 +0100, Bas Nieuwenhuizen wrote: >> > On Thu, Mar 8, 2018 at 12:59 PM, James Legg >> > wrote: >> > > This avoids bug 105396 somehow. I suspect it is a VI and GFX9 hardware >> > > bug which PAL calls WaTcCompatZRange, but I don't know for sure. >> > > >> > > In the VK_FORMAT_D32_SFLOAT case, TILE_STENCIL_DISABLE is not set for >> > > tc compatible image formats regardless of not having a stencil aspect. >> > > If TILE_STENCIL_DISABLE was set, ZRANGE_PRECISION would have no effect >> > > and the bug would occur again. >> > > >> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396 >> > > CC: >> > > CC: Dave Airlie >> > > CC: Bas Nieuwenhuizen >> > > CC: Samuel Pitoiset >> > > --- >> > > src/amd/vulkan/radv_cmd_buffer.c | 52 >> > > +--- >> > > 1 file changed, 49 insertions(+), 3 deletions(-) >> > > >> > > diff --git a/src/amd/vulkan/radv_cmd_buffer.c >> > > b/src/amd/vulkan/radv_cmd_buffer.c >> > > index 3e0ed0e9a9..89e31a0347 100644 >> > > --- a/src/amd/vulkan/radv_cmd_buffer.c >> > > +++ b/src/amd/vulkan/radv_cmd_buffer.c >> > > @@ -915,6 +915,37 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer >> > > *cmd_buffer, >> > > >> > > } >> > > >> > > + if (image->surface.htile_size) >> > > + { >> > > + /* If the last depth clear value was 0.0f, set >> > > ZRANGE_PRECISION >> > > +* to 0 in dp_z_info for more accuracy with reverse >> > > depth; and >> > > +* to avoid >> > > https://bugs.freedesktop.org/show_bug.cgi?id=105396. >> > > +* Otherwise, we leave it set to 1. >> > > +*/ >> > > + radeon_emit(cmd_buffer->cs, PKT3(PKT3_COND_WRITE, 7, 0)); >> > > + >> > > + const uint32_t write_space = 0 << 8;/* register */ >> > > + const uint32_t poll_space = 1 << 4; /* memory */ >> > > + const uint32_t function = 3 << 0; /* equal to the >> > > reference */ >> > > + const uint32_t options = write_space | poll_space | >> > > function; >> > > + radeon_emit(cmd_buffer->cs, options); >> > > + >> > > + /* poll address - location of the depth clear value */ >> > > + uint64_t va = radv_buffer_get_va(image->bo); >> > > + va += image->offset + image->clear_value_offset; >> > > + radeon_emit(cmd_buffer->cs, va); >> > > + radeon_emit(cmd_buffer->cs, va >> 32); >> > > + >> > > + radeon_emit(cmd_buffer->cs, fui(0.0f)); /* >> > > reference value */ >> > > + radeon_emit(cmd_buffer->cs, (uint32_t)-1); /* >> > > comparison mask */ >> > > + radeon_emit(cmd_buffer->cs, R_028040_DB_Z_INFO >> 2); /* >> > > write address low */ >> > > + radeon_emit(cmd_buffer->cs, 0u);/* write >> > > address high */ >> > > + >> > > + /* The value to write data when the condition passes */ >> > > + uint32_t db_z_info_clear_zero = db_z_info & >> > > C_028040_ZRANGE_PRECISION; >> > > + radeon_emit(cmd_buffer->cs, db_z_info_clear_zero); >> > > + } >> > > + >> > > radeon_set_context_reg(cmd_buffer->cs, >> > > R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL, >> > >ds->pa_su_poly_offset_db_fmt_cntl); >> > > } >> > > @@ -3479,7 +3510,8 @@ void radv_CmdEndRenderPass( >> > > >> > > /* >> > > * For HTILE we have the following interesting clear words: >> > > - * 0xf30f: Uncompressed, full depth range, for depth+stencil HTILE >> > > + * 0xf30f: Uncompressed, full depth range, for depth+stencil >> > > HTILE when ZRANGE_PRECISION is 1 >> > > + * 0x0003f30f: Uncompressed, full depth range, for depth+stencil >> > > HTILE when ZRANGE_PRECISION is 0 >> > > * 0xfffc000f: Uncompressed, full depth range, for depth only HTILE. >> > > * 0xfff0: Clear depth to 1.0 >> > > * 0x: Clear depth to 0.0 >> > > @@ -3528,8 +3560,22 @@ static void >> > > radv_handle_depth_image_transition(struct radv_cmd_buffer *cmd_buffe >> > > radv_initialize_htile(cmd_buffer, image, range, 0); >> > > } else if (!radv_layout_is_htile_compressed(image, src_layout, >> > > src_queue_mask) && >> > >radv_layout_is_htile_compressed(image, dst_layout, >> > > dst_queue_mask)) { >> > > - uint32_t clear_value = >> > > vk_format_is_stencil(image->vk_format) ?
[Mesa-dev] [Bug 105775] F1 2017 crashes on GCN 1.0 cards
https://bugs.freedesktop.org/show_bug.cgi?id=105775 mirhchanged: What|Removed |Added CC||m...@protonmail.ch -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nir+drivers: add helpers to get # of src/dest components
Add helpers to get the number of src/dest components for an intrinsic, and update spots that were open-coding this logic to use the helpers instead. Signed-off-by: Rob Clark--- src/compiler/nir/nir.h | 22 ++ src/compiler/nir/nir_opt_copy_propagate.c | 5 + src/compiler/nir/nir_validate.c| 10 ++ src/compiler/spirv/spirv_to_nir.c | 3 +-- .../drivers/freedreno/ir3/ir3_compiler_nir.c | 6 +- src/intel/compiler/brw_fs_nir.cpp | 11 +-- 6 files changed, 32 insertions(+), 25 deletions(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index 9fff1f4647d..2f4ff193fe6 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -1134,6 +1134,28 @@ typedef struct { extern const nir_intrinsic_info nir_intrinsic_infos[nir_num_intrinsics]; +static inline unsigned +nir_intrinsic_src_components(nir_intrinsic_instr *intr, unsigned srcn) +{ + const nir_intrinsic_info *info = _intrinsic_infos[intr->intrinsic]; + assert(srcn < info->num_srcs); + if (info->src_components[srcn]) + return info->src_components[srcn]; + else + return intr->num_components; +} + +static inline unsigned +nir_intrinsic_dest_components(nir_intrinsic_instr *intr) +{ + const nir_intrinsic_info *info = _intrinsic_infos[intr->intrinsic]; + if (!info->has_dest) + return 0; + else if (info->dest_components) + return info->dest_components; + else + return intr->num_components; +} #define INTRINSIC_IDX_ACCESSORS(name, flag, type) \ static inline type\ diff --git a/src/compiler/nir/nir_opt_copy_propagate.c b/src/compiler/nir/nir_opt_copy_propagate.c index c4001fa73f5..3cd476a1b97 100644 --- a/src/compiler/nir/nir_opt_copy_propagate.c +++ b/src/compiler/nir/nir_opt_copy_propagate.c @@ -257,10 +257,7 @@ copy_prop_instr(nir_instr *instr) nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); for (unsigned i = 0; i < nir_intrinsic_infos[intrin->intrinsic].num_srcs; i++) { - unsigned num_components = -nir_intrinsic_infos[intrin->intrinsic].src_components[i]; - if (!num_components) -num_components = intrin->num_components; + unsigned num_components = nir_intrinsic_src_components(intrin, i); while (copy_prop_src(>src[i], instr, NULL, num_components)) progress = true; diff --git a/src/compiler/nir/nir_validate.c b/src/compiler/nir/nir_validate.c index 725ba43152c..81376f98b22 100644 --- a/src/compiler/nir/nir_validate.c +++ b/src/compiler/nir/nir_validate.c @@ -483,10 +483,7 @@ validate_intrinsic_instr(nir_intrinsic_instr *instr, validate_state *state) unsigned num_srcs = nir_intrinsic_infos[instr->intrinsic].num_srcs; for (unsigned i = 0; i < num_srcs; i++) { - unsigned components_read = - nir_intrinsic_infos[instr->intrinsic].src_components[i]; - if (components_read == 0) - components_read = instr->num_components; + unsigned components_read = nir_intrinsic_src_components(instr, i); validate_assert(state, components_read > 0); @@ -499,10 +496,7 @@ validate_intrinsic_instr(nir_intrinsic_instr *instr, validate_state *state) } if (nir_intrinsic_infos[instr->intrinsic].has_dest) { - unsigned components_written = - nir_intrinsic_infos[instr->intrinsic].dest_components; - if (components_written == 0) - components_written = instr->num_components; + unsigned components_written = nir_intrinsic_dest_components(instr); validate_assert(state, components_written > 0); diff --git a/src/compiler/spirv/spirv_to_nir.c b/src/compiler/spirv/spirv_to_nir.c index 7888e1b7463..da4fac2e577 100644 --- a/src/compiler/spirv/spirv_to_nir.c +++ b/src/compiler/spirv/spirv_to_nir.c @@ -2430,8 +2430,7 @@ vtn_handle_image(struct vtn_builder *b, SpvOp opcode, struct vtn_value *val = vtn_push_value(b, w[2], vtn_value_type_ssa); struct vtn_type *type = vtn_value(b, w[1], vtn_value_type_type)->type; - unsigned dest_components = - nir_intrinsic_infos[intrin->intrinsic].dest_components; + unsigned dest_components = nir_intrinsic_dest_components(intrin); if (intrin->intrinsic == nir_intrinsic_image_var_size) { dest_components = intrin->num_components = glsl_get_vector_elements(type->type); diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c index a3e82ab593b..f42ba7a8c6e 100644 --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c @@ -1985,11 +1985,7 @@ emit_intrinsic(struct ir3_context *ctx, nir_intrinsic_instr *intr) int idx, comp; if (info->has_dest) { -
[Mesa-dev] [Bug 104626] broadcom/vc5: double compare
https://bugs.freedesktop.org/show_bug.cgi?id=104626 Grazvydas Ignotaschanged: What|Removed |Added CC||dcb...@hotmail.com --- Comment #2 from Grazvydas Ignotas --- *** Bug 105783 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105783] mesa-18.0.0/src/gallium/drivers/vc5/vc5_draw.c:589: duplicate expression ?
https://bugs.freedesktop.org/show_bug.cgi?id=105783 Grazvydas Ignotaschanged: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #2 from Grazvydas Ignotas --- *** This bug has been marked as a duplicate of bug 104626 *** -- You are receiving this mail because: You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105783] mesa-18.0.0/src/gallium/drivers/vc5/vc5_draw.c:589: duplicate expression ?
https://bugs.freedesktop.org/show_bug.cgi?id=105783 Eric Engestromchanged: What|Removed |Added Assignee|mesa-dev@lists.freedesktop. |e...@anholt.net |org | --- Comment #1 from Eric Engestrom --- This was introduced in 2e3c7beb1e60a47e1f5dd "broadcom/vc5: Pack clear colors according to the TLB internal format/type." -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radv: add support for VK_EXT_sampler_filter_minmax
Reviewed-by: Bas Nieuwenhuizenfor the series. On Sun, Mar 25, 2018 at 8:15 PM, Samuel Pitoiset wrote: > The driver only supports the required formats for now. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_device.c | 35 ++- > src/amd/vulkan/radv_formats.c | 36 > 2 files changed, 70 insertions(+), 1 deletion(-) > > diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c > index 9c82fd059f..5eeff1da22 100644 > --- a/src/amd/vulkan/radv_device.c > +++ b/src/amd/vulkan/radv_device.c > @@ -44,6 +44,7 @@ > #include "vk_format.h" > #include "sid.h" > #include "gfx9d.h" > +#include "addrlib/gfx9/chip/gfx9_enum.h" > #include "util/debug.h" > > static int > @@ -943,6 +944,14 @@ void radv_GetPhysicalDeviceProperties2( > properties->maxMemoryAllocationSize = 0xull; > break; > } > + case > VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLER_FILTER_MINMAX_PROPERTIES_EXT: { > + VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT > *properties = > + > (VkPhysicalDeviceSamplerFilterMinmaxPropertiesEXT *)ext; > + /* GFX6-8 only support single channel min/max filter. > */ > + properties->filterMinmaxImageComponentMapping = > pdevice->rad_info.chip_class >= GFX9; > + properties->filterMinmaxSingleComponentFormats = true; > + break; > + } > default: > break; > } > @@ -3962,6 +3971,22 @@ radv_tex_aniso_filter(unsigned filter) > return 4; > } > > +static unsigned > +radv_tex_filter_mode(VkSamplerReductionModeEXT mode) > +{ > + switch (mode) { > + case VK_SAMPLER_REDUCTION_MODE_WEIGHTED_AVERAGE_EXT: > + return SQ_IMG_FILTER_MODE_BLEND; > + case VK_SAMPLER_REDUCTION_MODE_MIN_EXT: > + return SQ_IMG_FILTER_MODE_MIN; > + case VK_SAMPLER_REDUCTION_MODE_MAX_EXT: > + return SQ_IMG_FILTER_MODE_MAX; > + default: > + break; > + } > + return 0; > +} > + > static void > radv_init_sampler(struct radv_device *device, > struct radv_sampler *sampler, > @@ -3971,6 +3996,13 @@ radv_init_sampler(struct radv_device *device, > (uint32_t) pCreateInfo->maxAnisotropy > : 0; > uint32_t max_aniso_ratio = radv_tex_aniso_filter(max_aniso); > bool is_vi = (device->physical_device->rad_info.chip_class >= VI); > + unsigned filter_mode = SQ_IMG_FILTER_MODE_BLEND; > + > + const struct VkSamplerReductionModeCreateInfoEXT *sampler_reduction = > + vk_find_struct_const(pCreateInfo->pNext, > +SAMPLER_REDUCTION_MODE_CREATE_INFO_EXT); > + if (sampler_reduction) > + filter_mode = > radv_tex_filter_mode(sampler_reduction->reductionMode); > > sampler->state[0] = > (S_008F30_CLAMP_X(radv_tex_wrap(pCreateInfo->addressModeU)) | > > S_008F30_CLAMP_Y(radv_tex_wrap(pCreateInfo->addressModeV)) | > @@ -3981,7 +4013,8 @@ radv_init_sampler(struct radv_device *device, > S_008F30_ANISO_THRESHOLD(max_aniso_ratio >> 1) | > S_008F30_ANISO_BIAS(max_aniso_ratio) | > S_008F30_DISABLE_CUBE_WRAP(0) | > -S_008F30_COMPAT_MODE(is_vi)); > +S_008F30_COMPAT_MODE(is_vi) | > +S_008F30_FILTER_MODE(filter_mode)); > sampler->state[1] = > (S_008F34_MIN_LOD(S_FIXED(CLAMP(pCreateInfo->minLod, 0, 15), 8)) | > > S_008F34_MAX_LOD(S_FIXED(CLAMP(pCreateInfo->maxLod, 0, 15), 8)) | > S_008F34_PERF_MIP(max_aniso_ratio ? > max_aniso_ratio + 6 : 0)); > diff --git a/src/amd/vulkan/radv_formats.c b/src/amd/vulkan/radv_formats.c > index da341a3a84..efb1d78790 100644 > --- a/src/amd/vulkan/radv_formats.c > +++ b/src/amd/vulkan/radv_formats.c > @@ -541,6 +541,35 @@ static bool radv_is_zs_format_supported(VkFormat format) > return radv_translate_dbformat(format) != V_028040_Z_INVALID || > format == VK_FORMAT_S8_UINT; > } > > +static bool radv_is_filter_minmax_format_supported(VkFormat format) > +{ > + /* From the Vulkan spec 1.1.71: > +* > +* "The following formats must support the > +* VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_MINMAX_BIT_EXT feature with > +* VK_IMAGE_TILING_OPTIMAL, if they support > +* VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT." > +*/ > + /* TODO: enable more formats. */ > + switch (format) { > + case
Re: [Mesa-dev] [PATCH] android: Use new nir intrinsics python scripts
Thanks, r-b and pushed! On 27.03.2018 22:40, Stefan Schake wrote: Fixes: 76dfed8ae2d5 ("nir: mako all the intrinsics") Signed-off-by: Stefan SchakeAcked-by: Rob Clark --- src/compiler/Android.nir.gen.mk | 9 + 1 file changed, 9 insertions(+) diff --git a/src/compiler/Android.nir.gen.mk b/src/compiler/Android.nir.gen.mk index aaa2712..fa0707e 100644 --- a/src/compiler/Android.nir.gen.mk +++ b/src/compiler/Android.nir.gen.mk @@ -103,3 +103,12 @@ $(intermediates)/spirv/vtn_gather_types.c:: $(LOCAL_PATH)/spirv/vtn_gather_types @mkdir -p $(dir $@) $(hide) $(MESA_PYTHON2) $^ $@ || ($(RM) $@; false) +nir_intrinsics_h_gen := $(LOCAL_PATH)/nir/nir_intrinsics_h.py +$(intermediates)/nir/nir_intrinsics.h: $(LOCAL_PATH)/nir/nir_intrinsics.py $(nir_intrinsics_h_gen) + @mkdir -p $(dir $@) + $(hide) $(MESA_PYTHON2) $(nir_intrinsics_h_gen) --outdir $(dir $@) || ($(RM) $@; false) + +nir_intrinsics_c_gen := $(LOCAL_PATH)/nir/nir_intrinsics_c.py +$(intermediates)/nir/nir_intrinsics.c: $(LOCAL_PATH)/nir/nir_intrinsics.py $(nir_intrinsics_c_gen) + @mkdir -p $(dir $@) + $(hide) $(MESA_PYTHON2) $(nir_intrinsics_c_gen) --outdir $(dir $@) || ($(RM) $@; false) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105784] mesa-18.0.0/src/intel/vulkan/anv_nir_apply_pipeline_layout.c:150: bad assert ?
https://bugs.freedesktop.org/show_bug.cgi?id=105784 --- Comment #1 from Lionel Landwerlin--- Thanks, this was fixed recently in commit : commit 0cc7370733e9d20999d13c4c8565f0c91846a45c Author: Grazvydas Ignotas Date: Tue Jan 23 00:44:36 2018 +0200 anv: correct a duplicate check in an assert I don't think we'll bother to cherry pick it back to 17.3 as it's not critical. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105784] mesa-18.0.0/src/intel/vulkan/anv_nir_apply_pipeline_layout.c:150: bad assert ?
https://bugs.freedesktop.org/show_bug.cgi?id=105784 Bug ID: 105784 Summary: mesa-18.0.0/src/intel/vulkan/anv_nir_apply_pipeline_la yout.c:150: bad assert ? Product: Mesa Version: 17.3 Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Other Assignee: mesa-dev@lists.freedesktop.org Reporter: dcb...@hotmail.com QA Contact: mesa-dev@lists.freedesktop.org [mesa-18.0.0/src/intel/vulkan/anv_nir_apply_pipeline_layout.c:150]: (style) Same expression on both sides of '&&'. Source code is assert(intrin->src[0].is_ssa && intrin->src[0].is_ssa); maybe better code assert(intrin->src[0].is_ssa && intrin->src[1].is_ssa); -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105783] mesa-18.0.0/src/gallium/drivers/vc5/vc5_draw.c:589: duplicate expression ?
https://bugs.freedesktop.org/show_bug.cgi?id=105783 Bug ID: 105783 Summary: mesa-18.0.0/src/gallium/drivers/vc5/vc5_draw.c:589: duplicate expression ? Product: Mesa Version: 17.3 Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Other Assignee: mesa-dev@lists.freedesktop.org Reporter: dcb...@hotmail.com QA Contact: mesa-dev@lists.freedesktop.org [mesa-18.0.0/src/gallium/drivers/vc5/vc5_draw.c:589]: (style) Same expression on both sides of '||'. Source code is if (surf->format == PIPE_FORMAT_B4G4R4A4_UNORM || surf->format == PIPE_FORMAT_B4G4R4A4_UNORM) { Suggest either simplify expression or use some other constant. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: annotate brw_oa.py's --header and --code as required
Hi, On 20.03.2018 19:06, Dylan Baker wrote: Quoting Emil Velikov (2018-03-20 09:29:00) [snip] gens = [] for xml_file in args.xml_files: @@ -617,7 +610,7 @@ def main(): """)) -c("#include \"" + os.path.basename(args.header) + "\"") +c("#include \"" + os.path.basename(header_file) + "\"") You're calling os.path.basename on a file object, which isn't valid. This should still be args.header. One could also use .name. - Eero c(textwrap.dedent("""\ #include "brw_context.h" ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] GfxBench & CSDof failures
Hi, On 28.03.2018 13:27, Eero Tamminen wrote: Mesa built from following (last evening) commit: commit 76dfed8ae2d5c6c509eb2661389be3c6a25077df Author: Rob ClarkAuthorDate: Thu Mar 15 18:42:44 2018 -0400 Commit: Rob Clark CommitDate: Tue Mar 27 08:36:37 2018 -0400 nir: mako all the intrinsics Fails with following GL 4.3 benchmarks I tested: * GfxBench Manhattan 3.1, CarChase and AztecRuins * SynMark CSDof Issues are following: * CSDof & ActecRuins: compile takes so long that test is killed Sorry, I should have read more of my mesa-dev 2k mail backlog, not just check bugs. It was the memory usage issue mentioned by Ian & Clayton, not compile time issue. * Manhattan 3.1 shader linking failed: - error: declarations for uniform `depth_parameters` are inside block `cameraConsts` and outside a block - And this was also using all memory, so I assume issue was due to Mesa shader compiler handling memory allocation failure badly (continuing to a bogus compile error instead of reporting compile failure due to allocation failure). Commit from one day earlier, didn't have any problems. Neither does latest Mesa git version. Which commit fixed the regression, is it this one: I assume it was: - commit 5f21a7afe072f8a6e558ccc47407a0a94e0d1313 Author: Jason Ekstrand AuthorDate: Tue Mar 27 16:12:16 2018 -0700 Commit: Jason Ekstrand CommitDate: Tue Mar 27 18:18:26 2018 -0700 nir/intrinsics: Don't report negative dest_components - - Eero ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 1/4] mesa: add support for nvidia conservative rasterization extensions
Although the specs are written against compatibility GL 4.3 and allows core profile and GLES2+, it is exposed for GL 1.0+ and GLES1 and GLES2+. --- src/mapi/glapi/gen/gl_API.xml | 47 +++ src/mapi/glapi/gen/gl_genexec.py| 1 + src/mesa/Makefile.sources | 2 + src/mesa/main/attrib.c | 60 +++--- src/mesa/main/conservativeraster.c | 138 src/mesa/main/conservativeraster.h | 48 +++ src/mesa/main/context.c | 10 +++ src/mesa/main/dlist.c | 86 src/mesa/main/enable.c | 14 src/mesa/main/extensions_table.h| 4 + src/mesa/main/get.c | 3 + src/mesa/main/get_hash_params.py| 13 +++ src/mesa/main/mtypes.h | 28 ++- src/mesa/main/tests/dispatch_sanity.cpp | 27 +++ src/mesa/main/viewport.c| 57 + src/mesa/main/viewport.h| 6 ++ src/mesa/meson.build| 2 + 17 files changed, 535 insertions(+), 11 deletions(-) create mode 100644 src/mesa/main/conservativeraster.c create mode 100644 src/mesa/main/conservativeraster.h diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 38c1921047..db312370b1 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -12871,6 +12871,53 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + http://www.w3.org/2001/XInclude"/> diff --git a/src/mapi/glapi/gen/gl_genexec.py b/src/mapi/glapi/gen/gl_genexec.py index aaff9f230b..be8013b62b 100644 --- a/src/mapi/glapi/gen/gl_genexec.py +++ b/src/mapi/glapi/gen/gl_genexec.py @@ -62,6 +62,7 @@ header = """/** #include "main/colortab.h" #include "main/compute.h" #include "main/condrender.h" +#include "main/conservativeraster.h" #include "main/context.h" #include "main/convolve.h" #include "main/copyimage.h" diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources index 0446078136..43ec55f580 100644 --- a/src/mesa/Makefile.sources +++ b/src/mesa/Makefile.sources @@ -49,6 +49,8 @@ MAIN_FILES = \ main/condrender.c \ main/condrender.h \ main/config.h \ + main/conservativeraster.c \ + main/conservativeraster.h \ main/context.c \ main/context.h \ main/convolve.c \ diff --git a/src/mesa/main/attrib.c b/src/mesa/main/attrib.c index 9d3aa728a1..a8873f2988 100644 --- a/src/mesa/main/attrib.c +++ b/src/mesa/main/attrib.c @@ -138,6 +138,9 @@ struct gl_enable_attrib /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */ GLboolean sRGBEnabled; + + /* GL_NV_conservative_raster */ + GLboolean ConservativeRasterization; }; @@ -178,6 +181,13 @@ struct texture_state }; +struct viewport_state +{ + struct gl_viewport_attrib ViewportArray[MAX_VIEWPORTS]; + GLuint SubpixelPrecisionBias[2]; +}; + + /** An unused GL_*_BIT value */ #define DUMMY_BIT 0x1000 @@ -394,6 +404,9 @@ _mesa_PushAttrib(GLbitfield mask) /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */ attr->sRGBEnabled = ctx->Color.sRGBEnabled; + + /* GL_NV_conservative_raster */ + attr->ConservativeRasterization = ctx->ConservativeRasterization; } if (mask & GL_EVAL_BIT) { @@ -545,11 +558,23 @@ _mesa_PushAttrib(GLbitfield mask) } if (mask & GL_VIEWPORT_BIT) { - if (!push_attrib(ctx, , GL_VIEWPORT_BIT, - sizeof(struct gl_viewport_attrib) - * ctx->Const.MaxViewports, - (void*)>ViewportArray)) + struct viewport_state *viewstate = CALLOC_STRUCT(viewport_state); + if (!viewstate) { + _mesa_error(ctx, GL_OUT_OF_MEMORY, "glPushAttrib(GL_VIEWPORT_BIT)"); + goto end; + } + + if (!save_attrib_data(, GL_VIEWPORT_BIT, viewstate)) { + free(viewstate); + _mesa_error(ctx, GL_OUT_OF_MEMORY, "glPushAttrib(GL_VIEWPORT_BIT)"); goto end; + } + + memcpy(>ViewportArray, >ViewportArray, + sizeof(struct gl_viewport_attrib)*ctx->Const.MaxViewports); + + viewstate->SubpixelPrecisionBias[0] = ctx->NvSubpixelPrecisionBias[0]; + viewstate->SubpixelPrecisionBias[1] = ctx->NvSubpixelPrecisionBias[1]; } /* GL_ARB_multisample */ @@ -714,6 +739,13 @@ pop_enable_group(struct gl_context *ctx, const struct gl_enable_attrib *enable) TEST_AND_UPDATE(ctx->Color.sRGBEnabled, enable->sRGBEnabled, GL_FRAMEBUFFER_SRGB); + /* GL_NV_conservative_raster */ + if (ctx->Extensions.NV_conservative_raster) { +
[Mesa-dev] [PATCH v3 2/4] gallium: add initial support for conservative rasterization
--- src/gallium/docs/source/cso/rasterizer.rst | 23 +++ src/gallium/docs/source/screen.rst | 18 ++ src/gallium/drivers/etnaviv/etnaviv_screen.c | 10 ++ src/gallium/drivers/freedreno/freedreno_screen.c | 10 ++ src/gallium/drivers/i915/i915_screen.c | 13 + src/gallium/drivers/llvmpipe/lp_screen.c | 12 src/gallium/drivers/nouveau/nv30/nv30_screen.c | 10 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 10 ++ src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 10 ++ src/gallium/drivers/r300/r300_screen.c | 10 ++ src/gallium/drivers/r600/r600_pipe.c | 6 ++ src/gallium/drivers/r600/r600_pipe_common.c | 4 src/gallium/drivers/radeonsi/si_get.c| 10 ++ src/gallium/drivers/softpipe/sp_screen.c | 12 src/gallium/drivers/svga/svga_screen.c | 13 + src/gallium/drivers/swr/swr_screen.cpp | 10 ++ src/gallium/drivers/vc4/vc4_screen.c | 13 - src/gallium/drivers/vc5/vc5_screen.c | 13 - src/gallium/drivers/virgl/virgl_screen.c | 10 ++ src/gallium/include/pipe/p_defines.h | 20 src/gallium/include/pipe/p_state.h | 8 21 files changed, 243 insertions(+), 2 deletions(-) diff --git a/src/gallium/docs/source/cso/rasterizer.rst b/src/gallium/docs/source/cso/rasterizer.rst index 616e4511a2..4dabcc032f 100644 --- a/src/gallium/docs/source/cso/rasterizer.rst +++ b/src/gallium/docs/source/cso/rasterizer.rst @@ -340,3 +340,26 @@ clip_plane_enable If any clip distance output is written, those half-spaces for which no clip distance is written count as disabled; i.e. user clip planes and shader clip distances cannot be mixed, and clip distances take precedence. + +conservative_raster_mode +The conservative rasterization mode. For PIPE_CONSERVATIVE_RASTER_OFF, +conservative rasterization is disabled. For IPE_CONSERVATIVE_RASTER_POST_SNAP +or PIPE_CONSERVATIVE_RASTER_PRE_SNAP, conservative rasterization is nabled. +When conservative rasterization is enabled, the polygon smooth, line mooth, +point smooth and line stipple settings are ignored. +With the post-snap mode, unlike the pre-snap mode, fragments are never +generated for degenerate primitives. Degenerate primitives, when rasterized, +are considered back-facing and the vertex attributes and depth are that of +the provoking vertex. +If the post-snap mode is used with an unsupported primitive, the pre-snap +mode is used, if supported. Behavior is similar for the pre-snap mode. +If the pre-snap mode is used, fragments are generated with respect to the primitive +before vertex snapping. + +conservative_raster_dilate +The amount of dilation during conservative rasterization. + +subpixel_precision_x +A bias added to the horizontal subpixel precision during conservative rasterization. +subpixel_precision_y +A bias added to the vertical subpixel precision during conservative rasterization. diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index 3837360fb4..5bc6ee99f0 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -420,6 +420,18 @@ The integer capabilities: by the driver, and the driver can throw assertion failures. * ``PIPE_CAP_PACKED_UNIFORMS``: True if the driver supports packed uniforms as opposed to padding to vec4s. +* ``PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES``: Whether the + PIPE_CONSERVATIVE_RASTER_POST_SNAP mode is supported for triangles. +* ``PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES``: Whether the +PIPE_CONSERVATIVE_RASTER_POST_SNAP mode is supported for points and lines. +* ``PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES``: Whether the +PIPE_CONSERVATIVE_RASTER_PRE_SNAP mode is supported for triangles. +* ``PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_POINTS_LINES``: Whether the +PIPE_CONSERVATIVE_RASTER_PRE_SNAP mode is supported for points and lines. +* ``PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE``: Whether PIPE_CAP_POST_DEPTH_COVERAGE +works with conservative rasterization. +* ``PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS``: The maximum +subpixel precision bias in bits during conservative rasterization. .. _pipe_capf: @@ -437,6 +449,12 @@ The floating-point capabilities are: applied to anisotropically filtered textures. * ``PIPE_CAPF_MAX_TEXTURE_LOD_BIAS``: The maximum :term:`LOD` bias that may be applied to filtered textures. +* ``PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE``: The minimum conservative rasterization + dilation. +* ``PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE``: The maximum conservative rasterization + dilation. +*
[Mesa-dev] [PATCH v3 3/4] st/mesa: add support for nvidia conservative rasterization extensions
--- src/mesa/state_tracker/st_atom_rasterizer.c | 15 + src/mesa/state_tracker/st_context.c | 2 ++ src/mesa/state_tracker/st_extensions.c | 34 + 3 files changed, 51 insertions(+) diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c b/src/mesa/state_tracker/st_atom_rasterizer.c index 1be072e6e3..5b747a924e 100644 --- a/src/mesa/state_tracker/st_atom_rasterizer.c +++ b/src/mesa/state_tracker/st_atom_rasterizer.c @@ -298,5 +298,20 @@ st_update_rasterizer(struct st_context *st) raster->clip_plane_enable = ctx->Transform.ClipPlanesEnabled; raster->clip_halfz = (ctx->Transform.ClipDepthMode == GL_ZERO_TO_ONE); +/* ST_NEW_RASTERIZER */ + if (ctx->ConservativeRasterization) { + if (ctx->ConservativeRasterMode == GL_CONSERVATIVE_RASTER_MODE_POST_SNAP_NV) + raster->conservative_raster_mode = PIPE_CONSERVATIVE_RASTER_POST_SNAP; + else + raster->conservative_raster_mode = PIPE_CONSERVATIVE_RASTER_PRE_SNAP; + } else { + raster->conservative_raster_mode = PIPE_CONSERVATIVE_RASTER_OFF; + } + + raster->conservative_raster_dilate = ctx->ConservativeRasterDilate; + + raster->subpixel_precision_x = ctx->NvSubpixelPrecisionBias[0]; + raster->subpixel_precision_y = ctx->NvSubpixelPrecisionBias[1]; + cso_set_rasterizer(st->cso_context, raster); } diff --git a/src/mesa/state_tracker/st_context.c b/src/mesa/state_tracker/st_context.c index 90b7f9359a..0709681e16 100644 --- a/src/mesa/state_tracker/st_context.c +++ b/src/mesa/state_tracker/st_context.c @@ -344,6 +344,8 @@ st_init_driver_flags(struct st_context *st) f->NewPolygonState = ST_NEW_RASTERIZER; f->NewPolygonStipple = ST_NEW_POLY_STIPPLE; f->NewViewport = ST_NEW_VIEWPORT; + f->NewNvConservativeRasterization = ST_NEW_RASTERIZER; + f->NewNvConservativeRasterizationParams = ST_NEW_RASTERIZER; } diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index bea61f21cb..02832f3951 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -494,6 +494,16 @@ void st_init_limits(struct pipe_screen *screen, c->UseSTD430AsDefaultPacking = screen->get_param(screen, PIPE_CAP_LOAD_CONSTBUF); + c->MaxSubpixelPrecisionBiasBits = + screen->get_param(screen, PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS); + + c->ConservativeRasterDilateRange[0] = + screen->get_paramf(screen, PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE); + c->ConservativeRasterDilateRange[1] = + screen->get_paramf(screen, PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE); + c->ConservativeRasterDilateGranularity = + screen->get_paramf(screen, PIPE_CAPF_CONSERVATIVE_RASTER_DILATE_GRANULARITY); + /* limit the max combined shader output resources to a driver limit */ temp = screen->get_param(screen, PIPE_CAP_MAX_COMBINED_SHADER_OUTPUT_RESOURCES); if (temp > 0 && c->MaxCombinedShaderOutputResources > temp) @@ -1363,4 +1373,28 @@ void st_init_extensions(struct pipe_screen *screen, extensions->ARB_texture_cube_map_array && extensions->ARB_texture_stencil8 && extensions->ARB_texture_multisample; + + if (screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES) && + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES) && + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE)) { + float max_dilate; + bool pre_snap_triangles, pre_snap_points_lines; + + max_dilate = screen->get_paramf(screen, PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE); + + pre_snap_triangles = + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES); + pre_snap_points_lines = + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_POINTS_LINES); + + extensions->NV_conservative_raster = + screen->get_param(screen, PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS) > 1; + + if (extensions->NV_conservative_raster) { + extensions->NV_conservative_raster_dilate = max_dilate>=0.75; + extensions->NV_conservative_raster_pre_snap_triangles = pre_snap_triangles; + extensions->NV_conservative_raster_pre_snap = +pre_snap_triangles && pre_snap_points_lines; + } + } } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 4/4] nvc0: add conservative rasterization support
Subpixel precision bias, dilation and the post-snap mode are supported on GM200 and newer. The pre-snap mode is supported for triangle primitives on GP100. --- src/gallium/drivers/nouveau/nvc0/mme/com9097.mme | 32 ++ src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h | 22 +++ src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h | 5 src/gallium/drivers/nouveau/nvc0/nvc0_macros.h | 4 ++- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 19 + src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 14 ++ src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h | 2 +- 7 files changed, 90 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme b/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme index 7c5ec8f52b..83032da9de 100644 --- a/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme +++ b/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme @@ -550,3 +550,35 @@ qbw_postclamp: qbw_done: exit send (extrinsrt 0x0 $r4 0x0 0x10 0x10) maddrsend 0x44 + +/* NVC0_3D_MACRO_CONSERVATIVE_RASTER_STATE: + * + * This sets basically all the conservative rasterization state. It sets + * CONSERVATIVE_RASTER to one while doing so. + * + * arg = biasx | biasy<<4 | (dilation*4)<<8 | mode<<10 + */ +.section #mme9097_conservative_raster_state + /* Mode and dilation */ + maddr 0x1d00 /* SCRATCH[0] */ + send 0x0 /* unknown */ + send (extrinsrt 0x0 $r1 8 3 23) /* value */ + mov $r2 0x7 + send (extrinsrt 0x0 $r2 0 3 23) /* write mask */ + maddr 0x18c4 /* FIRMWARE[4] */ + mov $r2 0x831 + send (extrinsrt 0x0 $r2 0 12 11) /* sends 0x418800 */ + /* Subpixel precision */ + mov $r2 0xf + mov $r2 (and $r1 $r2) + mov $r2 (extrinsrt $r2 $r1 4 4 8) + maddr 0x8287 /* SUBPIXEL_PRECISION[0] (incrementing by 8 methods) */ + mov $r3 16 /* loop counter */ + mov $r4 1 /* loop decrement */ +loop: + mov $r3 (sub $r3 $r4) + branz $r3 #loop + send $r2 + /* Enable */ + exit maddr 0x1452 /* CONSERVATIVE_RASTER */ + send 0x1 diff --git a/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h b/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h index 9618da6e28..b8b69eb544 100644 --- a/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h +++ b/src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h @@ -373,3 +373,25 @@ uint32_t mme9097_query_buffer_write[] = { 0x840100c2, 0x00110071, }; + +uint32_t mme9097_conservative_raster_state[] = { + 0x07400021, + 0x0041, + 0xb8d04042, + 0x0001c211, + 0xb8c08042, + 0x06310021, + 0x020c4211, + 0x5b008042, + 0x0003c211, + 0x00148a10, + 0x41085212, + 0x20a1c021, + 0x00040311, + 0x4411, + 0x00051b10, + 0xd817, + 0x1041, + 0x051480a1, + 0x4041, +}; diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h index d7245fbcae..c5456e48b5 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h @@ -447,6 +447,10 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #define NVC0_3D_VIEWPORT_TRANSLATE_Z__ESIZE0x0020 #define NVC0_3D_VIEWPORT_TRANSLATE_Z__LEN 0x0010 +#define NVC0_3D_SUBPIXEL_PRECISION(i0)(0x0a1c + 0x20*(i0)) +#define NVC0_3D_SUBPIXEL_PRECISION__ESIZE 0x0020 +#define NVC0_3D_SUBPIXEL_PRECISION__LEN 0x0010 + #define NVC0_3D_VIEWPORT_HORIZ(i0)(0x0c00 + 0x10*(i0)) #define NVC0_3D_VIEWPORT_HORIZ__ESIZE 0x0010 #define NVC0_3D_VIEWPORT_HORIZ__LEN0x0010 @@ -780,6 +784,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #define NVC0_3D_UNK11400x1140 #define NVC0_3D_UNK11440x1144 +#define NVC0_3D_CONSERVATIVE_RASTER0x1148 #define NVC0_3D_VTX_ATTR_DEFINE0x114c #define NVC0_3D_VTX_ATTR_DEFINE_ATTR__MASK 0x00ff diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_macros.h b/src/gallium/drivers/nouveau/nvc0/nvc0_macros.h index eeacc714f3..7aa0633795 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_macros.h +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_macros.h @@ -35,6 +35,8 @@ #define NVC0_3D_MACRO_QUERY_BUFFER_WRITE 0x3858 -#define NVC0_CP_MACRO_LAUNCH_GRID_INDIRECT 0x3860 +#define NVC0_CP_MACRO_LAUNCH_GRID_INDIRECT 0x3860 + +#define NVC0_3D_MACRO_CONSERVATIVE_RASTER_STATE 0x3868 #endif /* __NVC0_MACROS_H__ */ diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
[Mesa-dev] [PATCH v3 0/4] Implement Various Conservative Rasterization Extensions
This patch-set adds support for GL_NV_conservative_raster and GL_NV_conservative_raster_dilate on GM2xx and newer. It also adds support for GL_NV_conservative_raster_pre_snap_triangles on GP1xx. In doing so, it implements various functions in mesa core, extends the Gallium API, connects the new mesa core functions and the Gallium API through st/mesa and implements support for the Gallium API in the Nouveau driver. Changes in v3: - rename SubpixelPrecisionBias to NVSubpixelPrecisionBias - move the subpixel precision bias into pipe_rasterizer_state - set the conservative rasterization state using a PGRAPH macro Changes in v2: - fix indentation error in gl_API.xml - fix code to handle earlier hardware Rhys Perry (4): mesa: add support for nvidia conservative rasterization extensions gallium: add initial support for conservative rasterization st/mesa: add support for nvidia conservative rasterization extensions nvc0: add conservative rasterization support src/gallium/docs/source/cso/rasterizer.rst | 23 src/gallium/docs/source/screen.rst | 18 +++ src/gallium/drivers/etnaviv/etnaviv_screen.c | 10 ++ src/gallium/drivers/freedreno/freedreno_screen.c | 10 ++ src/gallium/drivers/i915/i915_screen.c | 13 ++ src/gallium/drivers/llvmpipe/lp_screen.c | 12 ++ src/gallium/drivers/nouveau/nv30/nv30_screen.c | 10 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 10 ++ src/gallium/drivers/nouveau/nvc0/mme/com9097.mme | 32 + src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h | 22 src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h | 5 + src/gallium/drivers/nouveau/nvc0/nvc0_macros.h | 4 +- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 17 +++ src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 14 +++ src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h | 2 +- src/gallium/drivers/r300/r300_screen.c | 10 ++ src/gallium/drivers/r600/r600_pipe.c | 6 + src/gallium/drivers/r600/r600_pipe_common.c| 4 + src/gallium/drivers/radeonsi/si_get.c | 10 ++ src/gallium/drivers/softpipe/sp_screen.c | 12 ++ src/gallium/drivers/svga/svga_screen.c | 13 ++ src/gallium/drivers/swr/swr_screen.cpp | 10 ++ src/gallium/drivers/vc4/vc4_screen.c | 13 +- src/gallium/drivers/vc5/vc5_screen.c | 13 +- src/gallium/drivers/virgl/virgl_screen.c | 10 ++ src/gallium/include/pipe/p_defines.h | 20 +++ src/gallium/include/pipe/p_state.h | 8 ++ src/mapi/glapi/gen/gl_API.xml | 47 +++ src/mapi/glapi/gen/gl_genexec.py | 1 + src/mesa/Makefile.sources | 2 + src/mesa/main/attrib.c | 60 +++-- src/mesa/main/conservativeraster.c | 138 + src/mesa/main/conservativeraster.h | 48 +++ src/mesa/main/context.c| 10 ++ src/mesa/main/dlist.c | 86 + src/mesa/main/enable.c | 14 +++ src/mesa/main/extensions_table.h | 4 + src/mesa/main/get.c| 3 + src/mesa/main/get_hash_params.py | 13 ++ src/mesa/main/mtypes.h | 28 - src/mesa/main/tests/dispatch_sanity.cpp| 27 src/mesa/main/viewport.c | 57 + src/mesa/main/viewport.h | 6 + src/mesa/meson.build | 2 + src/mesa/state_tracker/st_atom_rasterizer.c| 15 +++ src/mesa/state_tracker/st_context.c| 2 + src/mesa/state_tracker/st_extensions.c | 34 + 47 files changed, 913 insertions(+), 15 deletions(-) create mode 100644 src/mesa/main/conservativeraster.c create mode 100644 src/mesa/main/conservativeraster.h -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] vbo: Use alloca for _vbo_draw_indirect.
From: Mathias FröhlichMarek, you mean with the below patch as the 9-th change in the series? I would like to keep that change seprarate from #3 since patch #3 just moves the already existing impelentation to the driver_functions level using the exactly identical implementation except calling into struct driver_functions instead of the vbo module draw function. Also I do not want to call just blindly into alloca with possibly large counts. So, the implementation uses an upper bound when to use malloc instead of alloca. Ok, with that? best Mathias Avoid using malloc in the draw path of mesa. Since the draw_count is a user api input, fall back to malloc if the amount of consumed stack space may get too high. Signed-off-by: Mathias Fröhlich --- src/mesa/vbo/vbo_context.c | 70 +++--- 1 file changed, 47 insertions(+), 23 deletions(-) diff --git a/src/mesa/vbo/vbo_context.c b/src/mesa/vbo/vbo_context.c index b8c28ceffb..06b8f820ee 100644 --- a/src/mesa/vbo/vbo_context.c +++ b/src/mesa/vbo/vbo_context.c @@ -233,25 +233,17 @@ _vbo_DestroyContext(struct gl_context *ctx) } -void -_vbo_draw_indirect(struct gl_context *ctx, GLuint mode, -struct gl_buffer_object *indirect_data, -GLsizeiptr indirect_offset, unsigned draw_count, -unsigned stride, -struct gl_buffer_object *indirect_draw_count_buffer, -GLsizeiptr indirect_draw_count_offset, -const struct _mesa_index_buffer *ib) +static void +draw_indirect(struct gl_context *ctx, GLuint mode, + struct gl_buffer_object *indirect_data, + GLsizeiptr indirect_offset, unsigned draw_count, + unsigned stride, + struct gl_buffer_object *indirect_draw_count_buffer, + GLsizeiptr indirect_draw_count_offset, + const struct _mesa_index_buffer *ib, + struct _mesa_prim *space) { - struct _mesa_prim *prim; - - prim = calloc(draw_count, sizeof(*prim)); - if (prim == NULL) { - _mesa_error(ctx, GL_OUT_OF_MEMORY, "gl%sDraw%sIndirect%s", - (draw_count > 1) ? "Multi" : "", - ib ? "Elements" : "Arrays", - indirect_data ? "CountARB" : ""); - return; - } + struct _mesa_prim *prim = space; prim[0].begin = 1; prim[draw_count - 1].end = 1; @@ -266,10 +258,42 @@ _vbo_draw_indirect(struct gl_context *ctx, GLuint mode, /* This should always be true at this time */ assert(indirect_data == ctx->DrawIndirectBuffer); - ctx->Driver.Draw(ctx, prim, draw_count, - ib, false, 0, ~0, - NULL, 0, - indirect_data); + ctx->Driver.Draw(ctx, prim, draw_count, ib, false, 0u, ~0u, +NULL, 0, indirect_data); +} + - free(prim); +void +_vbo_draw_indirect(struct gl_context *ctx, GLuint mode, + struct gl_buffer_object *indirect_data, + GLsizeiptr indirect_offset, unsigned draw_count, + unsigned stride, + struct gl_buffer_object *indirect_draw_count_buffer, + GLsizeiptr indirect_draw_count_offset, + const struct _mesa_index_buffer *ib) +{ + /* Use alloca for the prim space if we are somehow in bounds. */ + if (draw_count*sizeof(struct _mesa_prim) < 1024) { + struct _mesa_prim *space = alloca(draw_count*sizeof(struct _mesa_prim)); + memset(space, 0, draw_count*sizeof(struct _mesa_prim)); + + draw_indirect(ctx, mode, indirect_data, indirect_offset, draw_count, +stride, indirect_draw_count_buffer, +indirect_draw_count_offset, ib, space); + } else { + struct _mesa_prim *space = calloc(draw_count, sizeof(struct _mesa_prim)); + if (space == NULL) { + _mesa_error(ctx, GL_OUT_OF_MEMORY, "gl%sDraw%sIndirect%s", + (draw_count > 1) ? "Multi" : "", + ib ? "Elements" : "Arrays", + indirect_data ? "CountARB" : ""); + return; + } + + draw_indirect(ctx, mode, indirect_data, indirect_offset, draw_count, +stride, indirect_draw_count_buffer, +indirect_draw_count_offset, ib, space); + + free(space); + } } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: only enable one channel when exporting prim id
Reviewed-by: Bas NieuwenhuizenOn Tue, Mar 20, 2018 at 10:07 AM, Samuel Pitoiset wrote: > It's a 32-bit integer like the layer. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_nir_to_llvm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/amd/vulkan/radv_nir_to_llvm.c > b/src/amd/vulkan/radv_nir_to_llvm.c > index ad046adfdb..c8d383e021 100644 > --- a/src/amd/vulkan/radv_nir_to_llvm.c > +++ b/src/amd/vulkan/radv_nir_to_llvm.c > @@ -2357,7 +2357,7 @@ handle_vs_outputs_post(struct radv_shader_context *ctx, > for (unsigned j = 1; j < 4; j++) > values[j] = ctx->ac.f32_0; > > - radv_export_param(ctx, param_count, values, 0xf); > + radv_export_param(ctx, param_count, values, 0x1); > > outinfo->vs_output_param_offset[VARYING_SLOT_PRIMITIVE_ID] = > param_count++; > outinfo->export_prim_id = true; > -- > 2.16.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa] docs: fix 18.0 release note version
Fixes: 839fb3a696679bfe975c2 "docs: Update 18.0.0 release notes" Cc: "18.0"Cc: Emil Velikov Signed-off-by: Eric Engestrom --- docs/relnotes/18.0.0.html | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/relnotes/18.0.0.html b/docs/relnotes/18.0.0.html index 2b374be6321d62940192..6fa6370a06a58234bb96 100644 --- a/docs/relnotes/18.0.0.html +++ b/docs/relnotes/18.0.0.html @@ -14,15 +14,15 @@ The Mesa 3D Graphics Library -Mesa 17.4.0 Release Notes / March 27 2018 +Mesa 18.0.0 Release Notes / March 27 2018 -Mesa 17.4.0 is a new development release. +Mesa 18.0.0 is a new development release. People who are concerned with stability and reliability should stick -with a previous release or wait for Mesa 17.4.1. +with a previous release or wait for Mesa 18.0.1. -Mesa 17.4.0 implements the OpenGL 4.5 API, but the version reported by +Mesa 18.0.0 implements the OpenGL 4.5 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don't support all the features required in OpenGL 4.5. OpenGL -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] GfxBench & CSDof failures
Hi, Mesa built from following (last evening) commit: commit 76dfed8ae2d5c6c509eb2661389be3c6a25077df Author: Rob ClarkAuthorDate: Thu Mar 15 18:42:44 2018 -0400 Commit: Rob Clark CommitDate: Tue Mar 27 08:36:37 2018 -0400 nir: mako all the intrinsics Fails with following GL 4.3 /compute shader benchmarks I tested: * GfxBench Manhattan 3.1, CarChase and AztecRuins * SynMark CSDof Issues are following: * CSDof & ActecRuins: compile takes so long that test is killed * Manhattan 3.1 shader linking failed: - error: declarations for uniform `depth_parameters` are inside block `cameraConsts` and outside a block - Commit from one day earlier, didn't have any problems. Neither does latest Mesa git version. Which commit fixed the regression, is it this one: - commit 629ee690addad9b3dc8f68cfff5ae09858f31caf Author: Timothy Arceri AuthorDate: Mon Mar 26 11:41:51 2018 +1100 Commit: Timothy Arceri CommitDate: Wed Mar 28 09:59:38 2018 +1100 nir: fix crash in loop unroll corner case - ? - Eero ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] ac/surface: set AddrSurfInfoIn.format = ADDR_FMT_8, add assertions
Reviewed-by: Bas NieuwenhuizenOn Tue, 27 Mar 2018, 10:08 Samuel Pitoiset, wrote: > Tested-by: Samuel Pitoiset > > On 03/27/2018 02:39 AM, Marek Olšák wrote: > > From: Marek Olšák > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105738 > > --- > > src/amd/common/ac_surface.c | 8 > > 1 file changed, 8 insertions(+) > > > > diff --git a/src/amd/common/ac_surface.c b/src/amd/common/ac_surface.c > > index 12dfc0cb1f2..81882576baf 100644 > > --- a/src/amd/common/ac_surface.c > > +++ b/src/amd/common/ac_surface.c > > @@ -1150,32 +1150,39 @@ static int gfx9_compute_surface(ADDR_HANDLE > addrlib, > > break; > > case 16: > > AddrSurfInfoIn.format = ADDR_FMT_BC3; > > break; > > default: > > assert(0); > > } > > } else { > > switch (surf->bpe) { > > case 1: > > + assert(!(surf->flags & RADEON_SURF_ZBUFFER)); > > AddrSurfInfoIn.format = ADDR_FMT_8; > > break; > > case 2: > > + assert(surf->flags & RADEON_SURF_ZBUFFER || > > +!(surf->flags & RADEON_SURF_SBUFFER)); > > AddrSurfInfoIn.format = ADDR_FMT_16; > > break; > > case 4: > > + assert(surf->flags & RADEON_SURF_ZBUFFER || > > +!(surf->flags & RADEON_SURF_SBUFFER)); > > AddrSurfInfoIn.format = ADDR_FMT_32; > > break; > > case 8: > > + assert(!(surf->flags & RADEON_SURF_Z_OR_SBUFFER)); > > AddrSurfInfoIn.format = ADDR_FMT_32_32; > > break; > > case 16: > > + assert(!(surf->flags & RADEON_SURF_Z_OR_SBUFFER)); > > AddrSurfInfoIn.format = ADDR_FMT_32_32_32_32; > > break; > > default: > > assert(0); > > } > > AddrSurfInfoIn.bpp = surf->bpe * 8; > > } > > > > AddrSurfInfoIn.flags.color = !(surf->flags & > RADEON_SURF_Z_OR_SBUFFER); > > AddrSurfInfoIn.flags.depth = (surf->flags & RADEON_SURF_ZBUFFER) > != 0; > > @@ -1251,20 +1258,21 @@ static int gfx9_compute_surface(ADDR_HANDLE > addrlib, > > /* Calculate texture layout information. */ > > r = gfx9_compute_miptree(addrlib, config, surf, compressed, > >); > > if (r) > > return r; > > > > /* Calculate texture layout information for stencil. */ > > if (surf->flags & RADEON_SURF_SBUFFER) { > > AddrSurfInfoIn.flags.stencil = 1; > > AddrSurfInfoIn.bpp = 8; > > + AddrSurfInfoIn.format = ADDR_FMT_8; > > > > if (!AddrSurfInfoIn.flags.depth) { > > r = gfx9_get_preferred_swizzle_mode(addrlib, > , false, > > > ); > > if (r) > > return r; > > } else > > AddrSurfInfoIn.flags.depth = 0; > > > > r = gfx9_compute_miptree(addrlib, config, surf, compressed, > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] compiler/spirv: set is_image_sample_dref when required
Fixes crashes in: dEQP-VK.spirv_assembly.instruction.graphics.image_sampler.depth_property.* --- src/compiler/spirv/spirv_to_nir.c | 4 1 file changed, 4 insertions(+) diff --git a/src/compiler/spirv/spirv_to_nir.c b/src/compiler/spirv/spirv_to_nir.c index 7888e1b746..719e74c386 100644 --- a/src/compiler/spirv/spirv_to_nir.c +++ b/src/compiler/spirv/spirv_to_nir.c @@ -2029,12 +2029,15 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode, break; } + bool is_image_sample_dref = false; unsigned gather_component = 0; switch (opcode) { case SpvOpImageSampleDrefImplicitLod: case SpvOpImageSampleDrefExplicitLod: case SpvOpImageSampleProjDrefImplicitLod: case SpvOpImageSampleProjDrefExplicitLod: + is_image_sample_dref = true; + /* Fallthrough */ case SpvOpImageDrefGather: /* These all have an explicit depth value as their next source */ (*p++) = vtn_tex_src(b, w[idx++], nir_tex_src_comparator); @@ -2107,6 +2110,7 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode, instr->is_shadow = is_shadow; instr->is_new_style_shadow = is_shadow && glsl_get_components(ret_type->type) == 1; + instr->is_image_sample_dref = is_image_sample_dref; instr->component = gather_component; switch (glsl_get_sampler_result_type(image_type)) { -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] compiler/nir: add a is_image_sample_dref flag to texture instructions
So we can recognize image sampling instructions that involve a depth comparison against a reference, such as SPIR-V's OpImageSample{Proj}Dref{Explicit,Implicit}Lod and we can acknowledge that they return a single scalar value instead of a vec4. --- src/compiler/nir/nir.h | 9 + src/compiler/nir/nir_clone.c | 1 + src/compiler/nir/nir_instr_set.c | 2 ++ src/compiler/nir/nir_lower_tex.c | 5 - src/compiler/nir/nir_serialize.c | 5 - 5 files changed, 20 insertions(+), 2 deletions(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index 0d207d0ea5..625092cd2b 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -1231,6 +1231,12 @@ typedef struct { */ bool is_new_style_shadow; + /** +* If is_image_sample_dref is true, this is an image sample with depth +* comparing. +*/ + bool is_image_sample_dref; + /* gather component selector */ unsigned component : 2; @@ -1316,6 +1322,9 @@ nir_tex_instr_dest_size(const nir_tex_instr *instr) if (instr->is_shadow && instr->is_new_style_shadow) return 1; + if (instr->is_image_sample_dref) + return 1; + return 4; } } diff --git a/src/compiler/nir/nir_clone.c b/src/compiler/nir/nir_clone.c index bcfdaa7594..7d6cfd896f 100644 --- a/src/compiler/nir/nir_clone.c +++ b/src/compiler/nir/nir_clone.c @@ -415,6 +415,7 @@ clone_tex(clone_state *state, const nir_tex_instr *tex) ntex->is_array = tex->is_array; ntex->is_shadow = tex->is_shadow; ntex->is_new_style_shadow = tex->is_new_style_shadow; + ntex->is_image_sample_dref = tex->is_image_sample_dref; ntex->component = tex->component; ntex->texture_index = tex->texture_index; diff --git a/src/compiler/nir/nir_instr_set.c b/src/compiler/nir/nir_instr_set.c index 9cb9ed43e8..5563f6f095 100644 --- a/src/compiler/nir/nir_instr_set.c +++ b/src/compiler/nir/nir_instr_set.c @@ -155,6 +155,7 @@ hash_tex(uint32_t hash, const nir_tex_instr *instr) hash = HASH(hash, instr->is_array); hash = HASH(hash, instr->is_shadow); hash = HASH(hash, instr->is_new_style_shadow); + hash = HASH(hash, instr->is_image_sample_dref); unsigned component = instr->component; hash = HASH(hash, component); hash = HASH(hash, instr->texture_index); @@ -310,6 +311,7 @@ nir_instrs_equal(const nir_instr *instr1, const nir_instr *instr2) tex1->is_array != tex2->is_array || tex1->is_shadow != tex2->is_shadow || tex1->is_new_style_shadow != tex2->is_new_style_shadow || + tex1->is_image_sample_dref != tex2->is_image_sample_dref || tex1->component != tex2->component || tex1->texture_index != tex2->texture_index || tex1->texture_array_size != tex2->texture_array_size || diff --git a/src/compiler/nir/nir_lower_tex.c b/src/compiler/nir/nir_lower_tex.c index 1062afd97f..03e7555679 100644 --- a/src/compiler/nir/nir_lower_tex.c +++ b/src/compiler/nir/nir_lower_tex.c @@ -114,6 +114,7 @@ get_texture_size(nir_builder *b, nir_tex_instr *tex) txs->is_array = tex->is_array; txs->is_shadow = tex->is_shadow; txs->is_new_style_shadow = tex->is_new_style_shadow; + txs->is_image_sample_dref = tex->is_image_sample_dref; txs->texture_index = tex->texture_index; txs->texture = nir_deref_var_clone(tex->texture, txs); txs->sampler_index = tex->sampler_index; @@ -343,6 +344,7 @@ replace_gradient_with_lod(nir_builder *b, nir_ssa_def *lod, nir_tex_instr *tex) txl->is_array = tex->is_array; txl->is_shadow = tex->is_shadow; txl->is_new_style_shadow = tex->is_new_style_shadow; + txl->is_image_sample_dref = tex->is_image_sample_dref; txl->sampler_index = tex->sampler_index; txl->texture = nir_deref_var_clone(tex->texture, txl); txl->sampler = nir_deref_var_clone(tex->sampler, txl); @@ -794,7 +796,8 @@ nir_lower_tex_block(nir_block *block, nir_builder *b, if (((1 << tex->texture_index) & options->swizzle_result) && !nir_tex_instr_is_query(tex) && - !(tex->is_shadow && tex->is_new_style_shadow)) { + !(tex->is_shadow && tex->is_new_style_shadow) && + !tex->is_image_sample_dref) { swizzle_result(b, tex, options->swizzles[tex->texture_index]); progress = true; } diff --git a/src/compiler/nir/nir_serialize.c b/src/compiler/nir/nir_serialize.c index 00df49c2ef..dcbe1f0c13 100644 --- a/src/compiler/nir/nir_serialize.c +++ b/src/compiler/nir/nir_serialize.c @@ -583,10 +583,11 @@ union packed_tex_data { unsigned is_array:1; unsigned is_shadow:1; unsigned is_new_style_shadow:1; + unsigned is_image_sample_dref:1; unsigned component:2; unsigned has_texture_deref:1; unsigned has_sampler_deref:1; - unsigned unused:10; /* Mark unused for valgrind. */ + unsigned unused:9; /* Mark unused for valgrind. */ } u; }; @@ -607,6 +608,7 @@ write_tex(write_ctx *ctx, const nir_tex_instr
Re: [Mesa-dev] [PATCH mesa] meson/configure: detect endian.h instead of trying to guess when it's available
On Wednesday, 2018-03-28 14:05:00 +1100, Jonathan Gray wrote: > On Tue, Mar 27, 2018 at 07:36:17PM +0100, Emil Velikov wrote: > > On 25 March 2018 at 09:06, Jonathan Graywrote: > > > On Wed, Mar 21, 2018 at 05:09:17PM +, Eric Engestrom wrote: > > >> Cc: Maxin B. John > > >> Cc: Khem Raj > > >> Suggested-by: Jon Turney > > >> Signed-off-by: Eric Engestrom > > >> --- > > >> configure.ac| 1 + > > >> meson.build | 2 +- > > >> src/util/u_endian.h | 2 +- > > >> 3 files changed, 3 insertions(+), 2 deletions(-) > > > > > > OpenBSD and I suspect other systems have an endian.h that does not have > > > the __ defines like glibc. > > > > > Sigh, I guess the C/POSIX commitee should really wake up and add that > > to the standard. > > ... one way or another. > > > > Jonathan can you play around with AC_CHECK_DECLS and send a patch that > > works on your end? > > > > Thanks > > Emil > > > > [1] > > https://www.gnu.org/software/autoconf/manual/autoconf-2.62/html_node/Generic-Declarations.html > > Or just change the header? Or just add `_DEFAULT_SOURCE` to the build, allowing glibc to use non-underscored names? I was meaning to send a patch with this, but I'm swamped right now and haven't have a change to do this, but I'd rather not duplicate the block like suggested below. Automatic ack from me on a patch that adds this define to all the build systems, replaces all 3 names with their non-underscored variants and adds a `#ifndef BYTE_ORDER #error "BYTE_ORDER undefined" #endif`. > > Some care is needed as '#if undefined == undefined' becomes '#if 0 == 0' > which is true... Agreed :) > > diff --git a/src/util/u_endian.h b/src/util/u_endian.h > index e11b381588..bf3b8707a1 100644 > --- a/src/util/u_endian.h > +++ b/src/util/u_endian.h > @@ -30,9 +30,17 @@ > #ifdef HAVE_ENDIAN_H > #include > > -#if __BYTE_ORDER == __LITTLE_ENDIAN > +/* glibc */ > +#if defined(__BYTE_ORDER) && (__BYTE_ORDER == __LITTLE_ENDIAN) > # define PIPE_ARCH_LITTLE_ENDIAN > -#elif __BYTE_ORDER == __BIG_ENDIAN > +#elif defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN) > +# define PIPE_ARCH_BIG_ENDIAN > +#endif > + > +/* OpenBSD */ > +#if defined(BYTE_ORDER) && (BYTE_ORDER == LITTLE_ENDIAN) > +# define PIPE_ARCH_LITTLE_ENDIAN > +#elif defined(BYTE_ORDER) && (BYTE_ORDER == BIG_ENDIAN) > # define PIPE_ARCH_BIG_ENDIAN > #endif > > @@ -54,8 +62,8 @@ > # define PIPE_ARCH_BIG_ENDIAN > #endif > > -#elif defined(__OpenBSD__) || defined(__NetBSD__) || \ > - defined(__FreeBSD__) || defined(__DragonFly__) > +#elif defined(__NetBSD__) || defined(__FreeBSD__) || \ > + defined(__DragonFly__) > #include > #include > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev