[Mesa-dev] [PATCH] gallium/swr: Fix multi-context sync fence deadlock.

2019-01-04 Thread Bruce Cherniak
Various recreation scenarios lead to API thread getting stuck in swr_fence_finish(). This is a multi-context issue, whereby one context overwrites the fence read-value with a previous sync's lesser value. The fence sync value is supposed to be always increasing. In swr_fence_cb(), only update

[Mesa-dev] [PATCH] swr: Fix KNOB_MAX_WORKER_THREADS thread creation override.

2017-12-12 Thread Bruce Cherniak
Environment variable KNOB_MAX_WORKER_THREADS allows the user to override default thread creation and thread binding. Previous commit to adjust linux cpu topology caused setting this KNOB to bind all threads to a single core. This patch restores correct functionality of override. Cc:

[Mesa-dev] [PATCH v2] swr: Correct texture allocation and limit max size to 2GB

2017-11-30 Thread Bruce Cherniak
This patch fixes piglit tex3d-maxsize by correcting 4 things: The total_size calculation was using 32-bit math, therefore a >4GB allocation request overflowed and was not returning false (unsupported). Changed AlignedMalloc arguments from "unsigned int" to size_t, to handle >4GB allocations.

[Mesa-dev] [PATCH] swr: Correct texture allocation and limit max size to 2GB

2017-11-20 Thread Bruce Cherniak
This patch fixes piglit tex3d-maxsize by correcting 4 things: The total_size calculation was using 32-bit math, therefore a 4GB allocation request overflowed and was not returning false (unsupported). Changed AlignedMalloc arguments from "unsigned int" to size_t, to handle 4GB allocations.

[Mesa-dev] [PATCH] swr: Fixed an uncommon freed-memory access during state validation

2017-11-08 Thread Bruce Cherniak
State validation is performed during clear and draw calls. Validation during clear was still accessing vertex buffer state. When the currently set vertex buffers are client arrays, this could lead to accessing freed memory. Such is the case with the VMD application. Previously, vertex buffer

[Mesa-dev] [PATCH 1/2] st/mesa: only try to create 1x msaa surfaces for "fake" msaa drivers

2017-08-25 Thread Bruce Cherniak
From: Brian Paul For software drivers where we want "fake" msaa support for GL 3.x, we treat 1 sample as being msaa. For drivers with real msaa support, start format probing at 2x msaa. For drivers with fake msaa support, start format probing at 1x msaa. This also tweaks the

[Mesa-dev] [PATCH 2/2] swr: Report format max_samples=1 to maintain support for "fake" msaa.

2017-08-25 Thread Bruce Cherniak
Accompanying patch "st/mesa: only try to create 1x msaa surfaces for 'fake' msaa" requires driver to report max_samples=1 to enable "fake" msaa. Previously, 0 and 1 were treated equivalently in st_init_extensions() and either could enable "fake" msaa. This patch raises the swr default

[Mesa-dev] [PATCH] st/mesa: add osmesa framebuffer iface hash table per st manager

2017-08-02 Thread Bruce Cherniak
Commit bbc29393d3 didn't include osmesa state_tracker. This patch adds necessary initialization. Fixes crash in OSMesa initialization. Created-by: Charmaine Lee <charmai...@vmware.com> Tested-by: Bruce Cherniak <bruce.chern...@intel.com> Cc: Charmaine Lee <charmai...@vmware.com&

[Mesa-dev] [PATCH] st/mesa: add osmesa framebuffer iface hash table per st manager

2017-08-02 Thread Bruce Cherniak
Commit bbc29393d3 didn't include osmesa state_tracker. This patch adds necessary initialization. Fixes crash in OSMesa initialization. Created-by: Charmaine Lee <charmai...@vmware.com> Tested-by: Bruce Cherniak <bruce.chern...@intel.com> Cc: 17.2 <mesa-sta...@lists.freedeskto

[Mesa-dev] [PATCH v2 3/3] swr: Add path to draw directly from client memory without copy.

2017-07-12 Thread Bruce Cherniak
If size of client memory copy is too large, don't copy. The draw will access user-buffer directly and then block. This is faster and more efficient than queuing many large client draws. Applications that still use large client arrays benefit from this. VMD is an example. The threshold for this

[Mesa-dev] [PATCH v2 1/3] swr: Remove hard-coded constant and "todo" comment.

2017-07-12 Thread Bruce Cherniak
Removed the hard-coded constant in favor of a #define. Also removed TODO comment. The constant value doesn't need an environment configurable option. --- src/gallium/drivers/swr/swr_scratch.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git

[Mesa-dev] [PATCH v2 2/3] swr: Move environment config options into separate function.

2017-07-12 Thread Bruce Cherniak
Moved reading of environment config options out of swr_create_screen_internal, into a separate swr_validate_env_options. This is to keep from cluttering create_screen. --- src/gallium/drivers/swr/swr_screen.cpp | 60 +++--- 1 file changed, 34 insertions(+), 26

[Mesa-dev] [PATCH v2 0/3] swr: Optimize large draws from client arrays.

2017-07-12 Thread Bruce Cherniak
for this path defaults to 32KB. This value can be overridden by setting environment variable SWR_CLIENT_COPY_LIMIT. v2: Use #define for default value, rather than hard-coded constant. Bruce Cherniak (3): swr: Remove hard-coded constant and "todo" comment. swr: Move environment conf

[Mesa-dev] [PATCH 3/3] swr: Add path to draw directly from client memory without copy.

2017-07-11 Thread Bruce Cherniak
If size of client memory copy is too large, don't copy. The draw will access user-buffer directly and then block. This is faster and more efficient than queuing many large client draws. Applications that use large draws from client arrays benefit from this. VMD is an example. The threshold for

[Mesa-dev] [PATCH 0/3] swr: Optimize large draws from client arrays.

2017-07-11 Thread Bruce Cherniak
for this path defaults to 32KB. This value can be overridden by setting environment variable SWR_CLIENT_COPY_LIMIT. Bruce Cherniak (3): swr: Remove hard-coded constant and "todo" comment. swr: Move environment config options into separate function. swr: Add path to draw directly f

[Mesa-dev] [PATCH 2/3] swr: Move environment config options into separate function.

2017-07-11 Thread Bruce Cherniak
Moved reading of environment config options out of swr_create_screen_internal, into a separate swr_validate_env_options. This is to keep from cluttering create_screen. --- src/gallium/drivers/swr/swr_screen.cpp | 60 +++--- 1 file changed, 34 insertions(+), 26

[Mesa-dev] [PATCH 1/3] swr: Remove hard-coded constant and "todo" comment.

2017-07-11 Thread Bruce Cherniak
Removed the hard-coded constant in favor of a #define. Also removed TODO comment, the constant value doesn't need an environment configurable option. --- src/gallium/drivers/swr/swr_scratch.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git

[Mesa-dev] [PATCH] swr: Limit memory held by defer deleted resources.

2017-06-30 Thread Bruce Cherniak
This patch limits the number of items on the fence work queue (the deferred deletion list) by submitting a sync fence when the queue size exceeds a threshold. This initiates deferred deletion of all resources on the list and decreases the total amount of memory held waiting for "deferred

[Mesa-dev] [PATCH] swr: Minor cleanup of variable usage, no functional change.

2017-06-29 Thread Bruce Cherniak
In swr_update_derived, for consistency, index buffer validation should be using the p_draw_info copy "info" rather than referencing p_draw_info. No functional change. --- src/gallium/drivers/swr/swr_state.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git

[Mesa-dev] [PATCH] swr: Remove need to allocate vertex buffer scratch space all in one go.

2017-06-28 Thread Bruce Cherniak
Deferred deletion (via "fence_work") has obsoleted the need to allocate all client vertex buffer scratch space in a single chunk. Scratch allocations are now valid until the referenced fence is complete. --- src/gallium/drivers/swr/swr_state.cpp | 25 ++--- 1 file changed, 2

[Mesa-dev] [PATCH] swr: conditionally validate vertex buffer state

2017-06-27 Thread Bruce Cherniak
Vertex buffer state doesn't need to be validated on every call, only on dirty _NEW_VERTEX or indexed draws. Unconditional validation was introduced as part of patch 330d0607ed6, "remove pipe_index_buffer and set_index_buffer", with the expectation we'd optimize later. ---

[Mesa-dev] [PATCH] swr: set an explicit clear_rect if scissor is not enabled.

2017-06-26 Thread Bruce Cherniak
Fix regression of "no rendering" on simple apps like glxgears by setting an explicit full surface clear_rect when scissor is not enabled. This regressed with commit 00173d91 "st/mesa: don't set 16 scissors and 16 viewports if they're unused" due to an assumption that a default scissor rect is

[Mesa-dev] [PATCH v2] swr: Don't crash when encountering a VBO with stride = 0.

2017-06-15 Thread Bruce Cherniak
The swr driver uses vertex_buffer->stride to determine the number of elements in a VBO. A recent change to the state-tracker made it possible for VBO's with stride=0. This resulted in a divide by zero crash in the driver. The solution is to use the pre-calculated vertex element stream_pitch in

[Mesa-dev] [PATCH] swr: Don't crash when encountering a VBO with stride = 0.

2017-06-13 Thread Bruce Cherniak
The swr driver uses vertex_buffer->stride to determinine the number of elements in a VBO. A recent change to the state-tracker made it possible for VBO's with stride=0. This resulted in a divide by zero crash in the driver. The solution is to use the pre-calculated vertex element stream_pitch in

[Mesa-dev] [PATCH v3] swr: move msaa resolve to generalized StoreTile

2017-05-04 Thread Bruce Cherniak
v3: list piglit tests fixed by this patch. Fixed typo Tim pointed out. v2: Reword commit message to more closely adhere to community guidelines. This patch moves msaa resolve down into core/StoreTiles where the surface format conversion routines are available. The previous "experimental" resolve

[Mesa-dev] [PATCH v2] swr: move msaa resolve to generalized StoreTile

2017-04-27 Thread Bruce Cherniak
v2: Reword commit message to more closely adhere to community guidelines. This patch moves msaa resolve down into core/StoreTiles where the surface format conversion routines are available. The previous "experimental" resolve was limited to 8-bit unsigned render targets. This fixes a number of

[Mesa-dev] [PATCH] swr: MSAA fixes: piglit crashes, additional formats, improve perf.

2017-04-26 Thread Bruce Cherniak
This patch moves msaa resolve down into core/StoreTiles where it can take advantage of all the surface formats - previous resolve was limited to 8-bit unsigned. This fixes a number of piglit msaa tests that were crashing. MSAA performance is also greatly improved because resolve is done in

[Mesa-dev] [PATCH] swr: Enable MSAA in OpenSWR software renderer

2017-04-13 Thread Bruce Cherniak
This patch enables multisample antialiasing in the OpenSWR software renderer. MSAA is a proof-of-concept/work-in-progress with bug fixes and performance on the way. We wanted to get the changes out now to allow several customers to begin experimenting with MSAA in a software renderer. So as not

[Mesa-dev] [PATCH] swr: Removed unnecessary PIPE_BIND flags from swr_is_format_supported

2017-04-12 Thread Bruce Cherniak
Removed unnecessary and probably wrong PIPE_BIND_SCANOUT and PIPE_BIND_SHARED flags in favor of check on single PIPE_BIND_DISPLAY_TARGET flag. Reference llvmpipe change --- src/gallium/drivers/swr/swr_screen.cpp | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git

[Mesa-dev] [PATCH] swr: Align swr_context allocation to SIMD alignment.

2017-04-12 Thread Bruce Cherniak
The context now contains SIMD vectors which must be aligned (specifically samplePositions in the rastState in the derived state). Failure to align can result in segv crash on unaligned memory access in vector instructions. --- src/gallium/drivers/swr/swr_context.cpp | 7 +-- 1 file changed,

[Mesa-dev] [PATCH] st/glx: Add awareness for multisample pixel formats to st/glx-xlib.

2017-04-07 Thread Bruce Cherniak
In preparation for enabling MSAA in OpenSWR, the state trackers need to be aware of multisample pixel formats for software renderers. This patch allows glx-xlib to query the renderer for support of pixel formats with multisample, and create multisample resources. This change is benign to

[Mesa-dev] [PATCH] swr: Fix crash in swr_update_derived following st/mesa state changes.

2017-03-01 Thread Bruce Cherniak
Recent change to st/mesa state update logic caused major regressions to swr validation code. swr uses the same validation logic (swr_update_derived) for both draw and Clear calls. New st/mesa state update logic results in certain state objects not being set/bound during Clear. This was causing

[Mesa-dev] [PATCH] docs: update features.txt for GL_ARB_clear_texture with swr

2017-02-25 Thread Bruce Cherniak
--- docs/features.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/features.txt b/docs/features.txt index d9528e9..c42581a 100644 --- a/docs/features.txt +++ b/docs/features.txt @@ -192,7 +192,7 @@ GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, radeonsi

[Mesa-dev] [PATCH] swr: enable clear_texture with util_clear_texture

2017-02-25 Thread Bruce Cherniak
Passes corresponding piglit tests. --- src/gallium/drivers/swr/swr_context.cpp | 1 + src/gallium/drivers/swr/swr_screen.cpp | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/swr/swr_context.cpp b/src/gallium/drivers/swr/swr_context.cpp index

[Mesa-dev] [PATCH] swr: [rasterizer core] Removed unused clip code.

2017-02-03 Thread Bruce Cherniak
Removed unused Clip() and FRUSTUM_CLIP_MASK define. --- src/gallium/drivers/swr/rasterizer/core/clip.cpp | 22 -- src/gallium/drivers/swr/rasterizer/core/clip.h | 4 2 files changed, 26 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/clip.cpp

[Mesa-dev] [PATCH v2] swr: [rasterizer core] Remove dead code Clipper::ClipScalar()

2017-02-02 Thread Bruce Cherniak
v2: includes bugzilla reference, same code change Clipper::ClipScalar() is dead code and should be removed. It is causing an error with gcc-7 because it references a now defunct member. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99633 CC: "13.0 17.0"

[Mesa-dev] [PATCH] swr: [rasterizer core] Remove dead code Clipper::ClipScalar()

2017-02-02 Thread Bruce Cherniak
Clipper::ClipScalar() is dead code and should be removed. It is causing an error with gcc-7 because it references a now defunct member. CC: "13.0 17.0" --- src/gallium/drivers/swr/rasterizer/core/clip.h | 39 -- 1 file changed, 39

[Mesa-dev] [PATCH] gallium: Reduce trace_dump_box_bytes size by box->x.

2017-02-01 Thread Bruce Cherniak
If stride is supplied (as either stride or slice_stride), trace_dump_box_bytes will try to read stride bytes, regardless whether start address is offset by box->x. This causes access outside mapped region, and possible segv. (transfer_map stride and layer_stride are not adjusted for box

[Mesa-dev] [PATCH] swr: Prune empty nodes in CalculateProcessorTopology.

2017-01-19 Thread Bruce Cherniak
CalculateProcessorTopology tries to figure out system topology by parsing /proc/cpuinfo to determine the number of threads, cores, and NUMA nodes. There are some architectures where the "physical id" begins with 1 rather than 0, which was creating and empty "0" node and causing a crash in

[Mesa-dev] [PATCH] swr: Fix BugID 9919 compile error (icc-only).

2016-12-22 Thread Bruce Cherniak
ICC doesn't like the use of nullptr (std::nullptr_t) argument in p_atomic_set. GCC and clang don't complain. --- src/gallium/drivers/swr/swr_fence_work.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/swr/swr_fence_work.cpp

[Mesa-dev] [PATCH] swr: Implement fence attached work queues for deferred deletion.

2016-12-12 Thread Bruce Cherniak
Work can now be added to fences and triggered by fence completion. This allows for deferred resource deletion, and other asynchronous tasks. --- src/gallium/drivers/swr/Makefile.sources | 2 + src/gallium/drivers/swr/swr_context.cpp| 7 +- src/gallium/drivers/swr/swr_fence.cpp |

[Mesa-dev] [PATCH] swr: Fix active_queries count

2016-12-01 Thread Bruce Cherniak
The active_query count was incorrect for query types that don't require a begin_query. Removed the unnecessary assert. --- src/gallium/drivers/swr/swr_query.cpp | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/swr/swr_query.cpp

[Mesa-dev] [PATCH] swr: Removed stalling SwrWaitForIdle from queries.

2016-09-27 Thread Bruce Cherniak
Previous fundamental change in stats gathering added a temporary SwrWaitForIdle to begin_query and end_query. Code has been reworked to remove stall. --- src/gallium/drivers/swr/swr_context.cpp | 33 +++ src/gallium/drivers/swr/swr_context.h | 11 ++-

[Mesa-dev] [PATCH v2] swr: Update screen->context pointer with multiple contexts.

2016-06-17 Thread Bruce Cherniak
A pipe pointer in the screen allows for access to current device context in flush_frontbuffer and resource_destroy. This wasn't tracking current context in multi-context situations. v2: More caffeine. Corrected compare, removed unnecessary set of screen-pipe in create_context, and added a few

[Mesa-dev] [PATCH] swr: Update screen->context pointer with multiple contexts.

2016-06-17 Thread Bruce Cherniak
A pipe pointer in the screen allows for access to current device context in flush_frontbuffer and resource_destroy. This wasn't tracking current context in multi-context situations. --- src/gallium/drivers/swr/swr_context.cpp |6 -- src/gallium/drivers/swr/swr_state.cpp |4 2

[Mesa-dev] [PATCH] swr: [rasterizer] Correctly select optimized primitive assembly.

2016-05-24 Thread Bruce Cherniak
Indexed primitives were always using cut-aware primitive assembly, whether primitive_restart was enabled or not. Correctly pass down primitive_restart and select optimized PA when possible. --- src/gallium/drivers/swr/rasterizer/core/api.cpp|2 ++