[Mesa-dev] [Bug 80243] Mesa and libclc build failure after llvm = llvm-3.5svn r211259
https://bugs.freedesktop.org/show_bug.cgi?id=80243 Iaroslav Andrusyak pontost...@gmail.com changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #2 from Iaroslav Andrusyak pontost...@gmail.com --- fixed by 564821c917f4a9d5a0de2ee77b90b0cd85e3d3a6 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Testing OpenCL with gegl
Hmm. I get GEGL_DEBUG=opencl ./run-compositions.py apply-lens.xml GEGL-gegl-cl-init.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-init.c:508: libOpenCL.so: 'clCreateFromGLTexture2D': /usr/lib/x86_64-linux-gnu/libOpenCL.so: undefined symbol: clCreateFromGLTexture2D GEGL-gegl-cl-init.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-init.c:636: OpenCL is disabled GEGL-gegl-cl-init.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-init.c:636: OpenCL is disabled /media/bigdata/compile/gegl/tests/compositions/reference/apply-lens.png and /media/bigdata/compile/gegl/tests/compositions/output/apply-lens.png are identical PASS apply-lens.xml SKIP apply-lens.xml (OpenCL) === Test Results === tests passed: 1 tests skipped: 1 tests failed: 0 == PASS == I have built llvm, kernel,mesa from git with opencl on (A8-6500 APU, ARUBA). Simple opencl tests work. Using gimp (built from git) i have the same undefined symbol error in the console and all operations use 1 CPU core, the GPU doesnt seem to be used at all. If i make all tests, all opencl tests are skipped and one (non-opencl i suppose) even fails (gegl.xml). Any ideas how i could use opencl? -- O zi buna, Kertesz Laszlo ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/11] Makes DRI3 use invalidate.
On Thu, Jun 19, 2014 at 1:27 PM, Axel Davy axel.d...@ens.fr wrote: This doesn't change anything to the intel DRI3 implementation, but enables the gallium implementation to use dri2.stamp instead of relying on the stamp shared with the st backend. Signed-off-by: Axel Davy axel.d...@ens.fr Seems to fix the same issue as the patch in Keith's tree was fixing for me. Reviewed-by: Ben Skeggs bske...@redhat.com --- src/glx/dri3_glx.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c index 55eed39..b309cd4 100644 --- a/src/glx/dri3_glx.c +++ b/src/glx/dri3_glx.c @@ -1308,9 +1308,14 @@ static const __DRIimageLoaderExtension imageLoaderExtension = { .flushFrontBuffer= dri3_flush_front_buffer, }; +const __DRIuseInvalidateExtension dri3UseInvalidate = { + .base = { __DRI_USE_INVALIDATE, 1 } +}; + static const __DRIextension *loader_extensions[] = { imageLoaderExtension.base, systemTimeExtension.base, + dri3UseInvalidate.base, NULL }; @@ -1384,6 +1389,8 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, ++(*priv-stamp); } + (*psc-f-invalidate)(priv-driDrawable); + return ret; } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] scons: avoid building any piece of i915
On Thu, Jun 19, 2014 at 9:02 PM, Emil Velikov emil.l.veli...@gmail.com wrote: Leftover from commit c21fca8bf24. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/gallium/SConscript| 11 --- src/gallium/drivers/i915/SConscript | 12 src/gallium/winsys/i915/sw/SConscript | 12 3 files changed, 35 deletions(-) delete mode 100644 src/gallium/drivers/i915/SConscript delete mode 100644 src/gallium/winsys/i915/sw/SConscript Reviewed-by: Jakob Bornecrantz wallbra...@gmail.com diff --git a/src/gallium/SConscript b/src/gallium/SConscript index d6ca993..2ce8a46 100644 --- a/src/gallium/SConscript +++ b/src/gallium/SConscript @@ -21,12 +21,6 @@ SConscript([ 'drivers/trace/SConscript', ]) -if not env['msvc']: -# These drivers do not build on MSVC compilers -SConscript([ -'drivers/i915/SConscript', -]) - # # State trackers # @@ -62,11 +56,6 @@ if env['platform'] == 'windows': 'winsys/sw/gdi/SConscript', ]) -if not env['msvc']: -SConscript([ -'winsys/i915/sw/SConscript', -]) - if env['platform'] == 'haiku': SConscript([ 'winsys/sw/hgl/SConscript', diff --git a/src/gallium/drivers/i915/SConscript b/src/gallium/drivers/i915/SConscript deleted file mode 100644 index 22de67d..000 --- a/src/gallium/drivers/i915/SConscript +++ /dev/null @@ -1,12 +0,0 @@ -Import('*') - -env = env.Clone() - -i915 = env.ConvenienceLibrary( - target = 'i915', - source = env.ParseSourceList('Makefile.sources', 'C_SOURCES') - ) - -env.Alias('i915', i915) - -Export('i915') diff --git a/src/gallium/winsys/i915/sw/SConscript b/src/gallium/winsys/i915/sw/SConscript deleted file mode 100644 index 9d78519..000 --- a/src/gallium/winsys/i915/sw/SConscript +++ /dev/null @@ -1,12 +0,0 @@ -Import('*') - -env = env.Clone() - -i915_sw_sources = env.ParseSourceList('Makefile.sources', 'C_SOURCES') - -i915sw = env.ConvenienceLibrary( -target ='i915sw', -source = i915_sw_sources, -) - -Export('i915sw') -- 2.0.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 80266] Many instances of 131, which is undefined in C99
https://bugs.freedesktop.org/show_bug.cgi?id=80266 --- Comment #12 from Vittorio zec...@gmail.com --- In texstore.c lines 1281 and 1340 255 24 is computed in PACK_COLOR_. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Testing OpenCL with gegl
Hi, Have you built gegl with the patches that Jan sent to the mailing list? You can find them here: http://lists.freedesktop.org/archives/mesa-dev/2014-June/061880.html Hope it helps! Bruno On Sat, 2014-06-21 at 09:17 +0300, Kertesz Laszlo wrote: Hmm. I get GEGL_DEBUG=opencl ./run-compositions.py apply-lens.xml GEGL-gegl-cl-init.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-init.c:508: libOpenCL.so: 'clCreateFromGLTexture2D': /usr/lib/x86_64-linux-gnu/libOpenCL.so: undefined symbol: clCreateFromGLTexture2D GEGL-gegl-cl-init.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-init.c:636: OpenCL is disabled GEGL-gegl-cl-init.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-init.c:636: OpenCL is disabled /media/bigdata/compile/gegl/tests/compositions/reference/apply-lens.png and /media/bigdata/compile/gegl/tests/compositions/output/apply-lens.png are identical PASS apply-lens.xml SKIP apply-lens.xml (OpenCL) === Test Results === tests passed: 1 tests skipped: 1 tests failed: 0 == PASS == I have built llvm, kernel,mesa from git with opencl on (A8-6500 APU, ARUBA). Simple opencl tests work. Using gimp (built from git) i have the same undefined symbol error in the console and all operations use 1 CPU core, the GPU doesnt seem to be used at all. If i make all tests, all opencl tests are skipped and one (non-opencl i suppose) even fails (gegl.xml). Any ideas how i could use opencl? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 01/23] glsl: Add parsing support for multi-stream output in geometry shaders.
On Fri, Jun 20, 2014 at 11:17 PM, Timothy Arceri t_arc...@yahoo.com.au wrote: On Fri, 2014-06-20 at 10:46 -0700, Ian Romanick wrote: I care a lot less about what is often done than I do about what should be done. If there is an argument to be made that stand-alone comments (not on a line with other code) are better, that would be good data to have. Well I'm not sure how common a use case this is for Mesa devs but when doing any coding on a laptop the less wasted lines the better. Especially when the aspect ratio of all new laptops is suited to watching dvds rather than any kind of document viewing/editing. I personally do most of my Mesa contributions on the train travelling to and from work. For the same reason I've always wondered why the preferred width of lines is 78 columns? Is this just a historical thing? Or does it make sense to use this restriction e.g for debugging in the terminal? The devinfo.html doesn't give an actual reason, and 78 columns seems rather small considering the size and resolutions of modern screens. I think a shorter line limit is *especially* important in the age of modern screens that are way too wide. If you were to, say, double the line length to 160, some lines would be very long but most would remain short. So you would lose code density (half the window would be empty), and you would lose the ability to have more windows side-by-side. Also, it so happens that 2 80-char windows fit *perfectly* side-by-side on a rotated 24 monitor (1200x1920). -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/23] Megadrivers galore
On 21/06/14 01:30, Thomas Helland wrote: Hi again, Autogen seems to work flawlessly with the updated branch. Everything builds and compiles as normal on my Ivy Bridge setup. When trying to do a make install though, something fails. I'm not sure if it's related to your changes, but at least master gets further into the process before failing due to LLVM incompatibility when trying to install libxatracker I'm not sure what llvm has to do with libxatracker. I'm suspecting that 564821c917f will help on that regard. make[3]: Entering directory '/home/helland/Mesa/src/gallium/targets/vdpau' CXXLDlibvdpau_gallium.la .libs/libvdpau_gallium_la-target.o: In function `pipe_r600_create_screen': /home/helland/Mesa/src/gallium/targets/vdpau/../../../../src/gallium/auxiliary/target-helpers/inline_drm_helper.h:178: undefined reference to `radeon_drm_winsys_create' collect2: error: ld returned 1 exit status Makefile:729: recipe for target 'libvdpau_gallium.la' failed make[3]: *** [libvdpau_gallium.la] Error 1 make[3]: Leaving directory '/home/helland/Mesa/src/gallium/targets/vdpau' Makefile:524: recipe for target 'install-recursive' failed make[2]: *** [install-recursive] Error 1 make[2]: Leaving directory '/home/helland/Mesa/src/gallium/targets' Makefile:530: recipe for target 'install-recursive' failed make[1]: *** [install-recursive] Error 1 make[1]: Leaving directory '/home/helland/Mesa/src' Makefile:579: recipe for target 'install-recursive' failed make: *** [install-recursive] Error 1 As soon as I saw this I realised what Christian meant while going through the radeon patches. Cheers for that. The updated series (and rebased on top of master) can be found at branch static-or-shared-pipe-drivers-v3 in the usual place. Thanks again Emil I might just be doing something wrong here though. I'm not really that familiar with build-systems yet. (I'm spoiled with having started my programming career in java) Cheers, Thomas ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/13] mesa: Group gl_system_value values by the stage where they exist
Am 21.06.2014 03:00, schrieb Ian Romanick: From: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Ilia Mirkin imir...@alum.mit.edu Cc: Marek Olšák marek.ol...@amd.com Cc: Roland Scheidegger srol...@vmware.com Cc: 10.2 mesa-sta...@lists.freedesktop.org --- src/mesa/main/mtypes.h | 32 +++--- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +-- 2 files changed, 35 insertions(+), 9 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 8b7ee30..3899e7f 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2035,13 +2035,31 @@ typedef enum */ typedef enum { - SYSTEM_VALUE_FRONT_FACE, /** Fragment shader only (not done yet) */ - SYSTEM_VALUE_VERTEX_ID, /** Vertex shader only */ - SYSTEM_VALUE_INSTANCE_ID,/** Vertex shader only */ - SYSTEM_VALUE_SAMPLE_ID, /** Fragment shader only */ - SYSTEM_VALUE_SAMPLE_POS, /** Fragment shader only */ - SYSTEM_VALUE_SAMPLE_MASK_IN, /** Fragment shader only */ - SYSTEM_VALUE_INVOCATION_ID, /** Geometry shader only */ + /** +* \name Vertex shader system values +*/ + /*@{*/ + SYSTEM_VALUE_VERTEX_ID, + SYSTEM_VALUE_INSTANCE_ID, + /*@}*/ + + /** +* \name Geometry shader system values +*/ + /*@{*/ + SYSTEM_VALUE_INVOCATION_ID, + /*@}*/ + + /** +* \name Fragment shader system values +*/ + /*@{*/ + SYSTEM_VALUE_FRONT_FACE, /** (not done yet) */ + SYSTEM_VALUE_SAMPLE_ID, + SYSTEM_VALUE_SAMPLE_POS, + SYSTEM_VALUE_SAMPLE_MASK_IN, + /*@}*/ + SYSTEM_VALUE_MAX /** Number of values */ } gl_system_value; diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index fbfbea6..fa95493 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -4236,13 +4236,21 @@ struct st_translate { /** Map Mesa's SYSTEM_VALUE_x to TGSI_SEMANTIC_x */ const unsigned _mesa_sysval_to_semantic[SYSTEM_VALUE_MAX] = { - TGSI_SEMANTIC_FACE, + /* Vertex shader +*/ TGSI_SEMANTIC_VERTEXID, TGSI_SEMANTIC_INSTANCEID, + + /* Geometry shader +*/ + TGSI_SEMANTIC_INVOCATIONID, + + /* Fragment shader +*/ + TGSI_SEMANTIC_FACE, TGSI_SEMANTIC_SAMPLEID, TGSI_SEMANTIC_SAMPLEPOS, TGSI_SEMANTIC_SAMPLEMASK, - TGSI_SEMANTIC_INVOCATIONID, }; /** Looks good to me (though it does mean you have to add new values in the middle in the future if you group them by stage). 1-4 are Reviewed-by: Roland Scheidegger srol...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/13] Fix gl_VertexID on i965
Am 21.06.2014 03:00, schrieb Ian Romanick: This patch series fixes bugs in the i965 w.r.t. several uses of gl_VertexID. OpenGL (desktop and ES) have the following expectations of gl_VertexID: 1. When used with BaseVertex drawing commands, gl_VertexID will include the value of basevertex. This differens from the other API, but the change in OpenGL was based on feedback from application developers. This only affects OpenGL 3.2+. I don't question this, but note some binary drivers may not agree. There are unforunately lots of man pages around of glDrawElementsInstancedBaseVertexBaseInstance too which state: The basevertex has no effect on the shader-visible value of gl_VertexID. There's a bug filed about this (https://www.khronos.org/bugzilla/show_bug.cgi?id=932) which is still open though maybe it has all been resolved... (I'll just mention that llvmpipe right now will do d3d10-style vertexID though I guess indeed at some point we need to make it either switchable or implement GL_ARB_shader_draw_parameters so it can be worked around easily in the state tracker.) Roland 2. When used with DrawArrays drawing commands, gl_VertexID will count from the 'start' value instead of zero. This affects OpenGL 3.0+ and OpenGL ES 3.0+. The i965 driver botched both of these for slightly different reasons. For #1, our hardware was designed with the other API's semantic in mind, so gl_VertexID doesn't include the basevertex value. For #2, we implement DrawArrays with a non-zero start index by reprogramming the array base offset. As a result, gl_VertexID always counts from zero. I suspect other hardware may suffer from one or both of these issues. I have sent tests to the piglit list to reproduce them. To fix both of these issues, the shader needs to know the basevertex value. A later GL extension, GL_ARB_shader_draw_parameters, adds gl_BaseVertex and gl_BaseInstance as built-in variables. I believe some hardware implements these as system values much like gl_VertexID. I started this series with the assumption that we could have a SYSTEM_VALUE_BASE_VERTEX that might come from a uniform. In the end, I couldn't make that work. I left patch 7 in the series, but we may want to remove it. I added a STATE_BASE_VERTEX uniform instead. There is now a lowering pass that converts gl_VertexID to gl_VertexIDMESA + gl_BaseVertex (backed by a uniform). Some of these names should probably be changed. I have also contemplated adding an extension that exposes the other semantic for gl_VertexID for applications that actually want that. This is primarily things that are porting content (or bridging content) from the other API. That, however, can happen later. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-devk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=E56BlL24I31lr1fFtEsKBJjAprQ%2BQdhUczD1EsQ3dIY%3D%0As=2634834afd3400c841ab62f11a9b5350d604592529cfea97f2297c9398c0fd93 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/11] [RFC v2] Solve the mapping bug
Bruno Jiménez brunoji...@gmail.com writes: Hi, This is my second attempt to fix the mapping bug adding all the suggestions that Tom Stellard sent, and, so far, it seems that it is resolved. This series changes completely how OpenCL buffers are handled by the r600g driver. Before this, we would add them directly to a pool, and this pool would grow whenever we needed more space. But this process implied destroying the pool and creating a new one. There could be cases where a buffer would be mapped and the pool would grow, leaving one side of the mapping pointed to where the item was. This is the 'mapping bug' Now, Items will have an intermediate resource, where all mappings can be done, and when a buffer is going to be used with a kernel it is promoted to the pool. In the case where a promoted item is going to be mapped, it is previously demoted, so even if the pool changes its location due to growing, the map remains valid. In the case of a buffer mapped for reading, and used by a kernel to read from it, we will duplicate this buffer, having the intermediate buffer, where the user has its map, and an item in the pool, which is the one that the kernel is going to use. As a summary for v2: Patches 1-8: These are the main part of the series, and solve the mapping bug. Patches 1 and 7 now use less explicit castings Patch 2 is new and introduces the 'is_item_in_pool' function, which is used in patches 3 and 8 Patch 9: Is a complete rewrite of v1 patch 8 using gallium utils for double lists Patches 10 and 11: These are just a proof of concept for avoiding transfers GPU - GPU when using all CL Read/Write functions. They are v1 patch 9 splited in two to separate r600g changes from clover changes. Now, in clover's side it introduces and uses 'CLOVER_TRANSFER_MAP_DIRECTLY' so it doesen't collide with any other OpenCL flag. Please review and Thanks :) Bruno Jiménez (11): r600g/compute: Add an intermediate resource for OpenCL buffers r600g/compute: Add an util function to know if an item is in the pool r600g/compute: Add statuses to the compute_memory_items r600g/compute: divide the item list in two r600g/compute: Only move to the pool the buffers marked for promoting r600g/compute: Avoid problems when promoting items mapped for reading r600g/compute: Implement compute_memory_demote_item r600g/compute: Map only against intermediate buffers r600g/compute: Use gallium util functions for double lists r600g/compute: Map directly the pool in some cases clover: Use PIPE_TRANSFER_MAP_DIRECTLY when writing/reading buffers src/gallium/drivers/r600/compute_memory_pool.c | 294 - src/gallium/drivers/r600/compute_memory_pool.h | 31 ++- src/gallium/drivers/r600/evergreen_compute.c | 38 ++- src/gallium/state_trackers/clover/api/transfer.cpp | 4 +- src/gallium/state_trackers/clover/core/object.hpp | 4 + .../state_trackers/clover/core/resource.cpp| 2 + 6 files changed, 233 insertions(+), 140 deletions(-) -- 2.0.0 Hi Bruno, I'm sorry for having taken so long to comment on this, I haven't had time to read through your patch series carefully until last night. I'm not convinced that this is going to be a satisfactory solution, it punishes well-behaving applications by duplicating memory usage under some circumstances, and by slowing down buffer mapping with (in most cases) unnecessary shadow copies: Even though using a buffer while mapped is something to keep in mind because it's allowed by the spec with some restrictions, I think it's reasonable to assume that we're only going to see a comparatively small number of active buffer mappings during the execution of any compute kernel, so it seems unfortunate to me that the rest of mappings will have to pay for this one corner case. The implementation of PIPE_TRANSFER_MAP_DIRECTLY introduced in PATCH 10 has somewhat worrying semantics: A mapping with this flag might become stale unpredictably if a kernel is run, maybe from a different command queue. Clover's transfer functions don't hit that path right now on single-threaded applications, but they might in the future as we start accelerating the APIs we currently implement with soft_copy_op(). This is a bug IMHO: even direct mappings should last until the corresponding unmap call. I'm not advocating a revert of the series because it fixes a serious bug, but please don't push patches 10-11, we should probably start looking for a different solution. Some suggestions are: - Why do you even need a pool? Wouldn't it be possible to create a huge RAT, e.g. covering a 4GB portion of the GPU memory and then use a special memory domain or some sort of flag to tell the kernel to allocate a buffer from that region (or relocate if it's already been allocated elsewhere)? This is especially easy on hardware with virtual memory, as
Re: [Mesa-dev] [PATCH 3/5] clover: Report default values for half and double fp configs
Tom Stellard thomas.stell...@amd.com writes: From: Matt Arsenault arse...@gmail.com --- src/gallium/state_trackers/clover/api/device.cpp | 11 +-- src/gallium/state_trackers/clover/core/device.cpp | 24 +++ src/gallium/state_trackers/clover/core/device.hpp | 3 +++ 3 files changed, 36 insertions(+), 2 deletions(-) diff --git a/src/gallium/state_trackers/clover/api/device.cpp b/src/gallium/state_trackers/clover/api/device.cpp index 1bc2692..dc8e22c 100644 --- a/src/gallium/state_trackers/clover/api/device.cpp +++ b/src/gallium/state_trackers/clover/api/device.cpp @@ -201,8 +201,15 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, break; case CL_DEVICE_SINGLE_FP_CONFIG: - buf.as_scalarcl_device_fp_config() = - CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; + buf.as_scalarcl_device_fp_config() = dev.single_fp_config(); + break; + + case CL_DEVICE_DOUBLE_FP_CONFIG: + buf.as_scalarcl_device_fp_config() = dev.double_fp_config(); + break; + + case CL_DEVICE_HALF_FP_CONFIG: + buf.as_scalarcl_device_fp_config() = dev.half_fp_config(); break; case CL_DEVICE_GLOBAL_MEM_CACHE_TYPE: diff --git a/src/gallium/state_trackers/clover/core/device.cpp b/src/gallium/state_trackers/clover/core/device.cpp index bc3e3e6..6d52dd4 100644 --- a/src/gallium/state_trackers/clover/core/device.cpp +++ b/src/gallium/state_trackers/clover/core/device.cpp @@ -163,6 +163,30 @@ device::max_clock_frequency() const { PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY)[0]; } +cl_device_fp_config +device::single_fp_config() const { + // TODO: Get these from somewhere. + return CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; +} + +cl_device_fp_config +device::double_fp_config() const { + // TODO: Get these from somewhere. This is the mandated minimum double + // precision floating-point capability +return CL_FP_FMA Weird indentation here, otherwise looks good to me. + | CL_FP_ROUND_TO_NEAREST + | CL_FP_ROUND_TO_ZERO + | CL_FP_ROUND_TO_INF + | CL_FP_INF_NAN + | CL_FP_DENORM; +} + +cl_device_fp_config +device::half_fp_config() const { + // TODO: Get these from somewhere. + return CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; +} + std::vectorsize_t device::max_block_size() const { auto v = get_compute_paramuint64_t(pipe, PIPE_COMPUTE_CAP_MAX_BLOCK_SIZE); diff --git a/src/gallium/state_trackers/clover/core/device.hpp b/src/gallium/state_trackers/clover/core/device.hpp index 3662c6b..380029e 100644 --- a/src/gallium/state_trackers/clover/core/device.hpp +++ b/src/gallium/state_trackers/clover/core/device.hpp @@ -62,6 +62,9 @@ namespace clover { size_t max_threads_per_block() const; cl_ulong max_mem_alloc_size() const; cl_uint max_clock_frequency() const; + cl_device_fp_config single_fp_config() const; + cl_device_fp_config double_fp_config() const; + cl_device_fp_config half_fp_config() const; std::vectorsize_t max_block_size() const; std::string device_name() const; -- 1.8.1.4 pgpBM336cJ0NR.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/5] clover: Fix not setting build log if the build succeeds.
Tom Stellard thomas.stell...@amd.com writes: From: Matt Arsenault arse...@gmail.com If there were only warnings, they would not be added to the log. Also fixes valgrind use after free errors. --- src/gallium/state_trackers/clover/core/compiler.hpp | 3 ++- src/gallium/state_trackers/clover/core/error.hpp | 2 +- src/gallium/state_trackers/clover/core/program.cpp| 11 +++ src/gallium/state_trackers/clover/llvm/invocation.cpp | 14 +++--- 4 files changed, 17 insertions(+), 13 deletions(-) diff --git a/src/gallium/state_trackers/clover/core/compiler.hpp b/src/gallium/state_trackers/clover/core/compiler.hpp index 49cd022..3ce132f 100644 --- a/src/gallium/state_trackers/clover/core/compiler.hpp +++ b/src/gallium/state_trackers/clover/core/compiler.hpp @@ -32,7 +32,8 @@ namespace clover { module compile_program_llvm(const compat::string source, pipe_shader_ir ir, const compat::string target, - const compat::string opts); + const compat::string opts, + std::string log_out); This doesn't work. I'm afraid you need to use compat::string on the compiler interface because the C++98 and C++11 versions of std::string are not guaranteed to be binary compatible. This mess will go away once we can drop support for the non-C++11 versions of LLVM. Have a look at the attached patch for the memory management issues with compat::string. And maybe rename the output argument to r_log as is usual everywhere else in clover? module compile_program_tgsi(const compat::string source); } diff --git a/src/gallium/state_trackers/clover/core/error.hpp b/src/gallium/state_trackers/clover/core/error.hpp index 28459f3..9802195 100644 --- a/src/gallium/state_trackers/clover/core/error.hpp +++ b/src/gallium/state_trackers/clover/core/error.hpp @@ -66,7 +66,7 @@ namespace clover { class build_error : public error { public: - build_error(const compat::string log) : + build_error(const compat::string log = ) : Can you rename the argument to what as it's no longer going to hold the compilation log? error(CL_BUILD_PROGRAM_FAILURE, log) { } }; diff --git a/src/gallium/state_trackers/clover/core/program.cpp b/src/gallium/state_trackers/clover/core/program.cpp index 3aaa652..91ee553 100644 --- a/src/gallium/state_trackers/clover/core/program.cpp +++ b/src/gallium/state_trackers/clover/core/program.cpp @@ -52,15 +52,18 @@ program::build(const ref_vectordevice devs, const char *opts) { _opts.insert({ dev, opts }); + std::string build_log; + try { auto module = (dev.ir_format() == PIPE_SHADER_IR_TGSI ? compile_program_tgsi(_source) : compile_program_llvm(_source, dev.ir_format(), -dev.ir_target(), build_opts(dev))); +dev.ir_target(), build_opts(dev), +build_log)); _binaries.insert({ dev, module }); - - } catch (build_error e) { -_logs.insert({ dev, e.what() }); +_logs.insert({ dev, build_log }); + } catch (const build_error ) { +_logs.insert({ dev, build_log }); throw; } } diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index 48810bd..0dc1f50 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -120,12 +120,11 @@ namespace { compile(llvm::LLVMContext llvm_ctx, const std::string source, const std::string name, const std::string triple, const std::string processor, const std::string opts, - clang::LangAS::Map address_spaces) { + clang::LangAS::Map address_spaces, std::string log_out) { clang::CompilerInstance c; clang::EmitLLVMOnlyAction act(llvm_ctx); - std::string log; - llvm::raw_string_ostream s_log(log); + llvm::raw_string_ostream s_log(log_out); std::string libclc_path = LIBCLC_LIBEXECDIR + processor + - + triple + .bc; @@ -220,10 +219,10 @@ namespace { // Compile the code if (!c.ExecuteAction(act)) - throw build_error(log); + throw build_error(); // Get address spaces map to be able to find kernel argument address space - memcpy(address_spaces, c.getTarget().getAddressSpaceMap(), + memcpy(address_spaces, c.getTarget().getAddressSpaceMap(), sizeof(address_spaces));
Re: [Mesa-dev] [PATCH 5/5] clover: Enable cl_khr_fp64 for devices that support doubles
Tom Stellard thomas.stell...@amd.com writes: --- src/gallium/state_trackers/clover/api/device.cpp | 4 +++- src/gallium/state_trackers/clover/core/device.cpp | 6 ++ src/gallium/state_trackers/clover/core/device.hpp | 1 + 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/api/device.cpp b/src/gallium/state_trackers/clover/api/device.cpp index dc8e22c..275542d 100644 --- a/src/gallium/state_trackers/clover/api/device.cpp +++ b/src/gallium/state_trackers/clover/api/device.cpp @@ -290,7 +290,9 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, break; case CL_DEVICE_EXTENSIONS: - buf.as_string() = ; + // The trailing space is intentional. It is a spec-ism that there is a + // trailing space at the end of the list of extensions. Hm... Really? Where does the spec say that? + buf.as_string() = dev.cl_khr_fp64() ? cl_khr_fp64 : ; break; case CL_DEVICE_PLATFORM: diff --git a/src/gallium/state_trackers/clover/core/device.cpp b/src/gallium/state_trackers/clover/core/device.cpp index 6d52dd4..51b54fa 100644 --- a/src/gallium/state_trackers/clover/core/device.cpp +++ b/src/gallium/state_trackers/clover/core/device.cpp @@ -187,6 +187,12 @@ device::half_fp_config() const { return CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; } +bool +device::cl_khr_fp64() const { + return pipe-get_shader_param(pipe, PIPE_SHADER_COMPUTE, + PIPE_SHADER_CAP_DOUBLES); +} + Can we call this function has_doubles() or something similar? The extension name is somewhat cryptic. Or, I don't know, maybe use 'device::double_fp_config()' to check if doubles are supported? As Bruno said, it should return 0 if doubles are not supported by the implementation. std::vectorsize_t device::max_block_size() const { auto v = get_compute_paramuint64_t(pipe, PIPE_COMPUTE_CAP_MAX_BLOCK_SIZE); diff --git a/src/gallium/state_trackers/clover/core/device.hpp b/src/gallium/state_trackers/clover/core/device.hpp index 380029e..38bea54 100644 --- a/src/gallium/state_trackers/clover/core/device.hpp +++ b/src/gallium/state_trackers/clover/core/device.hpp @@ -65,6 +65,7 @@ namespace clover { cl_device_fp_config single_fp_config() const; cl_device_fp_config double_fp_config() const; cl_device_fp_config half_fp_config() const; + bool cl_khr_fp64() const; std::vectorsize_t max_block_size() const; std::string device_name() const; -- 1.8.1.4 pgpSUyj6s8rWe.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Testing OpenCL with gegl
On 06/21/2014 12:57 PM, Bruno Jimenez wrote: Hi, Have you built gegl with the patches that Jan sent to the mailing list? You can find them here: http://lists.freedesktop.org/archives/mesa-dev/2014-June/061880.html Hope it helps! Bruno Yes it did help. Although some of the tests managed to hang and reset the gpu... Gimp seems to use the gpu for some tests. In Gimp (launched with 'GEGL_DEBUG=opencl GEGL_USE_OPENCL=gpu gimp') i get many errors like: GEGL-gegl-cl-color.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-color.c:243: Missing OpenCL conversion for R'G'B' u8 - RGBA float Are those indications of some error or just debug info? On Sat, 2014-06-21 at 09:17 +0300, Kertesz Laszlo wrote: Hmm. I get GEGL_DEBUG=opencl ./run-compositions.py apply-lens.xml GEGL-gegl-cl-init.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-init.c:508: libOpenCL.so: 'clCreateFromGLTexture2D': /usr/lib/x86_64-linux-gnu/libOpenCL.so: undefined symbol: clCreateFromGLTexture2D GEGL-gegl-cl-init.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-init.c:636: OpenCL is disabled GEGL-gegl-cl-init.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-init.c:636: OpenCL is disabled /media/bigdata/compile/gegl/tests/compositions/reference/apply-lens.png and /media/bigdata/compile/gegl/tests/compositions/output/apply-lens.png are identical PASS apply-lens.xml SKIP apply-lens.xml (OpenCL) === Test Results === tests passed: 1 tests skipped: 1 tests failed: 0 == PASS == I have built llvm, kernel,mesa from git with opencl on (A8-6500 APU, ARUBA). Simple opencl tests work. Using gimp (built from git) i have the same undefined symbol error in the console and all operations use 1 CPU core, the GPU doesnt seem to be used at all. If i make all tests, all opencl tests are skipped and one (non-opencl i suppose) even fails (gegl.xml). Any ideas how i could use opencl? -- O zi buna, Kertesz Laszlo ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Testing OpenCL with gegl
I readded the mesa list again, as the info might be interesting to others. On Sat, 2014-06-21 at 11:50 +0200, Bruno Jimenez wrote: On Fri, 2014-06-20 at 17:01 -0400, Jan Vesely wrote: On Fri, 2014-06-20 at 21:58 +0200, Bruno Jimenez wrote: [snip] I also found a reference to BABL_PATH env var in the babl code, it can provide a list of directories to search for extensions. you can try setting it to proper location. by default it looks in $PREFIX/$LIBDIR/babl-0.1 (set during configure phase), so a symlnk might also be an option This was it! Setting BABL_PATH to the path of the extensions lib has worked. Thanks. Now the tests have started passing, although edge-laplace have hanged completely my gpu and I have had to hard-reboot. Which tests are failing for you? maybe I can just launch them and see if it happens too here. Right now, the failure count is at 4: - clones.xml: r600_blit.c:592:r600_resource_copy_region: Assertion `u_max_sample(dst) == u_max_sample(src)' failed. - gegl.xml: r600_blit.c:592:r600_resource_copy_region: Assertion `u_max_sample(dst) == u_max_sample(src)' failed. - noise-cell.xml: %lsr.iv861 = phi [3 x float]* [ %closest, % land.rhs.i69.preheader ], [ %60, %for.cond24.i64 ] Do not know how to replace this instruction with vector op - stretch-contrast.xml: It used to be only noise-cell.xml and stretch-contrast.xml. noise-cell needs support for large program scope arrays (same reason the second patch I sent you disables color kernels). it's a known problem in both llvm and clover. although the original failure was different from the currently reported one. (might be a new bug) stretch-contrast needs memory fences (libclc and llvm) I started some work on libclc side [0], but it'll need more changes on llvm side. These are known and afaik they affect radeonsi too. [0]http://www.pcc.me.uk/pipermail/libclc-dev/2014-April/000284.html assertion failures in clones.xml and gegl.xml are new and introduced by the opencl mem pool changes. I haven't bisected but '96a95f mesa: Copy Geom.UsesEndPrimitive when cloning a geometry program.' works OK. you might want to have a look as you know the code better. I also used the two fixes I sent so that the colors.xml won't hang Sorry for any inconvenience I may be causing. no worries, it's good that someone else is trying to get the thing running on mesa/clover Just for the r600g driver I'm afraid, if it's a radeonsi issue I won't be able to help, neither if it's in LLVM. I'm also on r600, and I believe cypress and turks should use much of the same code paths. so it should be possible for you to get at least the same pass rate as my gpu. I'm sure the amd guys would appreciate bug reports :) regards, Jan Thanks for everything! Bruno Jan Thanks in advance! Bruno regards, Jan [snip] -- Jan Vesely jan.ves...@rutgers.edu signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Testing OpenCL with gegl
On Sat, 2014-06-21 at 20:09 +0300, Kertesz Laszlo wrote: On 06/21/2014 12:57 PM, Bruno Jimenez wrote: Hi, Have you built gegl with the patches that Jan sent to the mailing list? You can find them here: http://lists.freedesktop.org/archives/mesa-dev/2014-June/061880.html Hope it helps! Bruno Yes it did help. Although some of the tests managed to hang and reset the gpu... Gimp seems to use the gpu for some tests. In Gimp (launched with 'GEGL_DEBUG=opencl GEGL_USE_OPENCL=gpu gimp') i get many errors like: GEGL-gegl-cl-color.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-color.c:243: Missing OpenCL conversion for R'G'B' u8 - RGBA float Are those indications of some error or just debug info? These are caused by the second patch that disables use of color conversion kernels. These kernels are used often and the compilation crashes, so I disabled them to avoid false failures. GEGL falls back on cpu path if it can't use an opencl kernel so these don't matter much. regards, Jan On Sat, 2014-06-21 at 09:17 +0300, Kertesz Laszlo wrote: Hmm. I get GEGL_DEBUG=opencl ./run-compositions.py apply-lens.xml GEGL-gegl-cl-init.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-init.c:508: libOpenCL.so: 'clCreateFromGLTexture2D': /usr/lib/x86_64-linux-gnu/libOpenCL.so: undefined symbol: clCreateFromGLTexture2D GEGL-gegl-cl-init.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-init.c:636: OpenCL is disabled GEGL-gegl-cl-init.c-Message: [GEGL_DEBUG_OPENCL] gegl-cl-init.c:636: OpenCL is disabled /media/bigdata/compile/gegl/tests/compositions/reference/apply-lens.png and /media/bigdata/compile/gegl/tests/compositions/output/apply-lens.png are identical PASS apply-lens.xml SKIP apply-lens.xml (OpenCL) === Test Results === tests passed: 1 tests skipped: 1 tests failed: 0 == PASS == I have built llvm, kernel,mesa from git with opencl on (A8-6500 APU, ARUBA). Simple opencl tests work. Using gimp (built from git) i have the same undefined symbol error in the console and all operations use 1 CPU core, the GPU doesnt seem to be used at all. If i make all tests, all opencl tests are skipped and one (non-opencl i suppose) even fails (gegl.xml). Any ideas how i could use opencl? -- Jan Vesely jan.ves...@rutgers.edu signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] clover: Enable cl_khr_fp64 for devices that support doubles
On Jun 21, 2014, at 9:37 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard thomas.stell...@amd.com writes: --- src/gallium/state_trackers/clover/api/device.cpp | 4 +++- src/gallium/state_trackers/clover/core/device.cpp | 6 ++ src/gallium/state_trackers/clover/core/device.hpp | 1 + 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/api/device.cpp b/src/gallium/state_trackers/clover/api/device.cpp index dc8e22c..275542d 100644 --- a/src/gallium/state_trackers/clover/api/device.cpp +++ b/src/gallium/state_trackers/clover/api/device.cpp @@ -290,7 +290,9 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, break; case CL_DEVICE_EXTENSIONS: - buf.as_string() = ; + // The trailing space is intentional. It is a spec-ism that there is a + // trailing space at the end of the list of extensions. Hm... Really? Where does the spec say that? I’m not really sure. I remember reporting a bug before about a trailing space in these outputs, and got the response that it is required by the spec, but never really finding where. I know it’s required for the versions (which also currently fails the conformance test with “A space must appear after the minor version! (returned: OpenCL C 1.1)” + buf.as_string() = dev.cl_khr_fp64() ? cl_khr_fp64 : ; break; case CL_DEVICE_PLATFORM: diff --git a/src/gallium/state_trackers/clover/core/device.cpp b/src/gallium/state_trackers/clover/core/device.cpp index 6d52dd4..51b54fa 100644 --- a/src/gallium/state_trackers/clover/core/device.cpp +++ b/src/gallium/state_trackers/clover/core/device.cpp @@ -187,6 +187,12 @@ device::half_fp_config() const { return CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; } +bool +device::cl_khr_fp64() const { + return pipe-get_shader_param(pipe, PIPE_SHADER_COMPUTE, + PIPE_SHADER_CAP_DOUBLES); +} + Can we call this function has_doubles() or something similar? The extension name is somewhat cryptic. Or, I don't know, maybe use 'device::double_fp_config()' to check if doubles are supported? As Bruno said, it should return 0 if doubles are not supported by the implementation. The extension is also “removed” in 1.2 and doubles become an “optional core extension” whatever that means, and is just required to be reported anyway for compatibility. std::vectorsize_t device::max_block_size() const { auto v = get_compute_paramuint64_t(pipe, PIPE_COMPUTE_CAP_MAX_BLOCK_SIZE); diff --git a/src/gallium/state_trackers/clover/core/device.hpp b/src/gallium/state_trackers/clover/core/device.hpp index 380029e..38bea54 100644 --- a/src/gallium/state_trackers/clover/core/device.hpp +++ b/src/gallium/state_trackers/clover/core/device.hpp @@ -65,6 +65,7 @@ namespace clover { cl_device_fp_config single_fp_config() const; cl_device_fp_config double_fp_config() const; cl_device_fp_config half_fp_config() const; + bool cl_khr_fp64() const; std::vectorsize_t max_block_size() const; std::string device_name() const; -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] clover: Enable cl_khr_fp64 for devices that support doubles
Matt Arsenault arse...@gmail.com writes: On Jun 21, 2014, at 9:37 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard thomas.stell...@amd.com writes: [...] case CL_DEVICE_EXTENSIONS: - buf.as_string() = ; + // The trailing space is intentional. It is a spec-ism that there is a + // trailing space at the end of the list of extensions. Hm... Really? Where does the spec say that? I’m not really sure. I remember reporting a bug before about a trailing space in these outputs, and got the response that it is required by the spec, but never really finding where. I know it’s required for the versions (which also currently fails the conformance test with “A space must appear after the minor version! (returned: OpenCL C 1.1)” That's because of the space-separated vendor-specific information that CL_DEVICE_VERSION is supposed to include after the minor version, it doesn't mean that the version string should necessarily be space-terminated. I don't think that the trailing space is required in the extension string, let's get rid of it unless you can find evidence from the spec that it should be otherwise. Thanks. [...] pgpEtxErptRfQ.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] clover: Enable cl_khr_fp64 for devices that support doubles
On Jun 21, 2014, at 12:32 PM, Francisco Jerez curroje...@riseup.net wrote: Matt Arsenault arse...@gmail.com writes: On Jun 21, 2014, at 9:37 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard thomas.stell...@amd.com writes: [...] case CL_DEVICE_EXTENSIONS: - buf.as_string() = ; + // The trailing space is intentional. It is a spec-ism that there is a + // trailing space at the end of the list of extensions. Hm... Really? Where does the spec say that? I’m not really sure. I remember reporting a bug before about a trailing space in these outputs, and got the response that it is required by the spec, but never really finding where. I know it’s required for the versions (which also currently fails the conformance test with “A space must appear after the minor version! (returned: OpenCL C 1.1)” That's because of the space-separated vendor-specific information that CL_DEVICE_VERSION is supposed to include after the minor version, it doesn't mean that the version string should necessarily be space-terminated. I don't think that the trailing space is required in the extension string, let's get rid of it unless you can find evidence from the spec that it should be otherwise. Thanks. Fine with me ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/11] [RFC v2] Solve the mapping bug
On Sat, 2014-06-21 at 17:39 +0200, Francisco Jerez wrote: Bruno Jiménez brunoji...@gmail.com writes: Hi, This is my second attempt to fix the mapping bug adding all the suggestions that Tom Stellard sent, and, so far, it seems that it is resolved. This series changes completely how OpenCL buffers are handled by the r600g driver. Before this, we would add them directly to a pool, and this pool would grow whenever we needed more space. But this process implied destroying the pool and creating a new one. There could be cases where a buffer would be mapped and the pool would grow, leaving one side of the mapping pointed to where the item was. This is the 'mapping bug' Now, Items will have an intermediate resource, where all mappings can be done, and when a buffer is going to be used with a kernel it is promoted to the pool. In the case where a promoted item is going to be mapped, it is previously demoted, so even if the pool changes its location due to growing, the map remains valid. In the case of a buffer mapped for reading, and used by a kernel to read from it, we will duplicate this buffer, having the intermediate buffer, where the user has its map, and an item in the pool, which is the one that the kernel is going to use. As a summary for v2: Patches 1-8: These are the main part of the series, and solve the mapping bug. Patches 1 and 7 now use less explicit castings Patch 2 is new and introduces the 'is_item_in_pool' function, which is used in patches 3 and 8 Patch 9: Is a complete rewrite of v1 patch 8 using gallium utils for double lists Patches 10 and 11: These are just a proof of concept for avoiding transfers GPU - GPU when using all CL Read/Write functions. They are v1 patch 9 splited in two to separate r600g changes from clover changes. Now, in clover's side it introduces and uses 'CLOVER_TRANSFER_MAP_DIRECTLY' so it doesen't collide with any other OpenCL flag. Please review and Thanks :) Bruno Jiménez (11): r600g/compute: Add an intermediate resource for OpenCL buffers r600g/compute: Add an util function to know if an item is in the pool r600g/compute: Add statuses to the compute_memory_items r600g/compute: divide the item list in two r600g/compute: Only move to the pool the buffers marked for promoting r600g/compute: Avoid problems when promoting items mapped for reading r600g/compute: Implement compute_memory_demote_item r600g/compute: Map only against intermediate buffers r600g/compute: Use gallium util functions for double lists r600g/compute: Map directly the pool in some cases clover: Use PIPE_TRANSFER_MAP_DIRECTLY when writing/reading buffers src/gallium/drivers/r600/compute_memory_pool.c | 294 - src/gallium/drivers/r600/compute_memory_pool.h | 31 ++- src/gallium/drivers/r600/evergreen_compute.c | 38 ++- src/gallium/state_trackers/clover/api/transfer.cpp | 4 +- src/gallium/state_trackers/clover/core/object.hpp | 4 + .../state_trackers/clover/core/resource.cpp| 2 + 6 files changed, 233 insertions(+), 140 deletions(-) -- 2.0.0 Hi Bruno, I'm sorry for having taken so long to comment on this, I haven't had time to read through your patch series carefully until last night. Hi Francisco, It's Ok, we can't be always here to review everything :) I'm not convinced that this is going to be a satisfactory solution, it punishes well-behaving applications by duplicating memory usage under some circumstances, and by slowing down buffer mapping with (in most cases) unnecessary shadow copies: Even though using a buffer while mapped is something to keep in mind because it's allowed by the spec with some restrictions, I think it's reasonable to assume that we're only going to see a comparatively small number of active buffer mappings during the execution of any compute kernel, so it seems unfortunate to me that the rest of mappings will have to pay for this one corner case. I have to say that I understand perfectly your concerns about memory duplication and slow mappings. In fact, the memory duplication is mainly caused by the possibility of having a buffer mapped for reading that a kernel uses + relocation of the pool. And the slow mappings are caused by any map that could stay alive while a kernel is launched + relocation. For the OpenCL code that I have seen, I haven't yet encountered one that left a buffer mapped for reading and at the same time launched a kernel that reads from it. But I do have encountered programs that left buffers mapped while they launched kernels that don't touch those buffers. But I agree that punishing mappings made simply for read/write between kernel executions is something that we have to find a way to avoid. That's mainly why I tried to avoid the copies in the case of simple
Re: [Mesa-dev] [Mesa-stable] [PATCH 12/13] i965: If STATE_BASE_VERTEX is used, set its value
This looks like it's adding a pile of uniform scanning overhead even to draws which don't need the feature. Can we stash the index we need earlier? On Sat, Jun 21, 2014 at 1:01 PM, Ian Romanick i...@freedesktop.org wrote: From: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: 10.2 mesa-sta...@lists.freedesktop.org --- src/mesa/drivers/dri/i965/brw_draw.c | 37 1 file changed, 37 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index ac21656..dd914b6 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -40,6 +40,7 @@ #include swrast/swrast.h #include swrast_setup/swrast_setup.h #include drivers/common/meta.h +#include program/prog_parameter.h #include brw_blorp.h #include brw_draw.h @@ -392,6 +393,27 @@ static bool brw_try_draw_prims( struct gl_context *ctx, bool retval = true; GLuint i; bool fail_next = false; + GLint last_start = -1; + GLint base_vertex_idx = -1; + + struct gl_program_parameter_list *const plist = + ctx-VertexProgram._Current-Base.Parameters; + + /* Find the storage associated with STATE_BASE_VERTEX. If there is +* storage, then we'll need to update its value each time +* prims[i].basevertex changes. +*/ + for (unsigned j = 0; j plist-NumParameters; j++) { + if (plist-Parameters[j].StateIndexes[0] == STATE_INTERNAL + plist-Parameters[j].StateIndexes[1] == STATE_BASE_VERTEX) { + base_vertex_idx = j; + break; + } + } + + if (base_vertex_idx = 0) { + ctx-VertexProgram._Current-Base.Parameters-ParameterValues[base_vertex_idx][0].i = brw-basevertex; + } if (ctx-NewState) _mesa_update_state( ctx ); @@ -462,11 +484,26 @@ static bool brw_try_draw_prims( struct gl_context *ctx, brw-basevertex != prims[i].basevertex) { brw-num_instances = prims[i].num_instances; brw-basevertex = prims[i].basevertex; + + if (base_vertex_idx = 0) { +plist-ParameterValues[base_vertex_idx][0].i = brw-basevertex; +ctx-NewState |= _NEW_PROGRAM_CONSTANTS; +_mesa_update_state(ctx); + } + if (i 0) { /* For i == 0 we just did this before the loop */ brw-state.dirty.brw |= BRW_NEW_VERTICES; brw_merge_inputs(brw, arrays); } } + + if (prims[i].start != last_start base_vertex_idx = 0) { + plist-ParameterValues[base_vertex_idx][0].i = prims[i].start; + ctx-NewState |= _NEW_PROGRAM_CONSTANTS; + _mesa_update_state(ctx); + last_start = prims[i].start; + } + if (brw-gen 6) brw_set_prim(brw, prims[i]); else -- 1.8.1.4 ___ mesa-stable mailing list mesa-sta...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-stable ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/13] Fix gl_VertexID on i965
This will be broken for indirect draws too, and possibly performance-crippling to fix there, since we don't have the baseVertex value available to shove into a uniform. On Sun, Jun 22, 2014 at 3:36 AM, Roland Scheidegger srol...@vmware.com wrote: Am 21.06.2014 03:00, schrieb Ian Romanick: This patch series fixes bugs in the i965 w.r.t. several uses of gl_VertexID. OpenGL (desktop and ES) have the following expectations of gl_VertexID: 1. When used with BaseVertex drawing commands, gl_VertexID will include the value of basevertex. This differens from the other API, but the change in OpenGL was based on feedback from application developers. This only affects OpenGL 3.2+. I don't question this, but note some binary drivers may not agree. There are unforunately lots of man pages around of glDrawElementsInstancedBaseVertexBaseInstance too which state: The basevertex has no effect on the shader-visible value of gl_VertexID. There's a bug filed about this (https://www.khronos.org/bugzilla/show_bug.cgi?id=932) which is still open though maybe it has all been resolved... (I'll just mention that llvmpipe right now will do d3d10-style vertexID though I guess indeed at some point we need to make it either switchable or implement GL_ARB_shader_draw_parameters so it can be worked around easily in the state tracker.) Roland 2. When used with DrawArrays drawing commands, gl_VertexID will count from the 'start' value instead of zero. This affects OpenGL 3.0+ and OpenGL ES 3.0+. The i965 driver botched both of these for slightly different reasons. For #1, our hardware was designed with the other API's semantic in mind, so gl_VertexID doesn't include the basevertex value. For #2, we implement DrawArrays with a non-zero start index by reprogramming the array base offset. As a result, gl_VertexID always counts from zero. I suspect other hardware may suffer from one or both of these issues. I have sent tests to the piglit list to reproduce them. To fix both of these issues, the shader needs to know the basevertex value. A later GL extension, GL_ARB_shader_draw_parameters, adds gl_BaseVertex and gl_BaseInstance as built-in variables. I believe some hardware implements these as system values much like gl_VertexID. I started this series with the assumption that we could have a SYSTEM_VALUE_BASE_VERTEX that might come from a uniform. In the end, I couldn't make that work. I left patch 7 in the series, but we may want to remove it. I added a STATE_BASE_VERTEX uniform instead. There is now a lowering pass that converts gl_VertexID to gl_VertexIDMESA + gl_BaseVertex (backed by a uniform). Some of these names should probably be changed. I have also contemplated adding an extension that exposes the other semantic for gl_VertexID for applications that actually want that. This is primarily things that are porting content (or bridging content) from the other API. That, however, can happen later. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-devk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=E56BlL24I31lr1fFtEsKBJjAprQ%2BQdhUczD1EsQ3dIY%3D%0As=2634834afd3400c841ab62f11a9b5350d604592529cfea97f2297c9398c0fd93 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/13] mesa: Group gl_system_value values by the stage where they exist
For 1-4: Reviewed-by: Marek Olšák marek.ol...@amd.com Marek On Sat, Jun 21, 2014 at 5:19 PM, Roland Scheidegger srol...@vmware.com wrote: Am 21.06.2014 03:00, schrieb Ian Romanick: From: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Ilia Mirkin imir...@alum.mit.edu Cc: Marek Olšák marek.ol...@amd.com Cc: Roland Scheidegger srol...@vmware.com Cc: 10.2 mesa-sta...@lists.freedesktop.org --- src/mesa/main/mtypes.h | 32 +++--- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 +-- 2 files changed, 35 insertions(+), 9 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 8b7ee30..3899e7f 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2035,13 +2035,31 @@ typedef enum */ typedef enum { - SYSTEM_VALUE_FRONT_FACE, /** Fragment shader only (not done yet) */ - SYSTEM_VALUE_VERTEX_ID, /** Vertex shader only */ - SYSTEM_VALUE_INSTANCE_ID,/** Vertex shader only */ - SYSTEM_VALUE_SAMPLE_ID, /** Fragment shader only */ - SYSTEM_VALUE_SAMPLE_POS, /** Fragment shader only */ - SYSTEM_VALUE_SAMPLE_MASK_IN, /** Fragment shader only */ - SYSTEM_VALUE_INVOCATION_ID, /** Geometry shader only */ + /** +* \name Vertex shader system values +*/ + /*@{*/ + SYSTEM_VALUE_VERTEX_ID, + SYSTEM_VALUE_INSTANCE_ID, + /*@}*/ + + /** +* \name Geometry shader system values +*/ + /*@{*/ + SYSTEM_VALUE_INVOCATION_ID, + /*@}*/ + + /** +* \name Fragment shader system values +*/ + /*@{*/ + SYSTEM_VALUE_FRONT_FACE, /** (not done yet) */ + SYSTEM_VALUE_SAMPLE_ID, + SYSTEM_VALUE_SAMPLE_POS, + SYSTEM_VALUE_SAMPLE_MASK_IN, + /*@}*/ + SYSTEM_VALUE_MAX /** Number of values */ } gl_system_value; diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index fbfbea6..fa95493 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -4236,13 +4236,21 @@ struct st_translate { /** Map Mesa's SYSTEM_VALUE_x to TGSI_SEMANTIC_x */ const unsigned _mesa_sysval_to_semantic[SYSTEM_VALUE_MAX] = { - TGSI_SEMANTIC_FACE, + /* Vertex shader +*/ TGSI_SEMANTIC_VERTEXID, TGSI_SEMANTIC_INSTANCEID, + + /* Geometry shader +*/ + TGSI_SEMANTIC_INVOCATIONID, + + /* Fragment shader +*/ + TGSI_SEMANTIC_FACE, TGSI_SEMANTIC_SAMPLEID, TGSI_SEMANTIC_SAMPLEPOS, TGSI_SEMANTIC_SAMPLEMASK, - TGSI_SEMANTIC_INVOCATIONID, }; /** Looks good to me (though it does mean you have to add new values in the middle in the future if you group them by stage). 1-4 are Reviewed-by: Roland Scheidegger srol...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/13] Fix gl_VertexID on i965
That's right. A uniform won't work with ARB_draw_indirect unless you lower it to direct draws, which would be very bad if it was applied to all drivers. Radeonsi indeed supports BaseVertex and BaseInstance as system values in the vertex shader. Well, vertex fetching has to be done in the vertex shader on that hardware, so the system values are kinda required there. Marek On Sun, Jun 22, 2014 at 12:27 AM, Chris Forbes chr...@ijw.co.nz wrote: This will be broken for indirect draws too, and possibly performance-crippling to fix there, since we don't have the baseVertex value available to shove into a uniform. On Sun, Jun 22, 2014 at 3:36 AM, Roland Scheidegger srol...@vmware.com wrote: Am 21.06.2014 03:00, schrieb Ian Romanick: This patch series fixes bugs in the i965 w.r.t. several uses of gl_VertexID. OpenGL (desktop and ES) have the following expectations of gl_VertexID: 1. When used with BaseVertex drawing commands, gl_VertexID will include the value of basevertex. This differens from the other API, but the change in OpenGL was based on feedback from application developers. This only affects OpenGL 3.2+. I don't question this, but note some binary drivers may not agree. There are unforunately lots of man pages around of glDrawElementsInstancedBaseVertexBaseInstance too which state: The basevertex has no effect on the shader-visible value of gl_VertexID. There's a bug filed about this (https://www.khronos.org/bugzilla/show_bug.cgi?id=932) which is still open though maybe it has all been resolved... (I'll just mention that llvmpipe right now will do d3d10-style vertexID though I guess indeed at some point we need to make it either switchable or implement GL_ARB_shader_draw_parameters so it can be worked around easily in the state tracker.) Roland 2. When used with DrawArrays drawing commands, gl_VertexID will count from the 'start' value instead of zero. This affects OpenGL 3.0+ and OpenGL ES 3.0+. The i965 driver botched both of these for slightly different reasons. For #1, our hardware was designed with the other API's semantic in mind, so gl_VertexID doesn't include the basevertex value. For #2, we implement DrawArrays with a non-zero start index by reprogramming the array base offset. As a result, gl_VertexID always counts from zero. I suspect other hardware may suffer from one or both of these issues. I have sent tests to the piglit list to reproduce them. To fix both of these issues, the shader needs to know the basevertex value. A later GL extension, GL_ARB_shader_draw_parameters, adds gl_BaseVertex and gl_BaseInstance as built-in variables. I believe some hardware implements these as system values much like gl_VertexID. I started this series with the assumption that we could have a SYSTEM_VALUE_BASE_VERTEX that might come from a uniform. In the end, I couldn't make that work. I left patch 7 in the series, but we may want to remove it. I added a STATE_BASE_VERTEX uniform instead. There is now a lowering pass that converts gl_VertexID to gl_VertexIDMESA + gl_BaseVertex (backed by a uniform). Some of these names should probably be changed. I have also contemplated adding an extension that exposes the other semantic for gl_VertexID for applications that actually want that. This is primarily things that are porting content (or bridging content) from the other API. That, however, can happen later. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-devk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=E56BlL24I31lr1fFtEsKBJjAprQ%2BQdhUczD1EsQ3dIY%3D%0As=2634834afd3400c841ab62f11a9b5350d604592529cfea97f2297c9398c0fd93 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/13] Fix gl_VertexID on i965
In the indirect case we can possibly just adjust the indirect setup code to write the uniform with MI_SRM, if we know where it will be -- so not too awful -- but it is different. On Sun, Jun 22, 2014 at 1:59 PM, Marek Olšák mar...@gmail.com wrote: That's right. A uniform won't work with ARB_draw_indirect unless you lower it to direct draws, which would be very bad if it was applied to all drivers. Radeonsi indeed supports BaseVertex and BaseInstance as system values in the vertex shader. Well, vertex fetching has to be done in the vertex shader on that hardware, so the system values are kinda required there. Marek On Sun, Jun 22, 2014 at 12:27 AM, Chris Forbes chr...@ijw.co.nz wrote: This will be broken for indirect draws too, and possibly performance-crippling to fix there, since we don't have the baseVertex value available to shove into a uniform. On Sun, Jun 22, 2014 at 3:36 AM, Roland Scheidegger srol...@vmware.com wrote: Am 21.06.2014 03:00, schrieb Ian Romanick: This patch series fixes bugs in the i965 w.r.t. several uses of gl_VertexID. OpenGL (desktop and ES) have the following expectations of gl_VertexID: 1. When used with BaseVertex drawing commands, gl_VertexID will include the value of basevertex. This differens from the other API, but the change in OpenGL was based on feedback from application developers. This only affects OpenGL 3.2+. I don't question this, but note some binary drivers may not agree. There are unforunately lots of man pages around of glDrawElementsInstancedBaseVertexBaseInstance too which state: The basevertex has no effect on the shader-visible value of gl_VertexID. There's a bug filed about this (https://www.khronos.org/bugzilla/show_bug.cgi?id=932) which is still open though maybe it has all been resolved... (I'll just mention that llvmpipe right now will do d3d10-style vertexID though I guess indeed at some point we need to make it either switchable or implement GL_ARB_shader_draw_parameters so it can be worked around easily in the state tracker.) Roland 2. When used with DrawArrays drawing commands, gl_VertexID will count from the 'start' value instead of zero. This affects OpenGL 3.0+ and OpenGL ES 3.0+. The i965 driver botched both of these for slightly different reasons. For #1, our hardware was designed with the other API's semantic in mind, so gl_VertexID doesn't include the basevertex value. For #2, we implement DrawArrays with a non-zero start index by reprogramming the array base offset. As a result, gl_VertexID always counts from zero. I suspect other hardware may suffer from one or both of these issues. I have sent tests to the piglit list to reproduce them. To fix both of these issues, the shader needs to know the basevertex value. A later GL extension, GL_ARB_shader_draw_parameters, adds gl_BaseVertex and gl_BaseInstance as built-in variables. I believe some hardware implements these as system values much like gl_VertexID. I started this series with the assumption that we could have a SYSTEM_VALUE_BASE_VERTEX that might come from a uniform. In the end, I couldn't make that work. I left patch 7 in the series, but we may want to remove it. I added a STATE_BASE_VERTEX uniform instead. There is now a lowering pass that converts gl_VertexID to gl_VertexIDMESA + gl_BaseVertex (backed by a uniform). Some of these names should probably be changed. I have also contemplated adding an extension that exposes the other semantic for gl_VertexID for applications that actually want that. This is primarily things that are porting content (or bridging content) from the other API. That, however, can happen later. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-devk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=E56BlL24I31lr1fFtEsKBJjAprQ%2BQdhUczD1EsQ3dIY%3D%0As=2634834afd3400c841ab62f11a9b5350d604592529cfea97f2297c9398c0fd93 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev