[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 GitLab Migration User changed: What|Removed |Added Resolution|--- |MOVED Status|NEW |RESOLVED --- Comment #17 from GitLab Migration User --- -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/615. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 Jan Vesely changed: What|Removed |Added Component|Gallium/StateTracker/Clover |Drivers/Gallium/r600 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 Timothy Arceri changed: What|Removed |Added Component|Other |Gallium/StateTracker/Clover -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #16 from Jan Vesely--- (In reply to Dave Gilbert from comment #15) > Hi Jan, > Yes, doing: > --- a/ocl.cpp > +++ b/ocl.cpp > @@ -65,6 +65,7 @@ static int got_dev(cl::Platform , > std::vector , cl::Dev > events.push_back(event); > cl::Event eventMap; > queue.enqueueBarrierWithWaitList(); > +event.wait(); > mapped = (cl_uint*)queue.enqueueMapBuffer(output, CL_TRUE /* blocking > */, CL_MAP_READ, > 0 /* offset */, > SIZE * SIZE * SIZE * sizeof(cl_uint) /* size */, > > does seem to work. thanks, that means the kernel work event works correctly. I'll need to double check the specs wrt synchronization points. we either miss a wait, or fail to update mapped buffers after kernel finishes execution. > > Vedran: I've only got a Turks to play with; feel free to try my test on > something else. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #15 from Dave Gilbert--- Hi Jan, Yes, doing: --- a/ocl.cpp +++ b/ocl.cpp @@ -65,6 +65,7 @@ static int got_dev(cl::Platform , std::vector , cl::Dev events.push_back(event); cl::Event eventMap; queue.enqueueBarrierWithWaitList(); +event.wait(); mapped = (cl_uint*)queue.enqueueMapBuffer(output, CL_TRUE /* blocking */, CL_MAP_READ, 0 /* offset */, SIZE * SIZE * SIZE * sizeof(cl_uint) /* size */, does seem to work. Vedran: I've only got a Turks to play with; feel free to try my test on something else. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #14 from Jan Vesely--- (In reply to Dave Gilbert from comment #12) > > It doesn't seem to help, if I add: > --- a/ocl.cpp > +++ b/ocl.cpp > @@ -74,6 +74,7 @@ static int got_dev(cl::Platform , > std::vector , cl::Dev > cl::Event eventBarrier2; > queue.enqueueBarrierWithWaitList(NULL,); > std::cerr << __func__ << "enqueueMapBuffer gave: " << err << std::endl; > +event.wait(); > eventMap.wait(); > eventBarrier2.wait(); > > > that doesn't seem to help and I think event is the event triggered by the > kernel. can you move it few lines up? (before the call to mapBuffer). -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #13 from Vedran Miletić--- (In reply to Dave Gilbert from comment #6) > (In reply to Jan Vesely from comment #5) > > (In reply to Dave Gilbert from comment #4) > > > Created attachment 135313 [details] > > > foo.link-0.ll > > > > > > That's all 3 of the debug files it produced. > > > (I wasn't sure which were the llvm and which the isa dumps; I guess the > > > asm > > > is the isa? and the ll's are both llvm dumps?) > > > > yes. the first .ll is from compilation step, the other one is from linking > > step. > > > > .ll dump looks correct. > > .asm also looks correct. > > > > you can try producing multiple asm dumps for working and non-working runs. > > But I don't think that the llvm is the culprit here. > > > > Can you try waiting for the kernel execution to complete explicitly before > > mapping the buffer? > > Ideally call clFinish() on line 63. > > Since I'm on the C++ binding (probably a mistake) I used: > queue.finish(); > > and it seems to be working. > > (This also corresponds possibly to what I'm seeing on a more complex kernel; > with a more complex kernel I'm seeing on a whole pile of data on the last > few Z slices as being bogus suggesting it's not finished). > > Dave This reminds me of a certain issue I experienced with OpenMM. Is it limited to Turks, or it happens on SI+ cards? -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #12 from Dave Gilbert--- (In reply to Jan Vesely from comment #11) > (In reply to Dave Gilbert from comment #10) > > I believe I'm still seeing this: > > > > dg@hath:~/ocl2$ clinfo > > Number of platforms 1 > > Platform Name Clover > > Platform Vendor Mesa > > Platform VersionOpenCL 1.1 Mesa > > 17.4.0-devel (git-a16dc04ad5) > > > > dg@hath:~/ocl2$ echo $LD_LIBRARY_PATH > > /home/dg/mesa/try/lib: > > > > so I *think* it's using my build. > > yes, that looks OK. > > > and I believe I'm still seeing it. > > Is my test valid or do I really need that finish? > > it should be OK. Can you replace the clFinish with clWaitForEvents (or the > respective C++ method) to wait for kernel execution? > It looks to me that clover creates new map without waiting for all the dep > events. It doesn't seem to help, if I add: --- a/ocl.cpp +++ b/ocl.cpp @@ -74,6 +74,7 @@ static int got_dev(cl::Platform , std::vector , cl::Dev cl::Event eventBarrier2; queue.enqueueBarrierWithWaitList(NULL,); std::cerr << __func__ << "enqueueMapBuffer gave: " << err << std::endl; +event.wait(); eventMap.wait(); eventBarrier2.wait(); that doesn't seem to help and I think event is the event triggered by the kernel. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 Vedran Miletićchanged: What|Removed |Added CC||ved...@miletic.net -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #11 from Jan Vesely--- (In reply to Dave Gilbert from comment #10) > I believe I'm still seeing this: > > dg@hath:~/ocl2$ clinfo > Number of platforms 1 > Platform Name Clover > Platform Vendor Mesa > Platform VersionOpenCL 1.1 Mesa > 17.4.0-devel (git-a16dc04ad5) > > dg@hath:~/ocl2$ echo $LD_LIBRARY_PATH > /home/dg/mesa/try/lib: > > so I *think* it's using my build. yes, that looks OK. > and I believe I'm still seeing it. > Is my test valid or do I really need that finish? it should be OK. Can you replace the clFinish with clWaitForEvents (or the respective C++ method) to wait for kernel execution? It looks to me that clover creates new map without waiting for all the dep events. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #10 from Dave Gilbert--- I believe I'm still seeing this: dg@hath:~/ocl2$ clinfo Number of platforms 1 Platform Name Clover Platform Vendor Mesa Platform VersionOpenCL 1.1 Mesa 17.4.0-devel (git-a16dc04ad5) dg@hath:~/ocl2$ echo $LD_LIBRARY_PATH /home/dg/mesa/try/lib: so I *think* it's using my build. and I believe I'm still seeing it. Is my test valid or do I really need that finish? -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #9 from Dave Gilbert--- (In reply to Jan Vesely from comment #8) > (In reply to Dave Gilbert from comment #6) > > (In reply to Jan Vesely from comment #5) > > > (In reply to Dave Gilbert from comment #4) > > > > Created attachment 135313 [details] > > > > foo.link-0.ll > > > > > > > > That's all 3 of the debug files it produced. > > > > (I wasn't sure which were the llvm and which the isa dumps; I guess the > > > > asm > > > > is the isa? and the ll's are both llvm dumps?) > > > > > > yes. the first .ll is from compilation step, the other one is from linking > > > step. > > > > > > .ll dump looks correct. > > > .asm also looks correct. > > > > > > you can try producing multiple asm dumps for working and non-working runs. > > > But I don't think that the llvm is the culprit here. > > > > > > Can you try waiting for the kernel execution to complete explicitly before > > > mapping the buffer? > > > Ideally call clFinish() on line 63. > > > > Since I'm on the C++ binding (probably a mistake) I used: > > queue.finish(); > > > > and it seems to be working. > > > > (This also corresponds possibly to what I'm seeing on a more complex kernel; > > with a more complex kernel I'm seeing on a whole pile of data on the last > > few Z slices as being bogus suggesting it's not finished). > > > > Dave > > thanks for testing. I see you are using mesa 17.2. > > there were few changes to blocking call synchronization that went to mesa > 17.3: > 02f8ac6b70033a1b240d497c4664c359d2398cc3 (clover: Wrap event::wait_count in > a method taking care of the required locking.) > bc4000ee40c78efe1e5e8a6244d4bb55389d8418 (clover: Run the associated action > before an event is signalled.) > 3a5b69c09ba355c616c274b0c7f5aba3bd21fd54 (clover: Wait for requested > operation if blocking flag is set) > > which might help address the issue. Can you test mesa 17.3? Yeh, I'll figure out how to get 17.3 built on this box. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #8 from Jan Vesely--- (In reply to Dave Gilbert from comment #6) > (In reply to Jan Vesely from comment #5) > > (In reply to Dave Gilbert from comment #4) > > > Created attachment 135313 [details] > > > foo.link-0.ll > > > > > > That's all 3 of the debug files it produced. > > > (I wasn't sure which were the llvm and which the isa dumps; I guess the > > > asm > > > is the isa? and the ll's are both llvm dumps?) > > > > yes. the first .ll is from compilation step, the other one is from linking > > step. > > > > .ll dump looks correct. > > .asm also looks correct. > > > > you can try producing multiple asm dumps for working and non-working runs. > > But I don't think that the llvm is the culprit here. > > > > Can you try waiting for the kernel execution to complete explicitly before > > mapping the buffer? > > Ideally call clFinish() on line 63. > > Since I'm on the C++ binding (probably a mistake) I used: > queue.finish(); > > and it seems to be working. > > (This also corresponds possibly to what I'm seeing on a more complex kernel; > with a more complex kernel I'm seeing on a whole pile of data on the last > few Z slices as being bogus suggesting it's not finished). > > Dave thanks for testing. I see you are using mesa 17.2. there were few changes to blocking call synchronization that went to mesa 17.3: 02f8ac6b70033a1b240d497c4664c359d2398cc3 (clover: Wrap event::wait_count in a method taking care of the required locking.) bc4000ee40c78efe1e5e8a6244d4bb55389d8418 (clover: Run the associated action before an event is signalled.) 3a5b69c09ba355c616c274b0c7f5aba3bd21fd54 (clover: Wait for requested operation if blocking flag is set) which might help address the issue. Can you test mesa 17.3? -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #7 from Jan Vesely--- Created attachment 135318 --> https://bugs.freedesktop.org/attachment.cgi?id=135318=edit annotated asm dump -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #6 from Dave Gilbert--- (In reply to Jan Vesely from comment #5) > (In reply to Dave Gilbert from comment #4) > > Created attachment 135313 [details] > > foo.link-0.ll > > > > That's all 3 of the debug files it produced. > > (I wasn't sure which were the llvm and which the isa dumps; I guess the asm > > is the isa? and the ll's are both llvm dumps?) > > yes. the first .ll is from compilation step, the other one is from linking > step. > > .ll dump looks correct. > .asm also looks correct. > > you can try producing multiple asm dumps for working and non-working runs. > But I don't think that the llvm is the culprit here. > > Can you try waiting for the kernel execution to complete explicitly before > mapping the buffer? > Ideally call clFinish() on line 63. Since I'm on the C++ binding (probably a mistake) I used: queue.finish(); and it seems to be working. (This also corresponds possibly to what I'm seeing on a more complex kernel; with a more complex kernel I'm seeing on a whole pile of data on the last few Z slices as being bogus suggesting it's not finished). Dave -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #5 from Jan Vesely--- (In reply to Dave Gilbert from comment #4) > Created attachment 135313 [details] > foo.link-0.ll > > That's all 3 of the debug files it produced. > (I wasn't sure which were the llvm and which the isa dumps; I guess the asm > is the isa? and the ll's are both llvm dumps?) yes. the first .ll is from compilation step, the other one is from linking step. .ll dump looks correct. .asm also looks correct. you can try producing multiple asm dumps for working and non-working runs. But I don't think that the llvm is the culprit here. Can you try waiting for the kernel execution to complete explicitly before mapping the buffer? Ideally call clFinish() on line 63. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #4 from Dave Gilbert--- Created attachment 135313 --> https://bugs.freedesktop.org/attachment.cgi?id=135313=edit foo.link-0.ll That's all 3 of the debug files it produced. (I wasn't sure which were the llvm and which the isa dumps; I guess the asm is the isa? and the ll's are both llvm dumps?) -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #2 from Dave Gilbert--- Created attachment 135311 --> https://bugs.freedesktop.org/attachment.cgi?id=135311=edit foo.ll from debug run -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 --- Comment #3 from Dave Gilbert--- Created attachment 135312 --> https://bugs.freedesktop.org/attachment.cgi?id=135312=edit foo.link-0.asm -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 Jan Veselychanged: What|Removed |Added Blocks||99553 --- Comment #1 from Jan Vesely --- can you run using CLOVER_DEBUG=llvm,native CLOVER_DEBUG_FILE=foo and attach both llvm and isa dumps? Referenced Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=99553 [Bug 99553] Tracker bug for runnning OpenCL applications on Clover -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)
https://bugs.freedesktop.org/show_bug.cgi?id=103586 Bug ID: 103586 Summary: OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?) Product: Mesa Version: 17.2 Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Other Assignee: mesa-dev@lists.freedesktop.org Reporter: freedesk...@treblig.org QA Contact: mesa-dev@lists.freedesktop.org I've got a trivial kernel that draws a sphere in a voxel cube; each voxel should end up as 0 or 1; if I use global id 0 as z, 1 as y, 2 as x I get corruptions where some voxels have random junk in; if I reverse the order so that global id 0 is x, 1 is y and 2 is z then it's happy. (Confirmed the code is clean with oclgrind and happy on Intel. Versions: Number of devices 1 Device Name AMD TURKS (DRM 2.50.0 / 4.13.0-1-amd64, LLVM 5.0.0) Device Vendor AMD Device Vendor ID0x1002 Device Version OpenCL 1.1 Mesa 17.2.4 Driver Version 17.2.4 Device OpenCL C Version OpenCL C 1.1 (on debian testing, was on stable, but same behaviour) 01:00.0 0300: 1002:6841 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Thames [Radeon HD 7550M/7570M/7650M] (prog-if 00 [VGA controller]) Subsystem: Hewlett-Packard Company Thames [Radeon HD 7550M/7570M/7650M] Flags: bus master, fast devsel, latency 0, IRQ 37 Memory at c000 (64-bit, prefetchable) [size=256M] Memory at d430 (64-bit, non-prefetchable) [size=128K] I/O ports at 4000 [size=256] Expansion ROM at 000c [disabled] [size=128K] Capabilities: Kernel driver in use: radeon Kernel modules: radeon in an HP Elitebook laptop. Code that triggers this: https://github.com/penguin42/opencl-play/commit/c98470685874769e4a59975791459180564b6f6e build and run with: g++ -O2 ocl.cpp -lOpenCL && ./a.out 2> z then check output with: tr '01' ' '