Hi Michal, thanks for the bugfix. The crashes have now disappeared and more tests are passing with your bugfix version. However, several unit tests still fail that work with AMD and Intel. Briefly looking at the results I see lots of nan entries in the pocl output. I will try to pin this down more and then report back to you.
Best wishes Timo On Mon, 11 Mar 2019 at 10:50, Michal Babej (TAU) <[email protected]> wrote: > Hello, > > > I remember trying to fix this bug last year, but then i got sidetracked by > other things. (BTW it would be preferable if you reported bugs as github > issues in the future) > > > Anyway, i've hopefully fixed it. Can you test your program with master > branch from https://github.com/franz/pocl > > > Regards, > > -- mb > ------------------------------ > *From:* Timo Betcke <[email protected]> > *Sent:* Friday, March 8, 2019 3:48:34 AM > *To:* Portable Computing Language development discussion > *Subject:* Re: [pocl-devel] POCL Crash in vmovaps operation > > Dear Pekka, > > I have now cooked up a small example that crashes in vmovaps. The gist is > available here (uses PyOpenCL to run): > > https://gist.github.com/tbetcke/b4da01465b587e85cc88801aafdced0a > > The example is fairly nonsensical and was derived by reducing a crashing > kernel as far as possible while retaining the crash. > It runs fine under Intel CPU OpenCL on a Xeon and Rocm OpenCL on an AMD > GPU. My platform is Ubuntu 18.04 with llvm 6. If necessary > I can create an environment with updated llvm, but would like to avoid it > (unless it is llvm 6 related). Pocl is the most recent git master. > > The code crashes at the following assembler instructions: > > 0x00007fffe02575e3 <+195>: xor r9d,r9d > 0x00007fffe02575e6 <+198>: xor r10d,r10d > 0x00007fffe02575e9 <+201>: nop DWORD PTR [rax+0x0] > 0x00007fffe02575f0 <+208>: mov QWORD PTR [rdx+r9*1],0x0 > => 0x00007fffe02575f8 <+216>: vmovaps XMMWORD PTR [rdi+r9*1-0x10],xmm0 > 0x00007fffe02575ff <+223>: mov QWORD PTR [rdi+r9*1],0x0 > 0x00007fffe0257607 <+231>: vmovaps XMMWORD PTR [rdx+r9*1-0x10],xmm0 > 0x00007fffe025760e <+238>: vmovupd xmm1,XMMWORD PTR [rdi+r9*1-0x8] > 0x00007fffe0257615 <+245>: vaddpd xmm1,xmm1,XMMWORD PTR [rdx+r9*1-0x8] > 0x00007fffe025761c <+252>: vmovupd XMMWORD PTR [rdx+r9*1-0x8],xmm1 > 0x00007fffe0257623 <+259>: mov r8,r11 > 0x00007fffe0257626 <+262>: sar r8,0x20 > 0x00007fffe025762a <+266>: lea rsi,[r8+r8*2] > > Removing any of the for loops or the localResult variable (or removing its > __local attribute) leads to the kernel working on Pocl. > It would be great to get to the source of this. Please let me know if you > need more information from me. > > Best wishes > > Timo > > > On Wed, 6 Mar 2019 at 21:21, Timo Betcke <[email protected]> wrote: > > Hi Pekka, > > thanks for your hints and the link. I had one buffer in the kernel call > that had a cast from a float type to a vector type. I have fixed this. But > the segfault remains. In the next few days I will try to cook up a simple > example that produces the segfault. Fortunately, the kernel itself is not > too complicated, so should be able to reduce it. > > Best wishes > > Timo > > On Wed, 6 Mar 2019 at 10:20, Pekka Jääskeläinen (TAU) < > [email protected]> wrote: > > Yes, now that I look at it more closely, > your stack trace looks _very_ much to the common data alignment > issues people have. I think this might be worth a FAQ item somewhere. > > > https://stackoverflow.com/questions/5983389/how-to-align-stack-at-32-byte-boundary-in-gcc > > On 6.3.2019 8.45, Pekka Jääskeläinen (TAU) wrote: > > Hi Timo, > > > > Shooting in the dark here, but since just yesterday I debugged a similar > > looking issue > > which was caused by an illegal cast in the source code from float* to > > float4*. It trusted > > the alignment is still fine, which it wasn't after vectorization. A very > > target specific programming > > error which many ocl targets can easily hide. > > > > If this is something else, we need a test case, smaller the better, to > > help you here. > > Before opening an issue though, please with the latest master and LLVM 8. > > > > Pekka > > > > ------------------------------------------------------------------------ > > *From:* Timo Betcke <[email protected]> > > *Sent:* Tuesday, March 5, 2019 11:27:12 PM > > *To:* Portable Computing Language development discussion > > *Subject:* [pocl-devel] POCL Crash in vmovaps operation > > Dear Pocl community, > > > > I was just testing the newest Pocl Version (github master branch) with > > our software. During execution of one of our kernels Pocl crashed. > > Disassembling the crash shows the following operations during the crash: > > > > ------------------ > > 0x00007fffb81efdd8 <+664>: vmulpd xmm2,xmm2,xmm6 > > 0x00007fffb81efddc <+668>: vsubpd xmm2,xmm5,xmm2 > > 0x00007fffb81efde0 <+672>: vpermilpd xmm5,xmm4,0x1 > > 0x00007fffb81efde6 <+678>: vmulsd xmm3,xmm3,xmm5 > > 0x00007fffb81efdea <+682>: vmulsd xmm4,xmm15,xmm4 > > 0x00007fffb81efdee <+686>: vsubsd xmm3,xmm3,xmm4 > > 0x00007fffb81efdf2 <+690>: vpermilpd xmm1,xmm1,0x1 > > 0x00007fffb81efdf8 <+696>: vmulpd xmm0,xmm0,xmm1 > > 0x00007fffb81efdfc <+700>: vpermilpd xmm1,xmm0,0x1 > > 0x00007fffb81efe02 <+706>: vsubsd xmm0,xmm0,xmm1 > > 0x00007fffb81efe06 <+710>: lea rsi,[rdx+rdx*2] > > 0x00007fffb81efe0a <+714>: mov rdx,QWORD PTR [rbx+0x38] > > => 0x00007fffb81efe0e <+718>: vmovaps XMMWORD PTR [rdx+rsi*8],xmm12 > > ---Type <return> to continue, or q <return> to quit--- > > 0x00007fffb81efe13 <+723>: mov QWORD PTR [rbx+0x40],rsi > > 0x00007fffb81efe17 <+727>: mov QWORD PTR [rdx+rsi*8+0x10],0x0 > > 0x00007fffb81efe20 <+736>: vinsertf32x4 ymm1,ymm16,xmm0,0x1 > > ----------------------------- > > This seems to be a similar bug that I discussed a year ago on the > > mailing list. See the thread here: > > > https://www.mail-archive.com/[email protected]/msg01087.html. > > > In summary, the issue was related to us using arrays of arrays within > > our kernels and pocl creating wrong code for it. > > > > During that time a gist was suggested for Pocl, which I tested but did > > not improve things. Afterwards I let it drop for a while as we were in > > early development and had loads of building sites. But our software is > > now close to release ready and it would be great to get it working with > > pocl. > > > > Any help would be greatly appreciated. > > Best wishes > > > > Timo > > > > -- > > Timo Betcke > > Professor of Computational Mathematics > > University College London > > Department of Mathematics > > E-Mail: [email protected] <mailto:[email protected]> > > Tel.: +44 (0) 20-3108-4068 > > > > > > _______________________________________________ > > pocl-devel mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/pocl-devel > > > > -- > Pekka > > > _______________________________________________ > pocl-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/pocl-devel > > > > -- > Timo Betcke > Professor of Computational Mathematics > University College London > Department of Mathematics > E-Mail: [email protected] > Tel.: +44 (0) 20-3108-4068 > > > > -- > Timo Betcke > Professor of Computational Mathematics > University College London > Department of Mathematics > E-Mail: [email protected] > Tel.: +44 (0) 20-3108-4068 > _______________________________________________ > pocl-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/pocl-devel > -- Timo Betcke Professor of Computational Mathematics University College London Department of Mathematics E-Mail: [email protected] Tel.: +44 (0) 20-3108-4068
_______________________________________________ pocl-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/pocl-devel
