Hi,

I have opened an issue and summarised the discussion so far in

https://github.com/pocl/pocl/issues/701

Best wishes

Timo


On Thu, 14 Mar 2019 at 10:20, Pekka Jääskeläinen (TAU) <
[email protected]> wrote:

> Hi Timo,
>
> Can you please open an issue of this, it's easier to track
> in Github?
>
> Thanks,
> Pekka
>
> On 14.3.2019 1.49, Timo Betcke wrote:
> > Hi,
> >
> > I have pinned down the next failed test. It still seems related to the
> > multi-indexing even with your bugfixed version. The corresponding gist
> > is here:
> >
> > https://gist.github.com/tbetcke/0bf7e12a2f3ab8032339cc38b8441b6e
> >
> > At the end of the kernel all entries in shapeIntegral should have the
> > value 1.0. However, while shapeIntegral[0][0] is correct,
> > shapeIntegral[1][0] is not.
> > If I move the second print statement for shapeIntegral[1][0] into the
> > for loop the variables are correctly updated.
> >
> > Just something for context. The actual kernel from which this example is
> > derived, is doing a finite element integral on a triangle. The test
> > values are from the test space and the trial values from the domain
> > space. Via C Macros I am adapting the dimensions of the arrays to the
> > actual number of test and trial functions. The crash happens for trial
> > dimension 1 and test dimension 3.
> >
> > Thanks again for your help. I am excited about getting Pocl to work with
> > our software.
> >
> > Best wishes
> >
> > Timo
> >
> >
> > On Wed, 13 Mar 2019 at 23:23, Timo Betcke <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     Hi Michal,
> >
> >     thanks for the bugfix. The crashes have now disappeared and more
> >     tests are passing with your bugfix version. However, several unit
> >     tests still fail that work with AMD and Intel. Briefly looking at
> >     the results I see lots of nan entries in the pocl output. I will try
> >     to pin this down more and then report back to you.
> >
> >     Best wishes
> >
> >     Timo
> >
> >     On Mon, 11 Mar 2019 at 10:50, Michal Babej (TAU)
> >     <[email protected] <mailto:[email protected]>> wrote:
> >
> >         Hello,
> >
> >
> >         I remember trying to fix this bug last year, but then i got
> >         sidetracked by other things. (BTW it would be preferable if you
> >         reported bugs as github issues in the future)
> >
> >
> >         Anyway, i've hopefully fixed it. Can you test your program with
> >         master branch from https://github.com/franz/pocl
> >
> >
> >         Regards,
> >
> >         -- mb
> >
> >
>  ------------------------------------------------------------------------
> >         *From:* Timo Betcke <[email protected]
> >         <mailto:[email protected]>>
> >         *Sent:* Friday, March 8, 2019 3:48:34 AM
> >         *To:* Portable Computing Language development discussion
> >         *Subject:* Re: [pocl-devel] POCL Crash in vmovaps operation
> >         Dear Pekka,
> >
> >         I have now cooked up a small example that crashes in vmovaps.
> >         The gist is available here (uses PyOpenCL to run):
> >
> >         https://gist.github.com/tbetcke/b4da01465b587e85cc88801aafdced0a
> >
> >         The example is fairly nonsensical and was derived by reducing a
> >         crashing kernel as far as possible while retaining the crash.
> >         It runs fine under Intel CPU OpenCL on a Xeon and Rocm OpenCL on
> >         an AMD GPU. My platform is Ubuntu 18.04 with llvm 6. If necessary
> >         I can create an environment with updated llvm, but would like to
> >         avoid it (unless it is llvm 6 related). Pocl is the most recent
> >         git master.
> >
> >         The code crashes at the following assembler instructions:
> >
> >             0x00007fffe02575e3 <+195>:   xor    r9d,r9d
> >             0x00007fffe02575e6 <+198>:   xor    r10d,r10d
> >             0x00007fffe02575e9 <+201>:   nop    DWORD PTR [rax+0x0]
> >             0x00007fffe02575f0 <+208>:   mov    QWORD PTR [rdx+r9*1],0x0
> >         => 0x00007fffe02575f8 <+216>:   vmovaps XMMWORD PTR
> >         [rdi+r9*1-0x10],xmm0
> >             0x00007fffe02575ff <+223>:   mov    QWORD PTR [rdi+r9*1],0x0
> >             0x00007fffe0257607 <+231>:   vmovaps XMMWORD PTR
> >         [rdx+r9*1-0x10],xmm0
> >             0x00007fffe025760e <+238>:   vmovupd xmm1,XMMWORD PTR
> >         [rdi+r9*1-0x8]
> >             0x00007fffe0257615 <+245>:   vaddpd xmm1,xmm1,XMMWORD PTR
> >         [rdx+r9*1-0x8]
> >             0x00007fffe025761c <+252>:   vmovupd XMMWORD PTR
> >         [rdx+r9*1-0x8],xmm1
> >             0x00007fffe0257623 <+259>:   mov    r8,r11
> >             0x00007fffe0257626 <+262>:   sar    r8,0x20
> >             0x00007fffe025762a <+266>:   lea    rsi,[r8+r8*2]
> >
> >         Removing any of the for loops or the localResult variable (or
> >         removing its __local attribute) leads to the kernel working on
> Pocl.
> >         It would be great to get to the source of this. Please let me
> >         know if you need more information from me.
> >
> >         Best wishes
> >
> >         Timo
> >
> >
> >         On Wed, 6 Mar 2019 at 21:21, Timo Betcke <[email protected]
> >         <mailto:[email protected]>> wrote:
> >
> >             Hi Pekka,
> >
> >             thanks for your hints and the link. I had one buffer in the
> >             kernel call that had a cast from a float type to a vector
> >             type. I have fixed this. But the segfault remains. In the
> >             next few days I will try to cook up a simple example that
> >             produces the segfault. Fortunately, the kernel itself is not
> >             too complicated, so should be able to reduce it.
> >
> >             Best wishes
> >
> >             Timo
> >
> >             On Wed, 6 Mar 2019 at 10:20, Pekka Jääskeläinen (TAU)
> >             <[email protected]
> >             <mailto:[email protected]>> wrote:
> >
> >                 Yes, now that I look at it more closely,
> >                 your stack trace looks _very_ much to the common data
> >                 alignment
> >                 issues people have. I think this might be worth a FAQ
> >                 item somewhere.
> >
> >
> https://stackoverflow.com/questions/5983389/how-to-align-stack-at-32-byte-boundary-in-gcc
> >
> >                 On 6.3.2019 8.45, Pekka Jääskeläinen (TAU) wrote:
> >                  > Hi Timo,
> >                  >
> >                  > Shooting in the dark here, but since just yesterday I
> >                 debugged a similar
> >                  > looking issue
> >                  > which was caused by an illegal cast in the source
> >                 code from float* to
> >                  > float4*. It trusted
> >                  > the alignment is still fine, which it wasn't after
> >                 vectorization. A very
> >                  > target specific programming
> >                  > error which many ocl targets can easily hide.
> >                  >
> >                  > If this is something else, we need a test case,
> >                 smaller the better, to
> >                  > help you here.
> >                  > Before opening an issue though, please with the
> >                 latest master and LLVM 8.
> >                  >
> >                  > Pekka
> >                  >
> >                  >
> >
>  ------------------------------------------------------------------------
> >                  > *From:* Timo Betcke <[email protected]
> >                 <mailto:[email protected]>>
> >                  > *Sent:* Tuesday, March 5, 2019 11:27:12 PM
> >                  > *To:* Portable Computing Language development
> discussion
> >                  > *Subject:* [pocl-devel] POCL Crash in vmovaps
> operation
> >                  > Dear Pocl community,
> >                  >
> >                  > I was just testing the newest Pocl Version (github
> >                 master branch) with
> >                  > our software. During execution of one of our kernels
> >                 Pocl crashed.
> >                  > Disassembling the crash shows the following
> >                 operations during the crash:
> >                  >
> >                  > ------------------
> >                  >     0x00007fffb81efdd8 <+664>:   vmulpd xmm2,xmm2,xmm6
> >                  >     0x00007fffb81efddc <+668>:   vsubpd xmm2,xmm5,xmm2
> >                  >     0x00007fffb81efde0 <+672>:   vpermilpd
> xmm5,xmm4,0x1
> >                  >     0x00007fffb81efde6 <+678>:   vmulsd xmm3,xmm3,xmm5
> >                  >     0x00007fffb81efdea <+682>:   vmulsd
> xmm4,xmm15,xmm4
> >                  >     0x00007fffb81efdee <+686>:   vsubsd xmm3,xmm3,xmm4
> >                  >     0x00007fffb81efdf2 <+690>:   vpermilpd
> xmm1,xmm1,0x1
> >                  >     0x00007fffb81efdf8 <+696>:   vmulpd xmm0,xmm0,xmm1
> >                  >     0x00007fffb81efdfc <+700>:   vpermilpd
> xmm1,xmm0,0x1
> >                  >     0x00007fffb81efe02 <+706>:   vsubsd xmm0,xmm0,xmm1
> >                  >     0x00007fffb81efe06 <+710>:   lea
> rsi,[rdx+rdx*2]
> >                  >     0x00007fffb81efe0a <+714>:   mov    rdx,QWORD PTR
> >                 [rbx+0x38]
> >                  > => 0x00007fffb81efe0e <+718>:   vmovaps XMMWORD PTR
> >                 [rdx+rsi*8],xmm12
> >                  > ---Type <return> to continue, or q <return> to quit---
> >                  >     0x00007fffb81efe13 <+723>:   mov    QWORD PTR
> >                 [rbx+0x40],rsi
> >                  >     0x00007fffb81efe17 <+727>:   mov    QWORD PTR
> >                 [rdx+rsi*8+0x10],0x0
> >                  >     0x00007fffb81efe20 <+736>:   vinsertf32x4
> >                 ymm1,ymm16,xmm0,0x1
> >                  > -----------------------------
> >                  > This seems to be a similar bug that I discussed a
> >                 year ago on the
> >                  > mailing list. See the thread here:
> >                  >
> >
> https://www.mail-archive.com/[email protected]/msg01087.html
> .
> >
> >                  > In summary, the issue was related to us using arrays
> >                 of arrays within
> >                  > our kernels and pocl creating wrong code for it.
> >                  >
> >                  > During that time a gist was suggested for Pocl, which
> >                 I tested but did
> >                  > not improve things. Afterwards I let it drop for a
> >                 while as we were in
> >                  > early development and had loads of building sites.
> >                 But our software is
> >                  > now close to release ready and it would be great to
> >                 get it working with
> >                  > pocl.
> >                  >
> >                  > Any help would be greatly appreciated.
> >                  > Best wishes
> >                  >
> >                  > Timo
> >                  >
> >                  > --
> >                  > Timo Betcke
> >                  > Professor of Computational Mathematics
> >                  > University College London
> >                  > Department of Mathematics
> >                  > E-Mail: [email protected]
> >                 <mailto:[email protected]> <mailto:[email protected]
> >                 <mailto:[email protected]>>
> >                  > Tel.: +44 (0) 20-3108-4068
> >                  >
> >                  >
> >                  > _______________________________________________
> >                  > pocl-devel mailing list
> >                  > [email protected]
> >                 <mailto:[email protected]>
> >                  >
> https://lists.sourceforge.net/lists/listinfo/pocl-devel
> >                  >
> >
> >                 --
> >                 Pekka
> >
> >
> >                 _______________________________________________
> >                 pocl-devel mailing list
> >                 [email protected]
> >                 <mailto:[email protected]>
> >                 https://lists.sourceforge.net/lists/listinfo/pocl-devel
> >
> >
> >
> >             --
> >             Timo Betcke
> >             Professor of Computational Mathematics
> >             University College London
> >             Department of Mathematics
> >             E-Mail: [email protected] <mailto:[email protected]>
> >             Tel.: +44 (0) 20-3108-4068
> >
> >
> >
> >         --
> >         Timo Betcke
> >         Professor of Computational Mathematics
> >         University College London
> >         Department of Mathematics
> >         E-Mail: [email protected] <mailto:[email protected]>
> >         Tel.: +44 (0) 20-3108-4068
> >         _______________________________________________
> >         pocl-devel mailing list
> >         [email protected]
> >         <mailto:[email protected]>
> >         https://lists.sourceforge.net/lists/listinfo/pocl-devel
> >
> >
> >
> >     --
> >     Timo Betcke
> >     Professor of Computational Mathematics
> >     University College London
> >     Department of Mathematics
> >     E-Mail: [email protected] <mailto:[email protected]>
> >     Tel.: +44 (0) 20-3108-4068
> >
> >
> >
> > --
> > Timo Betcke
> > Professor of Computational Mathematics
> > University College London
> > Department of Mathematics
> > E-Mail: [email protected] <mailto:[email protected]>
> > Tel.: +44 (0) 20-3108-4068
> >
> >
> > _______________________________________________
> > pocl-devel mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/pocl-devel
> >
>
> --
> Pekka
>
> _______________________________________________
> pocl-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/pocl-devel
>


-- 
Timo Betcke
Professor of Computational Mathematics
University College London
Department of Mathematics
E-Mail: [email protected]
Tel.: +44 (0) 20-3108-4068
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to