> -----Original Message----- > From: Kalle Raiskila [mailto:[email protected]] > Sent: 08 October 2014 07:02 > To: Portable Computing Language development discussion > Subject: Re: [pocl-devel] shuffle() and shuffle2() swaps M and N? > > Hi, > > On 07.10.2014 17:53, ext Daniel Sanders wrote: > > Hi, > > > > I'm debugging a problem with the shuffle() and shuffle2() functions > > on > MIPS. I think I've found (part of) the problem but if I'm correct, then > I don't see how it can be working for other targets. > > The macros that make up the kernel are admittedly a bit twisty, and > shuffle is no exception. I've wrecked my brain several times on these. > > > > > Starting with test/kernel/test_shuffle.cc, I've reduced the testcase down > to n=2 and m=16 and changed stimuli in testcase() so that it only contains > 13's. My testcase produces this output: > > Error in shuffle short 16 => short 2 :[1, 1] = shuffle( [0, 1, 2, 3, 4, 5, > > 6, 7, 8, 9, > 10, 11, 12, 13, 14, 15], [13, 13]); > > element 0 should be 13 (mask 13), got 1 > > element 1 should be 13 (mask 13), got 1 > > Error in shuffle2 short 16 => short 2 :[1, 1] = shuffle2( [0, 1, 2, 3, 4, > > 5, 6, 7, 8, > 9, 10, 11, 12, 13, 14, 15], [16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, > 28, 29, 30, > 31], [13, 13]); > > As you can see, the mask is selecting element 1 instead of the 13 that was > expected. > > Ok, this is clearly an incorrect output. > > > > > lib/kernel/shuffle.cl contains the following code: > > #define _CL_IMPLEMENT_SHUFFLE(ELTYPE, MTYPE, N, M) \ > > ELTYPE##N __attribute__ ((overloadable)) \ > > shuffle(ELTYPE##M in, MTYPE##N mask) \ > > { \ > > MTYPE msize = M==3 ? 4 : M; \ > > ELTYPE##N out; \ > > for (int i=0; i<N; ++i) { \ > > MTYPE m = mask[i] & (msize-1); \ > > out[i] = in[m]; \ > > } \ > > return out; \ > > } > > > > #define _CL_IMPLEMENT_SHUFFLE_M(ELTYPE, MTYPE, M) \ > > _CL_IMPLEMENT_SHUFFLE(ELTYPE, MTYPE, M, 2) \ > > ... > > > > Hasn't this swapped M and N so that the mask is being limited to 0-N (0-1) > instead of 0-M (0-15)? > > The naming here follows the OCL spec - Out and Mask are vectors of size > (0-N), the inputs are of size (0-M). The content of mask values must be > limited to the element count of the inputs. > > Looking at the above - the variable M in _CL_IMPLEMENT_SHUFFLE_M > *might* > be poorly named - N might be a better name here... But that should not > affect the functionality (?), as all permutations of N and M are generated. > > Or is there some other thing I overlooked?
Hmm, I agree. All the permutations are generated and the ones I've expanded manually are doing the right thing for their prototype. I can't explain why swapping the arguments (and fixing a vector alignment bug in LLVM) made my tests pass so I'll keep digging. Thanks. > Note also, that the spec doesn't define shuffle for vectors of 3 - this > extension went in unnoticed, as some other OCL implementation had it too. > > kalle > > ------------------------------------------------------------------------------ > Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer > Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports > Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper > Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer > http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.cl > ktrk > _______________________________________________ > pocl-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/pocl-devel ------------------------------------------------------------------------------ Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk _______________________________________________ pocl-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/pocl-devel
