> Are you referring to MIPS implementation of the following code? > > http://cgit.freedesktop.org/pixman/tree/pixman/pixman-fast-path.c?id=pixman- 0.29.2#n389
Yes. > Looks like a lot of changes for only adding a missing shift. Are you > really just fixing a single bug and not also introducing something > unrelated? Yes, it really does look like a huge change for couple of missing shifts. When I wrote this code in the first place, I misplaced those shifts, which allowed me to combine code for over operation and: UN8x4_MUL_UN8x4 (s, ma); UN8x4_MUL_UN8 (ma, srca); ma = ~ma; UN8x4_MUL_UN8x4_ADD_UN8x4 (d, ma, s); where shifts are not present (for ma). So I decided to rewrite that piece of code from scratch. I changed logic, so now assembly code mimic code from pixman-fast-path.c but process two pixels at a time. This code should be easier to debug and maintain. > Also appears that this is not the only problem in the MIPS DSPr2 > code. Using "test/fuzzer-find-diff.pl" script, I can reproduce one > more failure: I'll look into this, and upload separate patch with fix for this. Thanks, Nemanja Lukic -----Original Message----- From: Siarhei Siamashka [mailto:siarhei.siamas...@gmail.com] Sent: Sunday, March 03, 2013 3:42 AM To: Nemanja Lukic Cc: pixman@lists.freedesktop.org; Nemanja Lukic Subject: Re: [Pixman] [PATCH] MIPS: DSPr2: Fix bug in over_n_8888_8888_ca/over_n_8888_0565_ca routines On Fri, 1 Mar 2013 10:10:16 +0100 Nemanja Lukic <nlu...@mips.com> wrote: > From: Nemanja Lukic <nemanja.lu...@rt-rk.com> > > After introducing new PRNG (pseudorandom number generator) a bug in two DSPr2 > routines was revealed. Bug manifested by wrong calculation in composite and > glyph tests, which caused make check to fail for MIPS DSPr2 optimizations. Thanks for spotting and addressing this issue. The test suite has a relatively good coverage, but a chance for missing some bugs always exists. Increasing the number of iterations in random tests can reduce this probability, but makes the tests run longer. One of the reasons for introducing the new PRNG was the intention to improve performance. It was the biggest performance bottleneck for the blitters-test program (another bottleneck is CRC32 calculation there) as can be seen from the profiling logs included in the commit message: http://cgit.freedesktop.org/pixman/commit/?id=b31a696263f1ae9a Having less than 4 seconds to run blitters-test on a reasonably fast x86 hardware, we could increase the number of randomly tested compositing operations quite significantly and improve reliability. But unfortunately some slow hardware such as MIPS and ARM11 is holding us back. The MIPS build of blitters-test needs ~7.5 minutes to run on MIPS 74K 480MHz hardware or ~2.5 minutes in QEMU on Intel Core i7 860 (with a single thread). That's quite a lot. CRC32 can be still improved quite significantly by using a better split-by-four or split-by-eight implementation from zlib or xz (either by borrowing the code or by adding one of these libraries as an optional dependency). But that's more like just tens percents of overall performance improvement and not a magic solution. > Bug was in the calculation of the: > *dst = over (src, *dst) when ma == 0xffffffff Are you referring to MIPS implementation of the following code? http://cgit.freedesktop.org/pixman/tree/pixman/pixman-fast-path.c?id=pixman- 0.29.2#n389 Indeed, in order to test this branch in the blitters-test program, we need to encounter four consecutive 0xff bytes in the mask. The randomly generated images are already having more 0xff and 0x00 bytes. But maybe adding some code to increase the probability of getting large clusters of 0xff and 0x00 in the randomly generated images could improve the reliability of testing. > In this case src was not negated and shifted right by 24 bits, > it was only negated. Routines are rewritten, and now make check passes > for DPSr2 optimizations. Performance improvement remained the same as in > original commit. > > The bug was revealed in commit b31a6962. Errors were detected by composite > and glyph tests. > --- > pixman/pixman-mips-dspr2-asm.S | 298 ++++++++++++++++++++++----------------- > 1 files changed, 168 insertions(+), 130 deletions(-) Looks like a lot of changes for only adding a missing shift. Are you really just fixing a single bug and not also introducing something unrelated? Also appears that this is not the only problem in the MIPS DSPr2 code. Using "test/fuzzer-find-diff.pl" script, I can reproduce one more failure: op=PIXMAN_OP_IN src_fmt=a8r8g8b8, dst_fmt=a8, mask_fmt=null src_width=1, src_height=1, dst_width=124, dst_height=14 src_x=0, src_y=0, dst_x=4, dst_y=0 src_stride=12, dst_stride=128 w=13, h=12 4114763: checksum=023EF000 The problematic conditions can be reproduced by running: ./blitters-test 4114763 Which shows us that blitters-test would need to run 2-3x more iterations to detect this problem (right now it runs 2000000 tests). It is also a good idea to run fuzzer-find-diff.pl script in the "infinite" mode overnight from time to time, or at least once before releases. -- Best regards, Siarhei Siamashka _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman