Siarhei Siamashka <[email protected]> writes: >> == a8r8g8b8 OVER r5g6b5 == >> >> When OVER compositing the a8r8g8b8 pixel 0x0f00c300 with the x14r5g6b5 > > Did you actually mean x14r6g6b6?
Yes, thanks. >> pixel 0x03c0, the true floating point value of the resulting green >> channel is: >> >> 0xc3 / 255.0 + (1.0 - 0x0f / 255.0) * (0x0f / 63.0) = 0.9887955 >> >> but when compositing 8 bit values, where the 6-bit green channel is >> converted to 8 bit through bit replication, the 8-bit result is: >> >> 0xc3 + ((255 - 0x0f) * 0x3c + 127) / 255 = 251 >> >> which corresponds to a real value of 0.984314. The difference from the >> true value is 0.004482 which is bigger than the acceptable deviation >> of 0.004. So, if we were to compute all the CONJOINT/DISJOINT >> operators in floating point, or otherwise make them more accurate, the >> acceptable deviation could be set at 0.0045. >> >> If we were doing the 6-bit conversion with rounding: >> >> (x / 63.0 * 255.0 + 0.5) >> >> instead of bit replication, the deviation in this particular case >> would be only 0.0005, so we may want to consider this at some >> point. > > This has been also discussed here: > > http://comments.gmane.org/gmane.comp.graphics.pixman/1891 > > Though the bit replication when converting to 8-bit is not so bad. > Dropping lower bits when converting back introduces a bigger error. > > Anyway, if I remember correctly, the accuracy loss has been well known > since the time when bitexact testing was introduced. Other than using > less accurate but faster conversion approximations, currently there > is also an assumption that separate "fetch -> combine -> store" steps > must provide exactly the same results as the fast path functions doing > the same operations in one go. This restriction surely inhibits > performance and accuracy. Certain platforms (ARM11 and MIPS32) should > be able to improve performance a bit if we go away from bitexact > correctness testing and allow more freedom in implementations. So this > patchset indeed looks rather useful. > > However I think that we may need to come to an agreement on the primary > purpose of the 8-bit pipeline, especially now that we also have a > floating point pipeline. In my opinion, the 8-bit integer pipeline > should always favour performance over accuracy in the case of doubt. I agree that the primary purpose of the 8-bit pipeline is performance. If performance didn't matter, we could just use floating point for everything. But clearly we can't allow arbitrary deviation from the exact computation, so the question has to be how much deviation is acceptable. > Moreover, anyone using r5g6b5 format is most likely either memory or > performance constrained, so they would not particularly appreciate the > more accurate, but slower conversion between a8r8g8b8 and r5g6b5. It's not an academic discussion btw. If we add dithering, the difference between shifting and rounding becomes very obvious. Here are two images, both containing a gradient rendered three different ways: once onto r5g6b5 without dithering, once onto a8r8g8b8 without dithering, and once with dithering onto r5g6b5. In the first image, bitshifting is used: http://people.freedesktop.org/~sandmann/dither-shift.png In the second, rounding is used: http://people.freedesktop.org/~sandmann/dither-round.png In the first image, there is an obvious darkening in the dithered gradient. In the second, the difference is visible, but fairly subtle. Even the undithered gradient, while ugly in both cases, is rendered visibly more faithfully with rounding. > There are also other libraries and alternative solutions out > there. The competition between different mobile browsers and UI > toolkits for the embedded systems seems to be heavily focused on > performance. Every little bit is relevant. Well, we could start doing division by 255 in this way: (a * b + 0xff) >> 8 The error is not that severe, and it would be a little bit faster than (t = a * b + 0x80, (t + (t >> 8)) >> 8. If we had started out doing divisions in the above way, would we now be debating whether the additional shift instruction in the (t + (t >> 8)) >> 8) formula would be worth the higher precision? The question I'm trying to answer is how much deviation should be considered acceptable. The answer is unlikely to be: "We got it precisely right back when the bitexact test suite was added", especially, as you pointed out, there are places where we could improve both performance and accuracy. That goes for r5g6b5 too btw. For over_8888_0565(), this: s + DIV_63 ((255 - a) * d) would likely be both faster and more accurate than s + DIV_255 ((255 - a) * ((d << 4) | (d >> 2))) > And while we are talking about this, bilinear interpolation precision > is also somewhat related here (the choice of 7-bit vs. 4-bit) and > whether we can avoid doing correct rounding for it or not. To use the tolerance based tests as implemented by do_composite(), I think both the reference and the test subject have to use the same subsampling precision. (Dithered rounding could also be used here, btw.) > On the other hand, the floating point pipeline is a good place to > implement sRGB, accurate format conversions and the other nice things. > In other words, it can favour accuracy over performance. In my view, the floating point pipeline should eventually implement everything with high accuracy so that it can be used both as a reference for a tolerance based test suite, and as a fallback for operations that don't have fast paths. I have a start on that here: http://cgit.freedesktop.org/~sandmann/pixman/log/?h=float-imp Trying to verify that that branch fixes the a2r10g10b10->a8r8g8b8 precision loss is what prompted this patch set and some upcoming fixes for the PDF operators. Søren _______________________________________________ Pixman mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pixman
