Hi Roland!

On 4/11/19 8:18 PM, Roland Scheidegger wrote:
What version of mesa are you using?
The original results were generated using version 19.0.2 (from the arch linux repositories), but I got the same results using the current git version (98934e6aa19795072a353dae6020dafadc76a1e3).
The debug flags were changed a while ago (so that those perf tweaks can
be disabled on release builds too), it needs to be either:
GALLIVM_PERF=no_rho_approx,no_brilinear,no_quad_lod
or easier
GALLIVM_PERF=no_filter_hacks (which disables these 3 things above together)

Although all of that only really affects filtering with mipmaps (not
sure if you do?).
Using GALLIVM_PERF does not a make a difference, either, but that should be expected because I'm not using mipmaps, just "regular" linear filtering (GL_NEAREST).


(more below)
See my responses below as well.


Am 11.04.19 um 18:00 schrieb Dominik Drees:
Running with the suggested flags in the environment does not change the
result for the test case I described below. The results with and without
the environment variables set are pixel-wise equal.

By the way, and if this of interest: For GL_NEAREST sampling the results
from hardware and llvmpipe are equal as well.

Best,
Dominik

On 4/11/19 4:36 PM, Ilia Mirkin wrote:
llvmpipe takes a number of shortcuts in the interest of speed which
cause inaccurate texturing. Try running with

GALLIVM_DEBUG=no_rho_approx,no_brilinear,no_quad_lod

and see if the issue still occurs.

Cheers,

    -ilia



On Thu, Apr 11, 2019 at 8:30 AM Dominik Drees <dominik.dr...@wwu.de>
wrote:

Hello, everyone!

I have a question regarding the interpolation precision of llvmpipe.
Feel free to redirect me to somewhere else if this is not the right
place to ask. Consider the following scenario: In a fragment shader we
are sampling from a 16x16, 8 bit texture with values between 0 and 3
using linear interpolation. Then we write white to the screen if the
sampled value is > 1/255 and black otherwise. The output looks very
different when rendered with llvmpipe compared to the result produced by
rendering hardware (for both intel (mesa i965) and nvidia (proprietary
driver)).

I've uploaded examplary output images here
(https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fimgur.com%2Fa%2FD1udpez&amp;data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501149697&amp;sdata=vymggYHZTDLwKNh7RpcM1eSyhVA2L%2BfHNchvYS8yQPQ%3D&amp;reserved=0)

and the corresponding fragment shader here
(https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fpa808Req&amp;data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501149697&amp;sdata=%2FqKVJCXFS4UswynKeSoqCKivTHAb2o%2FZwVE1nwNms3M%3D&amp;reserved=0).
The shader looks iffy to me, how do you use that vec4 in the if clause?



My hypothesis is that llvmpipe (in contrast to hardware) only uses 8 bit
for the interpolation computation when reading from 8 bit textures and
thus loses precision in the lower bits. Is that correct? If so, does
anyone know of a workaround?

So, in theory it is indeed possible the results are less accurate with
llvmpipe (I believe all recent hw does rgba8 filtering with more than 8
bit precision).
For formats fitting into rgba8, we have a fast path in llvmpipe
(gallivm) for the lerp, which unpacks the 8bit values into 16bit values,
does the lerp with that and packs back to 8 bit. The result is
accurately rounded there (to 8 bit) but only for 1 lerp step - for a 2d
texture there are 3 of those (one per direction, and a final one
combining the result). And yes this means the filtered result only has 8
bits.
Do I understand you correctly in that for the 2D case, the results of the first two lerps (done in 16 bit) are converted to 8 bit, then converted back to 16 bit for the final (second stage) lerp?

If so and if I'm understanding this correctly, for 2D (i.e., a 2-stage linear interpolation) we potentially have an error in the order of one bit for the final 8 bit value due to the intermediate 16->8->16 conversion. For sampling from a 3D texture (i.e., a 3-stage linear interpolation) the effect would be amplified: The extra stage could cause an error with a magnitude of two bits of the final 8 bit result (if I'm doing the math in my head correctly).

Is there any (conceptual) reason why the result of a one dimensional interpolation step is reduced back to 8 bits before the second stage interpolation? Would avoiding these conversions not actually be faster (in addition to the improved accuracy)?

I do believe you should not rely on implementations having more accuracy
- as far as I know the filtering we do is conformant there (it is tricky
to do better using the fast path).
In principle you are correct. In our regressiontests we actually have (per test) configurable thresholds for maximum pixel distance/maximum number of differing pixels/neighborhood search radius etc. We could just increase these thresholds, but would risk missing some regressions that (for example) only affect a very small portion of the screen. For the larger part of our test suite llvmpipe actually works quite well within the established limits. For some other cases where we render a relatively small 8 bit 3D volume the differences basically trampled the previously set thresholds and were quite visible to the naked eye.


There would be code to actually do filtering with full float precision,
although there's no way to reach it with rgba8 formats unless you change
the code (if you want to try out the theory, look at
lp_bld_sample_soa.c, lp_build_sample_soa_code() determines whether to
use the fast (aos) filtering path (use_aos, determined mostly by
util_format_fits_8unorm()). If you set this to false it will use the
full float filtering path. (FWIW I was actually thinking a while ago we
should force this path when there's only 1 channel, albeit I never got
around to test (benchmark) it - this is because the AoS filtering path
is really optimized for rgba8 formats, and if you only have 1 channel
it's quite possible float filtering is actually faster, since this
handles the channels individually.)
I guess though if the full float precision filtering is useful in
general, we could add that to GALLIVM_PERF.
Forcing float precision indeed fixes the test case described below and our volume rendering regression tests! If this cannot be fixed in general I would be very happy about an option to force float precision via GALLIVM_PERF. FWIW, with forced float precision running our test suit is actually faster (~6 minutes) than "stock" master (~6:40), but these may be highly biased, of course.

Best,
Dominik

Roland





A little bit of background about the use case: We are trying to move the
CI of Voreen
(https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.uni-muenster.de%2FVoreen%2F&amp;data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501149697&amp;sdata=tZf1sxXpC0rDhAAzqXNp9UQnRmrnZceKCerfJKcMdmk%3D&amp;reserved=0)
to the Gitlab-CI
running in docker without any hardware dependencies. Using llvmpipe for
our regression tests works in principle, but shows significant
differences in the raycasting rendering of an 8-bit-per-voxel dataset.
(The effect is of course less visible than the constructed example case
linked above, but still quite noticeable for a human.)

Any help or pointers would be appreciated!

Best,
Dominik

--
Dominik Drees

Department of Computer Science
Westfaelische Wilhelms-Universitaet Muenster

email: dominik.dr...@wwu.de
web:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wwu.de%2FPRIA%2Fpersonen%2Fdrees.shtml&amp;data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501159687&amp;sdata=tZeO2bZCQzdIz8ifZnNRbQ8tM46CCTDrDFgTeXbVWUU%3D&amp;reserved=0

phone: +49 251 83 - 38448

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-dev&amp;data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501159687&amp;sdata=d%2Fj7ZLjayR308Y0qFzFu5YqVBbQF%2B1b8tHPS75U3jco%3D&amp;reserved=0



_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-dev&amp;data=02%7C01%7Csroland%40vmware.com%7Cbdef52eb504c4078f9f808d6be96da17%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636905952501179679&amp;sdata=fMbBfbBWnYQbDmwTcV%2FaOVpXwTLD%2BV5PF2yGH8hvHkM%3D&amp;reserved=0



--
Dominik Drees

Department of Computer Science
Westfaelische Wilhelms-Universitaet Muenster

email: dominik.dr...@wwu.de
web: https://www.wwu.de/PRIA/personen/drees.shtml
phone: +49 251 83 - 38448

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to