On Sat, 18 Apr 2020 15:45:32 GMT, Kevin Rushforth <k...@openjdk.org> wrote:
>> I discussed this with a graphics engineer. He said that a couple of branches >> do not have any real performance impact >> even on modern mobile devices, and that, e.g., on iOS 7 using half floats >> instead of floats was improving shader >> execution dramatically. Desktops with NVIDIA or AMD and even Intel modern >> cards can process dozens of branches with no >> significant performance degradation. He suggested actually to have all the >> light types in a single shader file >> (looking ahead here). He also suggested not to permute on shaders based on >> the number of lights and just pass in a >> uniform for that number and loop over it. The permutations on the bump, >> specular and self illuminations components are >> correct (not sure we are not doing that for the diffuse component). If we >> add later shadows, which is not on my near >> to-do list, then we should permute there. It also depends on our target >> hardware. If we take into account hardware >> from, say, 2005 then maybe branching will cause significant performance >> loss, but that hinders our ability to increase >> performance for newer hardware. What is the policy here? I have a Win10 >> laptop with a GeForce 610M that I will test >> this weekend to see if the mobile NVidia cards have some issue. > > I think most of those are good suggestions going forward. As for the > performance drop, the only place we've seen it so > far is on graphics accelerators that are a few years old by now. Integrated > graphics chipsets (such as Intel HD) either > old or new seem largely unaffected by the shader changes. What we are missing > is performance metrics from newer > graphics accelerators on Mac and Windows. Even with the performance drop on > older graphics devices, I'm leaning > towards not having the shaders to be shaders to be doubled, since this is an > artificial stress test with huge quads. If > we could get performance data from a couple more recent graphics accelerators > that would be best. Here is a slightly modified test program. It fixes a compilation error in the previous, and also adds a system property to set the number of quads: It creates 200 quads by default. If you need to increase this or decrease it to get something in the ~ 10 fps range you can do that with `-DnumQuads=NNNN`. [pointlighttest.zip](https://github.com/openjdk/jfx/files/4526179/pointlighttest.zip) ------------- PR: https://git.openjdk.java.net/jfx/pull/43