Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
Have you considered turning all inline functions into macros, so that the compiler doesn't have to inline them? Marek On Fri, Sep 12, 2014 at 12:58 AM, Jason Ekstrand ja...@jlekstrand.net wrote: On Thu, Sep 11, 2014 at 3:53 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 12.09.2014 00:31, schrieb Jason Ekstrand: On Thu, Sep 11, 2014 at 2:55 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 15.08.2014 04:50, schrieb Jason Ekstrand: On Aug 14, 2014 7:13 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 15.08.2014 02:36, schrieb Dave Airlie: On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance This file seems to anger gcc a lot. It seems to take upwards of a minute or two to compile here. gcc 4.8.3 on 32-bit x86. Dave. For me (on our poor little Duron 1800/2 GB) it ran ~5 minutes... gcc 4.8.1 on 32-bit x86. If we'd like, the way the macros are set up, it would be easy to change it so that we do less unrolling in the cases where we are actually doing substantial format conversion and wouldn't notice the extra logic quite as much. I'll play with it a bit tomorrow or next week and see how how much of a hit we would actually take if we unrolled a little less in places. --Jason Ekstrand Ping. In a second it took 11+ minutes , here... 11 minutes! What system are you running? and are you using -03 or something? Yes, we can do something to cut it down, but it will probably require a configure flag; the question is what flag. --Jason See above, the old children's system... ;-) -O2 -m32 -march=athlon-mp -mtune=athlon-mp -m3dnow -msse -mmmx -mfpmath=sse,387 -pipe Bad? - Worked for ages on AthlonMP8-) Maybe it is bad on Duron (the MP thing, much smaller cache and better GCC), now. Dieter Yeah, my recommendation would be hacking the macros to not unroll and keep the patch locally. If you've got a better idea as to how to organize the code so the compiler likes it, I'm open as long as we don't loose performance. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
On 09/11/2014 04:58 PM, Jason Ekstrand wrote: On Thu, Sep 11, 2014 at 3:53 PM, Dieter Nützel die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de wrote: Am 12.09.2014 00:31, schrieb Jason Ekstrand: On Thu, Sep 11, 2014 at 2:55 PM, Dieter Nützel die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de wrote: Am 15.08.2014 04:50, schrieb Jason Ekstrand: On Aug 14, 2014 7:13 PM, Dieter Nützel die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de wrote: Am 15.08.2014 02:36, schrieb Dave Airlie: On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance This file seems to anger gcc a lot. It seems to take upwards of a minute or two to compile here. gcc 4.8.3 on 32-bit x86. Dave. For me (on our poor little Duron 1800/2 GB) it ran ~5 minutes... gcc 4.8.1 on 32-bit x86. If we'd like, the way the macros are set up, it would be easy to change it so that we do less unrolling in the cases where we are actually doing substantial format conversion and wouldn't notice the extra logic quite as much. I'll play with it a bit tomorrow or next week and see how how much of a hit we would actually take if we unrolled a little less in places. --Jason Ekstrand Ping. In a second it took 11+ minutes , here... 11 minutes! What system are you running? and are you using -03 or something? Yes, we can do something to cut it down, but it will probably require a configure flag; the question is what flag. --Jason See above, the old children's system... ;-) -O2 -m32 -march=athlon-mp -mtune=athlon-mp -m3dnow -msse -mmmx -mfpmath=sse,387 -pipe Bad? - Worked for ages on AthlonMP8-) Maybe it is bad on Duron (the MP thing, much smaller cache and better GCC), now. Dieter Yeah, my recommendation would be hacking the macros to not unroll and keep the patch locally. If you've got a better idea as to how to organize the code so the compiler likes it, I'm open as long as we don't loose performance. It looks like a release build with MSVC is taking quite a while to compile this file too (actually at link time when the optimizer kicks in). But even on my fast Linux system with gcc, the difference in compile time between -O0 and -O3 is pretty big (2 seconds vs. 1 minute, 3 seconds). I'm still prototyping something but it looks like breaking the top-level switch cases in _mesa_swizzle_and_convert() into separate functions reduces the time quite a bit. Let me pursue that a bit further and see how it goes... -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
Forgot to reply-all. On Sep 12, 2014 9:05 AM, Jason Ekstrand ja...@jlekstrand.net wrote: The teximage-colors test that I pushed to piglit a week or two ago takes a --benchmark flag that bumps the texture size and does the upload 1000 times and gives you the average time to upload. --Jason On Sep 12, 2014 9:01 AM, Brian Paul bri...@vmware.com wrote: On 09/12/2014 08:49 AM, Jason Ekstrand wrote: On Sep 12, 2014 7:09 AM, Brian Paul bri...@vmware.com mailto:bri...@vmware.com wrote: On 09/11/2014 04:58 PM, Jason Ekstrand wrote: On Thu, Sep 11, 2014 at 3:53 PM, Dieter Nützel die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de wrote: Am 12.09.2014 00:31, schrieb Jason Ekstrand: On Thu, Sep 11, 2014 at 2:55 PM, Dieter Nützel die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de wrote: Am 15.08.2014 04:50, schrieb Jason Ekstrand: On Aug 14, 2014 7:13 PM, Dieter Nützel die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de wrote: Am 15.08.2014 02:36, schrieb Dave Airlie: On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance This file seems to anger gcc a lot. It seems to take upwards of a minute or two to compile here. gcc 4.8.3 on 32-bit x86. Dave. For me (on our poor little Duron 1800/2 GB) it ran ~5 minutes... gcc 4.8.1 on 32-bit x86. If we'd like, the way the macros are set up, it would be easy to change it so that we do less unrolling in the cases where we are actually doing substantial format conversion and wouldn't notice the extra logic quite as much. I'll play with it a bit tomorrow or next week and see how how much of a hit we would actually take if we unrolled a little less in places. --Jason Ekstrand Ping. In a second it took 11+ minutes , here... 11 minutes! What system are you running? and are you using -03 or something? Yes, we can do something to cut it down, but it will probably require a configure flag; the question is what flag. --Jason See above, the old children's system... ;-) -O2 -m32 -march=athlon-mp -mtune=athlon-mp -m3dnow -msse -mmmx -mfpmath=sse,387 -pipe Bad? - Worked for ages on AthlonMP8-) Maybe it is bad on Duron (the MP thing, much smaller cache and better GCC), now. Dieter Yeah, my recommendation would be hacking the macros to not unroll and keep the patch locally. If you've got a better idea as to how to organize the code so the compiler likes it, I'm open as long as we don't loose performance. It looks like a release build with MSVC is taking quite a while to compile this file too (actually at link time when the optimizer kicks in). But even on my fast Linux system with gcc, the difference in compile time between -O0 and -O3 is pretty big (2 seconds vs. 1
Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
On 09/12/2014 08:09 AM, Brian Paul wrote: On 09/11/2014 04:58 PM, Jason Ekstrand wrote: On Thu, Sep 11, 2014 at 3:53 PM, Dieter Nützel die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de wrote: Am 12.09.2014 00:31, schrieb Jason Ekstrand: On Thu, Sep 11, 2014 at 2:55 PM, Dieter Nützel die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de wrote: Am 15.08.2014 04:50, schrieb Jason Ekstrand: On Aug 14, 2014 7:13 PM, Dieter Nützel die...@nuetzel-hh.de mailto:die...@nuetzel-hh.de wrote: Am 15.08.2014 02:36, schrieb Dave Airlie: On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance This file seems to anger gcc a lot. It seems to take upwards of a minute or two to compile here. gcc 4.8.3 on 32-bit x86. Dave. For me (on our poor little Duron 1800/2 GB) it ran ~5 minutes... gcc 4.8.1 on 32-bit x86. If we'd like, the way the macros are set up, it would be easy to change it so that we do less unrolling in the cases where we are actually doing substantial format conversion and wouldn't notice the extra logic quite as much. I'll play with it a bit tomorrow or next week and see how how much of a hit we would actually take if we unrolled a little less in places. --Jason Ekstrand Ping. In a second it took 11+ minutes , here... 11 minutes! What system are you running? and are you using -03 or something? Yes, we can do something to cut it down, but it will probably require a configure flag; the question is what flag. --Jason See above, the old children's system... ;-) -O2 -m32 -march=athlon-mp -mtune=athlon-mp -m3dnow -msse -mmmx -mfpmath=sse,387 -pipe Bad? - Worked for ages on AthlonMP8-) Maybe it is bad on Duron (the MP thing, much smaller cache and better GCC), now. Dieter Yeah, my recommendation would be hacking the macros to not unroll and keep the patch locally. If you've got a better idea as to how to organize the code so the compiler likes it, I'm open as long as we don't loose performance. It looks like a release build with MSVC is taking quite a while to compile this file too (actually at link time when the optimizer kicks in). But even on my fast Linux system with gcc, the difference in compile time between -O0 and -O3 is pretty big (2 seconds vs. 1 minute, 3 seconds). I'm still prototyping something but it looks like breaking the top-level switch cases in _mesa_swizzle_and_convert() into separate functions reduces the time quite a bit. Let me pursue that a bit further and see how it goes... OK, I'm posting a couple patches: mesa: break up _mesa_swizzle_and_convert() to reduce compile time This reduces -O3 compile time with gcc to 1/4 of what it was. Seems to reduce compile time with MSVC too, but I haven't really measured it. Dieter, can you test this patch on your system? mesa: move i, j var decls into SWIZZLE_CONVERT_LOOP() macro I think the optimizer can sometimes do a better job when loop variables are declared per-loop, rather than declared per-function. But this patch increases the size of the .o file from 2556528 to 2933216 bytes (15%). Jason, if you have a benchmark to measure the speed of this code, I'd be interested to know if this patch helps much. Moving the declaration of 'j'
Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
Am 15.08.2014 04:50, schrieb Jason Ekstrand: On Aug 14, 2014 7:13 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 15.08.2014 02:36, schrieb Dave Airlie: On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance This file seems to anger gcc a lot. It seems to take upwards of a minute or two to compile here. gcc 4.8.3 on 32-bit x86. Dave. For me (on our poor little Duron 1800/2 GB) it ran ~5 minutes... gcc 4.8.1 on 32-bit x86. If we'd like, the way the macros are set up, it would be easy to change it so that we do less unrolling in the cases where we are actually doing substantial format conversion and wouldn't notice the extra logic quite as much. I'll play with it a bit tomorrow or next week and see how how much of a hit we would actually take if we unrolled a little less in places. --Jason Ekstrand Ping. In a second it took 11+ minutes , here... Thanks! Dieter ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
On Thu, Sep 11, 2014 at 2:55 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 15.08.2014 04:50, schrieb Jason Ekstrand: On Aug 14, 2014 7:13 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 15.08.2014 02:36, schrieb Dave Airlie: On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance This file seems to anger gcc a lot. It seems to take upwards of a minute or two to compile here. gcc 4.8.3 on 32-bit x86. Dave. For me (on our poor little Duron 1800/2 GB) it ran ~5 minutes... gcc 4.8.1 on 32-bit x86. If we'd like, the way the macros are set up, it would be easy to change it so that we do less unrolling in the cases where we are actually doing substantial format conversion and wouldn't notice the extra logic quite as much. I'll play with it a bit tomorrow or next week and see how how much of a hit we would actually take if we unrolled a little less in places. --Jason Ekstrand Ping. In a second it took 11+ minutes , here... 11 minutes! What system are you running? and are you using -03 or something? Yes, we can do something to cut it down, but it will probably require a configure flag; the question is what flag. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
On Thu, Sep 11, 2014 at 3:31 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Thu, Sep 11, 2014 at 2:55 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 15.08.2014 04:50, schrieb Jason Ekstrand: On Aug 14, 2014 7:13 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 15.08.2014 02:36, schrieb Dave Airlie: On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance This file seems to anger gcc a lot. It seems to take upwards of a minute or two to compile here. gcc 4.8.3 on 32-bit x86. Dave. For me (on our poor little Duron 1800/2 GB) it ran ~5 minutes... gcc 4.8.1 on 32-bit x86. If we'd like, the way the macros are set up, it would be easy to change it so that we do less unrolling in the cases where we are actually doing substantial format conversion and wouldn't notice the extra logic quite as much. I'll play with it a bit tomorrow or next week and see how how much of a hit we would actually take if we unrolled a little less in places. --Jason Ekstrand Ping. In a second it took 11+ minutes , here... 11 minutes! What system are you running? and are you using -03 or something? Yes, we can do something to cut it down, but it will probably require a configure flag; the question is what flag. --Jason I'm also open to re-factoring it in some way that's a little easier on compilers. That said, it's very sensitive to performance, so whatever refactor gets done will have to be done carefully. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
Am 12.09.2014 00:31, schrieb Jason Ekstrand: On Thu, Sep 11, 2014 at 2:55 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 15.08.2014 04:50, schrieb Jason Ekstrand: On Aug 14, 2014 7:13 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 15.08.2014 02:36, schrieb Dave Airlie: On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance This file seems to anger gcc a lot. It seems to take upwards of a minute or two to compile here. gcc 4.8.3 on 32-bit x86. Dave. For me (on our poor little Duron 1800/2 GB) it ran ~5 minutes... gcc 4.8.1 on 32-bit x86. If we'd like, the way the macros are set up, it would be easy to change it so that we do less unrolling in the cases where we are actually doing substantial format conversion and wouldn't notice the extra logic quite as much. I'll play with it a bit tomorrow or next week and see how how much of a hit we would actually take if we unrolled a little less in places. --Jason Ekstrand Ping. In a second it took 11+ minutes , here... 11 minutes! What system are you running? and are you using -03 or something? Yes, we can do something to cut it down, but it will probably require a configure flag; the question is what flag. --Jason See above, the old children's system... ;-) -O2 -m32 -march=athlon-mp -mtune=athlon-mp -m3dnow -msse -mmmx -mfpmath=sse,387 -pipe Bad? - Worked for ages on AthlonMP8-) Maybe it is bad on Duron (the MP thing, much smaller cache and better GCC), now. Dieter ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
On Thu, Sep 11, 2014 at 3:53 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 12.09.2014 00:31, schrieb Jason Ekstrand: On Thu, Sep 11, 2014 at 2:55 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 15.08.2014 04:50, schrieb Jason Ekstrand: On Aug 14, 2014 7:13 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 15.08.2014 02:36, schrieb Dave Airlie: On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance This file seems to anger gcc a lot. It seems to take upwards of a minute or two to compile here. gcc 4.8.3 on 32-bit x86. Dave. For me (on our poor little Duron 1800/2 GB) it ran ~5 minutes... gcc 4.8.1 on 32-bit x86. If we'd like, the way the macros are set up, it would be easy to change it so that we do less unrolling in the cases where we are actually doing substantial format conversion and wouldn't notice the extra logic quite as much. I'll play with it a bit tomorrow or next week and see how how much of a hit we would actually take if we unrolled a little less in places. --Jason Ekstrand Ping. In a second it took 11+ minutes , here... 11 minutes! What system are you running? and are you using -03 or something? Yes, we can do something to cut it down, but it will probably require a configure flag; the question is what flag. --Jason See above, the old children's system... ;-) -O2 -m32 -march=athlon-mp -mtune=athlon-mp -m3dnow -msse -mmmx -mfpmath=sse,387 -pipe Bad? - Worked for ages on AthlonMP8-) Maybe it is bad on Duron (the MP thing, much smaller cache and better GCC), now. Dieter Yeah, my recommendation would be hacking the macros to not unroll and keep the patch locally. If you've got a better idea as to how to organize the code so the compiler likes it, I'm open as long as we don't loose performance. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance This file seems to anger gcc a lot. It seems to take upwards of a minute or two to compile here. gcc 4.8.3 on 32-bit x86. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
Am 15.08.2014 02:36, schrieb Dave Airlie: On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance This file seems to anger gcc a lot. It seems to take upwards of a minute or two to compile here. gcc 4.8.3 on 32-bit x86. Dave. For me (on our poor little Duron 1800/2 GB) it ran ~5 minutes... gcc 4.8.1 on 32-bit x86. Dieter ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
On Aug 14, 2014 7:13 PM, Dieter Nützel die...@nuetzel-hh.de wrote: Am 15.08.2014 02:36, schrieb Dave Airlie: On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance This file seems to anger gcc a lot. It seems to take upwards of a minute or two to compile here. gcc 4.8.3 on 32-bit x86. Dave. For me (on our poor little Duron 1800/2 GB) it ran ~5 minutes... gcc 4.8.1 on 32-bit x86. If we'd like, the way the macros are set up, it would be easy to change it so that we do less unrolling in the cases where we are actually doing substantial format conversion and wouldn't notice the extra logic quite as much. I'll play with it a bit tomorrow or next week and see how how much of a hit we would actually take if we unrolled a little less in places. --Jason Ekstrand ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com --- src/mesa/main/format_utils.c | 844 +++ src/mesa/main/format_utils.h | 5 + 2 files changed, 849 insertions(+) diff --git a/src/mesa/main/format_utils.c b/src/mesa/main/format_utils.c index 241c158..d60aeb3 100644 --- a/src/mesa/main/format_utils.c +++ b/src/mesa/main/format_utils.c @@ -54,3 +54,847 @@ _mesa_srgb_ubyte_to_linear_float(uint8_t cl) return lut[cl]; } + +/* A bunch of format conversion macros and helper functions used below */ + +/* Only guaranteed to work for BITS = 32 */ +#define MAX_UINT(BITS) ((BITS) == 32 ? UINT32_MAX : ((1u (BITS)) - 1)) +#define MAX_INT(BITS) (int)MAX_UINT((BITS) - 1) I'd probably put one more set of parens around the whole macro body, just to be safe. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/13] mesa/format_utils: Add a general format conversion function
On Mon, Aug 4, 2014 at 7:55 AM, Brian Paul bri...@vmware.com wrote: On 08/02/2014 02:11 PM, Jason Ekstrand wrote: Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com --- src/mesa/main/format_utils.c | 844 ++ + src/mesa/main/format_utils.h | 5 + 2 files changed, 849 insertions(+) diff --git a/src/mesa/main/format_utils.c b/src/mesa/main/format_utils.c index 241c158..d60aeb3 100644 --- a/src/mesa/main/format_utils.c +++ b/src/mesa/main/format_utils.c @@ -54,3 +54,847 @@ _mesa_srgb_ubyte_to_linear_float(uint8_t cl) return lut[cl]; } + +/* A bunch of format conversion macros and helper functions used below */ + +/* Only guaranteed to work for BITS = 32 */ +#define MAX_UINT(BITS) ((BITS) == 32 ? UINT32_MAX : ((1u (BITS)) - 1)) +#define MAX_INT(BITS) (int)MAX_UINT((BITS) - 1) I'd probably put one more set of parens around the whole macro body, just to be safe. Yup, fixed. --Jason -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev