Hello,
Here's a small set of patches that slightly improve the performance of
FreeType when compiled for ARM and x86_64 with GCC. I also checked that it
doesn't negatively affect x86 performance.
On ARM, loading glyphs is about 3% faster, and rendering gray bitmaps is 6%
faster.
On x86_64, loading glyphs is 6% faster, and rendering gray bitmaps is 2.5%
faster.
The optimizations were found by inspecting the generated machine code in
hot spots.
Let me know if you find any issue.
- David
PS: Everything measured with:
./ftbench -p -t 5 -s 14 -f 0008 Arial.ttf
(0008 is FT_LOAD_NO_BITMAP)
ARM:
CFLAGS=-O2 -fomit-frame-pointer -march=armv7-a -mthumb ./configure
--disable-shared --without-zlib --without-png --without-bzip2
--host=arm-linux-androideabi
Before:
Load 34.287 us/op
Load_Advances (Normal)34.317 us/op
Load_Advances (Fast) 0.176 us/op
Render23.544 us/op
Get_Glyph 6.661 us/op
Get_CBox 1.957 us/op
Get_Char_Index0.261 us/op
Iterate CMap 121.696 us/op
New_Face 115.143 us/op
Embolden 1.428 us/op
Get_BBox 3.313 us/op
After:
Load 33.358 us/op
Load_Advances (Normal)33.330 us/op
Load_Advances (Fast) 0.176 us/op
Render22.079 us/op
Get_Glyph 6.494 us/op
Get_CBox 1.937 us/op
Get_Char_Index0.232 us/op
Iterate CMap 120.793 us/op
New_Face 115.759 us/op
Embolden 1.450 us/op
Get_BBox 3.384 us/op
x86_64:
===
CFLAGS=-O2 -fomit-frame-pointer ./configure --disable-shared
--without-zlib --without-png --without-bzip2
Before:
Load 4.890 us/op
Load_Advances (Normal)4.849 us/op
Load_Advances (Fast) 0.027 us/op
Render2.813 us/op
Get_Glyph 0.473 us/op
Get_CBox 0.076 us/op
Get_Char_Index0.024 us/op
Iterate CMap 13.982 us/op
New_Face 12.341 us/op
Embolden 0.027 us/op
Get_BBox 0.303 us/op
After:
Load 4.617 us/op
Load_Advances (Normal)4.537 us/op
Load_Advances (Fast) 0.028 us/op
Render2.743 us/op
Get_Glyph 0.441 us/op
Get_CBox 0.076 us/op
Get_Char_Index0.023 us/op
Iterate CMap 13.508 us/op
New_Face 12.298 us/op
Embolden 0.027 us/op
Get_BBox 0.296 us/op
x86:
CFLAGS=-O2 -fomit-frame-pointer -m32 LDFLAGS=-m32 ./configure
--disable-shared --without-zlib --without-png --without-bzip2
Before:
Load 4.973 us/op
Load_Advances (Normal)4.910 us/op
Load_Advances (Fast) 0.023 us/op
Render3.140 us/op
Get_Glyph 0.641 us/op
Get_CBox 0.243 us/op
Get_Char_Index0.027 us/op
Iterate CMap 15.303 us/op
New_Face 13.041 us/op
Embolden 0.167 us/op
Get_BBox 0.527 us/op
After:
Load 4.930 us/op
Load_Advances (Normal)4.895 us/op
Load_Advances (Fast) 0.023 us/op
Render3.131 us/op
Get_Glyph 0.620 us/op
Get_CBox 0.237 us/op
Get_Char_Index0.027 us/op
Iterate CMap 15.051 us/op
New_Face 13.133 us/op
Embolden 0.163 us/op
Get_BBox 0.524 us/op
0001-arm-Enable-FT_MulFix_arm-for-thumb2-compilation.patch
Description: Binary data
0002-x86_64-Optimize-FT_MulFix-for-x86_64-GCC-builds.patch
Description: Binary data
0003-arm-x86-x86_64-Optimized-TT_MulFix14-TT_DivFix14.patch
Description: Binary data
0004-arm-Improve-gray-rasterizer-performance.patch
Description: Binary data
___
Freetype-devel mailing list
Freetype-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/freetype-devel