Branch: refs/heads/main
Home: https://github.com/WebKit/WebKit
Commit: 915489b71ff7119698f25679e6b7ccb47a2dd7c2
https://github.com/WebKit/WebKit/commit/915489b71ff7119698f25679e6b7ccb47a2dd7c2
Author: Chris Dumez <[email protected]>
Date: 2026-01-06 (Tue, 06 Jan 2026)
Changed paths:
M Source/WTF/wtf/text/StringCommon.h
M Tools/TestWebKitAPI/Tests/WTF/StringCommon.cpp
Log Message:
-----------
Micro-optimize `WTF::copyElements(std::span<float> destinationSpan,
std::span<const double> sourceSpan)`
https://bugs.webkit.org/show_bug.cgi?id=304928
Reviewed by Yusuke Suzuki.
Micro-optimize the function as directed by Claude AI.
In particular, the following improvements were made:
1. Replaced simde_vld1q_f64_x4 with individual loads
- vld1q_f64_x4 is a structure load that loads into an interleaved structure,
which may have overhead
- Individual vld1q_f64 loads are simpler and give the compiler more flexibility
to schedule instructions
- On ARM64, the compiler can pipeline these loads better
2. Replaced simde_vst1q_f32_x2 with individual stores
- Similar to loads - structure stores have overhead
- Individual stores are simpler and more efficient
- Avoids creating the temporary simde_float32x4x2_t structure
- Better instruction scheduling
Micro-benchmarking results:
Length Buggy (ns) Fixed (ns) Speedup Improvement Status
--------------------------------------------------------------------------------
1 1.46 1.22 1.197x +19.7% OK
2 1.33 0.94 1.424x +42.4% OK
3 1.23 1.33 0.924x -7.6% OK
4 1.25 1.22 1.028x +2.8% OK
5 1.45 1.46 0.998x -0.2% OK
6 1.44 1.44 1.000x -0.0% OK
7 1.65 1.63 1.012x +1.2% OK
8 0.72 0.61 1.180x +18.0% OK
9 0.97 0.93 1.046x +4.6% OK
10 1.25 1.21 1.036x +3.6% OK
12 1.47 1.31 1.121x +12.1% OK
15 1.71 1.73 0.992x -0.8% OK
16 1.22 0.92 1.334x +33.4% OK
20 1.98 1.75 1.134x +13.4% OK
24 1.73 1.22 1.415x +41.5% OK
31 2.98 2.49 1.195x +19.5% OK
32 2.21 1.57 1.410x +41.0% OK
48 3.20 2.31 1.387x +38.7% OK
64 4.20 3.02 1.389x +38.9% OK
96 6.12 4.54 1.346x +34.6% OK
128 8.18 6.07 1.348x +34.8% OK
192 12.06 8.98 1.343x +34.3% OK
256 15.86 12.02 1.319x +31.9% OK
512 31.63 23.92 1.322x +32.2% OK
1024 67.34 52.24 1.289x +28.9% OK
2048 132.41 101.49 1.305x +30.5% OK
4096 263.98 236.39 1.117x +11.7% OK
8192 617.88 615.62 1.004x +0.4% OK
16384 1006.77 774.89 1.299x +29.9% OK
--------------------------------------------------------------------------------
Average speedup: 1.204x (20.4% improvement)
Test: Tools/TestWebKitAPI/Tests/WTF/StringCommon.cpp
* Source/WTF/wtf/text/StringCommon.h:
(WTF::copyElements):
* Tools/TestWebKitAPI/Tests/WTF/StringCommon.cpp:
(TestWebKitAPI::CopyElementsDoubleToFloatTest::testConversion):
(TestWebKitAPI::TEST_F(CopyElementsDoubleToFloatTest, VerySmallSizes)):
(TestWebKitAPI::TEST_F(CopyElementsDoubleToFloatTest, ExactlySIMDWidth)):
(TestWebKitAPI::TEST_F(CopyElementsDoubleToFloatTest, JustAboveSIMDWidth)):
(TestWebKitAPI::TEST_F(CopyElementsDoubleToFloatTest,
ExactlyTwoSIMDIterations)):
(TestWebKitAPI::TEST_F(CopyElementsDoubleToFloatTest, MediumSizes)):
(TestWebKitAPI::TEST_F(CopyElementsDoubleToFloatTest, LargeSizes)):
(TestWebKitAPI::TEST_F(CopyElementsDoubleToFloatTest,
EdgeCasesAroundSIMDBoundaries)):
(TestWebKitAPI::TEST_F(CopyElementsDoubleToFloatTest, SpecialValues)):
(TestWebKitAPI::TEST_F(CopyElementsDoubleToFloatTest, PrecisionLoss)):
(TestWebKitAPI::TEST_F(CopyElementsDoubleToFloatTest,
StressTestMultipleIterations)):
(TestWebKitAPI::TEST_F(CopyElementsDoubleToFloatTest, AlignmentVariations)):
Canonical link: https://commits.webkit.org/305183@main
To unsubscribe from these emails, change your notification settings at
https://github.com/WebKit/WebKit/settings/notifications