https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122455
Bug ID: 122455
Summary: [16 regression] Regression in Snappy tests due to
incorrect speculation of devirtualization target
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: soumyaa at gcc dot gnu.org
CC: jh at suse dot cz
Target Milestone: ---
Hi,
The following Snappy tests:
BM_UValidate/2/1
BM_UValidate/2/2
BM_UValidate/3/1
BM_UValidate/3/2
have observed a ~10% regression after commit 9ee937b2: Add --param
max-devirt-targets [https://gcc.gnu.org/cgit/gcc/commit/?id=9ee937b2]
---
Initial impressions:
The function SnappyDecompressor::RefillTag() calls the functions Peek() and
Skip which are implemented via 2 classes:
- ByteArraySource
- SnappyIOVecReader
This patch seems to pick SnappyIOVecReader as the first speculative target, as
is seen here:
ldr x2, [x2, :got_lo12:_ZN6snappy17SnappyIOVecReader4SkipEm]
ldr x9, [x0]
ldr x3, [x9, 32]
cmp x3, x2 # Compare vtable
beq .L885
blr x3
But, checking with the profile data for this function confirms that we only
call the ByteArraySource implementation:
_ZN6snappy18SnappyDecompressor9RefillTagEv total:1415550 head:81082
0: 81082
1: 81082
2: 81082
4: 80174 _ZN6snappy15ByteArraySource4SkipEm:80988
5: 80370
6: 80682 _ZN6snappy15ByteArraySource4PeekEPm:80489
Depending on the specific test, we might end up calling a different
implementation and the slowdown is consistent across all tests that use the
ByteArraySource implementation.
---
Steps to reproduce:
# Clone
git clone https://github.com/google/snappy.git
cd snappy
git submodule update --init
# Build
mkdir builddir
cd builddir
CC=/path/to/gcc-install-dir/bin/gcc CXX=/path/to/gcc-install-dir/bin/g++
OPT="<compile flags>" SRC=$(pwd)
LD_LIBRARY_PATH=/path/to/gcc-install-dir/lib64:$LD_LIBRARY_PATH
cmake .. -DCMAKE_C_COMPILER=$CC -DCMAKE_CXX_COMPILER=$CXX
-DCMAKE_C_FLAGS="$OPT" -DCMAKE_CXX_FLAGS="$OPT" -DCMAKE_BUILD_TYPE="Release"
${SRC}
make -j $(nproc)
# Run
cp -r ../testdata/ .
./snappy_benchmark --benchmark_filter="BM_UValidate/2/1"
--benchmark_min_warmup_time=2 --benchmark_time_unit=us;