On Wed, Mar 26, 2025 at 09:32:27PM +0800, LongPing Wei wrote: > On 2025/3/26 19:04, Mikulas Patocka wrote: > > > > > > On Wed, 26 Mar 2025, LongPing Wei wrote: > > > > > Calling verity_verify_io in bh for IO of all sizes is not suitable for > > > embedded devices. From our tests, it can improve the performance of 4K > > > synchronise random reads. > > > For example: > > > ./fio --name=rand_read --ioengine=psync --rw=randread --bs=4K \ > > > --direct=1 --numjobs=8 --runtime=60 --time_based --group_reporting \ > > > --filename=/dev/block/mapper/xx-verity > > > > > > But it will degrade the performance of 512K synchronise sequential reads > > > on our devices. > > > For example: > > > ./fio --name=read --ioengine=psync --rw=read --bs=512K --direct=1 \ > > > --numjobs=8 --runtime=60 --time_based --group_reporting \ > > > --filename=/dev/block/mapper/xx-verity > > > > > > A parameter array is introduced by this change. And users can modify the > > > default config by /sys/module/dm_verity/parameters/use_bh_bytes. > > > > > > The default limits for NONE/RT/BE/IDLE is set to 4096. > > > > > > Call verity_verify_io directly when verity_end_io is in softirq. > > > > > > Signed-off-by: LongPing Wei <weilongp...@oppo.com> > > > > Are you sure that 4096 bytes is the correct threshold? > > > > I suggest that you run the benchmarks for 4k, 8k, 16k, 32k, 64k, 128k, > > 256k, 512k and set the default threshold to the largest value where bh > > code performs better than non-bh code. > > > > Mikulas > >
4096 bytes sounds good to me. As I mentioned elsewhere in the thread (https://lore.kernel.org/dm-devel/20250326164057.GA1243@sol.localdomain/), objections keep getting raised to doing more work in softirq context due to potential impact on real-time tasks. So while more I/O benchmarks would be interesting, they are also not the whole story and we should probably prefer a more conservative threshold. Also, dm-verity I/O requests have a bimodal distribution with peaks at 4 KiB (for synchronous reads) and 128 KiB or read_ahead_kb (for readahead) anyway, so most of the impact should come from 4 KiB anyway. 8 KiB could also be reasonable when taking into account multibuffer hashing though, since multibuffer hashing makes hashing two 4 KiB blocks almost as fast as hashing a single 4 KiB block. (Though multibuffer hashing is currently Android-only, as the crypto maintainer keeps rejecting it upstream.) - Eric