On Thu, Mar 27, 2025 at 08:52:45AM +0800, LongPing Wei wrote: > On 2025/3/27 4:48, Eric Biggers wrote: > > On Wed, Mar 26, 2025 at 07:50:34PM +0100, Mikulas Patocka wrote: > > > > > > > > > On Wed, 26 Mar 2025, Eric Biggers wrote: > > > > > > > On Wed, Mar 26, 2025 at 09:32:27PM +0800, LongPing Wei wrote: > > > > > On 2025/3/26 19:04, Mikulas Patocka wrote: > > > > > > > > > > > > > > > > > > On Wed, 26 Mar 2025, LongPing Wei wrote: > > > > > > > > > > > > > Calling verity_verify_io in bh for IO of all sizes is not > > > > > > > suitable for > > > > > > > embedded devices. From our tests, it can improve the performance > > > > > > > of 4K > > > > > > > synchronise random reads. > > > > > > > For example: > > > > > > > ./fio --name=rand_read --ioengine=psync --rw=randread --bs=4K \ > > > > > > > --direct=1 --numjobs=8 --runtime=60 --time_based > > > > > > > --group_reporting \ > > > > > > > --filename=/dev/block/mapper/xx-verity > > > > > > > > > > > > > > But it will degrade the performance of 512K synchronise > > > > > > > sequential reads > > > > > > > on our devices. > > > > > > > For example: > > > > > > > ./fio --name=read --ioengine=psync --rw=read --bs=512K --direct=1 > > > > > > > \ > > > > > > > --numjobs=8 --runtime=60 --time_based --group_reporting \ > > > > > > > --filename=/dev/block/mapper/xx-verity > > > > > > > > > > > > > > A parameter array is introduced by this change. And users can > > > > > > > modify the > > > > > > > default config by /sys/module/dm_verity/parameters/use_bh_bytes. > > > > > > > > > > > > > > The default limits for NONE/RT/BE/IDLE is set to 4096. > > > > > > > > > > > > > > Call verity_verify_io directly when verity_end_io is in softirq. > > > > > > > > > > > > > > Signed-off-by: LongPing Wei <weilongp...@oppo.com> > > > > > > > > > > > > Are you sure that 4096 bytes is the correct threshold? > > > > > > > > > > > > I suggest that you run the benchmarks for 4k, 8k, 16k, 32k, 64k, > > > > > > 128k, > > > > > > 256k, 512k and set the default threshold to the largest value where > > > > > > bh > > > > > > code performs better than non-bh code. > > > > > > > > > > > > Mikulas > > > > > > > > > > > > > > 4096 bytes sounds good to me. As I mentioned elsewhere in the thread > > > > (https://lore.kernel.org/dm-devel/20250326164057.GA1243@sol.localdomain/), > > > > objections keep getting raised to doing more work in softirq context > > > > due to > > > > potential impact on real-time tasks. So while more I/O benchmarks > > > > would be > > > > interesting, they are also not the whole story and we should probably > > > > prefer a > > > > more conservative threshold. Also, dm-verity I/O requests have a > > > > bimodal > > > > distribution with peaks at 4 KiB (for synchronous reads) and 128 KiB or > > > > read_ahead_kb (for readahead) anyway, so most of the impact should come > > > > from 4 > > > > KiB anyway. > > > > > > > > 8 KiB could also be reasonable when taking into account multibuffer > > > > hashing > > > > though, since multibuffer hashing makes hashing two 4 KiB blocks almost > > > > as fast > > > > as hashing a single 4 KiB block. (Though multibuffer hashing is > > > > currently > > > > Android-only, as the crypto maintainer keeps rejecting it upstream.) > > > > > > > > - Eric > > > > > > OK, so I'll set it to 8KiB. > > > > > > Do you think that I should increase this limit to "unlimited" for > > > IOPRIO_CLASS_RT and disable this feature at all for IOPRIO_CLASS_NONE? > > > > IOPRIO_CLASS_NONE means an I/O priority isn't set, so it probably should be > > treated the same as IOPRIO_CLASS_BE which is the "normal" priority. > > > > For now I think it makes sense to use 4 or 8 KiB even for IOPRIO_CLASS_RT. > > That > > covers the single-block request case which is the one that matters most. > > We can > > bump it up later if we can be shown that something higher is better. > > Softirqs > > normally get prioritized over all real-time tasks; so while it should be > > more > > acceptable to use them for IOPRIO_CLASS_RT, it's not a perfect solution. > > > > Another concern I have with longer requests is that when dm-verity runs the > > verification work in a softirq and finds that a Merkle tree block isn't > > cached, > > it punts the request to a kworker and restarts it from the beginning. That > > not > > only defeats the point of using softirq context, but it also throws away > > work > > that was done. I think we could fix it to not throw away work, but having > > to > > fall back at all would still defeat the point of using softirq context. > > > > - Eric > > Hi, Eric > > It seems that verity_prefetch_io doesn't work efficiently. > dm_bufio_prefetch_with_ioprio > __dm_bufio_prefetch > blk_start_plug > for loop > __bufio_new > submit_io > cond_resched > blk_finish_plug > > If more than one hash blocks need to be prefetched, cond_resched will > be called in each loop and blk_finish_plug will be called at the end. > > The requests for hash blocks may have not been dispatched when the > requests for data blocks have been completed.
Well, it's always possible for the prefetch I/O to be too slow, but the way it works does seem to be unnecessarily bad. The prefetch work runs on a kworker, which is going to slow things down. The way it should work is that verity_map() runs the prefetch work and starts, but doesn't wait for, any I/O that is needed. (And of course, most of the time no I/O will actually be needed.) - Eric