Re: [dm-devel] dm bufio: Reduce dm_bufio_lock contention

2018-06-12 Thread Mike Snitzer
On Tue, Jun 12 2018 at  4:03am -0400,
Jing Xia  wrote:

> Performance test in android reports that the phone sometimes gets
> hanged and shows black screen for about several minutes.The sysdump shows:
> 1. kswapd and other tasks who enter the direct-reclaim path are waiting
> on the dm_bufio_lock;

Do you have an understanding of where they are waiting?  Is it in
dm_bufio_shrink_scan()?

> 2. the task who gets the dm_bufio_lock is stalled for IO completions,
> the relevant stack trace as :
> 
> PID: 22920  TASK: ffc0120f1a00  CPU: 1   COMMAND: "kworker/u8:2"
>  #0 [ffc0282af3d0] __switch_to at ff8008085e48
>  #1 [ffc0282af3f0] __schedule at ff8008850cc8
>  #2 [ffc0282af450] schedule at ff8008850f4c
>  #3 [ffc0282af470] schedule_timeout at ff8008853a0c
>  #4 [ffc0282af520] schedule_timeout_uninterruptible at ff8008853aa8
>  #5 [ffc0282af530] wait_iff_congested at ff8008181b40
>  #6 [ffc0282af5b0] shrink_inactive_list at ff8008177c80
>  #7 [ffc0282af680] shrink_lruvec at ff8008178510
>  #8 [ffc0282af790] mem_cgroup_shrink_node_zone at ff80081793bc
>  #9 [ffc0282af840] mem_cgroup_soft_limit_reclaim at ff80081b6040

Understanding the root cause for why the IO isn't completing quick
enough would be nice.  Is the backing storage just overwhelmed?

> This patch aims to reduce the dm_bufio_lock contention when multiple
> tasks do shrink_slab() at the same time.It is acceptable that task
> will be allowed to reclaim from other shrinkers or reclaim from dm-bufio
> next time, rather than stalled for the dm_bufio_lock.

Your patch just looks to be papering over the issue.  Like you're
treating the symptom rather than the problem.

> Signed-off-by: Jing Xia 
> Signed-off-by: Jing Xia 

You only need one Signed-off-by.

> ---
>  drivers/md/dm-bufio.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
> index c546b56..402a028 100644
> --- a/drivers/md/dm-bufio.c
> +++ b/drivers/md/dm-bufio.c
> @@ -1647,10 +1647,19 @@ static unsigned long __scan(struct dm_bufio_client 
> *c, unsigned long nr_to_scan,
>  static unsigned long
>  dm_bufio_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
>  {
> + unsigned long count;
> + unsigned long retain_target;
> +
>   struct dm_bufio_client *c = container_of(shrink, struct 
> dm_bufio_client, shrinker);
> - unsigned long count = READ_ONCE(c->n_buffers[LIST_CLEAN]) +
> +
> + if (!dm_bufio_trylock(c))
> + return 0;
> +
> + count = READ_ONCE(c->n_buffers[LIST_CLEAN]) +
> READ_ONCE(c->n_buffers[LIST_DIRTY]);
> - unsigned long retain_target = get_retain_buffers(c);
> + retain_target = get_retain_buffers(c);
> +
> + dm_bufio_unlock(c);
>  
>   return (count < retain_target) ? 0 : (count - retain_target);
>  }
> -- 
> 1.9.1
> 

The reality of your patch is, on a heavily used bufio-backed volume,
you're effectively disabling the ability to reclaim bufio memory via the
shrinker.

Because chances are the bufio lock will always be contended for a
heavily used bufio client.

But after a quick look, I'm left wondering why dm_bufio_shrink_scan()'s
dm_bufio_trylock() isn't sufficient to short-circuit the shrinker for
your use-case?
Maybe __GFP_FS is set so dm_bufio_shrink_scan() only ever uses
dm_bufio_lock()?

Is a shrinker able to be reentered by the VM subsystem
(e.g. shrink_slab() calls down into same shrinker from multiple tasks
that hit direct reclaim)?
If so, a better fix could be to add a flag to the bufio client so we can
know if the same client is being re-entered via the shrinker (though
it'd likely be a bug for the shrinker to do that!).. and have
dm_bufio_shrink_scan() check that flag and return SHRINK_STOP if set.

That said, it could be that other parts of dm-bufio are monopolizing the
lock as part of issuing normal IO (to your potentially slow
backend).. in which case just taking the lock from the shrinker even
once will block like you've reported.

It does seem like additional analysis is needed to pinpoint exactly what
is occuring.  Or some additional clarification needed (e.g. are the
multiple tasks waiting for the bufio lock, as you reported with "1"
above, waiting for the same exact shrinker's ability to get the same
bufio lock?)

But Mikulas, please have a look at this reported issue and let us know
your thoughts.

Thanks,
Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] dm-thin: Why is DATA_DEV_BLOCK_SIZE_MIN_SECTORS set to 64k?

2018-06-12 Thread Joe Thornber
On Sat, Jun 09, 2018 at 07:31:54PM +, Eric Wheeler wrote:
> I understand the choice.  What I am asking is this: would it be safe to 
> let others make their own choice about block size provided they are warned 
> about the metadata-chunk-size/pool-size limit tradeoff?
> 
> If it is safe, can we relax the restriction?  For example, 16k chunks 
> still enables ~4TB pools, but with 1/4th of the CoW IO overhead on heavily 
> snapshotted environments.

Yes, it would be safe.  There are downsides though; all io gets split
on block size boundaries, so dropping to 16k or smaller could
seriously increase the cpu usage.  Smaller blocks also means more
mappings, more metadata, more kernel memory consumption.

- Joe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [Query] Failed to create dm-crypt device when using AEAD type

2018-06-12 Thread Xiongfeng Wang
Hi Milan,

Thanks so much! It worked. No integrity errors are printed after I wipe the 
device.
I am stuck in these errors for about a week. Thanks again!

Regards,
Xiongfeng

On 2018/6/12 13:50, Milan Broz wrote:
> On 06/12/2018 07:37 AM, Xiongfeng Wang wrote:
>> Hi Dm-crypt maintainers,
>>
>> Recently, I was testing the dm-crypt, but I failed to create dm-crypt device 
>> when using AEAD type.
>> I would really appreciate it if you could give some help.
>> The error info is as follows:
>> localhost:~ # export SIZE_INT=997376
>> 8 J 0"ost:~ # dmsetup create integ1 --table "0 $SIZE_INT integrity /dev/sdd2 
>> 0 2
>> localhost:~ #
>> dom \host:~ # dmsetup create crypt1 --table "0 $SIZE_INT crypt 
>> capi:gcm(aes)-ran
>>>  11ff33c6fb942655efb3e30cf4c0fd95f5ef483afca72166c530ae26151dd83b \
>>>  0 /dev/mapper/integ1 0 1 integrity:28:aead"
>> [ 1746.631559] device-mapper: crypt: Integrity AEAD, tag size 16, IV size 12.
>> [ 1746.649796] device-mapper: crypt: INTEGRITY AEAD ERROR, sector 997248
>> [ 1746.656382] device-mapper: crypt: INTEGRITY AEAD ERROR, sector 997248
>> [ 1746.662826] Buffer I/O error on dev dm-3, logical block 124656, async 
>> page read
> 
> These errors actually say that it works as expected! :)
> 
> If the underlying device has no integrity tags initialized, *every* access to 
> device must generate
> integrity fail (because integrity tag is just not correct).
> 
> And the errors above are perhaps udev scans that are triggered by inotify 
> when new device appears
> and it tries to find some signatures on uninitialized disk with blkid.
> 
> If you use cryptsetup, it will try to wipe the device, alternatively you can 
> use
> dd (just be sure to use direct-io, page cache can generate some reads that 
> fails as well),
> so in your case something like this:
> 
>   # dd if=/dev/zero of=/dev/maper/crypt1 bs=1M oflag=direct
> 
> should wipe the device (and store integrity tags).
> 
> If you activate your devices again (with the same parameters), no integrity 
> errors should be present.
> 
> (I will write more documentation in next weeks regarding all this stuff, we 
> have now better
> AEAD ciphers in 4.18.)
> 
> Thanks,
> Milan
>  
>>
>> I tested it both on qemu and hardware, and it printed the same error.
>> The error seems always on the last several sectors within the SIZE_INT I 
>> designated.
>> When I change the SIZE_INT, the error sector num also change.
>> I think something went wrong in the software, not the hardware.
>>
>> My board don't have AEAD accelerator, so it uses the software implemented 
>> cipher.
>> My kernel version is 4.17-rc3.
>>
>> The command is as follows:
>> export SIZE_INT=997376
>> dmsetup create integ1 --table "0 $SIZE_INT integrity /dev/sdd2 0 28 J 0"
>> dmsetup create crypt1 --table "0 $SIZE_INT crypt capi:gcm(aes)-random \
>>  11ff33c6fb942655efb3e30cf4c0fd95f5ef483afca72166c530ae26151dd83b \
>>  0 /dev/mapper/integ1 0 1 integrity:28:aead"
>>
>> This command comes from the commit information of the commit which introduce 
>> AEAD.
>> (commit ef43aa38063a6b2b3c6618e28ab35794f4f1fe29
>> dm crypt: add cryptographic data integrity protection (authenticated 
>> encryption))
>> I only change 'aes-gcm-random' to 'capi:gcm(aes)-random'
>>
>> Really appreciate it if you could have a look at it, Thanks!
>>
>> Regards,
>> Xiongfeng
>>
>>
> 
> .
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel