Verification of this commit with the linux-hwe-edge kernel in -proposed,
using the attached test-case "io_setup_v2.c"
commit 2a8a98673c13cb2a61a6476153acf8344adfa992
Author: Mauricio Faria de Oliveira <[email protected]>
Date: Wed Jul 5 10:53:16 2017 -0300
fs: aio: fix the increment of aio-nr and counting against aio-
max-nr
Test-case (attached)
$ sudo apt-get install gcc libaio-dev
$ gcc -o io_setup_v2 io_setup_v2.c -laio
Original kernel:
- Only 409 io_contexts could be allocated,
but that took 130880 [ div by 2, per bug] = 65440 slots out of 65535
$ uname -rv
4.11.0-14-generic #20~16.04.1-Ubuntu SMP Wed Aug 9 09:06:18 UTC 2017
$ ./io_setup_v2 1 65536
nr_events: 1, nr_requests: 65536
rc = -11, i = 409
^Z
[1]+ Stopped ./io_setup_v2 1 65536
$ cat /proc/sys/fs/aio-nr
130880
$ cat /proc/sys/fs/aio-max-nr
65536
$ kill %%
Patched kernel:
- Now 65515 io_contexts could be allocated out of 65535 (much better)
(and reporting correctly, without div by 2.)
$ uname -rv
4.11.0-140-generic #20~16.04.1+bz146489 SMP Tue Sep 19 17:46:15 CDT 2017
$ ./io_setup_v2 1 65536
nr_events: 1, nr_requests: 65536
rc = -12, i = 65515
^Z
[1]+ Stopped ./io_setup_v2 1 65536
$ cat /proc/sys/fs/aio-nr
65515
$ kill %%
** Description changed:
Problem Description
=================================
I am facing this issue for Texan Flash storage 840 disks which are coming
from coho and salfish adapter
coho adapter with 840 storage is 3G disks and salfish adapter with 840
is 12G disks
I am able to see those disks in lsblk o/p but not in multipath -ll
comamnd
0004:01:00.0 Coho: Saturn-X U78C9.001.WZS0060-P1-C6
0x10000090fa2a51f8 host10 Online
0004:01:00.1 Coho: Saturn-X U78C9.001.WZS0060-P1-C6
0x10000090fa2a51f9 host11 Online
0005:09:00.0 Sailfish: QLogic 8GB U78C9.001.WZS0060-P1-C9
0x21000024ff787778 host2 Online
0005:09:00.1 Sailfish: QLogic 8GB U78C9.001.WZS0060-P1-C9
0x21000024ff787779 host4 Online
-
root@luckyv1:/dev/disk# multipath -ll | grep "size=3.0G" -B 1
root@luckyv1:/dev/disk# multipath -ll | grep "size=12G" -B 1
root@luckyv1:/dev/disk#
== Comment: #3 - Luciano Chavez <[email protected]> - 2016-09-20 20:22:20 ==
I edited /etc/multipath.conf and added
verbosity 6
to crank up the output and ran multipath -ll and saved it off to a text
file (attached). All the using the directio checker failed and those
using the tur checker seem to work.
Sep 20 20:07:36 | loading //lib/multipath/libcheckdirectio.so checker
Sep 20 20:07:36 | loading //lib/multipath/libprioconst.so prioritizer
Sep 20 20:07:36 | Discover device
/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sdai
Sep 20 20:07:36 | sdai: udev property ID_WWN whitelisted
Sep 20 20:07:36 | sdai: not found in pathvec
Sep 20 20:07:36 | sdai: mask = 0x25
Sep 20 20:07:36 | sdai: dev_t = 66:32
Sep 20 20:07:36 | open
'/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sdai/size'
Sep 20 20:07:36 | sdai: size = 20971520
Sep 20 20:07:36 | sdai: vendor = IBM
Sep 20 20:07:36 | sdai: product = FlashSystem-9840
Sep 20 20:07:36 | sdai: rev = 1442
Sep 20 20:07:36 | sdai: h:b:t:l = 3:0:0:0
Sep 20 20:07:36 | SCSI target 3:0:0 -> FC rport 3:0-2
Sep 20 20:07:36 | sdai: tgt_node_name = 0x500507605e839800
Sep 20 20:07:36 | open
'/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/state'
Sep 20 20:07:36 | sdai: path state = running
Sep 20 20:07:36 | sdai: get_state
Sep 20 20:07:36 | sdai: path_checker = directio (internal default)
Sep 20 20:07:36 | sdai: checker timeout = 30 ms (internal default)
Sep 20 20:07:36 | io_setup failed
Sep 20 20:07:36 | sdai: checker init failed
- == Comment: #4 - Luciano Chavez <[email protected]> - 2016-09-20 21:00:23 ==
- Hello Mauricio,
-
- I edited /etc/multipath.conf to use the tur path checker but the output
- still shows directio. Ideas?
-
- devices {
- device {
- vendor "IBM "
- product "FlashSystem-9840"
- path_selector "round-robin 0"
- path_grouping_policy multibus
- path_checker tur
- rr_weight uniform
- no_path_retry fail
- failback immediate
- dev_loss_tmo 300
- fast_io_fail_tmo 25
- }
- }
-
== Comment: #7 - Mauricio Faria De Oliveira <[email protected]> -
2016-09-27 18:32:57 ==
The function is failing at the io_setup() system call.
- @ checkers/directio.c
+ @ checkers/directio.c
- int libcheck_init (struct checker * c)
- {
- unsigned long pgsize = getpagesize();
- struct directio_context * ct;
- long flags;
+ int libcheck_init (struct checker * c)
+ {
+ unsigned long pgsize = getpagesize();
+ struct directio_context * ct;
+ long flags;
- ct = malloc(sizeof(struct directio_context));
- if (!ct)
- return 1;
- memset(ct, 0, sizeof(struct directio_context));
+ ct = malloc(sizeof(struct directio_context));
+ if (!ct)
+ return 1;
+ memset(ct, 0, sizeof(struct directio_context));
- if (io_setup(1, &ct->ioctx) != 0) {
- condlog(1, "io_setup failed");
- free(ct);
- return 1;
- }
- <...>
+ if (io_setup(1, &ct->ioctx) != 0) {
+ condlog(1, "io_setup failed");
+ free(ct);
+ return 1;
+ }
+ <...>
The syscall is failing w/ EAGAIN
- # grep ^io_setup multipath_-v2_-d.strace
- io_setup(1, 0x100163c9130) = -1 EAGAIN (Resource
temporarily unavailable)
- io_setup(1, 0x10015bae2c0) = -1 EAGAIN (Resource
temporarily unavailable)
- io_setup(1, 0x100164d65a0) = -1 EAGAIN (Resource
temporarily unavailable)
- io_setup(1, 0x10016429f20) = -1 EAGAIN (Resource
temporarily unavailable)
- io_setup(1, 0x100163535c0) = -1 EAGAIN (Resource
temporarily unavailable)
- io_setup(1, 0x10016368510) = -1 EAGAIN (Resource
temporarily unavailable)
- <...>
+ # grep ^io_setup multipath_-v2_-d.strace
+ io_setup(1, 0x100163c9130) = -1 EAGAIN (Resource temporarily
unavailable)
+ io_setup(1, 0x10015bae2c0) = -1 EAGAIN (Resource temporarily
unavailable)
+ io_setup(1, 0x100164d65a0) = -1 EAGAIN (Resource temporarily
unavailable)
+ io_setup(1, 0x10016429f20) = -1 EAGAIN (Resource temporarily
unavailable)
+ io_setup(1, 0x100163535c0) = -1 EAGAIN (Resource temporarily
unavailable)
+ io_setup(1, 0x10016368510) = -1 EAGAIN (Resource temporarily
unavailable)
+ <...>
According to the manpage (man 2 io_setup)
- NAME
- io_setup - create an asynchronous I/O context
+ NAME
+ io_setup - create an asynchronous I/O context
- DESCRIPTION
- The io_setup() system call creates an asynchronous I/O context
suitable for concurrently processing nr_events operations. <...>
+ DESCRIPTION
+ The io_setup() system call creates an asynchronous I/O context
suitable for concurrently processing nr_events operations. <...>
- ERRORS
- EAGAIN The specified nr_events exceeds the user's limit of
available events, as defined in /proc/sys/fs/aio-max-nr.
-
+ ERRORS
+ EAGAIN The specified nr_events exceeds the user's limit of available
events, as defined in /proc/sys/fs/aio-max-nr.
On luckyv1:
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-max-nr
- 65536
+ root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-max-nr
+ 65536
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
- 130560
+ root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
+ 130560
According to linux's Documentation/sysctl/fs.txt [1]
- aio-nr & aio-max-nr:
+ aio-nr & aio-max-nr:
- aio-nr is the running total of the number of events specified on the
- io_setup system call for all currently active aio contexts. If aio-nr
- reaches aio-max-nr then io_setup will fail with EAGAIN. Note that
- raising aio-max-nr does not result in the pre-allocation or re-sizing
- of any kernel data structures.
+ aio-nr is the running total of the number of events specified on the
+ io_setup system call for all currently active aio contexts. If aio-nr
+ reaches aio-max-nr then io_setup will fail with EAGAIN. Note that
+ raising aio-max-nr does not result in the pre-allocation or re-sizing
+ of any kernel data structures.
Interestingly, aio-nr is greater than aio-max-nr. Hm.
Increased aio-max-nr to 262144, and could get some more maps created.
- Accidentally killed multipathd and got this:
-
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-max-nr
- 262144
-
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
- 0
-
- With just a few io_setup failures (previously there were hundreds..)
-
- root@luckyv1:~/mauricfo/bz146849/sep27# multipath -v2
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | io_setup failed
- Sep 27 17:11:22 | mpathcs: ignoring map
- Sep 27 17:11:22 | mpathcs: ignoring map
-
- Then noticed a huge increase:
-
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
- 354560
-
- multipathd was restarted automatically.
-
-
- The number decreases when multipathd is being stopped/shutdown:
-
- let this running:
- root@luckyv1:~# systemctl stop multipathd
-
- and watch it:
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
- 354560
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
- 276480
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
- 267520
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
- 262400
- ...
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
- 43520
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
- 29440
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
- 0
-
- Start it, and aio-nr goes high:
-
- root@luckyv1:~# cat /proc/sys/fs/aio-nr;
- 0
-
- root@luckyv1:~# systemctl start multipathd
-
- root@luckyv1:~# cat /proc/sys/fs/aio-nr;
- 523520
-
-
- And it seems there's something very wrong w/ the number of requests that are
successful, and the number reported in aio-nr:
-
- root@luckyv1:~/mauricfo/bz146849/sep27# grep -c 'io_setup.*= 0'
strace.multipathd_-d_-s.io_setup
- 409
-
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
- 519680
-
- root@luckyv1:~/mauricfo/bz146849/sep27# killall -9 multipathd
-
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
- 0
-
- All request are for a single (one) AIO context (nr_events parameter):
-
- root@luckyv1:~/mauricfo/bz146849/sep27# grep -m3 'io_setup.*= 0'
strace.multipathd_-d_-s.io_setup
- 130184 io_setup(1, [70366829412352]) = 0
- 130184 io_setup(1, [70366818795520]) = 0
- 130184 io_setup(1, [70366818729984]) = 0
-
- Checking further..
-
- [1] https://www.kernel.org/doc/Documentation/sysctl/fs.txt
== Comment: #8 - Mauricio Faria De Oliveira <[email protected]> -
2016-09-27 18:56:08 ==
This attached test-case demonstrates that for each io_setup() request of 1
nr_event, actually 1280 seem to be allocated.
-
root@luckyv1:~/mauricfo/bz146849/sep27# gcc -o io_setup io_setup.c -laio
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
+ root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
0
root@luckyv1:~/mauricfo/bz146849/sep27# ./io_setup &
[1] 12352
io_setup rc = 0
sleeping 10 seconds...
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
+ root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
1280
<...>
io_destroy rc = 0
[1]+ Done ./io_setup
- root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
+ root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
0
-
- == Comment: #9 - Mauricio Faria De Oliveira <[email protected]> -
2016-09-27 19:17:13 ==
- (In reply to comment #8)
- > This attached test-case demonstrates that for each io_setup() request of 1
- > nr_event, actually 1280 seem to be allocated.
-
- The kernel code for the io_setup() syscall (call to ioctx_alloc())
- indeed does that.
-
- SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp)
- ...
- ioctx = ioctx_alloc(nr_events);
- ...
-
- static struct kioctx *ioctx_alloc(unsigned nr_events)
- ...
- nr_events = max(nr_events, num_possible_cpus() * 4);
- nr_events *= 2;
- ...
-
-
- The math is:
-
- root@luckyv1:~# cat /sys/devices/system/cpu/possible
- 0-159
-
- root@luckyv1:~# echo $((160*4*2))
- 1280
-
- Now, need to understand why.
-
- == Comment: #10 - Mauricio Faria De Oliveira <[email protected]> -
2016-09-27 19:22:44 ==
- Anyway, the number of paths to FlashSystem-9840 in the system is 424.
-
- root@luckyv1:~/mauricfo/bz146849/sep27# grep FlashSystem-9840
/sys/block/sd*/device/model | wc -l
- 424
-
- For each one there's an io_setup() call to allocate 1 nr_event, which
- actually becomes 1280, so we have in total... 542720, when the default
- aio-max-nr is 65536
-
- root@luckyv1:~/mauricfo/bz146849/sep27# echo $((424*1280))
- 542720
-
- So, obviously, it will exceed the max amount available (it already did,
- for some reason, as aio-nr showed 130560).
-
- Next steps is to understand why this is done, if this is a bug at all
- (you know, that multiplier of 160 * 4 is big on Power because we have
- lots of threads w/ P8/SMT8), and if so, if there's a proper "fix" for
- this. It may very well be just working correctly, and then we'll
- document this and instructions to increase the limit for
- FlashSystem-9840.
-
- == Comment: #11 - Mauricio Faria De Oliveira <[email protected]> -
2016-09-27 19:30:47 ==
- (In reply to comment #10)
- > For each one there's an io_setup() call to allocate 1 nr_event, which
- > actually becomes 1280, so we have in total... 542720, when the default
- > aio-max-nr is 65536
- <...>
- > So, obviously, it will exceed the max amount available (it already did, for
- > some reason, as aio-nr showed 130560).
-
- Got that; the check is against 2x aio-max-nr:
-
- if (aio_nr + nr_events > (aio_max_nr * 2UL)
-
- 130560 + 1280 = 131840
- which is greater than
- 65536 * 2 = 131072
-
- so the check fails from that point on.
-
- 130560 (default of 64k aio-max-nr) is enough for 102 paths (130560 /
- 1280).
-
- == Comment: #14 - LEKSHMI C. PILLAI <[email protected]> - 2016-09-28
07:57:03 ==
- Hi
-
- I applied the workaround to increase aio-max-nr.
-
- I did it on luckyv1.I changed the default value to 1048576.it didn't worked
.Again changed to 4194304
- root@luckyv1:/test/lucky# cat /proc/sys/fs/aio-max-nr
- 4194304--------------------------------------------------------------------->
- root@luckyv1:/test/lucky#
-
- How I did
-
- Edited the value in /etc/sysctl.conf and ran sysctl -p /etc/sysctl.conf
- .
-
- After that I am able to see all the disks
-
- Thanks for the workaround
-
-
- Thanks
- Lekshmi
-
- == Comment: #15 - Mauricio Faria De Oliveira <[email protected]> -
2016-09-28 08:44:39 ==
- (In reply to comment #14)
- > I applied the workaround to increase aio-max-nr.
- >
- > I did it on luckyv1.I changed the default value to 1048576.it didn't worked
- > .Again changed to 4194304
- <...>
- > After that I am able to see all the disks
-
- Yes, the actual value that works may vary depending on what multipathd
- is doing, and by which time did you run which commands before/after
- multipathd was running.
-
- The point is, multipathd allocates a number of events. I'd expect 424 FS9840
paths * 1280 events/path = 542720 events (if multipathd doesn't allocate a few
ones for itself / path-independent)), so theoretically, half of that (271360)
or close should pass.
- However, if multipathd was already running (ie, allocated a number of
events), and /then/ you run multipath command to test, multipath would try to
allocate /more/ events on top of those already allocated by multipathd, which
would be much more likely to fail because a large number was already allocated.
-
- > Thanks for the workaround
-
- You're welcome!
-
- == Comment: #19 - Mauricio Faria De Oliveira <[email protected]> -
- 2016-10-03 10:44:10 ==
-
== Comment: #45 - Mauricio Faria De Oliveira <[email protected]> -
2017-09-19 18:32:10 ==
Verification of this commit with the linux-hwe-edge kernel in -proposed,
- using the attached test-case "io_setup_v2.c"
+ using the attached test-case "io_setup_v2.c"
- commit 2a8a98673c13cb2a61a6476153acf8344adfa992
- Author: Mauricio Faria de Oliveira <[email protected]>
- Date: Wed Jul 5 10:53:16 2017 -0300
+ commit 2a8a98673c13cb2a61a6476153acf8344adfa992
+ Author: Mauricio Faria de Oliveira <[email protected]>
+ Date: Wed Jul 5 10:53:16 2017 -0300
- fs: aio: fix the increment of aio-nr and counting against aio-
+ fs: aio: fix the increment of aio-nr and counting against aio-
max-nr
Test-case (attached)
- $ sudo apt-get install gcc libaio-dev
- $ gcc -o io_setup_v2 io_setup_v2.c -laio
+ $ sudo apt-get install gcc libaio-dev
+ $ gcc -o io_setup_v2 io_setup_v2.c -laio
Original kernel:
- - Only 409 io_contexts could be allocated,
- but that took 130880 [ div by 2, per bug] = 65440 slots out of 65535
+ - Only 409 io_contexts could be allocated,
+ but that took 130880 [ div by 2, per bug] = 65440 slots out of 65535
- $ uname -rv
- 4.11.0-14-generic #20~16.04.1-Ubuntu SMP Wed Aug 9 09:06:18 UTC 2017
+ $ uname -rv
+ 4.11.0-14-generic #20~16.04.1-Ubuntu SMP Wed Aug 9 09:06:18 UTC 2017
- $ ./io_setup_v2 1 65536
- nr_events: 1, nr_requests: 65536
- rc = -11, i = 409
- ^Z
- [1]+ Stopped ./io_setup_v2 1 65536
+ $ ./io_setup_v2 1 65536
+ nr_events: 1, nr_requests: 65536
+ rc = -11, i = 409
+ ^Z
+ [1]+ Stopped ./io_setup_v2 1 65536
- $ cat /proc/sys/fs/aio-nr
- 130880
+ $ cat /proc/sys/fs/aio-nr
+ 130880
- $ cat /proc/sys/fs/aio-max-nr
- 65536
+ $ cat /proc/sys/fs/aio-max-nr
+ 65536
- $ kill %%
+ $ kill %%
Patched kernel:
- - Now 65515 io_contexts could be allocated out of 65535 (much better)
- (and reporting correctly, without div by 2.)
+ - Now 65515 io_contexts could be allocated out of 65535 (much better)
+ (and reporting correctly, without div by 2.)
- $ uname -rv
- 4.11.0-140-generic #20~16.04.1+bz146489 SMP Tue Sep 19 17:46:15 CDT 2017
+ $ uname -rv
+ 4.11.0-140-generic #20~16.04.1+bz146489 SMP Tue Sep 19 17:46:15 CDT 2017
- $ ./io_setup_v2 1 65536
- nr_events: 1, nr_requests: 65536
- rc = -12, i = 65515
- ^Z
- [1]+ Stopped ./io_setup_v2 1 65536
+ $ ./io_setup_v2 1 65536
+ nr_events: 1, nr_requests: 65536
+ rc = -12, i = 65515
+ ^Z
+ [1]+ Stopped ./io_setup_v2 1 65536
- $ cat /proc/sys/fs/aio-nr
- 65515
+ $ cat /proc/sys/fs/aio-nr
+ 65515
- $ kill %%
+ $ kill %%
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1718397
Title:
multipath -ll is not showing the disks which are actually multipath
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1718397/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs