Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-12-04 Thread Josh Hunt
On Tue, Dec 3, 2013 at 9:19 AM, Tejun Heo t...@kernel.org wrote:
 Hello,

 On Tue, Dec 03, 2013 at 08:28:43AM -0600, Josh Hunt wrote:
 You're right. Thanks for pointing this out. I did not realize there
 was a bug in the init script. The version of initramfs-tools I was
 using had the following bug:
 https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1215911

 Updating to 0.99ubuntu13.4 of initramfs-tools resolved my boot hangs.

 I did try using the workaround as suggested by Linus. In my setup the
 dm_init() code was hit, however it still appeared to be too late at
 times. I also tried moving the call to async_synchronize_full() above
 the for loop and it still had the same issue (patch attached.) Out of
 around 10 reboot tests it failed to find root 1 or 2 times.

 The ubuntu scripts don't ever actually call do_mount() if it can't
 find the device. It seems to rely on some udev functionality to tell
 it when the device is present, and if that fails it just bails out.

 This change has introduced a regression. However, I only noticed it
 b/c my init script had a bug which caused it not to wait around for
 the device to appear.

 Hmmm so, read the bug report, digged and asked around a bit.
 Here's the root problem - ubuntu's initramfs uses a tool to wait for
 the root device which uses libudev to listen for the device event;
 unfortunately, its rx buffer is not set large enough and the receiver
 isn't fast enough, which means that netlink broadcast messages from
 the kernel can overrun the buffer.  When that happens, it sets an
 error on the socket, so the next recv fails with -ENOBUFS.  If that
 happens, the wait for root aborts immediately and initramfs proceeds
 to mount non-existent root device.

 The only thing which changes by these patches is the timing of events.
 The problem likely wasn't as exposed before because things were slow
 enough so that either the messages could be consumed fast enough or
 there's enough delay between libata module load and the root device
 wait hiding the bug in the wait logic.

 So, yeah, it's a full blown timing bug.  I'm not sure what we can do
 to work around from kernel side except for randomly slowing things
 down or forcefully enlarging rx buffer size.  There really is no
 interlocking to take advantage of. :(

So there used to be a call to async_synchronize_full() in
ata_host_register(), but it was removed by
f29d3b23238e1955a8094e038c72546e99308e61 as part of some fastboot
changes. Adding it back (in the attached patch) seems to resolve the
issue when using the broken initrd. I'm guessing adding it back isn't
an option, but I wanted to point it out.

-- 
Josh
Index: b/drivers/ata/libata-core.c
===
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -6181,12 +6181,14 @@ int ata_host_register(struct ata_host *h
 	/* perform each probe asynchronously */
 	for (i = 0; i  host-n_ports; i++) {
 		struct ata_port *ap = host-ports[i];
 		async_schedule(async_port_probe, ap);
 	}
 
+	async_synchronize_full();
+
 	return 0;
 
  err_tadd:
 	while (--i = 0) {
 		ata_tport_delete(host-ports[i]);
 	}


Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-12-03 Thread Josh Hunt
On Tue, Nov 26, 2013 at 4:29 PM, Tejun Heo t...@kernel.org wrote:
 Hello,

 On Tue, Nov 26, 2013 at 04:12:41PM -0600, Josh Hunt wrote:
 I should have clarified that I'm not using dm/md in my setup. I know
 the modules are getting loaded in the log I attached, but root is not
 a md/dm device.

 Can you please still try it?  The init script is broken and we're now
 just trying to restore just enough of the old behavior so that the
 issue is not exposed.  The boot script in use seems to load md/dm
 modules after storage drivers and use their termination as the signal
 for storage ready, so it could be a good enough bandaid even if
 you're not using dm/md.

 Thanks.

 --
 tejun

Tejun

You're right. Thanks for pointing this out. I did not realize there
was a bug in the init script. The version of initramfs-tools I was
using had the following bug:
https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1215911

Updating to 0.99ubuntu13.4 of initramfs-tools resolved my boot hangs.

I did try using the workaround as suggested by Linus. In my setup the
dm_init() code was hit, however it still appeared to be too late at
times. I also tried moving the call to async_synchronize_full() above
the for loop and it still had the same issue (patch attached.) Out of
around 10 reboot tests it failed to find root 1 or 2 times.

The ubuntu scripts don't ever actually call do_mount() if it can't
find the device. It seems to rely on some udev functionality to tell
it when the device is present, and if that fails it just bails out.

This change has introduced a regression. However, I only noticed it
b/c my init script had a bug which caused it not to wait around for
the device to appear.

-- 
Josh
Index: linux-3.10/drivers/md/dm.c
===
--- linux-3.10.orig/drivers/md/dm.c
+++ linux-3.10/drivers/md/dm.c
@@ -16,12 +16,13 @@
 #include linux/bio.h
 #include linux/mempool.h
 #include linux/slab.h
 #include linux/idr.h
 #include linux/hdreg.h
 #include linux/delay.h
+#include linux/async.h
 
 #include trace/events/block.h
 
 #define DM_MSG_PREFIX core
 
 #ifdef CONFIG_PRINTK
@@ -275,12 +276,15 @@ static void (*_exits[])(void) = {
 static int __init dm_init(void)
 {
 	const int count = ARRAY_SIZE(_inits);
 
 	int r, i;
 
+	printk(KERN_CRIT DBG: %s: Calling async_synchronize_full();\n, __func__);
+	async_synchronize_full();
+
 	for (i = 0; i  count; i++) {
 		r = _inits[i]();
 		if (r)
 			goto bad;
 	}
 
Index: linux-3.10/drivers/md/md.c
===
--- linux-3.10.orig/drivers/md/md.c
+++ linux-3.10/drivers/md/md.c
@@ -48,12 +48,13 @@
 #include linux/file.h
 #include linux/compat.h
 #include linux/delay.h
 #include linux/raid/md_p.h
 #include linux/raid/md_u.h
 #include linux/slab.h
+#include linux/async.h
 #include md.h
 #include bitmap.h
 
 #ifndef MODULE
 static void autostart_arrays(int part);
 #endif
@@ -8573,12 +8574,14 @@ static void autostart_arrays(int part)
 	dev_t dev;
 	int i_scanned, i_passed;
 
 	i_scanned = 0;
 	i_passed = 0;
 
+	printk(KERN_CRIT DBG: %s: Calling async_synchronize_full()\n, __func__);
+	async_synchronize_full();
 	printk(KERN_INFO md: Autodetecting RAID arrays.\n);
 
 	while (!list_empty(all_detected_devices)  i_scanned  INT_MAX) {
 		i_scanned++;
 		node_detected_dev = list_entry(all_detected_devices.next,
 	struct detected_devices_node, list);


Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-12-03 Thread Tejun Heo
Hello,

On Tue, Dec 03, 2013 at 08:28:43AM -0600, Josh Hunt wrote:
 You're right. Thanks for pointing this out. I did not realize there
 was a bug in the init script. The version of initramfs-tools I was
 using had the following bug:
 https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1215911
 
 Updating to 0.99ubuntu13.4 of initramfs-tools resolved my boot hangs.
 
 I did try using the workaround as suggested by Linus. In my setup the
 dm_init() code was hit, however it still appeared to be too late at
 times. I also tried moving the call to async_synchronize_full() above
 the for loop and it still had the same issue (patch attached.) Out of
 around 10 reboot tests it failed to find root 1 or 2 times.
 
 The ubuntu scripts don't ever actually call do_mount() if it can't
 find the device. It seems to rely on some udev functionality to tell
 it when the device is present, and if that fails it just bails out.
 
 This change has introduced a regression. However, I only noticed it
 b/c my init script had a bug which caused it not to wait around for
 the device to appear.

Hmmm so, read the bug report, digged and asked around a bit.
Here's the root problem - ubuntu's initramfs uses a tool to wait for
the root device which uses libudev to listen for the device event;
unfortunately, its rx buffer is not set large enough and the receiver
isn't fast enough, which means that netlink broadcast messages from
the kernel can overrun the buffer.  When that happens, it sets an
error on the socket, so the next recv fails with -ENOBUFS.  If that
happens, the wait for root aborts immediately and initramfs proceeds
to mount non-existent root device.

The only thing which changes by these patches is the timing of events.
The problem likely wasn't as exposed before because things were slow
enough so that either the messages could be consumed fast enough or
there's enough delay between libata module load and the root device
wait hiding the bug in the wait logic.

So, yeah, it's a full blown timing bug.  I'm not sure what we can do
to work around from kernel side except for randomly slowing things
down or forcefully enlarging rx buffer size.  There really is no
interlocking to take advantage of. :(

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-11-26 Thread Josh Hunt
On Mon, Aug 12, 2013 at 10:09 AM, Tejun Heo t...@kernel.org wrote:
 Hello, Jonathan.

 On Mon, Aug 12, 2013 at 12:04:11AM -0700, Jonathan Nieder wrote:
 My laptop fails to boot[1] with the message 'Volume group data not
 found'.  Bisects to v3.8-rc4~17 (the above commit).  Reverting that
 commit on top of current master (d92581fcad18, 2013-08-10) produces
 a working kernel.  dmesg output from that working kernel attached.
 More details, including .config, at [2].

 Any ideas for tracking this down?

 Which initrd / boot script are you using?  It looks like lvm assemble
 scripts are running before sdX are detected leading to volume assembly
 failure.  Before the patch, any module loading would end up
 synchronizing async probes but after the patch modprobe invocations
 which don't schedule them won't be.  Does your boot script happen to
 run multiple modprobes in parallel and proceed to configure lvm
 without waiting for modprobes of libata drivers to finish?

 Thanks.

 --
 tejun
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

I'm also hitting a regression b/c of this patch. Booting 3.10.20 on a
number of older machines with onboard sata controllers are unable to
find their root device quickly enough. I bisected the issue down to
774a1221e862b343388347bac9b318767336b20b. Reverting it allows my
systems to boot fine. I'm seeing this with both sata_svw and ahci. I'm
using ubuntu precise userspace which is using initramfs-tools
0.99ubuntu13.1. From what I can tell the modprobes in this initrd are
done in serial, but the port probing is async. This allows init to
continue on to try and mount root, but it's not there yet.

I'm attaching some serial log output with initcall_debug enabled.

Both ahci and sata_svw call ata_host_activate(), which call
ata_host_register() and async_schedule(async_port_probe, ap).

Let me know what other information you may need.
Thanks

-- 
Josh


seriallog
Description: Binary data


Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-11-26 Thread Linus Torvalds
On Tue, Nov 26, 2013 at 1:29 PM, Josh Hunt joshhun...@gmail.com wrote:

 Both ahci and sata_svw call ata_host_activate(), which call
 ata_host_register() and async_schedule(async_port_probe, ap).

Well, with the modern logic (only wait for async probing if the
module itself did async probing) the ahci and svw modules didn't
really change any behavior.

But other modules did. I wonder, for example, if people insmod the dm
module, and expect all devices to exist afterwards. Which the old
logic of we always wait for all async code regardless of whether we
started it ourselves would do, but the new logic does not.

Something similar might hit the (non-modular) md auto-detect ioctl.

So maybe we should just special-case those two issues, and say let's
just wait for async requests here

Something like the appended (whitespace-damaged) diff. Does that make
a difference to you guys? And if it does, can you check *which* of the
two async_synchronize_full() calls it is that matters for your cases?

 Linus

--- duh, apply by hand --

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 0704c523a76b..7e7a2f743b11 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -351,6 +351,7 @@ static int __init dm_init(void)
goto bad;
}

+   async_synchronize_full();
return 0;

   bad:
diff --git a/drivers/md/md.c b/drivers/md/md.c
index b6b7a2866c9e..1d173dc662fc 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8602,6 +8602,7 @@ static void autostart_arrays(int part)
i_scanned = 0;
i_passed = 0;

+   async_synchronize_full();
printk(KERN_INFO md: Autodetecting RAID arrays.\n);

while (!list_empty(all_detected_devices)  i_scanned  INT_MAX) {
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-11-26 Thread Josh Hunt
On Tue, Nov 26, 2013 at 3:53 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
 On Tue, Nov 26, 2013 at 1:29 PM, Josh Hunt joshhun...@gmail.com wrote:

 Both ahci and sata_svw call ata_host_activate(), which call
 ata_host_register() and async_schedule(async_port_probe, ap).

 Well, with the modern logic (only wait for async probing if the
 module itself did async probing) the ahci and svw modules didn't
 really change any behavior.

 But other modules did. I wonder, for example, if people insmod the dm
 module, and expect all devices to exist afterwards. Which the old
 logic of we always wait for all async code regardless of whether we
 started it ourselves would do, but the new logic does not.

 Something similar might hit the (non-modular) md auto-detect ioctl.

 So maybe we should just special-case those two issues, and say let's
 just wait for async requests here

 Something like the appended (whitespace-damaged) diff. Does that make
 a difference to you guys? And if it does, can you check *which* of the
 two async_synchronize_full() calls it is that matters for your cases?

  Linus

 --- duh, apply by hand --

 diff --git a/drivers/md/dm.c b/drivers/md/dm.c
 index 0704c523a76b..7e7a2f743b11 100644
 --- a/drivers/md/dm.c
 +++ b/drivers/md/dm.c
 @@ -351,6 +351,7 @@ static int __init dm_init(void)
 goto bad;
 }

 +   async_synchronize_full();
 return 0;

bad:
 diff --git a/drivers/md/md.c b/drivers/md/md.c
 index b6b7a2866c9e..1d173dc662fc 100644
 --- a/drivers/md/md.c
 +++ b/drivers/md/md.c
 @@ -8602,6 +8602,7 @@ static void autostart_arrays(int part)
 i_scanned = 0;
 i_passed = 0;

 +   async_synchronize_full();
 printk(KERN_INFO md: Autodetecting RAID arrays.\n);

 while (!list_empty(all_detected_devices)  i_scanned  INT_MAX) 
 {

I should have clarified that I'm not using dm/md in my setup. I know
the modules are getting loaded in the log I attached, but root is not
a md/dm device.

-- 
Josh
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-11-26 Thread Tejun Heo
Hello,

On Tue, Nov 26, 2013 at 04:12:41PM -0600, Josh Hunt wrote:
 I should have clarified that I'm not using dm/md in my setup. I know
 the modules are getting loaded in the log I attached, but root is not
 a md/dm device.

Can you please still try it?  The init script is broken and we're now
just trying to restore just enough of the old behavior so that the
issue is not exposed.  The boot script in use seems to load md/dm
modules after storage drivers and use their termination as the signal
for storage ready, so it could be a good enough bandaid even if
you're not using dm/md.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-11-26 Thread Linus Torvalds
On Tue, Nov 26, 2013 at 2:12 PM, Josh Hunt joshhun...@gmail.com wrote:

 I should have clarified that I'm not using dm/md in my setup. I know
 the modules are getting loaded in the log I attached, but root is not
 a md/dm device.

Hmm. The initcall debugging doesn't actually show any of the wait for
async events, because those debug messages come from
do_one_initcall(), and the waiting happens later. Plus your messages
don't actually show where you are trying to - and failing - to mount
the root filesystem.

Without that kind of information, it's kind of hard to guess. Maybe
you could add a few printk's to your kernel? Add one to
do_init_module() *after* the

if (current-flags  PF_USED_ASYNC)
async_synchronize_full();

thing, and another to fs/namespace.c: do_mount() (just put something like

printk(do_mount: %s at %s\n, dev_name, dir_name);

or whatever, so that we can see when that happens..

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-08-12 Thread Jonathan Nieder
Hi,

Tejun Heo wrote:

 This avoids the described deadlock because iosched module doesn't use
 async and thus wouldn't invoke async_synchronize_full().  This is
 hacky and incomplete.  It will deadlock if async module loading nests;
 however, this works around the known problem case and seems to be the
 best of bad options.

 For more details, please refer to the following thread.

   http://thread.gmane.org/gmane.linux.kernel/1420814

My laptop fails to boot[1] with the message 'Volume group data not
found'.  Bisects to v3.8-rc4~17 (the above commit).  Reverting that
commit on top of current master (d92581fcad18, 2013-08-10) produces
a working kernel.  dmesg output from that working kernel attached.
More details, including .config, at [2].

Any ideas for tracking this down?

Thanks,
Jonathan

[1] Screenshot: 
http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=bad_3.10.3-1.jpg;att=1;bug=719464
Screenshot in recovery mode: 
http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=bad_3.10.3-1_recovery.jpg;att=2;bug=719464
[2] http://bugs.debian.org/719464
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Initializing cgroup subsys cpuacct
[0.00] Linux version 3.11.0-rc4+ (jrn@elie) (gcc version 4.7.3 (Debian 
4.7.3-6) ) #2 SMP Sun Aug 11 23:20:20 PDT 2013
[0.00] Command line: BOOT_IMAGE=/vmlinuz-3.11.0-rc4+ 
root=/dev/mapper/data-root ro quiet
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x6e545fff] usable
[0.00] BIOS-e820: [mem 0x6e546000-0x6e745fff] ACPI NVS
[0.00] BIOS-e820: [mem 0x6e746000-0x6fd3efff] usable
[0.00] BIOS-e820: [mem 0x6fd3f000-0x6fdbefff] reserved
[0.00] BIOS-e820: [mem 0x6fdbf000-0x6febefff] ACPI NVS
[0.00] BIOS-e820: [mem 0x6febf000-0x6fef6fff] ACPI data
[0.00] BIOS-e820: [mem 0x6fef7000-0x6fef] usable
[0.00] BIOS-e820: [mem 0xf800-0xf8ff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved
[0.00] BIOS-e820: [mem 0xfec1-0xfec10fff] reserved
[0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved
[0.00] BIOS-e820: [mem 0xffe0-0x] reserved
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.6 present.
[0.00] DMI: TOSHIBA Satellite C650D/Portable PC, BIOS 1.60 09/02/2010
[0.00] e820: update [mem 0x-0x0fff] usable == reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] No AGP bridge found
[0.00] e820: last_pfn = 0x6ff00 max_arch_pfn = 0x4
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-B uncachable
[0.00]   C-F write-through
[0.00] MTRR variable ranges enabled:
[0.00]   0 base  mask C000 write-back
[0.00]   1 base 4000 mask E000 write-back
[0.00]   2 base 6000 mask F000 write-back
[0.00]   3 base 6FEBE000 mask F000 uncachable
[0.00]   4 base FFE0 mask FFE0 write-protect
[0.00]   5 disabled
[0.00]   6 disabled
[0.00]   7 disabled
[0.00] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
[0.00] Base memory trampoline at [88099000] 99000 size 24576
[0.00] Using GB pages for direct mapping
[0.00] init_memory_mapping: [mem 0x-0x000f]
[0.00]  [mem 0x-0x000f] page 4k
[0.00] BRK [0x0184c000, 0x0184cfff] PGTABLE
[0.00] BRK [0x0184d000, 0x0184dfff] PGTABLE
[0.00] BRK [0x0184e000, 0x0184efff] PGTABLE
[0.00] init_memory_mapping: [mem 0x6fa0-0x6fbf]
[0.00]  [mem 0x6fa0-0x6fbf] page 2M
[0.00] BRK [0x0184f000, 0x0184] PGTABLE
[0.00] init_memory_mapping: [mem 0x6c00-0x6e545fff]
[0.00]  [mem 0x6c00-0x6e3f] page 2M
[0.00]  [mem 0x6e40-0x6e545fff] page 4k
[0.00] BRK [0x0185, 0x01850fff] PGTABLE
[0.00] init_memory_mapping: [mem 0x6e746000-0x6f9f]
[0.00]  [mem 0x6e746000-0x6e7f] page 4k
[0.00]  [mem 0x6e80-0x6f9f] page 2M
[0.00] init_memory_mapping: [mem 0x0010-0x6bff]
[0.00]  [mem 0x0010-0x001f] page 4k
[0.00]  [mem 0x0020-0x6bff] page 2M
[0.00] 

Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-08-12 Thread Tejun Heo
Hello, Jonathan.

On Mon, Aug 12, 2013 at 12:04:11AM -0700, Jonathan Nieder wrote:
 My laptop fails to boot[1] with the message 'Volume group data not
 found'.  Bisects to v3.8-rc4~17 (the above commit).  Reverting that
 commit on top of current master (d92581fcad18, 2013-08-10) produces
 a working kernel.  dmesg output from that working kernel attached.
 More details, including .config, at [2].
 
 Any ideas for tracking this down?

Which initrd / boot script are you using?  It looks like lvm assemble
scripts are running before sdX are detected leading to volume assembly
failure.  Before the patch, any module loading would end up
synchronizing async probes but after the patch modprobe invocations
which don't schedule them won't be.  Does your boot script happen to
run multiple modprobes in parallel and proceed to configure lvm
without waiting for modprobes of libata drivers to finish?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-01-16 Thread Alex Riesen
On Wed, Jan 16, 2013 at 3:52 AM, Tejun Heo t...@kernel.org wrote:
 This avoids the described deadlock because iosched module doesn't use
 async and thus wouldn't invoke async_synchronize_full().  This is
 hacky and incomplete.  It will deadlock if async module loading nests;
 however, this works around the known problem case and seems to be the
 best of bad options.

I confirm it fixes the original problem.

Tested-by: Alex Riesen raa.l...@gmail.com

[   27.594951] hub 1-1:1.0: state 7 ports 6 chg  evt 0004
[   27.595245] hub 1-1:1.0: port 2, status 0101, change 0001, 12 Mb/s
[   27.698995] hub 1-1:1.0: debounce: port 2: total 100ms stable 100ms
status 0x101
[   27.709977] hub 1-1:1.0: port 2 not reset yet, waiting 10ms
[   27.771888] usb 1-1.2: new high-speed USB device number 3 using ehci-pci
[   27.782871] hub 1-1:1.0: port 2 not reset yet, waiting 10ms
[   27.857503] usb 1-1.2: default language 0x0409
[   27.858248] usb 1-1.2: udev 3, busnum 1, minor = 2
[   27.858258] usb 1-1.2: New USB device found, idVendor=0781, idProduct=5408
[   27.858263] usb 1-1.2: New USB device strings: Mfr=1, Product=2,
SerialNumber=3
[   27.858267] usb 1-1.2: Product: U3 Titanium
[   27.858271] usb 1-1.2: Manufacturer: SanDisk Corporation
[   27.858275] usb 1-1.2: SerialNumber: 187A3A60F1E9
[   27.858800] usb 1-1.2: usb_probe_device
[   27.858815] usb 1-1.2: configuration #1 chosen from 1 choice
[   27.858940] usb 1-1.2: adding 1-1.2:1.0 (config #1, interface 0)
[   27.859246] usb-storage 1-1.2:1.0: usb_probe_interface
[   27.859258] usb-storage 1-1.2:1.0: usb_probe_interface - got id
[   27.859516] scsi6 : usb-storage 1-1.2:1.0
[   28.865771] io scheduler deadline registered (default)
[   28.866705] scsi 6:0:0:0: Direct-Access SanDisk  U3 Titanium
  2.18 PQ: 0 ANSI: 2
[   28.869483] sd 6:0:0:0: Attached scsi generic sg1 type 0
[   28.869700] sd 6:0:0:0: [sdb] 4013713 512-byte logical blocks:
(2.05 GB/1.91 GiB)
[   28.870197] sd 6:0:0:0: [sdb] Write Protect is off
[   28.870204] sd 6:0:0:0: [sdb] Mode Sense: 03 00 00 00
[   28.870692] sd 6:0:0:0: [sdb] No Caching mode page present
[   28.870697] sd 6:0:0:0: [sdb] Assuming drive cache: write through
[   28.873565] sd 6:0:0:0: [sdb] No Caching mode page present
[   28.873575] sd 6:0:0:0: [sdb] Assuming drive cache: write through
[   28.883895]  sdb: sdb1
[   28.887775] sd 6:0:0:0: [sdb] No Caching mode page present
[   28.887783] sd 6:0:0:0: [sdb] Assuming drive cache: write through
[   28.887789] sd 6:0:0:0: [sdb] Attached SCSI removable disk

The filesystem can be mounted and files can be read.
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-01-16 Thread Arjan van de Ven



As Arjan suggested, trying to load the default modules right after the
initial rootfs mount could be an acceptable compromise and it would be
really nice (for both code sanity and avoiding future problems) to be
able to declare module loading nested inside async unspported.


we can even try twice

the first time right after we mount the initramfs
the second time when the initramfs code exits, and before we exec init
(the initramfs supposedly mounted the real root fs at this point)

if you want your elevator to apply to your root filesystem storage, the rule
will then be put the module in the initramfs... but to be honest,
that's not a restriction that is unreasonable or unexpected.


for doing a module loading from inside an async handler..we can then just make
use of the normal load this module async way of requesting a module.

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-01-16 Thread Linus Torvalds
On Wed, Jan 16, 2013 at 9:03 AM, Arjan van de Ven ar...@linux.intel.com wrote:

 we can even try twice

 the first time right after we mount the initramfs
 the second time when the initramfs code exits, and before we exec init
 (the initramfs supposedly mounted the real root fs at this point)

Yes. This, together with don't try request_module for the default
elevator, and the warn if somebody does request_module from async
context would, I think, be the right thing to do.

In the meantime, I've applied Tejun's patch. It possibly speeds things
up regardless of this particular deadlock thing, and while it's not
pretty it certainly isn't horribly nasty or very invasive either, so I
don't see any reason to delay it just because there might be a better
solution some day.

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 6:52 PM, Tejun Heo t...@kernel.org wrote:

 It makes me feel dirty but makes the problem go away and I can't think
 of anything better, so here is the implementation of used async
 workaround.

Ok, people, can we get a tested-by (or Nope, doesn't work) from the
people who saw this?

That said, maybe we could just make the rule be that you can't pick a
default IO scheduler that is modular.

And I *would* like to see the warning we discussed. Maybe there are
other situations that can trigger this?

Because something like that

WARN_ON_ONCE(wait  i_am_async()  system_state == SYSTEM_RUNNING);

in kernel/kmod.c (__request_module()) still sounds like a good idea to
verify that this is the only thing that triggers it (of course, we'd
need to somehow avoid the warning for the known case with the known
workaround).

 Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-01-15 Thread Tejun Heo
Hello, Linus.

On Tue, Jan 15, 2013 at 07:00:31PM -0800, Linus Torvalds wrote:
 That said, maybe we could just make the rule be that you can't pick a
 default IO scheduler that is modular.

This is definitely much more preferable but it would affect use case
where everything is built modular and the elevator is selected via
kernel param.  This is way outside the usual usage and we can warn
about the new behavior but it still is an observable behavior change.
Do you think this would be okay?

 And I *would* like to see the warning we discussed. Maybe there are
 other situations that can trigger this?
 
 Because something like that
 
 WARN_ON_ONCE(wait  i_am_async()  system_state == SYSTEM_RUNNING);
 
 in kernel/kmod.c (__request_module()) still sounds like a good idea to
 verify that this is the only thing that triggers it (of course, we'd
 need to somehow avoid the warning for the known case with the known
 workaround).

And then this warning can be added without introducing
request_module_but_dont_warn_about_being_called_from_async().

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-01-15 Thread Ming Lei
On Wed, Jan 16, 2013 at 10:52 AM, Tejun Heo t...@kernel.org wrote:
 If the default iosched is built as module, the kernel may deadlock
 while trying to load the iosched module on device probe if the probing
 was running off async.  This is because async_synchronize_full() at
 the end of module init ends up waiting for the async job which
 initiated the module loading.

  async Amodprobe

  1. finds a device
  2. registers the block device
  3. request_module(default iosched)
 4. modprobe in userland
 5. load and init module
 6. async_synchronize_full()

 Async A waits for modprobe to finish in request_module() and modprobe
 waits for async A to finish in async_synchronize_full().

 Because there's no easy to track dependency once control goes out to
 userland, implementing properly nested flushing is difficult.  For
 now, make module init perform async_synchronize_full() iff module init
 has queued async jobs as suggested by Linus.

 This avoids the described deadlock because iosched module doesn't use
 async and thus wouldn't invoke async_synchronize_full().  This is
 hacky and incomplete.  It will deadlock if async module loading nests;
 however, this works around the known problem case and seems to be the
 best of bad options.

 For more details, please refer to the following thread.

   http://thread.gmane.org/gmane.linux.kernel/1420814

 Signed-off-by: Tejun Heo t...@kernel.org
 Reported-by: Alex Riesen raa.l...@gmail.com
 Cc: Linus Torvalds torva...@linux-foundation.org
 ---

Looks it does fix the deadlock problem on my Pandaboard,
also the scsi disk device node(/dev/sdX) comes just
after loading module of 'sd_mod'.

Tested-by: Ming Lei ming@canonical.com

Thanks,
--
Ming Lei
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 7:25 PM, Tejun Heo t...@kernel.org wrote:
 Hello, Linus.

 On Tue, Jan 15, 2013 at 07:00:31PM -0800, Linus Torvalds wrote:
 That said, maybe we could just make the rule be that you can't pick a
 default IO scheduler that is modular.

 This is definitely much more preferable but it would affect use case
 where everything is built modular and the elevator is selected via
 kernel param.  This is way outside the usual usage and we can warn
 about the new behavior but it still is an observable behavior change.
 Do you think this would be okay?

I do want the same user-visible semantics, so it's not some one-liner.

The compiled-in elevator would be easy enough to handle in the Kconfig
file (maybe we do already, I didn't even bother to check). The real
problem is the chosen_elevator one, which is dynamic with the kernel
command line. And we could handle that one by just trying to load the
module early (but exactly _when_?) and then instead of looking things
up by name, just keep a pointer to the default elevator around.

But no, it's not just some trivial one-liner. Especially the question
about when to try to load the module that is given on the kernel
command line is not trivial. Do we require that the module is in the
initrd and loadable basically immediate at boot? Do we try again after
switching the root filesystem? Things like that..

 And then this warning can be added without introducing
 request_module_but_dont_warn_about_being_called_from_async().

I do agree that it would be much nicer that way.

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-01-15 Thread Rusty Russell
Tejun Heo t...@kernel.org writes:
 --- a/kernel/module.c
 +++ b/kernel/module.c
 @@ -3058,8 +3064,25 @@ static int do_init_module(struct module
   blocking_notifier_call_chain(module_notify_list,
MODULE_STATE_LIVE, mod);
  
 - /* We need to finish all async code before the module init sequence is 
 done */
 - async_synchronize_full();

Linus put async_synchronize_full() here as a fix but beware: you can
start using the module before this call.  Normally the potential caller
is the one requesting the module load so it works, but if we get more
async stuff we may land in that hole.

Changing every caller of any async-initializing service is not going to
be pretty, but maybe put an async_cookie_t in struct module for
module_init to use, and sync it in try_module_get()?  Which would now
need a can_sleep flag... but the result would be more async.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html