Bug#536860: linux-image-2.6.26-2-sparc64: Kernel unaligned access at TPC[48bf10] __delayacct_add_tsk+0x48/0x15c
On Mon, 2011-08-15 at 00:48 -0500, Jonathan Nieder wrote: Hi, Jonathan Nieder wrote: I wonder how this should be dealt with wrt squeeze. For your amusement, here's an ugly proof-of-concept patch (against 2.6.32.y) that just does unaligned reads from struct taskstats. The only redeeming feature is that it doesn't break ABI. I'd be curious to hear whether it works and the effect on performance. If it were to make sense to actually apply something like this, it would only be on 64-bit arches that do not have efficient unaligned access. get_unaligned() and put_unaligned() will be 'free' on the architectures that do have efficient unaligned access; i.e. the compiler should expand the various inline functions and macros to a simple load or store instruction. So these functions should be sufficient: static inline u64 get_misaligned_64(const u64 *ptr) { if (sizeof(long) == 8) return get_unaligned(ptr); else return *ptr; } static inline void put_misaligned_64(u64 value, u64 *ptr) { if (sizeof(long) == 8) put_unaligned(value, ptr); else *ptr = value; } Ben. signature.asc Description: This is a digitally signed message part
Bug#536860: linux-image-2.6.26-2-sparc64: Kernel unaligned access at TPC[48bf10] __delayacct_add_tsk+0x48/0x15c
Hi, Jonathan Nieder wrote: I wonder how this should be dealt with wrt squeeze. For your amusement, here's an ugly proof-of-concept patch (against 2.6.32.y) that just does unaligned reads from struct taskstats. The only redeeming feature is that it doesn't break ABI. I'd be curious to hear whether it works and the effect on performance. If it were to make sense to actually apply something like this, it would only be on 64-bit arches that do not have efficient unaligned access. From: Jonathan Nieder jrnie...@gmail.com Date: Mon, 15 Aug 2011 00:32:02 -0500 Subject: taskstats: use unaligned accesses to work around alignment issues As upstream commit 4be2c95d1f77 explains, the taskstats struct is internally aligned to 8-byte boundaries but because it is preceded by two NLA headers (4 bytes each) and a pid (4 bytes) ends up being misaligned when returned from mk_reply(), resulting in unaligned accesses when it is passed around and populated after that. A few options for fixing that: - go back to populating a taskstats struct on the stack or heap and copying it back. Since the structure weighs in at 328 bytes, that is somewhat costly - change mk_reply to start the reply at an 8-byte boundary. This is what is done upstream on the affected platforms, but unfortunately that breaks compatibility with iotop versions before 0.4.2. - use unaligned stores and loads on the taskstats struct returned from mk_reply. That is what this patch does. Intended for testing only, not for upstream application or application to any public tree for that matter. This patch would not have any benefit unless on a 64-bit arch without efficient unaligned accesses. Signed-off-by: Jonathan Nieder jrnie...@gmail.com --- kernel/delayacct.c | 54 --- kernel/taskstats.c | 11 ++--- kernel/tsacct.c| 54 --- 3 files changed, 71 insertions(+), 48 deletions(-) diff --git a/kernel/delayacct.c b/kernel/delayacct.c index ead9b610aa71..69f7e8fe7b53 100644 --- a/kernel/delayacct.c +++ b/kernel/delayacct.c @@ -19,6 +19,7 @@ #include linux/time.h #include linux/sysctl.h #include linux/delayacct.h +#include asm/unaligned.h int delayacct_on __read_mostly = 1;/* Delay accounting turned on/off */ struct kmem_cache *delayacct_cache; @@ -97,6 +98,18 @@ void __delayacct_blkio_end(void) current-delays-blkio_count); } +static void update_stat(s64 *stat, s64 updated) +{ + put_unaligned(updated get_unaligned(stat) ? 0 : updated, + stat); +} + +static void update_stat_unsigned(u64 *stat, u64 updated) +{ + put_unaligned(updated get_unaligned(stat) ? 0 : updated, + stat); +} + int __delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk) { s64 tmp; @@ -111,16 +124,15 @@ int __delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk) if (!tsk-delays) goto done; - tmp = (s64)d-cpu_run_real_total; + tmp = get_unaligned((s64 *) d-cpu_run_real_total); cputime_to_timespec(tsk-utime + tsk-stime, ts); tmp += timespec_to_ns(ts); - d-cpu_run_real_total = (tmp (s64)d-cpu_run_real_total) ? 0 : tmp; + update_stat((s64 *) d-cpu_run_real_total, tmp); - tmp = (s64)d-cpu_scaled_run_real_total; + tmp = get_unaligned((s64 *) d-cpu_scaled_run_real_total); cputime_to_timespec(tsk-utimescaled + tsk-stimescaled, ts); tmp += timespec_to_ns(ts); - d-cpu_scaled_run_real_total = - (tmp (s64)d-cpu_scaled_run_real_total) ? 0 : tmp; + update_stat((s64 *) d-cpu_scaled_run_real_total, tmp); /* * No locking available for sched_info (and too expensive to add one) @@ -130,27 +142,29 @@ int __delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk) t2 = tsk-sched_info.run_delay; t3 = tsk-se.sum_exec_runtime; - d-cpu_count += t1; + put_unaligned(get_unaligned(d-cpu_count) + t1, d-cpu_count); - tmp = (s64)d-cpu_delay_total + t2; - d-cpu_delay_total = (tmp (s64)d-cpu_delay_total) ? 0 : tmp; + tmp = get_unaligned((s64 *) d-cpu_delay_total) + t2; + update_stat((s64 *) d-cpu_delay_total, tmp); - tmp = (s64)d-cpu_run_virtual_total + t3; - d-cpu_run_virtual_total = - (tmp (s64)d-cpu_run_virtual_total) ? 0 : tmp; + tmp = get_unaligned((s64 *) d-cpu_run_virtual_total) + t3; + update_stat((s64 *) d-cpu_run_virtual_total, tmp); /* zero XXX_total, non-zero XXX_count implies XXX stat overflowed */ spin_lock_irqsave(tsk-delays-lock, flags); - tmp = d-blkio_delay_total + tsk-delays-blkio_delay; - d-blkio_delay_total = (tmp d-blkio_delay_total) ? 0 : tmp; - tmp = d-swapin_delay_total + tsk-delays-swapin_delay; - d-swapin_delay_total = (tmp d-swapin_delay_total) ? 0 :
Bug#536860: linux-image-2.6.26-2-sparc64: Kernel unaligned access at TPC[48bf10] __delayacct_add_tsk+0x48/0x15c
Paul Wise wrote: I tried to backport the iotop fixes to 0.4 but there were some merge conflicts so I abandoned that effort for now. (Applying a7334ca3 (Removing dead code) along with 08211d20 and 7f1773e7 seems to do it fwiw.) If you use a newer kernel than squeeze, you should use backports iotop. Right --- since the kernel change breaks ABI, I think we shouldn't backport it to squeeze. If the dmesg spam and slow unaligned accesses start to hurt, it's not hard to use a newer kernel and backports iotop. Thanks, that helped. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110810044559.ga27...@elie.gateway.2wire.net
Bug#536860: linux-image-2.6.26-2-sparc64: Kernel unaligned access at TPC[48bf10] __delayacct_add_tsk+0x48/0x15c
tags 536860 = upstream fixed-upstream quit Hi, Gabriel VLASIU wrote: Running iotop I get lots of: Kernel unaligned access at TPC[48bef4] __delayacct_add_tsk+0x2c/0x15c I suspect this was fixed by v2.6.38-rc1~296 (taskstats: use better ifdef for alignment, 2011-01-12). Could you test a recent kernel from sid or experimental, or try the following patch? Thanks, Jonathan commit 9ab020cf Author: Jeff Mahoney je...@suse.com Date: Wed Jan 12 17:00:48 2011 -0800 taskstats: use better ifdef for alignment Commit 4be2c95d (taskstats: pad taskstats netlink response for aligment issues on ia64) added a null field to align the taskstats structure but the discussion centered around ia64. The issue exists on other platforms with inefficient unaligned access and adding them piecemeal would be an unmaintainable mess. This patch uses Dave Miller's suggestion of using a combination of CONFIG_64BIT !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to determine whether alignment is needed. Note that this will cause breakage on those platforms with applications like iotop which had hard-coded offsets into the packet to access the taskstats structure. The message seen on systems without the alignment fixes looks like: kernel unaligned access to 0xe23879dca9bc, ip=0xa00100133d10 The addresses may vary but resolve to locations inside __delayacct_add_tsk. iotop makes what I'd call unreasonable assumptions about the contents of a netlink genetlink packet containing generic attributes. They're typed and have headers that specify value lengths, so the client can (should) identify and skip the ones the client doesn't understand. The kernel, as of version 2.6.36, presented a packet like so: ++ | genlmsghdr - 4 bytes | ++ | NLA header - 4 bytes | /* Aggregate header */ +-+--+ | | NLA header - 4 bytes | /* PID header */ | +--+ | | pid/tgid - 4 bytes | | +--+ | | NLA header - 4 bytes | /* stats header */ | + -+ - oops. aligned on 4 byte boundary | | struct taskstats - 328 bytes | +-+--+ The iotop code expects that the kernel will behave as it did then, assuming that the packet format is set in stone. The format is set in stone, but the packet offsets are not. There's nothing in the packet format that guarantees that the packet will be sent in exactly the same way. The attribute contents are set (or versioned) and the aggregate contents are set but they can be anywhere in the packet. The issue here isn't that an unaligned structure gets passed to userspace, it's that the NLA infrastructure has something of a weakness: The 4 byte attribute header may force the payload to be unaligned. The taskstats structure is created at an unaligned location and then 64-bit values are operated on inside the kernel, so the unaligned access warnings gets spewed everywhere. It's possible to use the unaligned access API to operate on the structure in the kernel but it seems like a wasted effort to work around userspace code that isn't following the packet format. Any new additions would also need the be worked around. It's a maintenance nightmare. The conclusion of the earlier discussion seemed to be ok fine, if we have to break it, don't break it on arches that don't have the problem. Dave pointed out that the unaligned access problem doesn't only exist on ia64, but also on other 64-bit arches that don't have efficient unaligned access and it should be fixed there as well. The committed version of the patch and this addition keep with the conclusion of that discussion not to break it unnecessarily, which the pid padding and the packet padding fixes did do. x86_64 and powerpc don't suffer this problem so they shouldn't suffer the solution. Other 64-bit architectures do and will, though. Signed-off-by: Jeff Mahoney je...@suse.com Reported-by: David S. Miller da...@davemloft.net Acked-by: David S. Miller da...@davemloft.net Cc: Dan Carpenter erro...@gmail.com Cc: Balbir Singh bal...@in.ibm.com Cc: Florian Mickler flor...@mickler.org Cc: Guillaume Chazarain guic...@gmail.com Signed-off-by: Andrew Morton a...@linux-foundation.org Signed-off-by: Linus Torvalds torva...@linux-foundation.org diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 69691eb4..3971c6b9 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -348,7 +348,7 @@ static int parse(struct nlattr *na, struct cpumask *mask) return ret; } -#ifdef CONFIG_IA64 +#if defined(CONFIG_64BIT)
Bug#536860: linux-image-2.6.26-2-sparc64: Kernel unaligned access at TPC[48bf10] __delayacct_add_tsk+0x48/0x15c
Jonathan Nieder wrote: Could you test a recent kernel from sid or experimental, or try the following patch? [...] taskstats: use better ifdef for alignment The following changes would need to be applied first for it to work, of course. commit 85893120 Author: Jeff Mahoney je...@suse.com Date: Wed Oct 27 15:34:43 2010 -0700 delayacct: align to 8 byte boundary on 64-bit systems prepare_reply() sets up an skb for the response. The payload contains: ++ | genlmsghdr - 4 bytes | ++ | NLA header - 4 bytes | /* Aggregate header */ +-+--+ | | NLA header - 4 bytes | /* PID header */ | +--+ | | pid/tgid - 4 bytes | | +--+ | | NLA header - 4 bytes | /* stats header */ | + -+ - oops. aligned on 4 byte boundary | | struct taskstats - 328 bytes | +-+--+ The start of the taskstats struct must be 8 byte aligned on IA64 (and other systems with 8 byte alignment rules for 64-bit types) or runtime alignment warnings will be issued. This patch pads the pid/tgid field out to sizeof(long), which forces the alignment of taskstats. The getdelays userspace code is ok with this since it assumes 32-bit pid/tgid and then honors that header's length field. An array is used to avoid exposing kernel memory contents to userspace in the response. Signed-off-by: Jeff Mahoney je...@suse.com Cc: Balbir Singh bal...@in.ibm.com Signed-off-by: Andrew Morton a...@linux-foundation.org Signed-off-by: Linus Torvalds torva...@linux-foundation.org diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 11281d57..5a651aa6 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -360,6 +360,12 @@ static struct taskstats *mk_reply(struct sk_buff *skb, int type, u32 pid) struct nlattr *na, *ret; int aggr; + /* If we don't pad, we end up with alignment on a 4 byte boundary. +* This causes lots of runtime warnings on systems requiring 8 byte +* alignment */ + u32 pids[2] = { pid, 0 }; + int pid_size = ALIGN(sizeof(pid), sizeof(long)); + aggr = (type == TASKSTATS_TYPE_PID) ? TASKSTATS_TYPE_AGGR_PID : TASKSTATS_TYPE_AGGR_TGID; @@ -367,7 +373,7 @@ static struct taskstats *mk_reply(struct sk_buff *skb, int type, u32 pid) na = nla_nest_start(skb, aggr); if (!na) goto err; - if (nla_put(skb, type, sizeof(pid), pid) 0) + if (nla_put(skb, type, pid_size, pids) 0) goto err; ret = nla_reserve(skb, TASKSTATS_TYPE_STATS, sizeof(struct taskstats)); if (!ret) commit 4be2c95d Author: Jeff Mahoney je...@suse.com Date: Tue Dec 21 17:24:30 2010 -0800 taskstats: pad taskstats netlink response for aligment issues on ia64 The taskstats structure is internally aligned on 8 byte boundaries but the layout of the aggregrate reply, with two NLA headers and the pid (each 4 bytes), actually force the entire structure to be unaligned. This causes the kernel to issue unaligned access warnings on some architectures like ia64. Unfortunately, some software out there doesn't properly unroll the NLA packet and assumes that the start of the taskstats structure will always be 20 bytes from the start of the netlink payload. Aligning the start of the taskstats structure breaks this software, which we don't want. So, for now the alignment only happens on architectures that require it and those users will have to update to fixed versions of those packages. Space is reserved in the packet only when needed. This ifdef should be removed in several years e.g. 2012 once we can be confident that fixed versions are installed on most systems. We add the padding before the aggregate since the aggregate is already a defined type. Commit 85893120 (delayacct: align to 8 byte boundary on 64-bit systems) previously addressed the alignment issues by padding out the pid field. This was supposed to be a compatible change but the circumstances described above mean that it wasn't. This patch backs out that change, since it was a hack, and introduces a new NULL attribute type to provide the padding. Padding the response with 4 bytes avoids allocating an aligned taskstats structure and copying it back. Since the structure weighs in at 328 bytes, it's too big to do it on the stack. Signed-off-by: Jeff Mahoney je...@suse.com Reported-by: Brian Rogers br...@xyzw.org Cc: Jeff Mahoney je...@suse.com Cc: Guillaume Chazarain guic...@gmail.com Cc: Balbir Singh bal...@in.ibm.com
Bug#536860: linux-image-2.6.26-2-sparc64: Kernel unaligned access at TPC[48bf10] __delayacct_add_tsk+0x48/0x15c
Package: linux-image-2.6.26-2-sparc64 Version: 2.6.26-17 Severity: important Running iotop I get lots of: Jul 14 09:44:30 mailq kernel: [1285719.964489] Kernel unaligned access at TPC[48bef4] __delayacct_add_tsk+0x2c/0x15c Jul 14 09:44:30 mailq kernel: [1285719.964558] Kernel unaligned access at TPC[48bf10] __delayacct_add_tsk+0x48/0x15c Jul 14 09:44:30 mailq kernel: [1285719.964594] Kernel unaligned access at TPC[48bf14] __delayacct_add_tsk+0x4c/0x15c Jul 14 09:44:30 mailq kernel: [1285719.964630] Kernel unaligned access at TPC[48bf28] __delayacct_add_tsk+0x60/0x15c Jul 14 09:44:30 mailq kernel: [1285719.964667] Kernel unaligned access at TPC[48bf48] __delayacct_add_tsk+0x80/0x15c Jul 14 09:44:36 mailq kernel: [1285726.011965] Kernel unaligned access at TPC[48bef4] __delayacct_add_tsk+0x2c/0x15c Jul 14 09:44:36 mailq kernel: [1285726.012031] Kernel unaligned access at TPC[48bf10] __delayacct_add_tsk+0x48/0x15c Jul 14 09:44:36 mailq kernel: [1285726.012066] Kernel unaligned access at TPC[48bf14] __delayacct_add_tsk+0x4c/0x15c Jul 14 09:44:36 mailq kernel: [1285726.012100] Kernel unaligned access at TPC[48bf28] __delayacct_add_tsk+0x60/0x15c Jul 14 09:44:36 mailq kernel: [1285726.012135] Kernel unaligned access at TPC[48bf48] __delayacct_add_tsk+0x80/0x15c Jul 14 09:45:14 mailq kernel: [1285764.277939] Kernel unaligned access at TPC[48bef4] __delayacct_add_tsk+0x2c/0x15c Jul 14 09:45:14 mailq kernel: [1285764.278022] Kernel unaligned access at TPC[48bf10] __delayacct_add_tsk+0x48/0x15c Jul 14 09:45:14 mailq kernel: [1285764.278057] Kernel unaligned access at TPC[48bf14] __delayacct_add_tsk+0x4c/0x15c Jul 14 09:45:14 mailq kernel: [1285764.278091] Kernel unaligned access at TPC[48bf28] __delayacct_add_tsk+0x60/0x15c Jul 14 09:45:14 mailq kernel: [1285764.278127] Kernel unaligned access at TPC[48bf48] __delayacct_add_tsk+0x80/0x15c Jul 14 09:45:19 mailq kernel: [1285769.512383] Kernel unaligned access at TPC[48bef4] __delayacct_add_tsk+0x2c/0x15c Jul 14 09:45:19 mailq kernel: [1285769.512464] Kernel unaligned access at TPC[48bf10] __delayacct_add_tsk+0x48/0x15c Jul 14 09:45:19 mailq kernel: [1285769.512500] Kernel unaligned access at TPC[48bf14] __delayacct_add_tsk+0x4c/0x15c Jul 14 09:45:19 mailq kernel: [1285769.512535] Kernel unaligned access at TPC[48bf28] __delayacct_add_tsk+0x60/0x15c Jul 14 09:45:19 mailq kernel: [1285769.512572] Kernel unaligned access at TPC[48bf48] __delayacct_add_tsk+0x80/0x15c Jul 14 10:00:09 mailq kernel: [1286659.651335] Kernel unaligned access at TPC[48bef4] __delayacct_add_tsk+0x2c/0x15c Jul 14 10:00:10 mailq kernel: [1286659.651417] Kernel unaligned access at TPC[48bf10] __delayacct_add_tsk+0x48/0x15c Jul 14 10:00:10 mailq kernel: [1286659.651453] Kernel unaligned access at TPC[48bf14] __delayacct_add_tsk+0x4c/0x15c Jul 14 10:00:10 mailq kernel: [1286659.651488] Kernel unaligned access at TPC[48bf28] __delayacct_add_tsk+0x60/0x15c Jul 14 10:00:10 mailq kernel: [1286659.651525] Kernel unaligned access at TPC[48bf48] __delayacct_add_tsk+0x80/0x15c -- System Information: Debian Release: 5.0.2 APT prefers stable APT policy: (500, 'stable') Architecture: sparc (sparc64) Kernel: Linux 2.6.26-2-sparc64 Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages linux-image-2.6.26-2-sparc64 depends on: ii debconf [debconf-2.0] 1.5.24 Debian configuration management sy ii initramfs-tools [linux-initra 0.92o tools for generating an initramfs ii module-init-tools 3.4-1 tools for managing Linux kernel mo linux-image-2.6.26-2-sparc64 recommends no packages. Versions of packages linux-image-2.6.26-2-sparc64 suggests: pn fdutilsnone(no description available) pn linux-doc-2.6.26 none(no description available) ii silo 1.4.13a+git20070930-3 Sparc Improved LOader -- debconf information: linux-image-2.6.26-2-sparc64/preinst/abort-install-2.6.26-2-sparc64: linux-image-2.6.26-2-sparc64/postinst/bootloader-test-error-2.6.26-2-sparc64: shared/kernel-image/really-run-bootloader: true linux-image-2.6.26-2-sparc64/preinst/abort-overwrite-2.6.26-2-sparc64: linux-image-2.6.26-2-sparc64/postinst/depmod-error-2.6.26-2-sparc64: false linux-image-2.6.26-2-sparc64/postinst/old-initrd-link-2.6.26-2-sparc64: true linux-image-2.6.26-2-sparc64/preinst/failed-to-move-modules-2.6.26-2-sparc64: linux-image-2.6.26-2-sparc64/preinst/elilo-initrd-2.6.26-2-sparc64: true linux-image-2.6.26-2-sparc64/postinst/bootloader-error-2.6.26-2-sparc64: linux-image-2.6.26-2-sparc64/preinst/lilo-has-ramdisk: linux-image-2.6.26-2-sparc64/postinst/create-kimage-link-2.6.26-2-sparc64: true linux-image-2.6.26-2-sparc64/postinst/old-system-map-link-2.6.26-2-sparc64: true linux-image-2.6.26-2-sparc64/prerm/removing-running-kernel-2.6.26-2-sparc64: true linux-image-2.6.26-2-sparc64/preinst/initrd-2.6.26-2-sparc64: