Re: [PATCH 0/2] file capabilities: Introduction
On Mon, May 14, 2007 at 08:00:11PM +, Pavel Machek wrote: > Hi! > > > "Serge E. Hallyn" <[EMAIL PROTECTED]> wrote: > > > > > Following are two patches which have been sitting for some time in -mm. > > > > Where "some time" == "nearly six months". > > > > We need help considering, reviewing and testing this code, please. > > I did quick scan, and it looks ok. Plus, it means we can finally start > using that old capabilities subsystem... so I think we should do it. FWIW, I looked through it recently as well, and it looked reasonable enough to me, though I'm not a security expert. I did have a question about testing corner cases etc, which Serge has tried to address. Serge, are you planning to post an update without STRICTXATTR ? That should simplify the second patch. Regards Suparna > > Pavel > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) > http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html > - > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Suparna Bhattacharya ([EMAIL PROTECTED]) Linux Technology Center IBM Software Lab, India - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] bluetooth: fix locking in hci_sock_dev_event()
From: Satyam Sharma <[EMAIL PROTECTED]> Date: Thu, 17 May 2007 11:13:36 +0530 (IST) > [PATCH] bluetooth: fix locking in hci_sock_dev_event() > > We presently use lock_sock() to acquire a lock on a socket in > hci_sock_dev_event(), but this goes BUG because lock_sock() > can sleep and we're already holding a read-write spinlock at > that point. So, we must use the non-sleeping BH version, > bh_lock_sock(). > > However, hci_sock_dev_event() is called from user context and > hence using simply bh_lock_sock() will deadlock against a > concurrent softirq that tries to acquire a lock on the same > socket. Hence, disabling BH's before acquiring the socket lock > and enable them afterwards, is the proper solution to fix > socket locking in hci_sock_dev_event(). > > Cc: David Miller <[EMAIL PROTECTED]> > Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]> > Signed-off-by: Marcel Holtmann <[EMAIL PROTECTED]> > Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]> Thanks I'll merge this in. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] bluetooth: fix locking in hci_sock_dev_event()
[PATCH] bluetooth: fix locking in hci_sock_dev_event() We presently use lock_sock() to acquire a lock on a socket in hci_sock_dev_event(), but this goes BUG because lock_sock() can sleep and we're already holding a read-write spinlock at that point. So, we must use the non-sleeping BH version, bh_lock_sock(). However, hci_sock_dev_event() is called from user context and hence using simply bh_lock_sock() will deadlock against a concurrent softirq that tries to acquire a lock on the same socket. Hence, disabling BH's before acquiring the socket lock and enable them afterwards, is the proper solution to fix socket locking in hci_sock_dev_event(). Cc: David Miller <[EMAIL PROTECTED]> Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]> Signed-off-by: Marcel Holtmann <[EMAIL PROTECTED]> Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]> --- net/bluetooth/hci_sock.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) --- diff -ruNp a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c --- a/net/bluetooth/hci_sock.c 2007-05-16 17:31:06.0 +0530 +++ b/net/bluetooth/hci_sock.c 2007-05-16 17:38:35.0 +0530 @@ -665,7 +665,8 @@ static int hci_sock_dev_event(struct not /* Detach sockets from device */ read_lock(_sk_list.lock); sk_for_each(sk, node, _sk_list.head) { - lock_sock(sk); + local_bh_disable(); + bh_lock_sock_nested(sk); if (hci_pi(sk)->hdev == hdev) { hci_pi(sk)->hdev = NULL; sk->sk_err = EPIPE; @@ -674,7 +675,8 @@ static int hci_sock_dev_event(struct not hci_dev_put(hdev); } - release_sock(sk); + bh_unlock_sock(sk); + local_bh_enable(); } read_unlock(_sk_list.lock); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Software raid0 will crash the file-system, when each disk is 5TB
Yeah, seems you've locked it down, :D. I've written 600GB of data now, and anything is still fine. Will let it run overnight, and fill the whole 11T. I'll post the result tomorrow Thanks a lot though. Jeff > -Original Message- > From: Neil Brown [mailto:[EMAIL PROTECTED] > Sent: Thursday, 17 May 2007 5:31 p.m. > To: [EMAIL PROTECTED]; Jeff Zheng; Michal Piotrowski; Ingo > Molnar; [EMAIL PROTECTED]; > linux-kernel@vger.kernel.org; [EMAIL PROTECTED] > Subject: RE: Software raid0 will crash the file-system, when > each disk is 5TB > > On Thursday May 17, [EMAIL PROTECTED] wrote: > > > > Uhm, I just noticed something. > > 'chunk' is unsigned long, and when it gets shifted up, we > might lose > > bits. That could still happen with the 4*2.75T arrangement, but is > > much more likely in the 2*5.5T arrangement. > > Actually, it cannot be a problem with the 4*2.75T arrangement. > chuck << chunksize_bits > > will not exceed the size of the underlying device *in*kilobytes*. > In that case that is 0xAE9EC800 which will git in a 32bit long. > We don't double it to make sectors until after we add > zone->dev_offset, which is "sector_t" and so 64bit arithmetic is used. > > So I'm quite certain this bug will cause exactly the problems > experienced!! > > > > > Jeff, can you try this patch? > > Don't bother about the other tests I mentioned, just try this one. > Thanks. > > NeilBrown > > > Signed-off-by: Neil Brown <[EMAIL PROTECTED]> > > > > ### Diffstat output > > ./drivers/md/raid0.c |2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff .prev/drivers/md/raid0.c ./drivers/md/raid0.c > > --- .prev/drivers/md/raid0.c2007-05-17 > 10:33:30.0 +1000 > > +++ ./drivers/md/raid0.c2007-05-17 15:02:15.0 +1000 > > @@ -475,7 +475,7 @@ static int raid0_make_request (request_q > > x = block >> chunksize_bits; > > tmp_dev = zone->dev[sector_div(x, zone->nb_dev)]; > > } > > - rsect = (((chunk << chunksize_bits) + zone->dev_offset)<<1) > > + rsect = sector_t)chunk << chunksize_bits) + > > +zone->dev_offset)<<1) > > + sect_in_chunk; > > > > bio->bi_bdev = tmp_dev->bdev; > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Software raid0 will crash the file-system, when each disk is 5TB
On Thursday May 17, [EMAIL PROTECTED] wrote: > > Uhm, I just noticed something. > 'chunk' is unsigned long, and when it gets shifted up, we might lose > bits. That could still happen with the 4*2.75T arrangement, but is > much more likely in the 2*5.5T arrangement. Actually, it cannot be a problem with the 4*2.75T arrangement. chuck << chunksize_bits will not exceed the size of the underlying device *in*kilobytes*. In that case that is 0xAE9EC800 which will git in a 32bit long. We don't double it to make sectors until after we add zone->dev_offset, which is "sector_t" and so 64bit arithmetic is used. So I'm quite certain this bug will cause exactly the problems experienced!! > > Jeff, can you try this patch? Don't bother about the other tests I mentioned, just try this one. Thanks. NeilBrown > Signed-off-by: Neil Brown <[EMAIL PROTECTED]> > > ### Diffstat output > ./drivers/md/raid0.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff .prev/drivers/md/raid0.c ./drivers/md/raid0.c > --- .prev/drivers/md/raid0.c 2007-05-17 10:33:30.0 +1000 > +++ ./drivers/md/raid0.c 2007-05-17 15:02:15.0 +1000 > @@ -475,7 +475,7 @@ static int raid0_make_request (request_q > x = block >> chunksize_bits; > tmp_dev = zone->dev[sector_div(x, zone->nb_dev)]; > } > - rsect = (((chunk << chunksize_bits) + zone->dev_offset)<<1) > + rsect = sector_t)chunk << chunksize_bits) + zone->dev_offset)<<1) > + sect_in_chunk; > > bio->bi_bdev = tmp_dev->bdev; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 3/3] Support removal of unused dentry entries via SLUB defrag interface
This patch allows the removal of unused dentry entries in a partial populated slab page. Very limited (yet) in what it can do for reclaim but this catches bad cases in which we have a long list of partial slabs with a few entries in each of them. We can free up the slabs that have only unused dentry entries in them. get_dentry() uses the dcache lock and then works with dget_locked to obtain a reference to the dentry. An additional complication is that the dentry may be in process of being freed or it may just have been allocated. In that case d_inode is NULL. If we discover this then we simply stay away from the object and return 1 to indicate to the defrag logic that this object will be free. Otherwise we increment the refcount and return success. kick_dentry() is called after get_dentry_reference() has been used and after the slab has dropped all of its own locks. The dentry pruning for unused entries works in a straighforward way. Note: The code here could be significantly improved. If we could get to a point where all used dentries could be moved then full dentry slab defragmentation and vacating of dentry slab pages would become possible. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> --- fs/dcache.c | 89 ++-- 1 file changed, 81 insertions(+), 8 deletions(-) Index: slub/fs/dcache.c === --- slub.orig/fs/dcache.c 2007-05-16 20:58:02.0 -0700 +++ slub/fs/dcache.c2007-05-16 20:59:27.0 -0700 @@ -2114,18 +2114,91 @@ static void __init dcache_init_early(voi INIT_HLIST_HEAD(_hashtable[loop]); } +/* + * The slab is holding off frees. Thus we can safely examine + * the object without the danger of it vanishing from under us. + */ +static int get_dentry(struct kmem_cache *s, void *private) +{ + struct dentry *dentry = private; + int result = 0; + + spin_lock(_lock); + /* +* dentry->d_inode is set to NULL when the dentry +* is freed. Use that as an indicator that we should +* not interfere with the freeing process. +*/ + if (dentry->d_inode) { + dget_locked(dentry); + if (atomic_read(>d_count) > 2) + /* +* Moving of dentries in use not +* implemented yet. +*/ + result = -EINVAL; + } else + result = 1; + spin_unlock(_lock); + return result; +} + +static void put_dentry(struct kmem_cache *s, void *private) +{ + struct dentry *dentry = private; + + dput(dentry); +} + +/* + * Slab has dropped all the locks. Get rid of the + * refcount we obtained earlier and also rid of the + * object. + */ +static int kick_dentry(struct kmem_cache *s, void *private) +{ + struct dentry *dentry = private; + + spin_lock(_lock); + spin_lock(>d_lock); + if (atomic_read(>d_count) > 1) { + /* +* Reference count was increased. +* We need to abandon the freeing of +* objects. +*/ + spin_unlock(>d_lock); + spin_unlock(_lock); + dput(dentry); + return -EBUSY; + } + + /* Remove from LRU */ + if (!list_empty(>d_lru)) { + dentry_stat.nr_unused--; + list_del_init(>d_lru); + } + /* Drop the entry */ + prune_one_dentry(dentry, 1); + spin_unlock(_lock); + return 0; +} + +static struct kmem_cache_ops dentry_kmem_cache_ops = { + .get = get_dentry, + .put = put_dentry, + .kick = kick_dentry, + .sync = synchronize_rcu +}; + static void __init dcache_init(unsigned long mempages) { int loop; - /* -* A constructor could be added for stable state like the lists, -* but it is probably not worth it because of the cache nature -* of the dcache. -*/ - dentry_cache = KMEM_CACHE(dentry, - SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD); - + dentry_cache = KMEM_CACHE_OPS(dentry, + SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD, + _kmem_cache_ops); + register_shrinker(_shrinker); /* Hash may have been set up in dcache_init_early */ -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/3] SLUB: add support for kmem_cache_ops
We use the parameter formerly used by the destructor to pass an optional pointer to a kmem_cache_ops structure to kmem_cache_create. kmem_cache_ops is created as empty. Later patches populate kmem_cache_ops. Create a KMEM_CACHE_OPS macro that allows the specification of a the kmem_cache_ops. Code to handle kmem_cache_ops is added to SLUB. SLAB and SLOB are updated to be able to take a kmem_cache_ops structure but will ignore it. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> --- include/linux/slab.h | 13 + include/linux/slub_def.h |1 + mm/slab.c|6 +++--- mm/slob.c|2 +- mm/slub.c| 44 ++-- 5 files changed, 44 insertions(+), 22 deletions(-) Index: slub/include/linux/slab.h === --- slub.orig/include/linux/slab.h 2007-05-15 21:19:51.0 -0700 +++ slub/include/linux/slab.h 2007-05-15 21:27:07.0 -0700 @@ -38,10 +38,13 @@ typedef struct kmem_cache kmem_cache_t _ void __init kmem_cache_init(void); int slab_is_available(void); +struct kmem_cache_ops { +}; + struct kmem_cache *kmem_cache_create(const char *, size_t, size_t, unsigned long, void (*)(void *, struct kmem_cache *, unsigned long), - void (*)(void *, struct kmem_cache *, unsigned long)); + const struct kmem_cache_ops *s); void kmem_cache_destroy(struct kmem_cache *); int kmem_cache_shrink(struct kmem_cache *); void *kmem_cache_alloc(struct kmem_cache *, gfp_t); @@ -59,9 +62,11 @@ int kmem_ptr_validate(struct kmem_cache * f.e. add cacheline_aligned_in_smp to the struct declaration * then the objects will be properly aligned in SMP configurations. */ -#define KMEM_CACHE(__struct, __flags) kmem_cache_create(#__struct,\ - sizeof(struct __struct), __alignof__(struct __struct),\ - (__flags), NULL, NULL) +#define KMEM_CACHE_OPS(__struct, __flags, __ops) \ + kmem_cache_create(#__struct, sizeof(struct __struct), \ + __alignof__(struct __struct), (__flags), NULL, (__ops)) + +#define KMEM_CACHE(__struct, __flags) KMEM_CACHE_OPS(__struct, __flags, NULL) #ifdef CONFIG_NUMA extern void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node); Index: slub/mm/slub.c === --- slub.orig/mm/slub.c 2007-05-15 21:25:46.0 -0700 +++ slub/mm/slub.c 2007-05-15 21:29:36.0 -0700 @@ -294,6 +294,9 @@ static inline int check_valid_pointer(st return 1; } +struct kmem_cache_ops slub_default_ops = { +}; + /* * Slow version of get and set free pointer. * @@ -2003,11 +2006,13 @@ static int calculate_sizes(struct kmem_c static int kmem_cache_open(struct kmem_cache *s, gfp_t gfpflags, const char *name, size_t size, size_t align, unsigned long flags, - void (*ctor)(void *, struct kmem_cache *, unsigned long)) + void (*ctor)(void *, struct kmem_cache *, unsigned long), + const struct kmem_cache_ops *ops) { memset(s, 0, kmem_size); s->name = name; s->ctor = ctor; + s->ops = ops; s->objsize = size; s->flags = flags; s->align = align; @@ -2191,7 +2196,7 @@ static struct kmem_cache *create_kmalloc down_write(_lock); if (!kmem_cache_open(s, gfp_flags, name, size, ARCH_KMALLOC_MINALIGN, - flags, NULL)) + flags, NULL, _default_ops)) goto panic; list_add(>list, _caches); @@ -2505,12 +2510,16 @@ static int slab_unmergeable(struct kmem_ if (s->ctor) return 1; + if (s->ops != _default_ops) + return 1; + return 0; } static struct kmem_cache *find_mergeable(size_t size, size_t align, unsigned long flags, - void (*ctor)(void *, struct kmem_cache *, unsigned long)) + void (*ctor)(void *, struct kmem_cache *, unsigned long), + const struct kmem_cache_ops *ops) { struct list_head *h; @@ -2520,6 +2529,9 @@ static struct kmem_cache *find_mergeable if (ctor) return NULL; + if (ops != _default_ops) + return NULL; + size = ALIGN(size, sizeof(void *)); align = calculate_alignment(flags, align, size); size = ALIGN(size, align); @@ -2555,13 +2567,15 @@ static struct kmem_cache *find_mergeable struct kmem_cache *kmem_cache_create(const char *name, size_t size, size_t align, unsigned long flags, void (*ctor)(void *, struct kmem_cache *, unsigned long), - void (*dtor)(void *, struct kmem_cache *, unsigned long)) + const struct kmem_cache_ops *ops) { struct kmem_cache
[patch 2/3] SLUB: Implement targeted reclaim and partial list defragmentation
Targeted reclaim allows to target a single slab for reclaim. This is done by calling kmem_cache_vacate(page); It will return 1 on success, 0 if the operation failed. The vacate functionality is also used for slab shrinking. During the shrink operation SLUB will generate a list sorted by the number of objects in use. We extract pages off that list that are only filled less than a quarter. These objects are then processed using kmem_cache_vacate. In order for a slabcache to support this functionality a couple of functions must be defined via kmem_cache_ops. These are int get(struct kmem_cache *s, void *) Must obtain a reference to the indicated object. SLUB guarantees that the objects is still allocated. However, another thread may be blocked in slab_free attempting to free the same object. It may succeed as soon as get() returns to the slab allocator. The function must detect this situation and return 1 if that is the case. If the object cannot be freed then a negative -Exx code must be returned indicating the reason for the failure. get() return 0 on success. No slab operations may be performed in get_reference(). Interrupts are disabled. What can be done is very limited. The slab lock for the page with the object is taken. Any attempt to perform a slab operation may lead to a deadlock. void put(struct kmem_cache *, void *) Used to restore the reference count obtained by get() if the reclaim logic decides to abandon the attempt to vacate all objects in a slab. This is usually the case if get() indicates that an object is not freeable. put() is optional. If it is not defined then it is assumed that we can simply abandon get()s on slab objects. int kick(struct kmem_cache *, void *) After SLUB has established references to the remaining objects in a slab it will drop all locks and then use kick() on each of the objects. The existence of the object is guaranteed by virtue of the earlier obtained reference. The callback may perform any slab operation since no locks are held at the time of call. Function must return 0 if the object was successfully freed. Return -Exxx to indicate that the object is not freeable and to stop further attempt to free objects in this slab. The callback should remove the object from the slab in some way. This may be accomplished by reclaiming the object and then running kmem_cache_free() or reallocating it and then running kmem_cache_free(). Reallocation is advantageous because the partial slabs were just sorted to have the partial slabs with the most objects first. Allocation is likely to result in filling up a slab so that it can be removed from the partial list. void sync(void) After all objects have been removed by kick()s this function will be called to ensure that all free operations have completed. Typically the function called here is synchronize_rcu() if the slab cache uses RCU to free objects. The function is optional. If it is not specified then no synchronization is done before removing the slab. If a kmem_cache_vacate on a page fails then the slab has usually a pretty low usage ratio. Go through the slab and resequence the freelist so that object addresses increase as we allocate objects. This will trigger the cacheline prefetcher and increase allocations speed. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> --- include/linux/slab.h | 34 + mm/slab.c|9 + mm/slob.c|9 + mm/slub.c| 304 +-- 4 files changed, 346 insertions(+), 10 deletions(-) Index: slub/include/linux/slab.h === --- slub.orig/include/linux/slab.h 2007-05-16 22:12:43.0 -0700 +++ slub/include/linux/slab.h 2007-05-16 22:12:44.0 -0700 @@ -39,6 +39,39 @@ void __init kmem_cache_init(void); int slab_is_available(void); struct kmem_cache_ops { + /* +* Called with slab lock held and interrupts disabled. +* No slab operation may be performed. +* +* Return 0 if reference was successfully obtained +* Return 1 if a concurrent kmem_cache_free is waiting to free object +* Return -errcode if it is not possible to free the object. +* No reference was obtained. +*/ + int (*get)(struct kmem_cache *, void *); + + /* +* Use to restore the reference count if we abandon the +* attempt to vacate a slab page due to an unmovable +* object. +*/ + void (*put)(struct kmem_cache *, void *); + + /* +* Called with no locks held and interrupts enabled. +* Any operation may be
[patch 0/3] Slab Defrag / Slab Targeted Reclaim
Initial support for Slab defragmentation and targeted reclaim. The functionality here is minimal. We establish a slab API to allow removal or moving of objects between slabs. The only user provided here is a dentry cache reclaim capability. This is limited to the removal of unused dentries for now. It is planned to later add a similar inode reclaim capability and then extend the move/reclaim to support moving of dentries and inodes. Slab defragmentation is performed during kmem_cache_shrink. This is usually triggered through the slab shrinkers but can also be manually triggered through the slabinfo command. Support is provided for antifrag/defrag to evict a specific slab page through the kmem_cache_vacate function call. Since we can only reclaim unused dentries for now that functionality is pretty limited (we need to put some work into making dentries and inode more reclaimable or movable) but we can increase the capabilities over time which will allow us to move slabs from the reclaimable area into the movable area. This will shrink the reclaimable area significantly. Since we can target the vacating of pages this may allow the antifrag code to remove a page that hinders the freeing of higher order page. -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: usb-scanner-cameras kernel-2.6.22 and udev-095 problem
On Wed, May 16, 2007 at 02:58:22PM -0500, [EMAIL PROTECTED] wrote: > greg > > CONFIG_SYSFS_DEPRECATED=Y > > check that we have > rwxrwxrwx 1 root root 15 May 16 13:43 scanner-usbdev2.12 -> > bus/usb/002/012 > > in /dev directory after usb-scanner connection for kernel-2.6.20 > we don't have this for kernel-2.6.22 Who creates that? udev? What udev rule does that? Can you run 'udevtest' to see what is supposed to be matching here? Odds are you have a broken rule somehow. > > 2.6.20 usb-scanner connect > > # /usr/sbin/udevmonitor > > UEVENT[1179340996.886805] > add@/devices/pci:00/:00:0b.1/usb2/2-2/2-2.2 > UEVENT[1179340996.886864] add@/class/usb_endpoint/usbdev2.12_ep00 > UEVENT[1179340996.887438] > add@/devices/pci:00/:00:0b.1/usb2/2-2/2-2.2/2-2.2:1.0 > UEVENT[1179340996.887467] add@/class/usb_endpoint/usbdev2.12_ep81 > UEVENT[1179340996.887484] add@/class/usb_endpoint/usbdev2.12_ep02 > UEVENT[1179340996.887499] add@/class/usb_endpoint/usbdev2.12_ep83 > UEVENT[1179340996.887514] add@/class/usb_device/usbdev2.12 > UDEV [1179340996.921506] > add@/devices/pci:00/:00:0b.1/usb2/2-2/2-2.2 > UDEV [1179340996.936005] add@/class/usb_endpoint/usbdev2.12_ep00 > UDEV [1179340996.960144] add@/class/usb_endpoint/usbdev2.12_ep81 > UDEV [1179340996.963672] add@/class/usb_endpoint/usbdev2.12_ep02 > UDEV [1179340996.967439] add@/class/usb_endpoint/usbdev2.12_ep83 > UDEV [1179340997.240934] > add@/devices/pci:00/:00:0b.1/usb2/2-2/2-2.2/2-2.2:1.0 > UDEV [1179340997.473142] add@/class/usb_device/usbdev2.12 This last device is correct, and what your udev rule should be using to create your symlink. You didn't answer my, "what distro are you using" question. Also, what package created the udev rule that creates the above symlink? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Software raid0 will crash the file-system, when each disk is 5TB
> What is the nature of the corruption? Is it data in a file > that is wrong when you read it back, or does the filesystem > metadata get corrupted? The corruption is in fs metadata, jfs is completely destroied, after Umount, fsck does not recogonize it as jfs anymore. Xfs gives kernel Crash, but seems still recoverable. > > Can you try the configuration that works, and sha1sum the > files after you have written them to make sure that they > really are correct? We have verified the data on the working configuration, we have written around 900 identical 10G files , and verified that the md5sum is actually the same. The verification took two days though :) > My thought here is "maybe there is a bad block on one device, > and the block is used for data in the 'working' config, and > for metadata in the 'broken' config. > > Can you try a degraded raid10 configuration. e.g. > >mdadm -C /dev/md1 --level=10 --raid-disks=4 /dev/first missing \ >/dev/second missing > > That will lay out the data in exactly the same place as with > raid0, but will use totally different code paths to access > it. If you still get a problem, then it isn't in the raid0 code. I will try this later today. As I'm now trying different size of the component. 3.4T, seems working. Test 4.1T right now. > Maybe try version 1 metadata (mdadm --metadata=1). I doubt > that would make a difference, but as I am grasping at straws > already, it may be a straw woth trying. Well the problem may also be in 3ware disk array, or disk array driver. The guy complaining about the same problem is also using 3ware disk array controller. But there is no way to verify that and a single disk array has been working fine for us. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: user pointers and race conditions
From: sk b <[EMAIL PROTECTED]> Date: Wed, 16 May 2007 22:56:22 -0600 > 3:if (!access_ok(VERIFY_READ,stp,sizeof(struct st))) > 4:return; > 5:if (!access_ok(VERIFY_WRITE,stp->u,sizeof(int))) > 6:return; This code would not exist in the kernel, the kernel cannot dereference stp->u. The stp->u dereference would silently work on x86 and x86_64 but it would generate an exception on sparc64 and other platforms. User space accesses must go through the proper copy_from_user(), copy_to_user, get_user(), and put_user() interfaces. It must first copy stp into a local kernel space copy, then it may inspect the value of stp->u. And yes sparse would catch this problem in your code, because the "__user" annotations would catch the illegal "stp->u" dereference. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: user pointers and race conditions
On Wed, May 16, 2007 at 10:56:22PM -0600, sk b wrote: > > Hello, > > I'm wondering whether there is an exploitable TOCTTOU race condition in the > way user pointers are handled in the kernel. Consider the following code: > > 1: struct st { int *u; }; > 2: void syscall(struct st * stp) { > 3:if (!access_ok(VERIFY_READ,stp,sizeof(struct st))) > 4:return; > 5:if (!access_ok(VERIFY_WRITE,stp->u,sizeof(int))) ... and there's your bug - direct access to userland data. The normal variant is to use accessors (get_user() or copy_from_user()) to fetch the value of stp->u. At which point races of the kind you mentioned take an obviously dumb code (explicitly copying the same struct from userland _twice_). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Software raid0 will crash the file-system, when each disk is 5TB
On Wednesday May 16, [EMAIL PROTECTED] wrote: > On Thu, 17 May 2007, Neil Brown wrote: > > > On Thursday May 17, [EMAIL PROTECTED] wrote: > >> > >>> The only difference of any significance between the working > >>> and non-working configurations is that in the non-working, > >>> the component devices are larger than 2Gig, and hence have > >>> sector offsets greater than 32 bits. > >> > >> Do u mean 2T here?, but in both configuartion, the component devices are > >> larger than 2T (2.25T&5.5T). > > > > Yes, I meant 2T, and yes, the components are always over 2T. > > 2T decimal or 2T binary? > Either. The smallest as actually 2.75T (typo above). Precisely it was 2929641472 kilobytes or 5859282944 sectors or 0x15D3D9000 sectors. So it is over 32bits already... Uhm, I just noticed something. 'chunk' is unsigned long, and when it gets shifted up, we might lose bits. That could still happen with the 4*2.75T arrangement, but is much more likely in the 2*5.5T arrangement. Jeff, can you try this patch? Thanks. NeilBrown Signed-off-by: Neil Brown <[EMAIL PROTECTED]> ### Diffstat output ./drivers/md/raid0.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff .prev/drivers/md/raid0.c ./drivers/md/raid0.c --- .prev/drivers/md/raid0.c2007-05-17 10:33:30.0 +1000 +++ ./drivers/md/raid0.c2007-05-17 15:02:15.0 +1000 @@ -475,7 +475,7 @@ static int raid0_make_request (request_q x = block >> chunksize_bits; tmp_dev = zone->dev[sector_div(x, zone->nb_dev)]; } - rsect = (((chunk << chunksize_bits) + zone->dev_offset)<<1) + rsect = sector_t)chunk << chunksize_bits) + zone->dev_offset)<<1) + sect_in_chunk; bio->bi_bdev = tmp_dev->bdev; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Convert namespace_sem to a mutex
On 5/17/07, Bharata B Rao <[EMAIL PROTECTED]> wrote: From: Bharata B Rao <[EMAIL PROTECTED]> namespace_sem is a rwsem. It is acquired as read sem at only one place(used ^^ Actually, this ... by /proc/mounts, /proc//mounts and /proc//mountstats). In all other cases it is acquired as a write sem. So, as there is not more than one reader ^ ... does not mean this. Multiple threads could be reading mounts or mountstats, and otoh mount(2) and umount(2) (acquire it as write sem) could be less frequent? for this sem, this can be a generic sem (and not rwsem) and if so it can be a mutex. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Convert namespace_sem to a mutex
On Thu, May 17, 2007 at 10:20:41AM +0530, Bharata B Rao wrote: > From: Bharata B Rao <[EMAIL PROTECTED]> > > namespace_sem is a rwsem. It is acquired as read sem at only one place(used > by /proc/mounts, /proc//mounts and /proc//mountstats). In all other > cases it is acquired as a write sem. So, as there is not more than one reader > for this sem, this can be a generic sem (and not rwsem) and if so it can be > a mutex. Except that read accesses outnumber write ones by far and we have no reason for serializing them against each other. NAK. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
user pointers and race conditions
Hello, I'm wondering whether there is an exploitable TOCTTOU race condition in the way user pointers are handled in the kernel. Consider the following code: 1: struct st { int *u; }; 2: void syscall(struct st * stp) { 3:if (!access_ok(VERIFY_READ,stp,sizeof(struct st))) 4:return; 5:if (!access_ok(VERIFY_WRITE,stp->u,sizeof(int))) 6:return; 7:foo(); //user app writes a kernel address to stp->u 8:*(stp->u) = 0; 9:} Suppose syscall is some system call and, thus, stp and stp->u are user pointers. The function checks the stp and stp->u pointers using the access_ok macro on lines 3 and 5. Also suppose that the call to foo on line 7 takes a non-trivial amount of time to execute. During the time it takes foo to execute, the user application writes a kernel address to stp->u. Note that this write occurs after the check on line 5. Then, on line 8, the kernel writes to stp->u which contains a kernel address. So, the user application could force the kernel to overwrite itself. Is it possible to exploit this race condition? If so, does Sparse check for this? -SKB _ Download Messenger. Start an i’m conversation. Support a cause. Join now. http://im.live.com/messenger/im/home/?source=TAGWL_MAY07- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Software raid0 will crash the file-system, when each disk is 5TB
On Thu, 17 May 2007, Neil Brown wrote: On Thursday May 17, [EMAIL PROTECTED] wrote: The only difference of any significance between the working and non-working configurations is that in the non-working, the component devices are larger than 2Gig, and hence have sector offsets greater than 32 bits. Do u mean 2T here?, but in both configuartion, the component devices are larger than 2T (2.25T&5.5T). Yes, I meant 2T, and yes, the components are always over 2T. 2T decimal or 2T binary? So I'm at a complete loss. The raid0 code follows the same paths and does the same things and uses 64bit arithmetic where needed. So I have no idea how there could be a difference between these two cases. I'm at a loss... NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1 does not boot on VIA C3_2 cause of X86_CMPXCHG64
Linus Torvalds wrote: > > On Wed, 16 May 2007, H. Peter Anvin wrote: >> It gets turned on by the code in arch/i386/kernel/cpu. It's just that >> the new code that Andi added runs during setup, i.e. in real mode, so >> *way* earlier than that. > > Ahh. Do we really need it that early? The reason to do it early is so that we can still get a message out if the CPU doesn't have the necessary features. This is generic code and not specific to CX8. Since I'm rewriting the setup code in C, I have added code to enable features on VIA and Transmeta CPUs (there was already code in there to enable features on AMD; Intel isn't known to hide any features other than PAE on 400 MHz FSB Pentium-M.) I think the early feature detection makes good sense. It's a heckuva lot nicer to get a message on your screen saying that you can't boot this kernel on this CPU than a crash, or an early_printk which may never actually get to you. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Convert namespace_sem to a mutex
From: Bharata B Rao <[EMAIL PROTECTED]> namespace_sem is a rwsem. It is acquired as read sem at only one place(used by /proc/mounts, /proc//mounts and /proc//mountstats). In all other cases it is acquired as a write sem. So, as there is not more than one reader for this sem, this can be a generic sem (and not rwsem) and if so it can be a mutex. Patch is for 2.6.22-rc1-mm1. Signed-off-by: Bharata B Rao <[EMAIL PROTECTED]> --- fs/namespace.c | 48 1 files changed, 24 insertions(+), 24 deletions(-) --- a/fs/namespace.c +++ b/fs/namespace.c @@ -37,7 +37,7 @@ static int event; static struct list_head *mount_hashtable __read_mostly; static int hash_mask __read_mostly, hash_bits __read_mostly; static struct kmem_cache *mnt_cache __read_mostly; -static struct rw_semaphore namespace_sem; +static struct mutex namespace_mutex; int nr_user_mounts; int max_user_mounts = 1024; @@ -396,7 +396,7 @@ static void *m_start(struct seq_file *m, struct list_head *p; loff_t l = *pos; - down_read(_sem); + mutex_lock(_mutex); list_for_each(p, >list) if (!l--) return list_entry(p, struct vfsmount, mnt_list); @@ -413,7 +413,7 @@ static void *m_next(struct seq_file *m, static void m_stop(struct seq_file *m, void *v) { - up_read(_sem); + mutex_unlock(_mutex); } static inline void mangle(struct seq_file *m, const char *s) @@ -683,7 +683,7 @@ static int do_umount(struct vfsmount *mn return retval; } - down_write(_sem); + mutex_lock(_mutex); spin_lock(_lock); event++; @@ -696,7 +696,7 @@ static int do_umount(struct vfsmount *mn spin_unlock(_lock); if (retval) security_sb_umount_busy(mnt); - up_write(_sem); + mutex_unlock(_mutex); release_mounts(_list); return retval; } @@ -1002,12 +1002,12 @@ static int do_change_type(struct nameida if (nd->dentry != nd->mnt->mnt_root) return -EINVAL; - down_write(_sem); + mutex_lock(_mutex); spin_lock(_lock); for (m = mnt; m; m = (recurse ? next_mnt(m, mnt) : NULL)) change_mnt_propagation(m, type); spin_unlock(_lock); - up_write(_sem); + mutex_unlock(_mutex); return 0; } @@ -1030,7 +1030,7 @@ static int do_loopback(struct nameidata if (err) return err; - down_write(_sem); + mutex_lock(_mutex); err = -EINVAL; if (IS_MNT_UNBINDABLE(old_nd.mnt)) goto out; @@ -1062,7 +1062,7 @@ static int do_loopback(struct nameidata } out: - up_write(_sem); + mutex_unlock(_mutex); path_release(_nd); return err; } @@ -1124,7 +1124,7 @@ static int do_move_mount(struct nameidat if (err) return err; - down_write(_sem); + mutex_lock(_mutex); while (d_mountpoint(nd->dentry) && follow_down(>mnt, >dentry)) ; err = -EINVAL; @@ -1176,7 +1176,7 @@ static int do_move_mount(struct nameidat out1: mutex_unlock(>dentry->d_inode->i_mutex); out: - up_write(_sem); + mutex_unlock(_mutex); if (!err) path_release(_nd); path_release(_nd); @@ -1238,7 +1238,7 @@ int do_add_mount(struct vfsmount *newmnt { int err; - down_write(_sem); + mutex_lock(_mutex); /* Something was mounted here while we slept */ while (d_mountpoint(nd->dentry) && follow_down(>mnt, >dentry)) ; @@ -1267,11 +1267,11 @@ int do_add_mount(struct vfsmount *newmnt list_add_tail(>mnt_expire, fslist); spin_unlock(_lock); } - up_write(_sem); + mutex_unlock(_mutex); return 0; unlock: - up_write(_sem); + mutex_unlock(_mutex); mntput(newmnt); return err; } @@ -1337,9 +1337,9 @@ static void expire_mount_list(struct lis get_mnt_ns(ns); spin_unlock(_lock); - down_write(_sem); + mutex_lock(_mutex); expire_mount(mnt, mounts, ); - up_write(_sem); + mutex_unlock(_mutex); release_mounts(); mntput(mnt); put_mnt_ns(ns); @@ -1612,12 +1612,12 @@ static struct mnt_namespace *dup_mnt_ns( init_waitqueue_head(_ns->poll); new_ns->event = 0; - down_write(_sem); + mutex_lock(_mutex); /* First pass: copy the tree topology */ new_ns->root = copy_tree(mnt_ns->root, mnt_ns->root->mnt_root, CL_COPY_ALL | CL_EXPIRE, 0); if (IS_ERR(new_ns->root)) { - up_write(_sem); + mutex_unlock(_mutex); kfree(new_ns); return NULL; } @@ -1651,7 +1651,7 @@ static struct
Re: [BUG] (regression) AMD k6-III/450 won't boot w/2.6.22-rc1
Dave Jones wrote: > Bob, does this patch make it boot again for you? > > Dave > > Some AMD K6's advertise machine check capability, but don't actually > have an Intel compatible implementation. It also doesn't actually work, > so don't advertise it as being present. > > Signed-off-by: Dave Jones <[EMAIL PROTECTED]> NAK. No difference. Identical panic message. (Yes, I double-checked to make sure I was booting the patched kernel :-)). -- --- Bob Tracy WTO + WIPO = DMCA? http://www.anti-dmca.org [EMAIL PROTECTED] --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Software raid0 will crash the file-system, when each disk is 5TB
On Thursday May 17, [EMAIL PROTECTED] wrote: > I tried the patch, same problem show up, but no bug_on report > > Is there any other things I can do? > What is the nature of the corruption? Is it data in a file that is wrong when you read it back, or does the filesystem metadata get corrupted? Can you try the configuration that works, and sha1sum the files after you have written them to make sure that they really are correct? My thought here is "maybe there is a bad block on one device, and the block is used for data in the 'working' config, and for metadata in the 'broken' config. Can you try a degraded raid10 configuration. e.g. mdadm -C /dev/md1 --level=10 --raid-disks=4 /dev/first missing \ /dev/second missing That will lay out the data in exactly the same place as with raid0, but will use totally different code paths to access it. If you still get a problem, then it isn't in the raid0 code. Maybe try version 1 metadata (mdadm --metadata=1). I doubt that would make a difference, but as I am grasping at straws already, it may be a straw woth trying. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] make hci_notifier a blocking notifier (was Re: BUG: sleeping function called from invalid context at net/core/sock.c:1523)
On 5/16/07, Satyam Sharma <[EMAIL PROTECTED]> wrote: This issue has actually been resolved, see the patch at: http://lkml.org/lkml/2007/5/16/149 Ah, excellent. Thanks! Ray - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86-64 highres/dyntick support 2.6.22-rc1-v5
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Frank Sorenson wrote: > After adding *lots* of early_printks, I see that it hangs in > hpet_is_known(hdp) called from hpet_alloc(), so something in the hpet > code is still buggy. Adding nohpet to the kernel command line allows it > to boot correctly. Hrm. Looks like it gets past the hpet_is_known There's still something in the hpet detection code, but I didn't get to the bottom of it yet. I'll do some more debugging to track down where it's really hanging. Sorry for the noise. Frank - -- Frank Sorenson - KD7TZK Linux Systems Engineer, DSS Engineering, UBS AG [EMAIL PROTECTED] -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org iD8DBQFGS9lFaI0dwg4A47wRApdSAJoDsFphRHZq/tu3d4nJaqMvt+tLGQCghf1L OCuPEpCRr9tBSnBdVNiShRE= =NDZn -END PGP SIGNATURE- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: 2.6.21 - "make modules" with GREP_OPTIONS="-C1" (and other)
Martin Christoph wrote: [1] Summary: If i have some GREP_OPTIONS set (like -C1 or other) i get several errors while trying to do "make modules". [2] Full description: With some GREP_OPTIONS set "make modules" drops several errors like that: [EMAIL PROTECTED] /usr/src/linux # GREP_OPTIONS="-C1" make modules CHK include/linux/version.h CHK include/linux/utsrelease.h Building modules, stage 2. [...] WARNING: "aes_enc_blk" [arch/i386/crypto/aes.ko] undefined! WARNING: "aes_dec_blk" [arch/i386/crypto/aes.ko] undefined! [...] make[1]: *** [__modpost] Error 1 make: *** [modules] Error 2 [3] Keywords: "make modules", "GREP_OPTIONS", "WARNING", "undefined" [X.] Suggestion to fix: Unset GREP_OPTIONS within make process. While I admit that this will break the build, I think it's safe to say that there are hundreds of environment variables that will influence the kbuild system and makefiles. It's going to be an uphill battle if you want to fix each and every occurrence of a *possible* build breakage due to an environment variable being set wrongly. I think it's perfectly fine for the kbuild system to expect a reasonably sane and clean build system. Those who want to set specific variables to influence their build should be able to do so as well, without getting settings removed. In your case, I would suggest not setting this option by default in your shell ;) Cheers, Auke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1-mm1
On 5/16/07, H. Peter Anvin <[EMAIL PROTECTED]> wrote: Andy Whitcroft wrote: > Getting this on both x86 and x86_64 boxes, they are the older boxen so > likely older compilers: Please give the gcc version number. > CC arch/x86_64/boot/memory.o > arch/i386/boot/memory.c: In function `detect_memory': > arch/i386/boot/memory.c:32: error: can't find a register in class `DREG' > while reloading `asm' > > Seems to come from git-netsetup, but that tree isn't pulled into your > git version of -mm so I can't be more specific. Does the following patch work for you? -hpa diff --git a/arch/i386/boot/memory.c b/arch/i386/boot/memory.c index 8a82aa9..d7b250b 100644 --- a/arch/i386/boot/memory.c +++ b/arch/i386/boot/memory.c @@ -30,7 +30,7 @@ static int detect_memory_e820(void) size = sizeof(struct e820entry); id = SMAP; asm("int $0x15; setc %0" - : "=dm" (err), "+b" (next), "+d" (id), "+c" (size), + : "=am" (err), "+b" (next), "+d" (id), "+c" (size), "=m" (*desc) : "D" (desc), "a" (0xe820)); Observed same problem with gcc version 3.4.4 20050721 (Red Hat 3.4.4-2) and binutils-2.15.92.0.2-15 and the above patch fixes it. Regards, Bharata. -- "Men come and go but mountains remain" -- Ruskin Bond. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] make hci_notifier a blocking notifier (was Re: BUG: sleeping function called from invalid context at net/core/sock.c:1523)
On 5/17/07, Ray Lee <[EMAIL PROTECTED]> wrote: Apologies for taking so long to get back to you -- I've been on the road for the last week and have finally got to a point where I could test the patch. On 5/6/07, Satyam Sharma <[EMAIL PROTECTED]> wrote: > (Dropped Pavel, Rafael and linux-pm from CC list, this isn't a PM > error so don't want to spam them; and added bluez-devel) > > On 5/7/07, Ray Lee <[EMAIL PROTECTED]> wrote: > > On 5/6/07, Alan Stern <[EMAIL PROTECTED]> wrote: > > > On Sun, 6 May 2007, Satyam Sharma wrote: > > > > > > > Anyway, the hci_notifier is called from the following six call sites: > > > > > > > > hci_dev_open() and hci_dev_close() -> both called from > > > > hci_sock_ioctl() => both can sleep > > > > hci_register_dev() and hci_unregister_dev() => again both are capable > > > > of sleeping > > > > hci_suspend_dev() and hci_resume_dev() -> called from the .suspend() > > > > and .resume() of the hci_usb_driver, and again both of these can sleep > > > > > > > > Is there any other reason why hci_notifier must be an atomic notifier? > > > > > > > > (CC'ing Alan Stern just in case, apparently hci_notifier became atomic > > > > when notifier chains were classified into atomic / blocking) > > > > > > I don't remember exactly why this particular choice was made. Perhaps we > > > found that the notifier callout routines didn't use any blocking > > > primitives (we may have been mistaken about this -- there was a lot of > > > code to check) and so therefore the choice didn't matter. In that case we > > > probably just decided to make it an atomic notifier to keep things simple. > > > > > > As you found, changing it to a blocking notifier is very easy. Provided > > > all the callers are non-atomic it should work just fine. > > > > Okay, I'll go ahead and try the patch, then, and report back. > > You'd still get the BUG message. To fully resolve the problem, we need > to make the hci_sock_dev_event() notifier callout blocking (which > happened with this patch) but also convert hci_sk_list.lock to a > rwsem, but some users of that rwlock (other than hci_sock_dev_event) > are atomic. > > However, please do try and get back, as your testing would still be > helpful to see whether converting hci_notifier to blocking had other > side-effects -- if you only see the same message again and otherwise > things seem fine, then we're good as far as at least this change was > concerned. Yes, it's roughly the same trace. There are some differences, though those are likely due to me finding a new way to trigger the issue. (My laptop has a button to turn the WiFi/Bluetooth on and off. Hitting that and causing a disconnect of the internal Bluetooth connector triggers the same issue without going through a suspend/resume cycle.) Hi Ray, This issue has actually been resolved, see the patch at: http://lkml.org/lkml/2007/5/16/149 [ We've slightly altered the locking scheme, but it's also good to know that converting hci_notifier to a blocking notifier doesn't cause any troubles either. If this is fine with other drivers too, this could actually be a separate patch. ] I'll also soon send that patch to Andrew, will Cc you too. Thanks, Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PROBLEM: 2.6.21 - "make modules" with GREP_OPTIONS="-C1" (and other)
[1] Summary: If i have some GREP_OPTIONS set (like -C1 or other) i get several errors while trying to do "make modules". [2] Full description: With some GREP_OPTIONS set "make modules" drops several errors like that: [EMAIL PROTECTED] /usr/src/linux # GREP_OPTIONS="-C1" make modules CHK include/linux/version.h CHK include/linux/utsrelease.h Building modules, stage 2. [...] WARNING: "aes_enc_blk" [arch/i386/crypto/aes.ko] undefined! WARNING: "aes_dec_blk" [arch/i386/crypto/aes.ko] undefined! [...] make[1]: *** [__modpost] Error 1 make: *** [modules] Error 2 [3] Keywords: "make modules", "GREP_OPTIONS", "WARNING", "undefined" [X.] Suggestion to fix: Unset GREP_OPTIONS within make process. signature.asc Description: OpenPGP digital signature
Re: Weird hard disk noise on shutdown (bug #7674)
On Wed, 16 May 2007, Rob Landley wrote: > > But you need to detect if the kernel has proper SCSI device shutdown > > support, because if it does not, you have to do a cache flush and spindown > > on shutdown(8) if you can... > > Or (and this is just a thought), you could upgrade your kernel so it > correctly > handles your hardware, treating this just like any other driver bug or other The distros can't update kernels that easily on their stable branches. And in the userland side, we are not breaking things any further for users of kernels before 2.6.22 anyway. Don't expect shutdown(8) to remove support for <2.6.22 any time soon, at least in Debian. That will happen only when we are forced by some other reason to completely break compatibility with such kernels. > Last I checked you didn't have to spin down a USB flash key. If SATA is But you have to spin down an USB HD... el-cheapo USB enclosures will NOT do it for you. It is not an easy problem. > SCSI, what the heck is SAS? (Answer: a cynical marketing hack to bleed SCSI > bigots for the huge margins they've always been bled for. But oh well.) It Actually, in my limited experience, SAS is marginally less crappy than SATA, and has a higher MTBF, probably because the manufacturers try to cut less corners. But if one can get high-quality SATA drives (where?!), I don't know why SAS would be superior to SATA. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1 does not boot on VIA C3_2 cause of X86_CMPXCHG64
On Wed, 16 May 2007, H. Peter Anvin wrote: > > It gets turned on by the code in arch/i386/kernel/cpu. It's just that > the new code that Andi added runs during setup, i.e. in real mode, so > *way* earlier than that. Ahh. Do we really need it that early? Now, it's easy enough to just turn off CONFIG_X86_CMPXCHG64 (it really should be "8B" instead of "64", but that's another issue) for those things, and nobody should really care, but still, maybe we could re-do the early bits to be more polite to those VIA CPU's? I thought the cmpxchg8b stuff was just used to page table setup. Do those things even _support_ PAE? What else uses it? Early setup in real mode? What am I missing? My grep powers are waning.. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] recalc_sigpending_tsk fixes
> We already discussed this, this is not so important, but how about > > void recalc_sigpending_and_wake(struct task_struct *t) > { > int was_pending = signal_pending(t); > > if (recalc_sigpending_tsk(t) && !was_pending) > signal_wake_up(t, 0); > } > > ? > > This "was_pending" is more a documenation than a optimization. I don't object, but I think another comment about the wakeup being sometimes superfluous is enough, if anything. Thanks, Roland - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Software raid0 will crash the file-system, when each disk is 5TB
I tried the patch, same problem show up, but no bug_on report Is there any other things I can do? Jeff > Yes, I meant 2T, and yes, the components are always over 2T. > So I'm at a complete loss. The raid0 code follows the same > paths and does the same things and uses 64bit arithmetic where needed. > > So I have no idea how there could be a difference between > these two cases. > > I'm at a loss... > > NeilBrown > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1 does not boot on VIA C3_2 cause of X86_CMPXCHG64
Linus Torvalds wrote: > > On Thu, 17 May 2007, Christian wrote: > >> Linus Torvalds wrote: >>> Can you check? The Nehemian (C3-2) should be model 9 or greater. >> Yes, it's a Nehemiah > > Ok. If so, we should blacklist both MCYRIXIII and MVIAC3_2, I suspect. > >> lola:~ # cat /proc/cpuinfo >> flags : fpu vme de pse tsc msr cx8 sep mtrr pge cmov pat mmx fxsr >> sse rng rng_en ace ace_en > > However, it does seem to *claim* to support "cx8" aka cmpxchg8b. What's up > with that? It gets turned on by the code in arch/i386/kernel/cpu. It's just that the new code that Andi added runs during setup, i.e. in real mode, so *way* earlier than that. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dock support on thinkpad x60s for DVD/CDROM
On Wed, 16 May 2007, George Nychis wrote: > I was wondering if any progress was made on the docking station support for > new thinkpads, > like the x60s ultradock. I'm looking for support for the CD/DVD burner on > it. I tried > enabling the most recent dock support in the kernel and still nothing. > (2.6.22-rc1) ACPI generic dock driver, or thinkpad-acpi dock driver? -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Weird hard disk noise on shutdown (bug #7674)
On Wednesday 16 May 2007 8:58 pm, Henrique de Moraes Holschuh wrote: > On Wed, 16 May 2007, Rob Landley wrote: > > Ok, so the change is to get shutdown to _stop_ doing something stupid > > (spinning down the disk without first flushing the cache), and the correct > > thing for shutdown to do is keep its' mitts off the thing and let the kernel > > power down the darn hardware? > > Yes, for *all* SCSI disk devices, libata or not. I realize that this time next year it won't be possible to use a ramdisk or a network block device without going through the SCSI layer, but while it remains an option I'm relishing _not_ using it, thanks. > But you need to detect if the kernel has proper SCSI device shutdown > support, because if it does not, you have to do a cache flush and spindown > on shutdown(8) if you can... Or (and this is just a thought), you could upgrade your kernel so it correctly handles your hardware, treating this just like any other driver bug or other lack of proper hardware support in the history of Linux. (Back when APM couldn't power off the machine at the end of the shutdown sequence, did we modify shutdown to try to work around this, or did we fix it in the kernel so it worked?) Why does everybody want to shoehorn everything through the SCSI layer, anyway? Last I checked you didn't have to spin down a USB flash key. If SATA is SCSI, what the heck is SAS? (Answer: a cynical marketing hack to bleed SCSI bigots for the huge margins they've always been bled for. But oh well.) It would be hilarious if I didn't have to put up with it renumbering my devices and imposing requirements for hardware I haven't got on hardware I have got... Rob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] make slab gfp fair
On Mon, 14 May 2007, Peter Zijlstra wrote: > > In the interest of creating a reserve based allocator; we need to make the > slab > allocator (*sigh*, all three) fair with respect to GFP flags. > > That is, we need to protect memory from being used by easier gfp flags than it > was allocated with. If our reserve is placed below GFP_ATOMIC, we do not want > a > GFP_KERNEL allocation to walk away with it - a scenario that is perfectly > possible with the current allocators. And the solution is to fail the allocation of the process which tries to walk away with it. The failing allocation will lead to the killing of the process right? We already have an OOM killer which potentially kills random processes. We hate it. Could you please modify the patchset to *avoid* failure conditions. This patchset here only manages failure conditions. The system should not get into the failure conditions in the first place! For that purpose you may want to put processes to sleep etc. But in order to do so you need to figure out which processes you need to make progress. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Software raid0 will crash the file-system, when each disk is 5TB
On Thursday May 17, [EMAIL PROTECTED] wrote: > > > The only difference of any significance between the working > > and non-working configurations is that in the non-working, > > the component devices are larger than 2Gig, and hence have > > sector offsets greater than 32 bits. > > Do u mean 2T here?, but in both configuartion, the component devices are > larger than 2T (2.25T&5.5T). Yes, I meant 2T, and yes, the components are always over 2T. So I'm at a complete loss. The raid0 code follows the same paths and does the same things and uses 64bit arithmetic where needed. So I have no idea how there could be a difference between these two cases. I'm at a loss... NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] XFS: memory leak in xfs_inactive() - is xfs_trans_free() enough or do we need xfs_trans_cancel() ?
On Wed, May 16, 2007 at 11:31:16PM +0200, Jesper Juhl wrote: > Hi, > > The Coverity checker found a memory leak in xfs_inactive(). > So, the code allocates a transaction, but in the case where 'truncate' is > !=0 and xfs_itruncate_start(ip, XFS_ITRUNC_DEFINITE, 0); happens to return > an error, we'll just return from the function without dealing with the > memory allocated byxfs_trans_alloc() and assigned to 'tp', thus it'll be > orphaned/leaked - not good. Yeah, introduced by: http://git2.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d3cf209476b72c83907a412b6708c5e498410aa7 Thanks for reporting the problem, Jesper. > What I'm wondering is this; is it enough, at this point, to call > xfs_trans_free(tp); (it would seem to me that would be OK, but I'm not > intimite with this code) or do we need a full xfs_trans_cancel(tp, 0); ??? xfs_trans_free() is not supposed to be called by anything but the transaction code (it's static). So a xfs_trans_cancel() would need to be issued. > In case I'm right and xfs_trans_free(tp); is all we need, then please > consider the patch below. Otherwise please NACK the patch and I'll cook up > another one :-) NACK ;) xfs_trans_cancel() is needed. Patch below. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/xfs_vnodeops.c |1 + 1 file changed, 1 insertion(+) Index: 2.6.x-xfs-new/fs/xfs/xfs_vnodeops.c === --- 2.6.x-xfs-new.orig/fs/xfs/xfs_vnodeops.c2007-05-11 16:04:03.0 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_vnodeops.c 2007-05-17 12:37:25.671399078 +1000 @@ -1710,6 +1710,7 @@ xfs_inactive( error = xfs_itruncate_start(ip, XFS_ITRUNC_DEFINITE, 0); if (error) { + xfs_trans_cancel(tp, 0); xfs_iunlock(ip, XFS_IOLOCK_EXCL); return VN_INACTIVE_CACHE; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[Patch] Allocate sparsemem memmap above 4G on X86_64
On system with huge amount of physical memory. VFS cache and memory memmap may eat all available system memory under 4G, then system may fail to allocated swiotlb bounce buffer. There was a fix in arch/x86_64/mm/numa.c, but that fix does not cover sparsemem model. This patch add fix to sparsemem model. Signed-off-by: Zou Nan hai <[EMAIL PROTECTED]> Acked-by: Siddha, Suresh <[EMAIL PROTECTED]> --- include/asm-x86_64/mmzone.h |5 + include/linux/bootmem.h |3 +++ mm/sparse.c |5 + 3 files changed, 13 insertions(+) diff -Nraup a/include/asm-x86_64/mmzone.h b/include/asm-x86_64/mmzone.h --- a/include/asm-x86_64/mmzone.h 2007-05-17 09:38:02.0 +0800 +++ b/include/asm-x86_64/mmzone.h 2007-05-17 09:54:10.0 +0800 @@ -52,5 +52,10 @@ extern int pfn_valid(unsigned long pfn); #define FAKE_NODE_MIN_HASH_MASK(~(FAKE_NODE_MIN_SIZE - 1uL)) #endif +#define ARCH_HAS_ALLOC_BOOTMEM_HIGH_NODE 1 +#define alloc_bootmem_high_node(pgdat,size) \ +({__alloc_bootmem_core(pgdat->bdata, size, SMP_CACHE_BYTES, (4UL*1024*1024*1024), 0);}) + + #endif #endif diff -Nraup a/include/linux/bootmem.h b/include/linux/bootmem.h --- a/include/linux/bootmem.h 2007-05-17 09:38:02.0 +0800 +++ b/include/linux/bootmem.h 2007-05-17 09:37:00.0 +0800 @@ -131,5 +131,8 @@ extern void *alloc_large_system_hash(con #endif extern int hashdist; /* Distribute hashes across NUMA nodes? */ +#ifndef ARCH_HAS_ALLOC_BOOTMEM_HIGH_NODE +#define alloc_bootmem_high_node(pgdat, size) ({NULL;}) +#endif #endif /* _LINUX_BOOTMEM_H */ diff -Nraup a/mm/sparse.c b/mm/sparse.c --- a/mm/sparse.c 2007-05-17 09:38:03.0 +0800 +++ b/mm/sparse.c 2007-05-17 09:54:27.0 +0800 @@ -219,6 +219,11 @@ static struct page __init *sparse_early_ if (map) return map; + map = alloc_bootmem_high_node(NODE_DATA(nid), + sizeof(struct page) * PAGES_PER_SECTION); +if (map) +return map; + map = alloc_bootmem_node(NODE_DATA(nid), sizeof(struct page) * PAGES_PER_SECTION); if (map) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1 does not boot on VIA C3_2 cause of X86_CMPXCHG64
On Thu, 17 May 2007, Christian wrote: > Linus Torvalds wrote: > > Can you check? The Nehemian (C3-2) should be model 9 or greater. > > Yes, it's a Nehemiah Ok. If so, we should blacklist both MCYRIXIII and MVIAC3_2, I suspect. > lola:~ # cat /proc/cpuinfo > flags : fpu vme de pse tsc msr cx8 sep mtrr pge cmov pat mmx fxsr > sse rng rng_en ace ace_en However, it does seem to *claim* to support "cx8" aka cmpxchg8b. What's up with that? Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH]x86_64: early_print kernel console should send CRLF not LFCR
[PATCH]x86_64: early_print kernel console should send CRLF not LFCR in commit d358788f3f30113e49882187d794832905e42592 Author: Russell King <[EMAIL PROTECTED]> Date: Mon Mar 20 20:00:09 2006 + Glen Turner reported that writing LFCR rather than the more traditional CRLF causes issues with some terminals. Since this aflicts many serial drivers, extract the common code to a library function (uart_console_write) and arrange for each driver to supply a "putchar" function. but early_printk is left out. Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Cc: Russell King <[EMAIL PROTECTED]> diff --git a/arch/x86_64/kernel/early_printk.c b/arch/x86_64/kernel/early_printk.c index 56eaa25..296d2b0 100644 --- a/arch/x86_64/kernel/early_printk.c +++ b/arch/x86_64/kernel/early_printk.c @@ -91,9 +91,9 @@ static int early_serial_putc(unsigned char ch) static void early_serial_write(struct console *con, const char *s, unsigned n) { while (*s && n-- > 0) { - early_serial_putc(*s); if (*s == '\n') early_serial_putc('\r'); + early_serial_putc(*s); s++; } } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Serial 8250: Handle saving the clear-on-read bits from the LSR and MSR
On Wednesday 16 May 2007, Corey Minyard wrote: >Russell King wrote: >> On Sun, May 06, 2007 at 12:58:25PM -0400, Gene Heskett wrote: >>> [long message snipped] >>> >>> Thanks for your patience Corey. >> >> So, in one sentence or preferably one word, did Corey's patch cause a >> regression? > >>From what Gene said, I think the final outcome is that this patch didn't >seem to make any difference. It looks to me that the problems were >elsewhere. > >So what's the state of this patch? > >Thanks, > >-corey Gene here. My impression was that this patch did help in that it appeared to clean up what was thought to be less than optimum code in that area. There were a few times when it didn't seem to take quite as many kills and restarts of that ill-coded proprietary daemon to make things behave. OTOH, I get the very strong impression there is another, more serious buglet someplace else that does a pretty good job of masking any black and white comparisons one might make about this patch. Older code, as in 4 or 5 minor kernel versions back, appeared to work correctly, either on a serial port, or a usb port with a pl2303, if one could tolerate the miss-fires it did occasionally. Now of course the pl2303 seems to be broken, both of the ones I have quit working at all with 2.6.21 final. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) * |Rain| prepares for polygon soup <|Rain|> sweet merciful crap, it works? * |Rain| faints - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Software raid0 will crash the file-system, when each disk is 5TB
> The only difference of any significance between the working > and non-working configurations is that in the non-working, > the component devices are larger than 2Gig, and hence have > sector offsets greater than 32 bits. Do u mean 2T here?, but in both configuartion, the component devices are larger than 2T (2.25T&5.5T). > This does cause a slightly different code path in one place, > but I cannot see it making a difference. But maybe it does. > > What architecture is this running on? > What C compiler are you using? I386(i686) Gcc 4.0.2 20051125, Distro is Fedora core, we've tried fc4 and fc6. > Can you try with this patch? It is the only thing that I can > find that could conceivably go wrong. > OK, I will try the patach and post the result. Best Regards Jeff Zheng - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.20.11] File system corruption with degraded md/RAID-5 device
On Wednesday May 16, [EMAIL PROTECTED] wrote: > > Here is what I am doing to test: > > fdisk /dev/sda1 and /dev/sb1 to type fd/Linux raid auto > mdadm --create /dev/md1 -c 128 -l 5 -n 3 /dev/sda1 /dev/sdb1 missing > mke2fs -j -b 4096 -R stride=32 /dev/md1 > e2fsck -f /dev/md1 > - > Result: FAILS - fsck errors (Example: "Inode 3930855 is in use, but > has dtime set.") Very odd. I cannot reproduce this, but then my drives are somewhat smaller than yours (though I'm not sure how that could be significant). Can you try a raid0 across 2 drives? That would be more like the raid5 layout than raid1. My guess is some subtle hardware problem, as I would be very surprised in the raid5 code is causing this. Maybe run memtest86? NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1-mm1
On Wed, May 16, 2007 at 09:41:33AM -0700, Andrew Morton wrote: > On Wed, 16 May 2007 18:24:44 +0200 Michal Piotrowski <[EMAIL PROTECTED]> > wrote: > > > Andrew Morton napisaÅ(a): > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/ > > > > > > > Almost every time when I try to run this script I hit a bug. I'm wondering > > why... > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.22-rc1-mm1/test_mount_fs.sh > > > > [ .713016] kernel BUG at /home/devel/linux-mm/include/linux/mm.h:288! > > static inline int put_page_testzero(struct page *page) > { > VM_BUG_ON(atomic_read(>_count) == 0); > return atomic_dec_and_test(>_count); > } I haven't seen that one. I expect that it will be the noaddr buffer allocation changes that have triggered this... > > [ .719690] invalid opcode: [#1] > > [ .723397] PREEMPT SMP > > [ .725999] Modules linked in: xfs loop pktgen ipt_MASQUERADE > > iptable_nat nf_nat autofs4 af_packet nf_conntrack_netbios_ns ipt_REJECT > > nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter ip_tables > > ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 binfmt_misc > > thermal processor fan container nvram snd_intel8x0 snd_ac97_codec ac97_bus > > snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device > > snd_pcm_oss snd_mixer_oss snd_pcm evdev snd_timer snd soundcore intel_agp > > agpgart snd_page_alloc i2c_i801 ide_cd cdrom rtc unix > > [ .776026] CPU:0 > > [ .776027] EIP:0060:[]Not tainted VLI > > [ .776028] EFLAGS: 00010202 (2.6.22-rc1-mm1 #3) > > [ .788519] EIP is at put_page+0x44/0xee > > [ .792491] eax: 0001 ebx: c549f728 ecx: c04b27e0 edx: 0001 > > [ .799345] esi: edi: 0080 ebp: d067e9e0 esp: d067e9c8 > > [ .806208] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 > > [ .812104] Process mount (pid: 9419, ti=d067e000 task=d00a4070 > > task.ti=d067e000) > > [ .819486] Stack: d8980180 0080 d067e9f0 d8980180 0080 > > d067e9f0 fdc8eda3 > > [ .828103]fffc d8980180 d067ea20 fdc8f7ff fdc9b425 fdc96e5c > > 0008 > > [ .836635]c549dfd0 0200 cd44b8e0 2160 cd44b8e0 > > d067ea30 fdc78937 > > [ .845253] Call Trace: > > [ .847939] [] xfs_buf_free+0x41/0x61 [xfs] > > [ .853247] [] xfs_buf_get_noaddr+0x10c/0x118 [xfs] > > [ .859231] [] xlog_get_bp+0x65/0x69 [xfs] Yeah - that trace implies a memory allocation failure when allocating log buffer pages and the cleanup looks like it does a double free of the pages that got allocated. Patch attached below that should fix this problem. > > [ 6667.271984] XFS: Filesystem loop1 has duplicate UUID - can't mount > > > > ... > > > > [ 6670.074487] XFS: Filesystem loop1 has duplicate UUID - can't mount > > [ 6670.240395] XFS: Filesystem loop1 has duplicate UUID - can't mount > > [ 6670.350305] XFS: Filesystem loop1 has duplicate UUID - can't mount > > [ 6670.458773] XFS: Filesystem loop1 has duplicate UUID - can't mount I assume that the thread doing the mount got killed by the BUG and so the normal error handling path on log mount failure was not executed and hence the uuid for the filesystem never got removed from the table used to detect multiple mounts of the same filesystem Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/linux-2.6/xfs_buf.c | 21 + 1 file changed, 13 insertions(+), 8 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_buf.c === --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-05-11 16:03:26.0 +1000 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_buf.c2007-05-17 11:53:40.293585132 +1000 @@ -323,9 +323,16 @@ xfs_buf_free( for (i = 0; i < bp->b_page_count; i++) { struct page *page = bp->b_pages[i]; - if (bp->b_flags & _XBF_PAGE_CACHE) + /* handle noaddr allocation failure case */ + if (!page) + break; + + if (bp->b_flags & _XBF_PAGE_CACHE) { ASSERT(!PagePrivate(page)); - page_cache_release(page); + page_cache_release(page); + } else { + __free_page(page); + } } _xfs_buf_free_pages(bp); } @@ -766,6 +773,8 @@ xfs_buf_get_noaddr( goto fail; _xfs_buf_initialize(bp, target, 0, len, 0); + bp->b_flags |= _XBF_PAGES; + error = _xfs_buf_get_pages(bp, page_count, 0); if (error) goto fail_free_buf; @@ -773,15 +782,14 @@ xfs_buf_get_noaddr( for (i = 0; i <
dock support on thinkpad x60s for DVD/CDROM
Hey all, I was wondering if any progress was made on the docking station support for new thinkpads, like the x60s ultradock. I'm looking for support for the CD/DVD burner on it. I tried enabling the most recent dock support in the kernel and still nothing. (2.6.22-rc1) Thanks! George - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] fix kmalloc(0) in arch/ia64/pci/pci.c
Fix following kmalloc(0) message. patch is agaisnt 2.6.22-rc1-mm1. == BUG: at mm/slab.c:792 __find_general_cachep() Call Trace: [] show_stack+0x40/0xa0 sp=e14042227b00 bsp=e14042220f90 [] dump_stack+0x30/0x60 sp=e14042227cd0 bsp=e14042220f78 [] kmem_find_general_cachep+0x90/0x140 sp=e14042227cd0 bsp=e14042220f48 [] __kmalloc_node+0x30/0xa0 sp=e14042227cd0 bsp=e14042220f18 [] pci_acpi_scan_root+0x180/0x4a0 sp=e14042227cd0 bsp=e14042220ed0 [] acpi_pci_root_add+0x4e0/0x700 sp=e14042227cf0 bsp=e14042220e90 [] acpi_device_probe+0xa0/0x160 sp=e14042227d10 bsp=e14042220e58 [] driver_probe_device+0x250/0x380 sp=e14042227d10 bsp=e14042220e18 [] __driver_attach+0xc0/0x160 sp=e14042227d10 bsp=e14042220de0 [] bus_for_each_dev+0x80/0x100 sp=e14042227d10 bsp=e14042220da8 [] driver_attach+0x40/0x60 sp=e14042227d30 bsp=e14042220d88 [] bus_add_driver+0xf0/0x3c0 sp=e14042227d30 bsp=e14042220d48 [] driver_register+0x140/0x160 sp=e14042227d30 bsp=e14042220d28 [] acpi_bus_register_driver+0x50/0x80 sp=e14042227d30 bsp=e14042220d08 [] acpi_pci_root_init+0x20/0x60 sp=e14042227d30 bsp=e14042220cf0 [] kernel_init+0x450/0x7c0 sp=e14042227d30 bsp=e14042220ca8 [] kernel_thread_helper+0x30/0x60 sp=e14042227e30 bsp=e14042220c80 [] start_kernel_thread+0x20/0x40 sp=e14042227e30 bsp=e14042220c80 == Fix kmalloc(0) Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: linux-2.6.22-rc1-mm1/arch/ia64/pci/pci.c === --- linux-2.6.22-rc1-mm1.orig/arch/ia64/pci/pci.c +++ linux-2.6.22-rc1-mm1/arch/ia64/pci/pci.c @@ -354,6 +354,8 @@ pci_acpi_scan_root(struct acpi_device *d acpi_walk_resources(device->handle, METHOD_NAME__CRS, count_window, ); + if (!windows) + goto out2; controller->window = kmalloc_node(sizeof(*controller->window) * windows, GFP_KERNEL, controller->node); if (!controller->window) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: radeonfb and X800 cards
On Wed, 2007-05-16 at 21:47 -0400, Daniel Drake wrote: > Hi, > > Did anything happen to the patch titled "radeonfb: add support for newer > cards"? > http://lwn.net/Articles/215965/ > > Jimmy at http://bugs.gentoo.org/174063 has extended upon this with some > further fixes based on code the in X11 driver. The patches are on the > bug report. > > Ben, where can the most up-to-date radeonfb code be found? upstream. I haven't released anything else so far. Does the patch still apply ? Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1 does not boot on VIA C3_2 cause of X86_CMPXCHG64
H. Peter Anvin wrote: > > Andi added code to verify that we can actually execute on the processor > before protected mode (so we can still get a message out through the > BIOS.) That code presumably doesn't know of the MSR that needs to be > touched. > > That code is in assembly in Andi's version, my rewritten version has it > in C. I should add this code. > The newsetup tree now has code to unmask features on VIA and Transmeta (as well as AMD, which was already in there): http://git.kernel.org/?p=linux/kernel/git/hpa/linux-2.6-newsetup.git;a=commitdiff;h=c9cf55604433b386d0b499ed7bed654fd01c3be2 -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
patch jfs-fix-race-waking-up-jfsio-kernel-thread.patch queued to 2.6.21-stable tree
This is a note to let you know that we have just queued up the patch titled Subject: JFS: Fix race waking up jfsIO kernel thread to the 2.6.21-stable tree. Its filename is jfs-fix-race-waking-up-jfsio-kernel-thread.patch A git repo of this tree can be found at http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary >From [EMAIL PROTECTED] Tue May 15 20:55:43 2007 From: Dave Kleikamp <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Date: Tue, 15 May 2007 22:53:36 -0500 Message-Id: <[EMAIL PROTECTED]> Cc: linux-kernel Subject: JFS: Fix race waking up jfsIO kernel thread It's possible for a journal I/O request to be added to the log_redrive queue and the jfsIO thread to be awakened after the thread releases log_redrive_lock but before it sets its state to TASK_INTERRUPTIBLE. The jfsIO thread should set the state before giving up the spinlock, so the waking thread will really wake it. Signed-off-by: Dave Kleikamp <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- fs/jfs/jfs_logmgr.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- linux-2.6.21.1.orig/fs/jfs/jfs_logmgr.c +++ linux-2.6.21.1/fs/jfs/jfs_logmgr.c @@ -2354,12 +2354,13 @@ int jfsIOWait(void *arg) lbmStartIO(bp); spin_lock_irq(_redrive_lock); } - spin_unlock_irq(_redrive_lock); if (freezing(current)) { + spin_unlock_irq(_redrive_lock); refrigerator(); } else { set_current_state(TASK_INTERRUPTIBLE); + spin_unlock_irq(_redrive_lock); schedule(); current->state = TASK_RUNNING; } Patches currently in stable-queue which might be from [EMAIL PROTECTED] are queue-2.6.21/jfs-fix-race-waking-up-jfsio-kernel-thread.patch - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
radeonfb and X800 cards
Hi, Did anything happen to the patch titled "radeonfb: add support for newer cards"? http://lwn.net/Articles/215965/ Jimmy at http://bugs.gentoo.org/174063 has extended upon this with some further fixes based on code the in X11 driver. The patches are on the bug report. Ben, where can the most up-to-date radeonfb code be found? Thanks, Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1 does not boot on VIA C3_2 cause of X86_CMPXCHG64
Dave Jones wrote: > The C3s all have cx8, but it needs to be enabled in an MSR first. > (See arch/i386/kernel/cpu/centaur.c , search for CX8) > > Did we add code that uses cmpxchg8b before identify_cpu() gets run ? > I've not been paying attention to .22rc (busy trying to beat .21 into shape > for F7) > so I may have missed something obvious. Andi? > > Dave > May I brought up a wrong reason with the command cmpxchg64. But disabling CONFIG_X86_CMPXCHG64 helps. The via C3 EBGA datasheet R1.9 tells me this command works always: >The CMPXCHG8B instruction is provided and always enabled, however, it appears >disabled in the corresponding >CPUID function bit 0 to avoid a bug in an early version of Windows NT. >However, this default can be changed >via a bit in the FCR MSR. Hmm, I should be able to add a few small "here I am" to the my local boot code with a little hint. Anyway I will try tomorrow to find this on my own. printfs for debugging are more friendly than assembler. H. Peter Anvin wrote: > Andi added code to verify that we can actually execute on the processor > before protected mode (so we can still get a message out through the > BIOS.) That code presumably doesn't know of the MSR that needs to be > touched. Best regards, Christian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1 does not boot on VIA C3_2 cause of X86_CMPXCHG64
Linus Torvalds wrote: > Can you check? The Nehemian (C3-2) should be model 9 or greater. Yes, it's a Nehemiah lola:~ # cat /proc/cpuinfo processor : 0 vendor_id : CentaurHauls cpu family : 6 model : 9 model name : VIA Nehemiah stepping: 8 cpu MHz : 998.732 cache size : 64 KB fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr cx8 sep mtrr pge cmov pat mmx fxsr sse rng rng_en ace ace_en bogomips: 1999.51 clflush size: 32 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1 does not boot on VIA C3_2 cause of X86_CMPXCHG64
Dave Jones wrote: > > The C3s all have cx8, but it needs to be enabled in an MSR first. > (See arch/i386/kernel/cpu/centaur.c , search for CX8) > > Did we add code that uses cmpxchg8b before identify_cpu() gets run ? > I've not been paying attention to .22rc (busy trying to beat .21 into shape > for F7) > so I may have missed something obvious. Andi? > Andi added code to verify that we can actually execute on the processor before protected mode (so we can still get a message out through the BIOS.) That code presumably doesn't know of the MSR that needs to be touched. That code is in assembly in Andi's version, my rewritten version has it in C. I should add this code. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: UML doesn't compile in 2.6.21
On Wednesday 16 May 2007 6:49 pm, Robert Schwebel wrote: > Jeff, > > Any idea how this could happen? I'm trying to build 2.6.21 for ARCH=um, and the > linker stage explodes here: 2.6.21.1 built for me: tar xvjf linux-2.6.21.1.tar.bz2 && cd linux-2.6.21.1 && cat > mini.conf << EOF CONFIG_MODE_SKAS=y CONFIG_BINFMT_ELF=y CONFIG_HOSTFS=y CONFIG_SYSCTL=y CONFIG_STDERR_CONSOLE=y CONFIG_UNIX98_PTYS=y CONFIG_BLK_DEV_LOOP=y CONFIG_LBD=y CONFIG_EXT2_FS=y CONFIG_PROC_FS=y EOF make ARCH=um allnoconfig KCONFIG_ALLCONFIG=mini.conf && make ARCH=um && ./linux rootfstype=hostfs rw init=/bin/sh Does this not work for you? Rob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Proposed update of the i386 boot document
I have noticed that a few items in the i386 boot document have become a bit incoherent and/or dated over the years. Attached is an attempt at rewriting a few of the sections; the main things is a field-by-field description for the setup header. I would appreciate comments as to if this makes it easier to follow or not. -hpa THE LINUX/I386 BOOT PROTOCOL H. Peter Anvin <[EMAIL PROTECTED]> Last update 2007-05-16 On the i386 platform, the Linux kernel uses a rather complicated boot convention. This has evolved partially due to historical aspects, as well as the desire in the early days to have the kernel itself be a bootable image, the complicated PC memory model and due to changed expectations in the PC industry caused by the effective demise of real-mode DOS as a mainstream operating system. Currently, the following versions of the Linux/i386 boot protocol exist. Old kernels:zImage/Image support only. Some very early kernels may not even support a command line. Protocol 2.00: (Kernel 1.3.73) Added bzImage and initrd support, as well as a formalized way to communicate between the boot loader and the kernel. setup.S made relocatable, although the traditional setup area still assumed writable. Protocol 2.01: (Kernel 1.3.76) Added a heap overrun warning. Protocol 2.02: (Kernel 2.4.0-test3-pre3) New command line protocol. Lower the conventional memory ceiling. No overwrite of the traditional setup area, thus making booting safe for systems which use the EBDA from SMM or 32-bit BIOS entry points. zImage deprecated but still supported. Protocol 2.03: (Kernel 2.4.18-pre1) Explicitly makes the highest possible initrd address available to the bootloader. Protocol 2.04: (Kernel 2.6.14) Extend the syssize field to four bytes. Protocol 2.05: (Kernel 2.6.20) Make protected mode kernel relocatable. Introduce relocatable_kernel and kernel_alignment fields. Protocol 2.06: (Kernel 2.6.22) Added a field that contains the size of the boot command line MEMORY LAYOUT The traditional memory map for the kernel loader, used for Image or zImage kernels, typically looks like: || 0A ++ | Reserved for BIOS | Do not use. Reserved for BIOS EBDA. 09A000 ++ | Command line | | Stack/heap| For use by the kernel real-mode code. 098000 ++ | Kernel setup | The kernel real-mode code. 090200 ++ | Kernel boot sector| The kernel legacy boot sector. 09 ++ | Protected-mode kernel | The bulk of the kernel image. 01 ++ | Boot loader | <- Boot sector entry point :7C00 001000 ++ | Reserved for MBR/BIOS | 000800 ++ | Typically used by MBR | 000600 ++ | BIOS use only | 00 ++ When using bzImage, the protected-mode kernel was relocated to 0x10 ("high memory"), and the kernel real-mode block (boot sector, setup, and stack/heap) was made relocatable to any address between 0x1 and end of low memory. Unfortunately, in protocols 2.00 and 2.01 the 0x9+ memory range is still used internally by the kernel; the 2.02 protocol resolves that problem. It is desirable to keep the "memory ceiling" -- the highest point in low memory touched by the boot loader -- as low as possible, since some newer BIOSes have begun to allocate some rather large amounts of memory, called the Extended BIOS Data Area, near the top of low memory. The boot loader should use the "INT 12h" BIOS call to verify how much low memory is available. Unfortunately, if INT 12h reports that the amount of memory is too low, there is usually nothing the boot loader can do but to report an error to the user. The boot loader should therefore be designed to take up as little space in low memory as it reasonably can. For zImage or old bzImage kernels, which need data written into the 0x9 segment, the boot loader should make sure not to use memory above the 0x9A000 point; too many BIOSes will break above that point. For a modern bzImage kernel with boot protocol version >= 2.02, a memory layout like the following is suggested: ~~ | Protected-mode kernel | 10 ++ | I/O memory hole | 0A ++ | Reserved for BIOS | Leave as
Re: 2.6.22-rc1 does not boot on VIA C3_2 cause of X86_CMPXCHG64
On Thu, 17 May 2007, Christian wrote: > > my small VIA C3_2 box does not boot with 2.6.22-rc1. > It even does not uncompress the kernel. > > The configuration as M386 M486 works. But M586 + MVIAC3_2 > does not work. Ahh, from the EPIA HOWTO: 13.2. Is the C3 Pentium compatible? Yes. But Samuel 2, Ezra, Ezra T C3 processors have a problem with the cmpxchg8b (i.e. CMOV) opcode. Nehemiah and Antaur processors are not affected. However, that would imply that you don't have a VIA C3-2 (Nehamiah) at all, but the older original CyrixIII/VIA C3. Can you check? The Nehemian (C3-2) should be model 9 or greater. So afaik, you should use MCYRIXIII, and make _that_ be the one that disables the use of the cmpxchg8b instruction. Can you please verify? Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] powernow-k8: depend on acpi-processor for SMP systems
Daniel Drake wrote: Joshua Hoblitt wrote: I don't think this is quiet right either as Ed Sweetman has reported that this issue doesn't occur on single socket/multi-core systems. Where did he write that? In an off-list mail, Ed seemed to agree with my patch. Daniel What i didn't agree with was the dependency on the acpi P-state driver for single socket multi-core systems, where in the original post of this thread, Joshua was stating that smp systems required that driver. Later it was found that the acpi p-state driver was only being used to enforce the dependency on the acpi_processor driver ...which is the actual driver we care about (dependency wise). So yes, I do agree with your patch, in so far as my experience with the hardware. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.21-rc1-mm1] add check_highest_zone to build_zonelists_in_zone_order
On Wed, 16 May 2007 15:57:39 -0400 Lee Schermerhorn <[EMAIL PROTECTED]> wrote: > > [PATCH 2.6.21-rc1-mm1] add check_highest_zone to build_zonelists_in_zone_order > > We missed this in the "change zone order" series. We need to record > the highest populated zone, just as build_zonelists_node() does. > Memory policies apply only to this zone. Without this, we'll be > applying policy to all zones, including DMA, I think. Not having > thought about it much, I can't claim to understand the downside of > doing so. > > Also, display selected "policy zone" during boot or reconfig > of zonelist order, if 'NUMA. Inquiring minds [might] want to know... > > Cleanup: remove stale comment in set_zonelist_order() > > Signed-off-by: Lee Schermerhorn <[EMAIL PROTECTED]> > Acked-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Weird hard disk noise on shutdown (bug #7674)
On Wed, 16 May 2007, Rob Landley wrote: > Ok, so the change is to get shutdown to _stop_ doing something stupid > (spinning down the disk without first flushing the cache), and the correct > thing for shutdown to do is keep its' mitts off the thing and let the kernel > power down the darn hardware? Yes, for *all* SCSI disk devices, libata or not. But you need to detect if the kernel has proper SCSI device shutdown support, because if it does not, you have to do a cache flush and spindown on shutdown(8) if you can... -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] libata: implement ata_wait_after_reset()
On Wed, May 16, 2007 at 06:44:53PM +0200, Tejun Heo wrote: > + /* FIXME: GoVault needs 2s but we can't afford that without > + * parallel probing. 800ms is enough for iVDR disk > + * HHD424020F7SV00. Increase to 2secs when parallel probing > + * is in place. > + */ > + ATA_TMOUT_FF_WAIT = 4 * HZ / 5, > + Changing this to 4 * HZ / 4 gets rid of the occasional COMRESET failure. So it would seem that 800ms is good enough for the common case, but it seems to be cutting it pretty close.. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: select(0, ..) is valid ?
On Wed, 2007-05-16 at 10:37 -0500, Anton Blanchard wrote: > Hi Hugh, > > > It's interesting that compat_core_sys_select() shows this kmalloc(0) > > failure but core_sys_select() does not. That's because core_sys_select() > > avoids kmalloc by using a buffer on the stack for small allocations (and > > 0 sure is small). Shouldn't compat_core_sys_select() do just the same? > > Or is SLUB going to be so efficient that doing so is a waste of time? > > Nice catch, the original optimisation from Andi is: > > http://git.kernel.org/git-new/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=70674f95c0a2ea694d5c39f4e514f538a09be36f > > And I think it makes sense for the compat code to do it too. > > Anton Here it is .. Should I do one for poll() also ? Thanks, Badari Optimize select by a using stack space for small fd sets. core_sys_select() already has this optimization. This is for compat version. Signed-off-by: Badari Pulavarty <[EMAIL PROTECTED]> --- fs/compat.c | 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) Index: linux-2.6.22-rc1/fs/compat.c === --- linux-2.6.22-rc1.orig/fs/compat.c 2007-05-12 18:45:56.0 -0700 +++ linux-2.6.22-rc1/fs/compat.c2007-05-16 17:50:39.0 -0700 @@ -1544,9 +1544,10 @@ int compat_core_sys_select(int n, compat compat_ulong_t __user *outp, compat_ulong_t __user *exp, s64 *timeout) { fd_set_bits fds; - char *bits; + void *bits; int size, max_fds, ret = -EINVAL; struct fdtable *fdt; + long stack_fds[SELECT_STACK_ALLOC/sizeof(long)]; if (n < 0) goto out_nofds; @@ -1564,11 +1565,14 @@ int compat_core_sys_select(int n, compat * since we used fdset we need to allocate memory in units of * long-words. */ - ret = -ENOMEM; size = FDS_BYTES(n); - bits = kmalloc(6 * size, GFP_KERNEL); - if (!bits) - goto out_nofds; + bits = stack_fds; + if (size > sizeof(stack_fds) / 6) { + bits = kmalloc(6 * size, GFP_KERNEL); + ret = -ENOMEM; + if (!bits) + goto out_nofds; + } fds.in = (unsigned long *) bits; fds.out = (unsigned long *) (bits + size); fds.ex = (unsigned long *) (bits + 2*size); @@ -1600,7 +1604,8 @@ int compat_core_sys_select(int n, compat compat_set_fd_set(n, exp, fds.res_ex)) ret = -EFAULT; out: - kfree(bits); + if (bits != stack_fds) + kfree(bits); out_nofds: return ret; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] powernow-k8: depend on acpi-processor for SMP systems
Joshua Hoblitt wrote: I don't think this is quiet right either as Ed Sweetman has reported that this issue doesn't occur on single socket/multi-core systems. Where did he write that? In an off-list mail, Ed seemed to agree with my patch. Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] libata: implement ata_wait_after_reset()
On Wed, May 16, 2007 at 06:44:53PM +0200, Tejun Heo wrote: > This patch is against the current libata-dev#upstream + > pata_scc-fix-build-failure[1]. > > [1] http://article.gmane.org/gmane.linux.kernel/528405 > > Paul, please verify this fixes your problem. You can skip the > pata_scc patch, it will cause pata_scc part to be rejected but doesn't > matter. > Yes, this does get iVDR detection working again. The only problem seems to be that every now and then I end up with this: scsi0 : sata_sil scsi1 : sata_sil ata1: SATA max UDMA/100 cmd 0xfd000280 ctl 0xfd00028a bmdma 0xfd000200 irq 0 ata2: SATA max UDMA/100 cmd 0xfd0002c0 ctl 0xfd0002ca bmdma 0xfd000208 irq 0 ata1: device not ready (errno=-19), forcing hardreset ata1: COMRESET failed (errno=-19) ata1: reset failed (errno=-19), retrying in 9 secs ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) So at least the drive detection works, but it would be nice not to trigger this 9-second retry. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Resending: RT patches expose netdev race [was Re: [RFC] [patch 2/2] powerpc 2.6.21-rt1: fix kernel hang and/or panic
> I do not know why sk_buff->head would be null, or > would be set in a racy kind of way, or why the rt patches > would cause this. But the evidence implicates that. Would it be possible that a locking bug in spidernet would cause it under some circumstances to get a stale skb pointer ? Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Software raid0 will crash the file-system, when each disk is 5TB
On Wednesday May 16, [EMAIL PROTECTED] wrote: > Here is the information of the created raid0. Hope it is enough. Thanks. Everything looks fine here. The only difference of any significance between the working and non-working configurations is that in the non-working, the component devices are larger than 2Gig, and hence have sector offsets greater than 32 bits. This does cause a slightly different code path in one place, but I cannot see it making a difference. But maybe it does. What architecture is this running on? What C compiler are you using? Can you try with this patch? It is the only thing that I can find that could conceivably go wrong. Thanks, NeilBrown Signed-off-by: Neil Brown <[EMAIL PROTECTED]> ### Diffstat output ./drivers/md/raid0.c |1 + 1 file changed, 1 insertion(+) diff .prev/drivers/md/raid0.c ./drivers/md/raid0.c --- .prev/drivers/md/raid0.c2007-05-17 10:33:30.0 +1000 +++ ./drivers/md/raid0.c2007-05-17 10:34:02.0 +1000 @@ -461,6 +461,7 @@ static int raid0_make_request (request_q while (block >= (zone->zone_offset + zone->size)) zone++; + BUG_ON(block < zone->zone_offset); sect_in_chunk = bio->bi_sector & ((chunk_size<<1) -1); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1-mm1: boot failure under qemu
H. Peter Anvin wrote: > Jeremy Fitzhardinge wrote: > >> H. Peter Anvin wrote: >> >>> Okay, I've established that this is a bug in the Qemu kernel loader: the >>> Qemu loader puts zero in the loadflags, which is wrong no matter how you >>> slice it. >>> >>> I have checked in a workaround in the git.newsetup tree; the workaround >>> is to rely on a compile-time value for load low/load high instead of >>> looking at loadflags. >>> >>> >> Can you post a patch to try? >> >> > > Cumulative diff from -rc1-mm1. > Thanks, this works for me. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1 does not boot on VIA C3_2 cause of X86_CMPXCHG64
On Thu, May 17, 2007 at 02:09:16AM +0200, Christian wrote: > my small VIA C3_2 box does not boot with 2.6.22-rc1. > It even does not uncompress the kernel. > > The configuration as M386 M486 works. But M586 + MVIAC3_2 > does not work. > > solution for me, cahnge arch/i386/Kconfig.cpu > > --- arch/i386/Kconfig.cpu.before 2007-05-17 01:38:26.0 +0200 > +++ arch/i386/Kconfig.cpu 2007-05-17 00:54:52.0 +0200 > @@ -299,5 +299,5 @@ > > config X86_CMPXCHG64 > bool > - depends on !M386 && !M486 > + depends on !M386 && !M486 && !MVIAC3_2 > default y > > > The related #ifdef is in ./include/asm-i386/cmpxchg.h > May be cmpxchg8b is not supported by VIAC3_2 ? > > May be some other non Intel/AMD need to be excluded from X86_CMPXCHG64 ? > May be the generic option CONFIG_X86_GENERIC need to switch this off also ? The C3s all have cx8, but it needs to be enabled in an MSR first. (See arch/i386/kernel/cpu/centaur.c , search for CX8) Did we add code that uses cmpxchg8b before identify_cpu() gets run ? I've not been paying attention to .22rc (busy trying to beat .21 into shape for F7) so I may have missed something obvious. Andi? Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] powernow-k8: depend on acpi-processor for SMP systems
On Wed, May 16, 2007 at 02:26:14PM -1000, Joshua Hoblitt wrote: > I don't think this is quiet right either as Ed Sweetman has reported > that this issue doesn't occur on single socket/multi-core systems. I'm not sure why [*], because this should be preventing it.. if (num_online_cpus() != 1) { printk(KERN_ERR PFX "MP systems not supported by PSB BIOS structure\n"); kfree(data); return -ENODEV; } num_online_cpus will return 2 in a dual-core system, even though there's just one socket. Given they share a power plane, if there's a valid PSB structure however, it may be usable. Though this isn't necessarily true for all future dual-core AMD CPUs, and the ACPI tables really should be preferred. Dave [*] unless you have the second core disabled or CONFIG_SMP=n -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] select and dependencies in Kconfig
Timur Tabi wrote: > For example, if I want to add a new driver C that uses library B, I can > just add this: > > C > select B > > If I have to use "depends on", then I would have to change the Kconfig > option for B like this: > > B > depends on A || C You mean, "B... serves A, C". However, it shouldn't matter which way around the dependencies are written down in the Kconfigs. What does matter is how "make {old,menu,...}config" deal with it. -- Stefan Richter -=-=-=== -=-= =---= http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] scalable rw_mutex
On Wed, 16 May 2007 16:40:59 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Wed, 16 May 2007, Andrew Morton wrote: > > > (I hope. Might have race windows in which the percpu_counter_sum() count is > > inaccurate?) > > The question is how do these race windows affect the locking scheme? The race to which I refer here is if another CPU is running percpu_counter_sum() in the window between the clearing of the bit in cpu_online_map and the CPU_DEAD callout. Maybe that's too small to care about in the short-term, dunno. Officially we should fix that by taking lock_cpu_hotplug() in percpu_counter_sum(), but I hate that thing. I was thinking of putting a cpumask into the counter. If we do that then there's no race at all: everything happens under fbc->lock. This would be a preferable fix, if we need to fix it. But I'd prefer that freezer-based cpu-hotplug comes along and saves us again. umm, actually, we can fix the race by using CPU_DOWN_PREPARE instead of CPU_DEAD. Because it's OK if percpu_counter_sum() looks at a gone-away CPU's slot. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Resending: RT patches expose netdev race [was Re: [RFC] [patch 2/2] powerpc 2.6.21-rt1: fix kernel hang and/or panic
(resending , Owa-san was cut from cc list!??) Hi, On Tue, May 15, 2007 at 08:09:02PM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2007-05-15 at 17:47 +0900, Tsutomu OWA wrote: > > I encountered the following error when doing netperf from other machine > > to Celleb running RT kernel. PREEPT_NONE kernel works just fine as well. > > Hrm... sounds a bit weird. I wonder if there's a locking bug in the > driver in the first place. > > Linas, what's your take ? Heh. I almost deleted the entire email thread cause it didn't say "spidernet" in the subject line. :-) Seriously, I really almost did Since this is a long email; let me put a summary up front: I think the RT/premption patches are exposing some sort of race in the ip header handling code. The rest of the note is forensics pointing to this. Reading the patch, it looks like all it did was to move around the locks, without changing the semantics. Two comments about that: -- The current spidernet locks are very fine-grained; this makes the whole thing function more smoothly. The patch would make them coarse-grained, I don't like that. -- Moving around locks like that changes the timing completely, and changing the timing makes races come and go. The races seem to vanish, but that's only cause you are getting lucky. Since I'm sick-n-tired of dealing with spidernet, I thought I'd give this one a little extra attention. The crash is a null pointer deref. The spidernet doesn't use locks to protect null pointers. The spidernet mostly doesn't play with pointers at all; they're mostly static. So this crash is "unusual" from the get-go. >> Instruction dump: >> 6000 81790088 901f000c 913f0018 913f0008 917f0004 48132e8d >> 6000 >> a019009e 2f800800 409e0038 e9390038 <88690009> 2f830006 419e0010 >> 2f830011 The crashing instruction is <88690009> which is very unique: lbz r3,9(r9) load byte ... at an offset of 9 bytes!? spidernet does nothing with bytes, so its another reason its not spidernet. Below follows a manual disassembly. The guilty party appears to the the skb, and spcifically, skb->head has not been set. You'll have to read the details below to see why. I do not know why sk_buff->head would be null, or would be set in a racy kind of way, or why the rt patches would cause this. But the evidence implicates that. --linas Long stuff below. For the record: > > Unable to handle kernel paging request for data at address 0x0009 > > Faulting instruction address: 0xc0295434 > > Oops: Kernel access of bad area, sig: 11 [#1] > > PREEMPT SMP NR_CPUS=2 NUMA > > Modules linked in: > > NIP: C0295434 LR: C0295420 CTR: > > REGS: c95d6e30 TRAP: 0300 Not tainted (2.6.21-rc5-rt7) > > MSR: 80009032 CR: 24000482 XER: 2000 > > DAR: 0009, DSISR: 4000 > > TASK = c1e7c440[626] 'netserver' THREAD: c95d4000 CPU: 0 > > GPR00: 0800 C95D70B0 C05D77B8 0001 > > GPR04: 0001 C95D7080 > > GPR08: C95D7030 C95D7040 > > GPR12: FC69925300080D5D C04DE680 00422208 > > GPR16: 0040 00420D10 C95D7C88 > > GPR20: C1E7C440 0001 C8ACEAE0 > > GPR24: 0020 C0E50C80 81F84C5E C1C00BE0 > > GPR28: C1C05430 C1C00B80 C0570F30 C1FD1720 > > NIP [C0295434] .spider_net_xmit+0x1dc/0x448 > > LR [C0295420] .spider_net_xmit+0x1c8/0x448 > > Call Trace: > > [C95D70B0] [C0295420] .spider_net_xmit+0x1c8/0x448 > > (unreliable) > > [C95D7160] [C0327EE8] .dev_hard_start_xmit+0x238/0x300 > > [C95D7200] [C033A7F4] .__qdisc_run+0xdc/0x2a4 > > [C95D72B0] [C032A948] .dev_queue_xmit+0x1b0/0x2fc > > [C95D7350] [C034B470] .ip_output+0x280/0x2d8 > > [C95D73F0] [C034C6CC] .ip_queue_xmit+0x448/0x4d8 > > [C95D74F0] [C035F6D8] .tcp_transmit_skb+0x850/0x8c0 > > [C95D75C0] [C035C394] .__tcp_ack_snd_check+0x84/0xc0 > > [C95D7650] [C035E114] .tcp_rcv_established+0x4f0/0x8ac > > [C95D7700] [C0365B24] .tcp_v4_do_rcv+0x5c/0x448 > > [C95D77D0] [C031C2C4] .release_sock+0x94/0x11c > > [C95D7870] [C0354E7C] .tcp_recvmsg+0x374/0x8d8 > > [C95D7960] [C031B8A0] .sock_common_recvmsg+0x5c/0x84 > > [C95D79F0] [C031921C] .sock_recvmsg+0x110/0x15c > > [C95D7C00] [C031AA50] .sys_recvfrom+0xf0/0x174 > > [C95D7D90] [C0339368] .compat_sys_socketcall+0x178/0x214 > > [C95D7E30] [C0008634] syscall_exit+0x0/0x40 > > Instruction dump: > > 6000 81790088 901f000c 913f0018
Re: [PATCH] powernow-k8: depend on acpi-processor for SMP systems
I don't think this is quiet right either as Ed Sweetman has reported that this issue doesn't occur on single socket/multi-core systems. -J -- On Thu, May 17, 2007 at 12:50:50AM +0100, Daniel Drake wrote: > powernow-k8 uses PSB BIOS tables to read frequency info on UP systems, but > on SMP it requires the acpi-processor driver. Kconfig should be updated > accordingly to avoid the issues that users are running into. > > http://bugzilla.kernel.org/show_bug.cgi?id=8075 > https://bugs.gentoo.org/show_bug.cgi?id=178585 > > Signed-off-by: Daniel Drake <[EMAIL PROTECTED]> > > Index: linux/arch/i386/kernel/cpu/cpufreq/Kconfig > === > --- linux.orig/arch/i386/kernel/cpu/cpufreq/Kconfig > +++ linux/arch/i386/kernel/cpu/cpufreq/Kconfig > @@ -81,6 +81,7 @@ config X86_POWERNOW_K7_ACPI > config X86_POWERNOW_K8 > tristate "AMD Opteron/Athlon64 PowerNow!" > select CPU_FREQ_TABLE > + select ACPI_PROCESSOR if SMP > depends on EXPERIMENTAL > help > This adds the CPUFreq driver for mobile AMD Opteron/Athlon64 > processors. pgpu8Rj7GewjW.pgp Description: PGP signature
Re: [RFC] select and dependencies in Kconfig
Timur Tabi wrote: > Stefan Richter wrote: >> "A... select B" is just a flavor of "A... depends on B", ... > I think you mean "A... select B" is just a flavor of "B... depends on > A". No, A requires B's symbols. -- Stefan Richter -=-=-=== -=-= =---= http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.22-rc1-mm1: strange GPF when panicing under kvm
When I boot 2.6.22-rc1-mm1 under kvm, but forget to specify a root filesystem, it panics as expected. However, when panicing, it gets a GPF in delay_tsc, and then starts recursively panicing. I don't really understand what's going on; the instruction it's faulting on seems to be "pause" (ie, rep;nop), which seems like it shouldn't fault at all. It looks like some kvm artifact to me, but I'm not sure. Hm, given the error code, maybe it's a segment register problem. VFS: Cannot open root device "" or unknown-block(254,0) Please append a correct "root=" boot option; here are the available partitions: 03004194304 hda driver: ide-disk 03014192933 hda1 Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(254,0) general protection fault: fffa [#1] PREEMPT SMP Modules linked in: CPU:0 EIP:0060:[]Not tainted VLI EFLAGS: 0297 (2.6.22-rc1-mm1-paravirt #1391) EIP is at delay_tsc+0x20/0x42 eax: 00025431 ebx: ecx: edx: 0002 esi: 55c34bd8 edi: ebp: c1421e70 esp: c1421e5c ds: 007b es: 007b fs: 00d8 gs: ss: 0068 Process swapper (pid: 1, ti=c142 task=c141f4f0 task.ti=c142) Stack: 001c096b 55c0f7a7 0003 c1421e80 c021dc90 0001 c1421e90 c021dcb9 0003 c1421ec0 c01298f3 c0426f0e c051bd60 c1421ec0 c012a168 c041f0a4 c1421ecc c0430656 c1421ecc c1421ef4 Call Trace: [] show_trace_log_lvl+0x1a/0x30 [] show_stack_log_lvl+0x9d/0xac [] show_registers+0x1f7/0x336 [] die+0x119/0x21b [] do_general_protection+0x1bf/0x1c7 [] error_code+0x72/0x78 [] __delay+0xc/0xe [] __const_udelay+0x27/0x29 [] panic+0xf8/0x101 [] mount_block_root+0x221/0x236 [] mount_root+0x59/0x5f [] prepare_namespace+0x102/0x149 [] kernel_init+0x2bf/0x2ce [] kernel_thread_helper+0x7/0x10 === INFO: lockdep is turned off. Code: e2 8d 42 01 e8 cb ff ff ff c9 c3 55 89 e5 57 56 53 83 ec 08 89 45 ec 0f 31 8d 74 26 00 b9 00 00 00 00 89 c6 89 c8 09 f0 89 45 f0 90 0f 31 8d 74 26 00 b9 00 00 00 00 89 c6 89 c8 09 f0 2b 45 EIP: [] delay_tsc+0x20/0x42 SS:ESP 0068:c1421e5c general protection fault: fffa [#2] PREEMPT SMP Modules linked in: CPU:0 EIP:0060:[]Not tainted VLI EFLAGS: 0282 (2.6.22-rc1-mm1-paravirt #1391) EIP is at _spin_unlock_irqrestore+0x44/0x6d eax: 0282 ebx: c048cd80 ecx: c01096f7 edx: 0001 esi: 0282 edi: 0068 ebp: c1421dbc esp: c1421db4 ds: 007b es: 007b fs: 00d8 gs: ss: 0068 Process swapper (pid: 1, ti=c142 task=c141f4f0 task.ti=c142) Stack: c1421e24 c1421e5c c1421dec c01096f7 c0420228 0068 c1421e5c 0001 c048ecd4 c1421e24 0282 c1421e24 55c34bd8 fffa c1421e1c c037b992 fffa 000d 000b c011a748 0001 c11c2000 c141f6c0 Call Trace: [] show_trace_log_lvl+0x1a/0x30 [] show_stack_log_lvl+0x9d/0xac [] show_registers+0x1f7/0x336 [] die+0x119/0x21b [] do_general_protection+0x1bf/0x1c7 [] error_code+0x72/0x78 [] die+0x188/0x21b [] do_general_protection+0x1bf/0x1c7 [] error_code+0x72/0x78 [] __delay+0xc/0xe [] __const_udelay+0x27/0x29 [] panic+0xf8/0x101 [] mount_block_root+0x221/0x236 [] mount_root+0x59/0x5f [] prepare_namespace+0x102/0x149 [] kernel_init+0x2bf/0x2ce [] kernel_thread_helper+0x7/0x10 === INFO: lockdep is turned off. Code: 89 d8 e8 7a 4f ea ff f7 c6 00 02 00 00 75 13 89 f0 50 9d 90 8d b4 26 00 00 00 00 e8 0d 9f dc ff eb 11 e8 b0 b4 dc ff 89 f0 50 9d <90> 8d b4 26 00 00 00 00 b8 01 00 00 00 e8 c9 99 da ff 89 e0 25 EIP: [] _spin_unlock_irqrestore+0x44/0x6d SS:ESP 0068:c1421db4 general protection fault: fffa [#3] J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.22-rc1 does not boot on VIA C3_2 cause of X86_CMPXCHG64
Hi, my small VIA C3_2 box does not boot with 2.6.22-rc1. It even does not uncompress the kernel. The configuration as M386 M486 works. But M586 + MVIAC3_2 does not work. solution for me, cahnge arch/i386/Kconfig.cpu --- arch/i386/Kconfig.cpu.before 2007-05-17 01:38:26.0 +0200 +++ arch/i386/Kconfig.cpu 2007-05-17 00:54:52.0 +0200 @@ -299,5 +299,5 @@ config X86_CMPXCHG64 bool - depends on !M386 && !M486 + depends on !M386 && !M486 && !MVIAC3_2 default y The related #ifdef is in ./include/asm-i386/cmpxchg.h May be cmpxchg8b is not supported by VIAC3_2 ? May be some other non Intel/AMD need to be excluded from X86_CMPXCHG64 ? May be the generic option CONFIG_X86_GENERIC need to switch this off also ? Just write an email to me if you want to send a patch to test on a C3_2. Best regards, Christian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] powernow-k8: depend on acpi-processor for SMP systems
On Thu, May 17, 2007 at 12:50:50AM +0100, Daniel Drake wrote: > powernow-k8 uses PSB BIOS tables to read frequency info on UP systems, but > on SMP it requires the acpi-processor driver. Kconfig should be updated > accordingly to avoid the issues that users are running into. > > http://bugzilla.kernel.org/show_bug.cgi?id=8075 > https://bugs.gentoo.org/show_bug.cgi?id=178585 looks ok to me, but I'd like someone who has been seeing problems to confirm this works first. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2/3] 2.6.22-rc1: known regressions v2 - XFS
On Wed, May 16, 2007 at 04:40:20PM -0700, Jeremy Fitzhardinge wrote: > David Chinner wrote: > > Jeremy has tentatively indicated that the patch has fixed the problem. > > Have you seen any more problems since applying the patch, Jeremy? > > > > No, it continues to seem sound with casual use; I would have expected to > see the problem reoccur by now. I'd like to rerun the full set of tests > I did before to be sure, but so far so good. No other apparent > regressions either. Good to here. I think the problem is fixed, then. > Also, the match between the observed symptoms and the bugfix is very > good, which adds confidence (ie, no element of "it works now but we > don't know why"). I guess the only remaining concern is whether there > are any other paths which fail to dirty the inode. There aren't any that I can see - if more come up we'll deal with them then. > Did you manage to repro the problem? xfs_io is my friend ;) Without patch: # touch /mnt/scratch/fred # xfs_io -c "pwrite 0 5" -c "s" -c "pwrite 5 5" /mnt/scratch/fred wrote 5/5 bytes at offset 0 5.00 bytes, 1 ops; 0. sec (78.755 KiB/sec and 16129.0323 ops/sec) wrote 5/5 bytes at offset 5 5.00 bytes, 1 ops; 0. sec (542.535 KiB/sec and 11. ops/sec) # umount /mnt/scratch; mount /mnt/scratch; ls -l /mnt/scratch/fred -rw-r--r-- 1 root root 5 May 17 10:04 fred # So the second 5 byte write didn't change the file size. With patch: # touch /mnt/scratch/fred # xfs_io -c "pwrite 0 5" -c "s" -c "pwrite 5 5" /mnt/scratch/fred wrote 5/5 bytes at offset 0 5.00 bytes, 1 ops; 0. sec (76 KiB/sec and 15625. ops/sec) wrote 5/5 bytes at offset 5 5.00 bytes, 1 ops; 0. sec (610 KiB/sec and 125000. ops/sec) # umount /mnt/scratch; mount /mnt/scratch; ls -l /mnt/scratch/fred -rw-r--r-- 1 root root 10 May 17 09:53 fred # So yes, I've reproduced it and confirmed the patch fixes the problem. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] powernow-k8: depend on acpi-processor for SMP systems
powernow-k8 uses PSB BIOS tables to read frequency info on UP systems, but on SMP it requires the acpi-processor driver. Kconfig should be updated accordingly to avoid the issues that users are running into. http://bugzilla.kernel.org/show_bug.cgi?id=8075 https://bugs.gentoo.org/show_bug.cgi?id=178585 Signed-off-by: Daniel Drake <[EMAIL PROTECTED]> Index: linux/arch/i386/kernel/cpu/cpufreq/Kconfig === --- linux.orig/arch/i386/kernel/cpu/cpufreq/Kconfig +++ linux/arch/i386/kernel/cpu/cpufreq/Kconfig @@ -81,6 +81,7 @@ config X86_POWERNOW_K7_ACPI config X86_POWERNOW_K8 tristate "AMD Opteron/Athlon64 PowerNow!" select CPU_FREQ_TABLE + select ACPI_PROCESSOR if SMP depends on EXPERIMENTAL help This adds the CPUFreq driver for mobile AMD Opteron/Athlon64 processors. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Weird hard disk noise on shutdown (bug #7674)
On Wednesday 16 May 2007 9:49 am, Francesco Pretto wrote: > 2007/5/16, Stephen Clark <[EMAIL PROTECTED]>: > > >On Tuesday 15 May 2007 5:08 pm, Dave Jones wrote: > > > > > >I'm confused. Could someone please explain? > > > > > I agree. This didn't happen when I was just using the ide driver, why > > can't libata work as well > > as the old ide driver. > > > > Read my reply to that post. To summarize: libata, prior to 2.6.22rc1, > lacked the feature to spindown the hard disk. The last discussion was > about who's responsable to issue the STANDBYNOW command to the hard > disk. Response from the discussion is: the kernel. Trying to issue it > form userspace (iff your shutdown(8) implementation do so) will now > result in a big fat warning, until these compatibility measures will > be dropped from the kernel (soon or later). The last bit was what threw me. It seemed that the kernel was changed to do the right thing, but only as a compatability measure that would be dropped because userspace should be changed to start doing it (which seemed crazy). It seems that the _warning_ is the compatability measure that will be dropped (or perhaps the ability for userspace to do the wrong thing at all?), and the kernel will continue to DTRT. It's a bit confusing for those of us coming in late in the discussion. :) Rob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Weird hard disk noise on shutdown (bug #7674)
On Wednesday 16 May 2007 7:41 am, Tejun Heo wrote: > Hello, > > Rob Landley wrote: > > Um, hang on. So libata can't reliably turn the system off without data loss > > and potential damage to hardware unless userspace goes through a special song > > and dance? And this is _not_ considered a defect in the kernel? > > Yeap, definitely a bug in the kernel and we're trying to fix it. Just > for the record, we have _always_ issued FLUSH CACHE, so there hasn't > been and won't be any data loss problem. The data loss problem was > mentioned as why we can't do things completely inside kernel without > updating userland shutdown(8) which issues STANDBYNOW. Ok, so the change is to get shutdown to _stop_ doing something stupid (spinning down the disk without first flushing the cache), and the correct thing for shutdown to do is keep its' mitts off the thing and let the kernel power down the darn hardware? Woot, Rob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Weird hard disk noise on shutdown (bug #7674)
On Wednesday 16 May 2007 5:15 am, Francesco Pretto wrote: > - everyone else: > // continue to do nothing :-) > reboot(); That would be cool, but the impression I got from http://linux-ata.org/shutdown.html was that shutdown commands were supposed to _add_ quiescing of drives in order to avoid emergency head parking on poweroff. That article says: > Distros should update their shutdown(8) to do the followings. > Check whether /sys/modules/libata/parameters/spindown_compat exists. If it > does, write 0 to it. For each libata harddisk > Check whether /sys/class/scsi_disk/h:c:i:l/manage_start_stop exists. If it > doesn't, synchronize cache and spin the disk down as before. If it does, do > nothing and continue to the next disk. The file is also accessible as > /sys/block/sdX/device/scsi_disk:*/manage_start_stop. You're saying all this is to work around kernels _before_ 2.6.22, and instead of updating your shutdown you could just either update the kernel instead? > If exists some, at this point, exotic shutdown(8) implementation that > is trying to issue STANDBYBOW on libata/scsi devices, it will get a > fat warning. The warning could also state that the only supported way > is to leave complete responsibility of spinning down the hard disks to > kernel, so eventually it could be cleaned of checks and compatibility > options. I'm all for leaving this to the kernel. I play in the embedded space a lot, so the less I can get away with doing, the better. :) Rob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v12
Ingo Molnar wrote: * Peter Williams <[EMAIL PROTECTED]> wrote: As usual, any sort of feedback, bugreport, fix and suggestion is more than welcome, Load balancing appears to be badly broken in this version. When I started 4 hard spinners on my 2 CPU machine one ended up on one CPU and the other 3 on the other CPU and they stayed there. hm, i cannot reproduce this on 4 different SMP boxen, trying various combinations of SCHED_SMT/MC You may need to try more than once. Testing load balancing can be a pain as there's always a possibility you'll get a good result just by chance. I.e. you need a bunch of good results to say it's OK but only one bad result to say it's broken. This makes testing load balancing a pain. and other .config options that might make a difference to balancing. Could you send me your .config? Sent separately. Peter -- Peter Williams [EMAIL PROTECTED] "Learning, n. The kind of ignorance distinguishing the studious." -- Ambrose Bierce - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] recalc_sigpending_tsk fixes
On 05/16, Roland McGrath wrote: > > + * After recalculating TIF_SIGPENDING, we need to make sure the task wakes > up. > + * This is superfluous when called on current, the wakeup is a harmless > no-op. > + */ > +void recalc_sigpending_and_wake(struct task_struct *t) > +{ > + if (recalc_sigpending_tsk(t)) > + signal_wake_up(t, 0); > } We already discussed this, this is not so important, but how about void recalc_sigpending_and_wake(struct task_struct *t) { int was_pending = signal_pending(t); if (recalc_sigpending_tsk(t) && !was_pending) signal_wake_up(t, 0); } ? This "was_pending" is more a documenation than a optimization. Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] scalable rw_mutex
On Wed, 16 May 2007, Andrew Morton wrote: > (I hope. Might have race windows in which the percpu_counter_sum() count is > inaccurate?) The question is how do these race windows affect the locking scheme? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5][TAKE3] fallocate() implementation on i86, x86_64 and powerpc
On Wed, May 16, 2007 at 07:21:16AM -0500, Dave Kleikamp wrote: > On Wed, 2007-05-16 at 13:16 +1000, David Chinner wrote: > > On Wed, May 16, 2007 at 01:33:59AM +0530, Amit K. Arora wrote: > > > > Following changes were made to the previous version: > > > 1) Added description before sys_fallocate() definition. > > > 2) Return EINVAL for len<=0 (With new draft that Ulrich pointed to, > > > posix_fallocate should return EINVAL for len <= 0. > > > 3) Return EOPNOTSUPP if mode is not one of FA_ALLOCATE or FA_DEALLOCATE > > > 4) Do not return ENODEV for dirs (let individual file systems decide if > > > they want to support preallocation to directories or not. > > > 5) Check for wrap through zero. > > > 6) Update c/mtime if fallocate() succeeds. > > > > Please don't make this always happen. c/mtime updates should be dependent > > on the mode being used and whether there is visible change to the file. If > > no > > userspace visible changes to the file occurred, then timestamps should not > > be changed. > > i_blocks will be updated, so it seems reasonable to update ctime. mtime > shouldn't be changed, though, since the contents of the file will be > unchanged. That's assuming blocks were actually allocated - if the prealloc range already has underlying blocks there is no change and so we should not be changing mtime either. Only the filesystem will know if it has changed the file, so I think that timestamp updates need to be driven down to that level, not done blindy at the highest layer Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2/3] 2.6.22-rc1: known regressions v2 - XFS
David Chinner wrote: > Jeremy has tentatively indicated that the patch has fixed the problem. > Have you seen any more problems since applying the patch, Jeremy? > No, it continues to seem sound with casual use; I would have expected to see the problem reoccur by now. I'd like to rerun the full set of tests I did before to be sure, but so far so good. No other apparent regressions either. Also, the match between the observed symptoms and the bugfix is very good, which adds confidence (ie, no element of "it works now but we don't know why"). I guess the only remaining concern is whether there are any other paths which fail to dirty the inode. Did you manage to repro the problem? J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Kconfig powernow-k8 driver should depend on ACPI P-States driver
Daniel Drake wrote: Ed Sweetman wrote: Like i mentioned off list, the problem here is that cpu freq modules dont depend (Kconfig) on CONFIG_ACPI_PROCESSOR, yet they do. Not really. Firstly, some of the cpufreq modules *do* depend on CONFIG_ACPI_PROCESSOR. Secondly, the ones that don't have an existing dependency do not actually depend on ACPI_PROCESSOR in some/most configurations. I'll send in a patch to fix the real problem soon. Daniel The way i patched it was to just include a "select ACPI_PROCESSOR" in the X86_POWERNOW_K8 Kconfig entry. Since the "driver" that the user sees in the Kconfig is X86_POWERNOW_K8 is actually not a driver at all, our actual driver behaves differently since the "pseudo" driver only depends is CPUFREQ. This is misleading, as the actual driver beneath it, depends on ACPI_PROCESSOR too. Now the driver beneath it responds as you'd think. It disables itself when it's depends lines are invalid. Which is good. But the menu entry that we see is for X86_POWERNOW_K8, and that isn't disabled or anything when those depends lines of the driver it actually represents fails. This is easily fixed with the select line in the "pseudo" driver ...which i find a little more appropriate than a depends line. As for actual module dependency issues, i haven't bothered looking into that. As far as the Kconfig shows it shouldn't be allowed to have ACPI_PROCESSOR as a module at all. So maybe that's intended. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1-mm1
Correction, does *this patch* do it for you? -hpa diff --git a/arch/i386/boot/setup.ld b/arch/i386/boot/setup.ld index e9ca0c2..c9c5530 100644 --- a/arch/i386/boot/setup.ld +++ b/arch/i386/boot/setup.ld @@ -44,5 +44,5 @@ SECTIONS /DISCARD/ : { *(.note*) } - ASSERT(_end <= 0x8000, "Setup too big!") + . = ASSERT(_end <= 0x8000, "Setup too big!"); }
Re: [2/3] 2.6.22-rc1: known regressions v2 - XFS
On Wed, May 16, 2007 at 10:31:39PM +0200, Michal Piotrowski wrote: > Hi all, > > Here is a list of some known regressions in 2.6.22-rc1. > > Feel free to add new regressions/remove fixed etc. > http://kernelnewbies.org/known_regressions > > File systems > > Subject: 2.6.21-git10/11: files getting truncated on xfs > References : http://lkml.org/lkml/2007/5/9/410 > Submitter : Jeremy Fitzhardinge <[EMAIL PROTECTED]> > Handled-By : David Chinner <[EMAIL PROTECTED]> > Patch : http://lkml.org/lkml/2007/5/12/93 > Status : patch was suggested Jeremy has tentatively indicated that the patch has fixed the problem. Have you seen any more problems since applying the patch, Jeremy? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] scalable rw_mutex
On Sat, 12 May 2007 11:06:24 -0700 Andrew Morton <[EMAIL PROTECTED]> wrote: > On 12 May 2007 20:55:28 +0200 Andi Kleen <[EMAIL PROTECTED]> wrote: > > > Andrew Morton <[EMAIL PROTECTED]> writes: > > > > > On Fri, 11 May 2007 10:07:17 -0700 (PDT) > > > Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > > > > > On Fri, 11 May 2007, Andrew Morton wrote: > > > > > > > > > yipes. percpu_counter_sum() is expensive. > > > > > > > > Capable of triggering NMI watchdog on 4096+ processors? > > > > > > Well. That would be a millisecond per cpu which sounds improbable. And > > > we'd need to be calling it under local_irq_save() which we presently > > > don't. > > > And nobody has reported any problems against the existing callsites. > > > > > > But it's no speed demon, that's for sure. > > > > There is one possible optimization for this I did some time ago. You don't > > really > > need to sum all over the possible map, but only all CPUs that were ever > > online. But this only helps on systems where the possible map is bigger > > than online map in the common case. But that shouldn't be the case anymore > > on x86 > > -- it just used to be. If it's true on some other architectures it might > > be still worth it. > > > > hm, yeah. > > We could put a cpumask in percpu_counter, initialise it to > cpu_possible_map. Then, those callsites which have hotplug notifiers can > call into new percpu_counter functions which clear and set bits in that > cpumask and which drain percpu_counter.counts[cpu] into > percpu_counter.count. > > And percpu_counter_sum() gets taught to do for_each_cpu_mask(fbc->cpumask). Like this: From: Andrew Morton <[EMAIL PROTECTED]> per-cpu counters presently must iterate over all possible CPUs in the exhaustive percpu_counter_sum(). But it can be much better to only iterate over the presently-online CPUs. To do this, we must arrange for an offlined CPU's count to be spilled into the counter's central count. We can do this for all percpu_counters in the machine by linking them into a single global list and walking that list at CPU_DEAD time. (I hope. Might have race windows in which the percpu_counter_sum() count is inaccurate?) Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- include/linux/percpu_counter.h | 18 ++-- lib/percpu_counter.c | 66 +++ 2 files changed, 72 insertions(+), 12 deletions(-) diff -puN lib/percpu_counter.c~percpu_counters-use-cpu-notifiers lib/percpu_counter.c --- a/lib/percpu_counter.c~percpu_counters-use-cpu-notifiers +++ a/lib/percpu_counter.c @@ -3,8 +3,17 @@ */ #include +#include +#include +#include +#include #include +#ifdef CONFIG_HOTPLUG_CPU +static LIST_HEAD(percpu_counters); +static DEFINE_MUTEX(percpu_counters_lock); +#endif + void percpu_counter_mod(struct percpu_counter *fbc, s32 amount) { long count; @@ -44,3 +53,60 @@ s64 percpu_counter_sum(struct percpu_cou return ret < 0 ? 0 : ret; } EXPORT_SYMBOL(percpu_counter_sum); + +void percpu_counter_init(struct percpu_counter *fbc, s64 amount) +{ + spin_lock_init(>lock); + fbc->count = amount; + fbc->counters = alloc_percpu(s32); +#ifdef CONFIG_HOTPLUG_CPU + mutex_lock(_counters_lock); + list_add(>list, _counters); + mutex_unlock(_counters_lock); +#endif +} +EXPORT_SYMBOL(percpu_counter_init); + +void percpu_counter_destroy(struct percpu_counter *fbc) +{ + free_percpu(fbc->counters); +#ifdef CONFIG_HOTPLUG_CPU + mutex_lock(_counters_lock); + list_del(>list); + mutex_unlock(_counters_lock); +#endif +} +EXPORT_SYMBOL(percpu_counter_destroy); + +#ifdef CONFIG_HOTPLUG_CPU +static int __cpuinit percpu_counter_hotcpu_callback(struct notifier_block *nb, + unsigned long action, void *hcpu) +{ + unsigned int cpu; + struct percpu_counter *fbc; + + if (action != CPU_DEAD) + return NOTIFY_OK; + + cpu = (unsigned long)hcpu; + mutex_lock(_counters_lock); + list_for_each_entry(fbc, _counters, list) { + s32 *pcount; + + spin_lock(>lock); + pcount = per_cpu_ptr(fbc->counters, cpu); + fbc->count += *pcount; + *pcount = 0; + spin_unlock(>lock); + } + mutex_unlock(_counters_lock); + return NOTIFY_OK; +} + +static int __init percpu_counter_startup(void) +{ + hotcpu_notifier(percpu_counter_hotcpu_callback, 0); + return 0; +} +module_init(percpu_counter_startup); +#endif diff -puN include/linux/percpu.h~percpu_counters-use-cpu-notifiers include/linux/percpu.h diff -puN include/linux/percpu_counter.h~percpu_counters-use-cpu-notifiers include/linux/percpu_counter.h --- a/include/linux/percpu_counter.h~percpu_counters-use-cpu-notifiers +++ a/include/linux/percpu_counter.h @@ -8,6 +8,7 @@ #include #include +#include #include #include #include @@ -17,6 +18,9 @@
[PATCH] spi: potential memleak in spidev_ioctl
'ioc' should be deallocated if __copy_from_user fails (found by Coverity - CID 1644). Signed-off-by: Florin Malita <[EMAIL PROTECTED]> --- spidev.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/spi/spidev.c b/drivers/spi/spidev.c index c0a6dce..2464f34 100644 --- a/drivers/spi/spidev.c +++ b/drivers/spi/spidev.c @@ -364,6 +364,7 @@ spidev_ioctl(struct inode *inode, struct file *filp, break; } if (__copy_from_user(ioc, (void __user *)arg, tmp)) { + kfree(ioc); retval = -EFAULT; break; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1-mm1
Mel Gorman wrote: > LD arch/i386/boot/setup.elf > ld:arch/i386/boot/setup.ld:47: syntax error Does this patch fix it for you? -hpa diff --git a/arch/i386/boot/setup.ld b/arch/i386/boot/setup.ld index e9ca0c2..c9c5530 100644 --- a/arch/i386/boot/setup.ld +++ b/arch/i386/boot/setup.ld @@ -44,5 +44,5 @@ SECTIONS /DISCARD/ : { *(.note*) } - ASSERT(_end <= 0x8000, "Setup too big!") + . = ASSERT(_end <= 0x8000, "Setup too big!") }
Re: v2.6.21-rt2
Hi, |--==> Ingo Molnar writes: IM> i'm pleased to announce the v2.6.20-rt2 kernel, which can be downloaded IM> from the usual place: IM> http://redhat.com/~mingo/realtime-preempt/ This new version of the patch solves the amd64/udev bug I reported against previous releases: http://www.mail-archive.com/[EMAIL PROTECTED]/msg00353.html I'm going to test the patch on other amd64 machines as well, thanks a lot and keep up goo the work! Ciao, Free - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] select and dependencies in Kconfig
Stefan Richter wrote: "A... select B" is just a flavor of "A... depends on B", with the additional instruction to the Kconfig UIs: Don't hide A if you can silently switch on B. I think you mean "A... select B" is just a flavor of "B... depends on A". There is one minor difference between the two. If A is a driver and B is a library, then it's more intuitive to update the Kconfig option for A then it is to update the Kconfig option for B. For example, if I want to add a new driver C that uses library B, I can just add this: C select B If I have to use "depends on", then I would have to change the Kconfig option for B like this: B depends on A || C And every time I create a new driver that depends on library B, I have to update that "depends on" line *in addition to* creating the Kconfig line for the new driver. If 10 drivers use library B, you'll have this: B depends on A || C || D || E || F || G || H || I || J || K How about throwing "select" out of the Kconfig language and improving the UIs instead, so that users find what they want and need? I know a lot of people don't like 'select', but I prefer it over 'depends on'. -- Timur Tabi Linux Kernel Developer @ Freescale - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1 of 2] block_page_mkwrite() Implementation V2
On Wed, May 16, 2007 at 11:19:29AM +0100, David Howells wrote: > > However, page_mkwrite() isn't told which bit of the page is going to be > written to. This means it has to ask prepare_write() to make sure the whole > page is filled in. In other words, offset and to must be equal (in AFS I set > them both to 0). The assumption is the page is already up to date and we are writing the whole page unless EOF lands inside the page. AFAICT, we can't get called with a page that is not uptodate and so page filling is not something we should be doing (or want to be doing) here. All we want to do is to be able to change the mapping from a read to a write mapping (e.g. a read mapping of a hole needs to be changed on write) and do the relevant space reservation/allocation and buffer mapping needed for this change. > However, if someone adds a syscall to punch holes in files, this may change... We already have them - ioctl(XFS_IOC_UNRESVSP) and madvise(MADV_REMOVE) - and another - fallocate(FA_DEALLOCATE) - is on it's way. Racing with truncates should already be handled by the truncate code (i.e. partial page truncation does the zero filling). /me makes note to implement ->truncate_range() in XFS for MADV_REMOVE. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
bug seen with dynticks from CONFIG_HARDIRQS_SW_RESEND
Hi, In testing we were noticing that we were getting some intermittent crashes in profile_tick() when dyntick was enabled. The crashes were because the frame pointer per_cpuirq_regs value was 0. That code does a user_mode(get_irq_regs()). Currently regs is set only upon real hardware entry on an irq. The crash path shows resend_irqs() could be called with in a context where set_irq_regs() was not executed. In one specific case this was from softirq->tasklet_action(resend_tasklet)->resend_irqs->handle_level_irq-> handle_IRQ_event->...->profile_tick. It seems anyone calling kernel/irq/manage.c:enable_irq() at the wrong time can trigger this crash. Creating a fake stack and doing a set_irq_regs() fixes the crash. Would it be useful to set a pointer to the entry context on all state changes? For ease I just hacked a default fake stack into the init process after fork time so there is never a 0 but that doesn't seem so nice. Regards, Richard W. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Kconfig powernow-k8 driver should depend on ACPI P-States driver
Ed Sweetman wrote: Like i mentioned off list, the problem here is that cpu freq modules dont depend (Kconfig) on CONFIG_ACPI_PROCESSOR, yet they do. Not really. Firstly, some of the cpufreq modules *do* depend on CONFIG_ACPI_PROCESSOR. Secondly, the ones that don't have an existing dependency do not actually depend on ACPI_PROCESSOR in some/most configurations. I'll send in a patch to fix the real problem soon. Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc7: BUG: sleeping function called from invalid context at net/core/sock.c:1523
On Wed, 16 May 2007, David Miller wrote: > > I have just verified that this locking scheme is indeed correct. So you > > can add > > > > Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]> > > > > if you wish to, and submit the patch to Andrew. > I guess I don't get sent networking patches any more? > :-) Well, this is bluetooth-specific, but it seemed to me that Marcel wasn't going to send pull requests to Linus any time soon, therefore I thought going through akpm is a thing to do. Honestly, I really don't care through which tree this goes in, so sorry if any offence was caused here :) -- Jiri Kosina - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Why can't we sleep in an ISR?
OK. I think the gap between you and me is the definition of term *context*. If you go to Linux Kernel Development, 2nd Edition (ISBN 0-672-32720-1), Page 6, then you will read the following: in Linux, ... each processor is doing one of three things at any given moment: 1. In kernel-space, in process context, ... 2. In kernel-space, in interrupt context, not associated with a process, ... 3. In user-space ... This list is inclusive. ... Maybe you prefer other terminology system, but I do like the above definition given by Robert Love. So maybe in your system *context* mean something at hardware level and you say ISR is in process context, but I think it is more like a logical level and agree with Rovert's definition. And in hardware level, Robert's *context* definition also mean something specific, that I started to be aware of. That is, *in the same context* means a kernel-code is triggered by a user-space code. *in different context* means a kernel-code is triggered by an external interrupt source other than a user-space code. Context has nothing to do with whether an ISR borrow any data structure of a process, instead, its something logical or related to causality. 2007/5/16, Phillip Susi <[EMAIL PROTECTED]>: Dong Feng wrote: > If what you say were true, then an ISR would be running in the same > context as the interrupted process. Yes, and it is, as others have said in this thread, which is a good reason why ISRs can't sleep. > But please check any article or > book, it will say ISR running in different context from any process. > So ISR is considered in its own context, although it shares a lot of > things with the interrupted process. I would only say *context* is a > higher-level logical concept. Depends on which book or article you are reading I suppose. The generally accepted and often used thought is that ISRs technically are running in the context of the interrupted process, but because that context is unknown and therefore should not be used, it is often said that they run in no context, or outside of any context. Sometimes people then assume that because they run outside of any ( particular ) process context, they must be in their own context, but this is a mistake. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/