Re: [PATCH 1/8] compacting file_ra_state
On Fri, Jul 20, 2007 at 09:27:01PM -0700, Linus Torvalds wrote: > > > On Sat, 21 Jul 2007, Fengguang Wu wrote: > > > > Sorry, forgot to prefix the patch titles with [readahead]. > > Should I repost? > > Not for me, but on the other hand, I'd prefer for this to be in -mm a bit, Haven't the readahead patches already essentially been in -mm* for some time? I thought the new patches were some some restructured code, but essentially the tested algorithms? -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, Announce] Unified x86 architecture, arch/x86
> On Saturday 21 July 2007 01:55, Michal Piotrowski wrote: > > > > I really like this idea - code duplication is a bad thing. > > Did you actually look at the patch? It doesn't have a single line > less duplication than there was before. Everything that could > be easily shared was shared already. > > It's just new window dressing without any real advantages. And did you read what tglx wrote? This patch was the beginning of the merger, not the end result. It strived for binary identical images. It was to put everything together as a _starting_point_! The next thing to do after this is to start the merging. -- Steve - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] radixtree: introduce radix_tree_scan_hole()
On Sat, 21 Jul 2007 12:43:06 +0800 Fengguang Wu <[EMAIL PROTECTED]> wrote: > Introduce radix_tree_scan_hole(root, index, max_scan) to scan radix tree > for the first hole. It will be used in interleaved readahead. If you're ever feeling fantastically bored, please consider updating the userspace radix-tree test harness for this? Cook up a couple of testcases for the new functionality? Thanks. http://www.zip.com.au/~akpm/linux/patches/stuff/rtth.tar.gz is the latest. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] console: fix section mismatch warning in vgacon.c
On Sat, Jul 21, 2007 at 07:37:29AM +0800, Antonino A. Daplas wrote: > On Fri, 2007-07-20 at 23:27 +0200, Sam Ravnborg wrote: > > Fix following section mismatch warning: > > WARNING: vmlinux.o(.text+0x121e62): Section mismatch: reference to > > .init.text:__alloc_bootmem (between 'vgacon_startup' and > > 'vgacon_scrolldelta') > > > > Browsing the code it seems that vgacon_scrollback_startup() is only > > called during the init phase so the reference to the .init.text > > section is OK. > > Teach modpost not to warn using ___init_refok. > > > > Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]> > Acked-by: Antonino Daplas <[EMAIL PROTECTED]> Thanks. Will you take care of forwarding it it or do we rely on Andrew in this area? Sam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, Announce] Unified x86 architecture, arch/x86
On Saturday 21 July 2007 01:55, Michal Piotrowski wrote: > Hi, > > On 21/07/07, Thomas Gleixner <[EMAIL PROTECTED]> wrote: > > We are pleased to announce a project we've been working on for some > > time: the unified x86 architecture tree, or "arch/x86" - and we'd like > > to solicit feedback about it. > > > > What is this about? > > [..] > > > As usual, comments and suggestions are welcome! > > I really like this idea - code duplication is a bad thing. Did you actually look at the patch? It doesn't have a single line less duplication than there was before. Everything that could be easily shared was shared already. It's just new window dressing without any real advantages. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, Announce] Unified x86 architecture, arch/x86
On Saturday 21 July 2007 00:32, Thomas Gleixner wrote: > We are pleased to announce a project we've been working on for some > time: the unified x86 architecture tree, or "arch/x86" - and we'd like > to solicit feedback about it. Well you know my position on this. I think it's a bad idea because it means we can never get rid of any old junk. IMNSHO arch/x86_64 is significantly cleaner and simpler in many ways than arch/i386 and I would like to preserve that. Also in general arch/x86_64 is much easier to hack than arch/i386 because it's easier to regression test and in general has to care about much less junk. And I don't know of any way to ever fix that for i386 besides splitting the old stuff off completely. Besides radical file movements like this are bad anyways. They cause a big break in patchkits and forward/backwards porting that doesn't really help anybody. > This causes double maintenance > even for functionality that is conceptually the same for the 32-bit and > the 64-bit tree. (such as support for standard PC platform architecture > devices) It's not really the same platform: one is PC hardware going back forever with zillions of bugs, the other is modern PC platforms which much less bugs and quirks To see it otherwise it's more a junkification of arch/x86_64 than a cleanup of arch/i386 -- in fact you didn't really clean up arch/i386 at all. > How did we do it? > - > > As an initial matter, we made it painstakingly sure that the resulting > .o files in a 32-bit build are bit for bit equal. You got not a single line less code duplication then, so i don't really see the point of this. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-pm] Re: Hibernation considerations
Hi. On Saturday 21 July 2007 08:43:20 [EMAIL PROTECTED] wrote: > On Fri, 20 Jul 2007, Alan Stern wrote: > > > On Fri, 20 Jul 2007, Jeremy Maitin-Shepard wrote: > > > when doing a suspend-to-ram you get to a point where you just don't use > any userspace. > >> > >>> What do you mean? How can you prevent user tasks from running? That's > >>> basically what the freezer does, and the whole point of this approach > >>> is to eliminate the freezer. Right? > >> > >> Presumably no tasks at all would be scheduled. > > > > How would you prevent tasks from being scheduled? How would you > > prevent drivers from deadlocking because in order to put their device > > in a low-power state they need to acquire a lock which is held by a > > user task? > > you give up on the suspend becouse you have no way of getting the user > task to give up the lock. > > however, kernel locks should not be held by user tasks, user tasks are not > expected to behave in rational ways, allowing them to compete with kernel > tasks for locks is a sure way to get a deadlock or indefinate stall. > > what locks are accessed this way? Any userspace process can do a syscall. In the process of the syscall, it can take kernel locks, and it can schedule (eg, while seeking to take a second lock). Regards, Nigel pgpl7edMXgJyR.pgp Description: PGP signature
Re: where is the code for read system call?
> My application reads from socket. I need to change the behavior of read > system call for an experiment. Can someone point me to code? Wouldn't it be easier to create a preload-library-wrapper around glibc? Folkert van Heusden -- MultiTail is a versatile tool for watching logfiles and output of commands. Filtering, coloring, merging, diff-view, etc. http://www.vanheusden.com/multitail/ -- Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Linus, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This will get another small batch of changes for 2.6.23: Arthur Jones (1): IB/ipath: Remove ipath_layer dead code Florin Malita (1): IB/mlx4: Fix leaks in __mlx4_ib_modify_qp Hoang-Nam Nguyen (3): IB/ehca: Support large page MRs IB/ehca: Generate async event when SRQ limit reached IB/ehca: Move ehca2ib_return_code() out of line Joachim Fenkes (1): IB/ehca: Make internal_create/destroy_qp() static Michael S. Tsirkin (1): IB/mthca: Change command token on timeout Roland Dreier (2): mlx4_core: Change command token on timeout IB/mlx4: Fix error path in create_qp_common() Stefan Roscher (1): IB/ehca: Support small QP queues drivers/infiniband/hw/ehca/ehca_classes.h | 50 +++-- drivers/infiniband/hw/ehca/ehca_cq.c |8 +- drivers/infiniband/hw/ehca/ehca_eq.c |8 +- drivers/infiniband/hw/ehca/ehca_irq.c | 42 +++- drivers/infiniband/hw/ehca/ehca_main.c| 49 - drivers/infiniband/hw/ehca/ehca_mrmw.c| 371 - drivers/infiniband/hw/ehca/ehca_mrmw.h|2 +- drivers/infiniband/hw/ehca/ehca_pd.c | 25 ++- drivers/infiniband/hw/ehca/ehca_qp.c | 178 -- drivers/infiniband/hw/ehca/ehca_tools.h | 19 +-- drivers/infiniband/hw/ehca/ehca_uverbs.c |2 +- drivers/infiniband/hw/ehca/hcp_if.c | 50 +++- drivers/infiniband/hw/ehca/ipz_pt_fn.c| 222 + drivers/infiniband/hw/ehca/ipz_pt_fn.h| 26 ++- drivers/infiniband/hw/ipath/Makefile |1 - drivers/infiniband/hw/ipath/ipath_layer.c | 365 drivers/infiniband/hw/ipath/ipath_layer.h | 71 -- drivers/infiniband/hw/ipath/ipath_verbs.h |2 - drivers/infiniband/hw/mlx4/qp.c | 20 +- drivers/infiniband/hw/mthca/mthca_cmd.c |3 +- drivers/net/mlx4/cmd.c|3 +- 21 files changed, 802 insertions(+), 715 deletions(-) delete mode 100644 drivers/infiniband/hw/ipath/ipath_layer.c delete mode 100644 drivers/infiniband/hw/ipath/ipath_layer.h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/7] readahead cleanups and interleaved readahead take 3
Andrew, The following patches are based on yesterday's discussions, compiled and tested OK: smaller file_ra_state: [PATCH 1/7] readahead: compacting file_ra_state [PATCH 2/7] readahead: mmap read-around simplification [PATCH 3/7] readahead: combine file_ra_state.prev_index/prev_offset into prev_ code cleanups: [PATCH 4/7] readahead: remove several readahead macros [PATCH 5/7] readahead: remove the limit max_sectors_kb imposed on max_readahead_kb support of interleaved reads: [PATCH 6/7] radixtree: introduce radix_tree_scan_hole() [PATCH 7/7] readahead: basic support of interleaved reads The diffstat is block/ll_rw_blk.c |9 - fs/ext3/dir.c |2 - fs/ext4/dir.c |2 - fs/splice.c|2 - include/linux/fs.h | 14 +++- include/linux/mm.h |2 - include/linux/radix-tree.h |2 + lib/radix-tree.c | 34 mm/filemap.c | 17 +- mm/readahead.c | 58 +++ 10 files changed, 86 insertions(+), 56 deletions(-) Regards, Fengguang Wu -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/7] readahead: mmap read-around simplification
Fold file_ra_state.mmap_hit into file_ra_state.mmap_miss and make it an int. Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- include/linux/fs.h |3 +-- mm/filemap.c |4 ++-- 2 files changed, 3 insertions(+), 4 deletions(-) --- linux-2.6.22-rc6-mm1.orig/include/linux/fs.h +++ linux-2.6.22-rc6-mm1/include/linux/fs.h @@ -777,8 +777,7 @@ struct file_ra_state { there are only # of pages ahead */ unsigned int ra_pages; /* Maximum readahead window */ - unsigned long mmap_hit; /* Cache hit stat for mmap accesses */ - unsigned long mmap_miss;/* Cache miss stat for mmap accesses */ + int mmap_miss; /* Cache miss stat for mmap accesses */ unsigned long prev_index; /* Cache last read() position */ unsigned int prev_offset; /* Offset where last read() ended in a page */ }; --- linux-2.6.22-rc6-mm1.orig/mm/filemap.c +++ linux-2.6.22-rc6-mm1/mm/filemap.c @@ -1389,7 +1389,7 @@ retry_find: * Do we miss much more than hit in this file? If so, * stop bothering with read-ahead. It will only hurt. */ - if (ra->mmap_miss > ra->mmap_hit + MMAP_LOTSAMISS) + if (ra->mmap_miss > MMAP_LOTSAMISS) goto no_cached_page; /* @@ -1415,7 +1415,7 @@ retry_find: } if (!did_readaround) - ra->mmap_hit++; + ra->mmap_miss--; /* * We have a locked page in the page cache, now we need to check -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/7] readahead: combine file_ra_state.prev_index/prev_offset into prev_pos
Combine the file_ra_state members unsigned long prev_index unsigned int prev_offset into loff_t prev_pos It is more consistent and better supports huge files. Thanks to Peter for the nice proposal! Cc: Peter Zijlstra <[EMAIL PROTECTED]> Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- fs/ext3/dir.c |2 +- fs/ext4/dir.c |2 +- fs/splice.c|2 +- include/linux/fs.h |3 +-- mm/filemap.c | 11 ++- mm/readahead.c | 15 --- 6 files changed, 18 insertions(+), 17 deletions(-) --- linux-2.6.22-rc6-mm1.orig/include/linux/fs.h +++ linux-2.6.22-rc6-mm1/include/linux/fs.h @@ -778,8 +778,7 @@ struct file_ra_state { unsigned int ra_pages; /* Maximum readahead window */ int mmap_miss; /* Cache miss stat for mmap accesses */ - unsigned long prev_index; /* Cache last read() position */ - unsigned int prev_offset; /* Offset where last read() ended in a page */ + loff_t prev_pos;/* Cache last read() position */ }; /* --- linux-2.6.22-rc6-mm1.orig/mm/filemap.c +++ linux-2.6.22-rc6-mm1/mm/filemap.c @@ -881,8 +881,8 @@ void do_generic_mapping_read(struct addr index = *ppos >> PAGE_CACHE_SHIFT; next_index = index; - prev_index = ra.prev_index; - prev_offset = ra.prev_offset; + prev_index = ra.prev_pos >> PAGE_CACHE_SHIFT; + prev_offset = ra.prev_pos & (PAGE_CACHE_SIZE-1); last_index = (*ppos + desc->count + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT; offset = *ppos & ~PAGE_CACHE_MASK; @@ -968,7 +968,6 @@ page_ok: index += offset >> PAGE_CACHE_SHIFT; offset &= ~PAGE_CACHE_MASK; prev_offset = offset; - ra.prev_offset = offset; page_cache_release(page); if (ret == nr && desc->count) @@ -1055,7 +1054,9 @@ no_cached_page: out: *_ra = ra; - _ra->prev_index = prev_index; + _ra->prev_pos = prev_index; + _ra->prev_pos <<= PAGE_CACHE_SHIFT; + _ra->prev_pos |= prev_offset; *ppos = ((loff_t) index << PAGE_CACHE_SHIFT) + offset; if (filp) @@ -1435,7 +1436,7 @@ retry_find: * Found the page and have a reference on it. */ mark_page_accessed(page); - ra->prev_index = page->index; + ra->prev_pos = page->index << PAGE_CACHE_SHIFT; return page; outside_data_content: --- linux-2.6.22-rc6-mm1.orig/mm/readahead.c +++ linux-2.6.22-rc6-mm1/mm/readahead.c @@ -45,7 +45,7 @@ void file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping) { ra->ra_pages = mapping->backing_dev_info->ra_pages; - ra->prev_index = -1; + ra->prev_pos = -1; } EXPORT_SYMBOL_GPL(file_ra_state_init); @@ -318,7 +318,7 @@ static unsigned long get_next_ra_size(st * indicator. The flag won't be set on already cached pages, to avoid the * readahead-for-nothing fuss, saving pointless page cache lookups. * - * prev_index tracks the last visited page in the _previous_ read request. + * prev_pos tracks the last visited byte in the _previous_ read request. * It should be maintained by the caller, and will be used for detecting * small random reads. Note that the readahead algorithm checks loosely * for sequential patterns. Hence interleaved reads might be served as @@ -342,11 +342,9 @@ ondemand_readahead(struct address_space bool hit_readahead_marker, pgoff_t offset, unsigned long req_size) { - int max;/* max readahead pages */ - int sequential; - - max = ra->ra_pages; - sequential = (offset - ra->prev_index <= 1UL) || (req_size > max); + int max = ra->ra_pages; /* max readahead pages */ + pgoff_t prev_offset; + int sequential; /* * It's the expected callback offset, assume sequential access. @@ -360,6 +358,9 @@ ondemand_readahead(struct address_space goto readit; } + prev_offset = ra->prev_pos >> PAGE_CACHE_SHIFT; + sequential = offset - prev_offset <= 1UL || req_size > max; + /* * Standalone, small read. * Read as is, and do not pollute the readahead state. --- linux-2.6.22-rc6-mm1.orig/fs/ext3/dir.c +++ linux-2.6.22-rc6-mm1/fs/ext3/dir.c @@ -143,7 +143,7 @@ static int ext3_readdir(struct file * fi sb->s_bdev->bd_inode->i_mapping, >f_ra, filp, index, 1); - filp->f_ra.prev_index = index; + filp->f_ra.prev_pos = index << PAGE_CACHE_SHIFT; bh = ext3_bread(NULL, inode, blk, 0, ); } --- linux-2.6.22-rc6-mm1.orig/fs/ext4/dir.c +++
[PATCH 1/7] readahead: compacting file_ra_state
Use 'unsigned int' instead of 'unsigned long' for readahead sizes. This helps reduce memory consumption on 64bit CPU when a lot of files are opened. CC: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- include/linux/fs.h |8 mm/filemap.c |2 +- mm/readahead.c |2 +- 3 files changed, 6 insertions(+), 6 deletions(-) --- linux-2.6.22-rc6-mm1.orig/include/linux/fs.h +++ linux-2.6.22-rc6-mm1/include/linux/fs.h @@ -771,12 +771,12 @@ struct fown_struct { * Track a single file's readahead state */ struct file_ra_state { - pgoff_t start; /* where readahead started */ - unsigned long size; /* # of readahead pages */ - unsigned long async_size; /* do asynchronous readahead when + pgoff_t start; /* where readahead started */ + unsigned int size; /* # of readahead pages */ + unsigned int async_size;/* do asynchronous readahead when there are only # of pages ahead */ - unsigned long ra_pages; /* Maximum readahead window */ + unsigned int ra_pages; /* Maximum readahead window */ unsigned long mmap_hit; /* Cache hit stat for mmap accesses */ unsigned long mmap_miss;/* Cache miss stat for mmap accesses */ unsigned long prev_index; /* Cache last read() position */ --- linux-2.6.22-rc6-mm1.orig/mm/filemap.c +++ linux-2.6.22-rc6-mm1/mm/filemap.c @@ -840,7 +840,7 @@ static void shrink_readahead_size_eio(st if (count > 5) return; count++; - printk(KERN_WARNING "Reducing readahead size to %luK\n", + printk(KERN_WARNING "Reducing readahead size to %dK\n", ra->ra_pages << (PAGE_CACHE_SHIFT - 10)); } --- linux-2.6.22-rc6-mm1.orig/mm/readahead.c +++ linux-2.6.22-rc6-mm1/mm/readahead.c @@ -342,7 +342,7 @@ ondemand_readahead(struct address_space bool hit_readahead_marker, pgoff_t offset, unsigned long req_size) { - unsigned long max; /* max readahead pages */ + int max;/* max readahead pages */ int sequential; max = ra->ra_pages; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/7] radixtree: introduce radix_tree_scan_hole()
Introduce radix_tree_scan_hole(root, index, max_scan) to scan radix tree for the first hole. It will be used in interleaved readahead. The implementation is dumb and obviously correct. It can help debug(and document) the possible smart one in future. Cc: Nick Piggin <[EMAIL PROTECTED]> Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- include/linux/radix-tree.h |2 ++ lib/radix-tree.c | 34 ++ 2 files changed, 36 insertions(+) --- linux-2.6.22-rc6-mm1.orig/include/linux/radix-tree.h +++ linux-2.6.22-rc6-mm1/include/linux/radix-tree.h @@ -155,6 +155,8 @@ void *radix_tree_delete(struct radix_tre unsigned int radix_tree_gang_lookup(struct radix_tree_root *root, void **results, unsigned long first_index, unsigned int max_items); +unsigned long radix_tree_scan_hole(struct radix_tree_root *root, + unsigned long index, unsigned long max_scan); int radix_tree_preload(gfp_t gfp_mask); void radix_tree_init(void); void *radix_tree_tag_set(struct radix_tree_root *root, --- linux-2.6.22-rc6-mm1.orig/lib/radix-tree.c +++ linux-2.6.22-rc6-mm1/lib/radix-tree.c @@ -601,6 +601,40 @@ int radix_tree_tag_get(struct radix_tree EXPORT_SYMBOL(radix_tree_tag_get); #endif +static unsigned long +radix_tree_scan_hole_dumb(struct radix_tree_root *root, + unsigned long index, unsigned long max_scan) +{ + unsigned long i; + + for (i = 0; i < max_scan; i++) { + if (!radix_tree_lookup(root, index)) + break; + if (++index == 0) + break; + } + + return index; +} + +/** + * radix_tree_scan_hole-scan for hole + * @root: radix tree root + * @index: index key + * @max_scan: advice on max items to scan (it may scan a little more) + * + * Scan forward from @index for a hole/empty item, stop when + * - hit hole + * - wrap-around to index 0 + * - @max_scan or more items scanned + */ +unsigned long radix_tree_scan_hole(struct radix_tree_root *root, + unsigned long index, unsigned long max_scan) +{ + return radix_tree_scan_hole_dumb(root, index, max_scan); +} +EXPORT_SYMBOL(radix_tree_scan_hole); + static unsigned int __lookup(struct radix_tree_node *slot, void **results, unsigned long index, unsigned int max_items, unsigned long *next_index) -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/7] readahead: remove the limit max_sectors_kb imposed on max_readahead_kb
Remove the size limit max_sectors_kb imposed on max_readahead_kb. The size restriction is unreasonable. Especially when max_sectors_kb cannot grow larger than max_hw_sectors_kb, which can be rather small for some disk drives. Cc: Jens Axboe <[EMAIL PROTECTED]> Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> Acked-by: Jens Axboe <[EMAIL PROTECTED]> --- block/ll_rw_blk.c |9 - 1 file changed, 9 deletions(-) --- linux-2.6.22-rc6-mm1.orig/block/ll_rw_blk.c +++ linux-2.6.22-rc6-mm1/block/ll_rw_blk.c @@ -3945,7 +3945,6 @@ queue_max_sectors_store(struct request_q max_hw_sectors_kb = q->max_hw_sectors >> 1, page_kb = 1 << (PAGE_CACHE_SHIFT - 10); ssize_t ret = queue_var_store(_sectors_kb, page, count); - int ra_kb; if (max_sectors_kb > max_hw_sectors_kb || max_sectors_kb < page_kb) return -EINVAL; @@ -3954,14 +3953,6 @@ queue_max_sectors_store(struct request_q * values synchronously: */ spin_lock_irq(q->queue_lock); - /* -* Trim readahead window as well, if necessary: -*/ - ra_kb = q->backing_dev_info.ra_pages << (PAGE_CACHE_SHIFT - 10); - if (ra_kb > max_sectors_kb) - q->backing_dev_info.ra_pages = - max_sectors_kb >> (PAGE_CACHE_SHIFT - 10); - q->max_sectors = max_sectors_kb << 1; spin_unlock_irq(q->queue_lock); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/7] readahead: remove several readahead macros
Remove VM_MAX_CACHE_HIT, MAX_RA_PAGES and MIN_RA_PAGES. Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- include/linux/mm.h |2 -- mm/readahead.c | 10 +- 2 files changed, 1 insertion(+), 11 deletions(-) --- linux-2.6.22-rc6-mm1.orig/include/linux/mm.h +++ linux-2.6.22-rc6-mm1/include/linux/mm.h @@ -1148,8 +1148,6 @@ int write_one_page(struct page *page, in /* readahead.c */ #define VM_MAX_READAHEAD 128 /* kbytes */ #define VM_MIN_READAHEAD 16 /* kbytes (includes current page) */ -#define VM_MAX_CACHE_HIT 256 /* max pages in a row in cache before -* turning readahead off */ int do_page_cache_readahead(struct address_space *mapping, struct file *filp, pgoff_t offset, unsigned long nr_to_read); --- linux-2.6.22-rc6-mm1.orig/mm/readahead.c +++ linux-2.6.22-rc6-mm1/mm/readahead.c @@ -21,16 +21,8 @@ void default_unplug_io_fn(struct backing } EXPORT_SYMBOL(default_unplug_io_fn); -/* - * Convienent macros for min/max read-ahead pages. - * Note that MAX_RA_PAGES is rounded down, while MIN_RA_PAGES is rounded up. - * The latter is necessary for systems with large page size(i.e. 64k). - */ -#define MAX_RA_PAGES (VM_MAX_READAHEAD*1024 / PAGE_CACHE_SIZE) -#define MIN_RA_PAGES DIV_ROUND_UP(VM_MIN_READAHEAD*1024, PAGE_CACHE_SIZE) - struct backing_dev_info default_backing_dev_info = { - .ra_pages = MAX_RA_PAGES, + .ra_pages = VM_MAX_READAHEAD * 1024 / PAGE_CACHE_SIZE, .state = 0, .capabilities = BDI_CAP_MAP_COPY, .unplug_io_fn = default_unplug_io_fn, -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/7] readahead: basic support of interleaved reads
This is a simplified version of the pagecache context based readahead. It handles the case of multiple threads reading on the same fd and invalidating each others' readahead state. It does the trick by scanning the pagecache and recovering the current read stream's readahead status. The algorithm works in a opportunistic way, in that it do not try to detect interleaved reads _actively_, which requires a probe into the page cache(which means a little more overheads for random reads). It only tries to handle a previously started sequential readahead whose state was overwritten by another concurrent stream, and it can do this job pretty well. Negative and positive examples(or what you can expect from it): 1) it cannot detect and serve perfect request-by-request interleaved reads right: timestream 1 stream 2 0 1 1 1001 2 2 3 1002 4 3 5 1003 6 4 7 1004 8 5 9 1005 Here no single readahead will be carried out. 2) However, if it's two concurrent reads by two threads, the chance of the initial sequential readahead be started is huge. Once the first sequential readahead is started for a stream, this patch will ensure that the readahead window continues to rampup and won't be disturbed by other streams. timestream 1 stream 2 0 1 1 2 2 1001 3 3 4 1002 5 1003 6 4 7 5 8 1004 9 6 101005 11 7 121006 131007 Here steam 1 will start a readahead at page 2, and stream 2 will start its first readahead at page 1003. From then on the two streams will be served right. Cc: Rusty Russell <[EMAIL PROTECTED]> Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- mm/readahead.c | 33 +++-- 1 file changed, 23 insertions(+), 10 deletions(-) --- linux-2.6.22-rc6-mm1.orig/mm/readahead.c +++ linux-2.6.22-rc6-mm1/mm/readahead.c @@ -363,6 +363,29 @@ ondemand_readahead(struct address_space } /* +* Hit a marked page without valid readahead state. +* E.g. interleaved reads. +* Query the pagecache for async_size, which normally equals to +* readahead size. Ramp it up and use it as the new readahead size. +*/ + if (hit_readahead_marker) { + pgoff_t start; + + read_lock_irq(>tree_lock); + start = radix_tree_scan_hole(>page_tree, offset, max+1); + read_unlock_irq(>tree_lock); + + if (!start || start - offset > max) + return 0; + + ra->start = start; + ra->size = start - offset; /* old async_size */ + ra->size = get_next_ra_size(ra, max); + ra->async_size = ra->size; + goto readit; + } + + /* * It may be one of * - first read on start of file * - sequential cache miss @@ -373,16 +396,6 @@ ondemand_readahead(struct address_space ra->size = get_init_ra_size(req_size, max); ra->async_size = ra->size > req_size ? ra->size - req_size : ra->size; - /* -* Hit on a marked page without valid readahead state. -* E.g. interleaved reads. -* Not knowing its readahead pos/size, bet on the minimal possible one. -*/ - if (hit_readahead_marker) { - ra->start++; - ra->size = get_next_ra_size(ra, max); - } - readit: return ra_submit(ra, mapping, filp); } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] compacting file_ra_state
On Fri, Jul 20, 2007 at 09:27:01PM -0700, Linus Torvalds wrote: > > > On Sat, 21 Jul 2007, Fengguang Wu wrote: > > > > Sorry, forgot to prefix the patch titles with [readahead]. > > Should I repost? > > Not for me, but on the other hand, I'd prefer for this to be in -mm a bit, > even if it does mean missing the merge window this time around. OK. Let me repost it... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] compacting file_ra_state
On Sat, 21 Jul 2007, Fengguang Wu wrote: > > Sorry, forgot to prefix the patch titles with [readahead]. > Should I repost? Not for me, but on the other hand, I'd prefer for this to be in -mm a bit, even if it does mean missing the merge window this time around. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ofa-general] [PATCH 5/5] ehca: Support small QP queues
thanks, applied. I fixed this up myself to work with commit 20c2df83, which got rid of the destructor argument to kmem_cache_create() -- you probably want to check my tree to make sure it's OK. Also the same as I said before about checkpatch.pl's warning: WARNING: externs should be avoided in .c files #337: FILE: drivers/infiniband/hw/ehca/ehca_pd.c:91: + extern struct kmem_cache *small_qp_cache; please fix that up when you get a chance - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Fixing lables after GNU indent (Re: [PATCH 1/2] run scripts/Lindent on it to match Documentation/CodingStyle)
[] > > > sed -i -e 's/^\t* \(\w*:\)/ \1/' "$@" > > > > > > which will replace the leading tabs and spaces with one space. > > > It should leave case labels unmolested, as they should be indented with > > > tabs, not 6 spaces. > > > > > > Any regexp ninjas want to have a go at something better? > > > > I'm the one. Trying to write portable, optimized and easy to > > understand scripts [0]. > > > > Please, describe more what must be done, and i will do it. Case labels > > are handled very strangely in you example. > > OK. indent will indent labels to a column number that's a multiple of > 8, plus 6. So it may start in column 6, 14, 20, 28, etc. I'm not quite > sure what the definition of a label is; I had it as \w*: up there, but I > don't know if that would match the _. The point is to *not* handle case > labels, only goto labels. t=`printf '\t'` sed -i "s_^\($t*\) *\([^:]*:\)_\1\2_" "$@" ^-_ I'm not sure about leaving one space `here, otherwise it removes spaces between (supposedly right indented) line start, i.e. nothing or tab(s), and a label, i.e. `label_name:' without space before colon; `label_name' here actually not a colon, let's leave that kind of breakage to compiler. The variable $t is used for readability of the regex and because POSIX BREs leave undefined characters after a backslash, POSIX sed defines only \n as a new line. -- -o--=O`C #oo'L O <___=E M - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/5] ehca: Make ehca2ib_return_code() non-inline
thanks, applied - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/5] ehca: Generate event when SRQ limit reached
thanks, applied. BTW, does your SRQ-capable hardware support generating the "last WQE reached" event? There's not any reliable way to avoid problems when destroying QPs attached to an SRQ without it, and the IB spec requires CAs that support SRQs to generate it (o11-5.2.5 in chapter 11 of vol 1). I don't see any code in ehca to generate the event, and IPoIB CM at least will be very unhappy when using SRQs if the event is not generated. - R. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] compacting file_ra_state
Sorry, forgot to prefix the patch titles with [readahead]. Should I repost? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ofa-general] [PATCH 1/5] ehca: Supports large page MRs
I applied this, but I agree with checkpatch.pl: > WARNING: externs should be avoided in .c files > #227: FILE: drivers/infiniband/hw/ehca/ehca_mrmw.c:67: > +extern int ehca_mr_largepage; > > WARNING: externs should be avoided in .c files > #949: FILE: drivers/infiniband/hw/ehca/hcp_if.c:753: > +extern int ehca_debug_level; if you need to use a variable in more than one .c file, put the extern declaration in a common header that's included everywhere you use the variable, including the .c file that it is defined in. That way the compiler can see if you get confused about the type of the variable. When you get a chance, please post a follow-on patch to fix this. - R. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/8] trivial filemap.c cleanups
- remove unused local next_index in do_generic_mapping_read() - convert some 'unsigned long' to pgoff_t - wrap a long line Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- mm/filemap.c | 16 +++- 1 file changed, 7 insertions(+), 9 deletions(-) --- linux-2.6.22-git15.orig/mm/filemap.c +++ linux-2.6.22-git15/mm/filemap.c @@ -866,11 +866,10 @@ void do_generic_mapping_read(struct addr read_actor_t actor) { struct inode *inode = mapping->host; - unsigned long index; - unsigned long offset; - unsigned long last_index; - unsigned long next_index; - unsigned long prev_index; + pgoff_t index; + pgoff_t offset; + pgoff_t last_index; + pgoff_t prev_index; unsigned int prev_offset; struct page *cached_page; int error; @@ -878,7 +877,6 @@ void do_generic_mapping_read(struct addr cached_page = NULL; index = *ppos >> PAGE_CACHE_SHIFT; - next_index = index; prev_index = ra.prev_pos >> PAGE_CACHE_SHIFT; prev_offset = ra.prev_pos & (PAGE_CACHE_SIZE-1); last_index = (*ppos + desc->count + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT; @@ -1219,7 +1217,8 @@ out: } EXPORT_SYMBOL(generic_file_aio_read); -int file_send_actor(read_descriptor_t * desc, struct page *page, unsigned long offset, unsigned long size) +int file_send_actor(read_descriptor_t * desc, struct page *page, + unsigned long offset, unsigned long size) { ssize_t written; unsigned long count = desc->count; @@ -1272,7 +1271,6 @@ asmlinkage ssize_t sys_readahead(int fd, } #ifdef CONFIG_MMU -static int FASTCALL(page_cache_read(struct file * file, unsigned long offset)); /** * page_cache_read - adds requested page to the page cache if not already there * @file: file to read @@ -1281,7 +1279,7 @@ static int FASTCALL(page_cache_read(stru * This adds the requested page to the page cache if it isn't already there, * and schedules an I/O to read in its contents from disk. */ -static int fastcall page_cache_read(struct file * file, unsigned long offset) +static int fastcall page_cache_read(struct file * file, pgoff_t offset) { struct address_space *mapping = file->f_mapping; struct page *page; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/8] mmap read-around simplification
Fold file_ra_state.mmap_hit into file_ra_state.mmap_miss and make it an int. Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- include/linux/fs.h |3 +-- mm/filemap.c |4 ++-- 2 files changed, 3 insertions(+), 4 deletions(-) --- linux-2.6.22-git15.orig/include/linux/fs.h +++ linux-2.6.22-git15/include/linux/fs.h @@ -703,8 +703,7 @@ struct file_ra_state { there are only # of pages ahead */ unsigned int ra_pages; /* Maximum readahead window */ - unsigned long mmap_hit; /* Cache hit stat for mmap accesses */ - unsigned long mmap_miss;/* Cache miss stat for mmap accesses */ + int mmap_miss; /* Cache miss stat for mmap accesses */ unsigned long prev_index; /* Cache last read() position */ unsigned int prev_offset; /* Offset where last read() ended in a page */ }; --- linux-2.6.22-git15.orig/mm/filemap.c +++ linux-2.6.22-git15/mm/filemap.c @@ -1369,7 +1369,7 @@ retry_find: * Do we miss much more than hit in this file? If so, * stop bothering with read-ahead. It will only hurt. */ - if (ra->mmap_miss > ra->mmap_hit + MMAP_LOTSAMISS) + if (ra->mmap_miss > MMAP_LOTSAMISS) goto no_cached_page; /* @@ -1395,7 +1395,7 @@ retry_find: } if (!did_readaround) - ra->mmap_hit++; + ra->mmap_miss--; /* * We have a locked page in the page cache, now we need to check -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/8] remove the limit max_sectors_kb imposed on max_readahead_kb
Remove the size limit max_sectors_kb imposed on max_readahead_kb. The size restriction is unreasonable. Especially when max_sectors_kb cannot grow larger than max_hw_sectors_kb, which can be rather small for some disk drives. Cc: Jens Axboe <[EMAIL PROTECTED]> Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> Acked-by: Jens Axboe <[EMAIL PROTECTED]> --- block/ll_rw_blk.c |9 - 1 file changed, 9 deletions(-) --- linux-2.6.22-git15.orig/block/ll_rw_blk.c +++ linux-2.6.22-git15/block/ll_rw_blk.c @@ -3946,7 +3946,6 @@ queue_max_sectors_store(struct request_q max_hw_sectors_kb = q->max_hw_sectors >> 1, page_kb = 1 << (PAGE_CACHE_SHIFT - 10); ssize_t ret = queue_var_store(_sectors_kb, page, count); - int ra_kb; if (max_sectors_kb > max_hw_sectors_kb || max_sectors_kb < page_kb) return -EINVAL; @@ -3955,14 +3954,6 @@ queue_max_sectors_store(struct request_q * values synchronously: */ spin_lock_irq(q->queue_lock); - /* -* Trim readahead window as well, if necessary: -*/ - ra_kb = q->backing_dev_info.ra_pages << (PAGE_CACHE_SHIFT - 10); - if (ra_kb > max_sectors_kb) - q->backing_dev_info.ra_pages = - max_sectors_kb >> (PAGE_CACHE_SHIFT - 10); - q->max_sectors = max_sectors_kb << 1; spin_unlock_irq(q->queue_lock); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/8] introduce radix_tree_scan_hole()
Introduce radix_tree_scan_hole(root, index, max_scan) to scan radix tree for the first hole. It will be used in interleaved readahead. The implementation is dumb and obviously correct. It can help debug(and document) the possible smart one in future. Cc: Nick Piggin <[EMAIL PROTECTED]> Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- include/linux/radix-tree.h |2 ++ lib/radix-tree.c | 34 ++ 2 files changed, 36 insertions(+) --- linux-2.6.22-git15.orig/include/linux/radix-tree.h +++ linux-2.6.22-git15/include/linux/radix-tree.h @@ -155,6 +155,8 @@ void *radix_tree_delete(struct radix_tre unsigned int radix_tree_gang_lookup(struct radix_tree_root *root, void **results, unsigned long first_index, unsigned int max_items); +unsigned long radix_tree_scan_hole(struct radix_tree_root *root, + unsigned long index, unsigned long max_scan); int radix_tree_preload(gfp_t gfp_mask); void radix_tree_init(void); void *radix_tree_tag_set(struct radix_tree_root *root, --- linux-2.6.22-git15.orig/lib/radix-tree.c +++ linux-2.6.22-git15/lib/radix-tree.c @@ -599,6 +599,40 @@ int radix_tree_tag_get(struct radix_tree EXPORT_SYMBOL(radix_tree_tag_get); #endif +static unsigned long +radix_tree_scan_hole_dumb(struct radix_tree_root *root, + unsigned long index, unsigned long max_scan) +{ + unsigned long i; + + for (i = 0; i < max_scan; i++) { + if (!radix_tree_lookup(root, index)) + break; + if (++index == 0) + break; + } + + return index; +} + +/** + * radix_tree_scan_hole-scan for hole + * @root: radix tree root + * @index: index key + * @max_scan: advice on max items to scan (it may scan a little more) + * + * Scan forward from @index for a hole/empty item, stop when + * - hit hole + * - wrap-around to index 0 + * - @max_scan or more items scanned + */ +unsigned long radix_tree_scan_hole(struct radix_tree_root *root, + unsigned long index, unsigned long max_scan) +{ + return radix_tree_scan_hole_dumb(root, index, max_scan); +} +EXPORT_SYMBOL(radix_tree_scan_hole); + static unsigned int __lookup(struct radix_tree_node *slot, void **results, unsigned long index, unsigned int max_items, unsigned long *next_index) -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/8] combine file_ra_state.prev_index/prev_offset into prev_pos
Combine the file_ra_state members unsigned long prev_index unsigned int prev_offset into loff_t prev_pos It is more consistent and better supports huge files. Thanks to Peter for the nice proposal! Cc: Peter Zijlstra <[EMAIL PROTECTED]> Cc: Christoph Lameter <[EMAIL PROTECTED]> Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- fs/ext3/dir.c |2 +- fs/ext4/dir.c |2 +- fs/splice.c|2 +- include/linux/fs.h |3 +-- mm/filemap.c | 11 ++- mm/readahead.c | 15 --- 6 files changed, 18 insertions(+), 17 deletions(-) --- linux-2.6.22-git15.orig/include/linux/fs.h +++ linux-2.6.22-git15/include/linux/fs.h @@ -704,8 +704,7 @@ struct file_ra_state { unsigned int ra_pages; /* Maximum readahead window */ int mmap_miss; /* Cache miss stat for mmap accesses */ - unsigned long prev_index; /* Cache last read() position */ - unsigned int prev_offset; /* Offset where last read() ended in a page */ + loff_t prev_pos;/* Cache last read() position */ }; /* --- linux-2.6.22-git15.orig/mm/filemap.c +++ linux-2.6.22-git15/mm/filemap.c @@ -879,8 +879,8 @@ void do_generic_mapping_read(struct addr cached_page = NULL; index = *ppos >> PAGE_CACHE_SHIFT; next_index = index; - prev_index = ra.prev_index; - prev_offset = ra.prev_offset; + prev_index = ra.prev_pos >> PAGE_CACHE_SHIFT; + prev_offset = ra.prev_pos & (PAGE_CACHE_SIZE-1); last_index = (*ppos + desc->count + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT; offset = *ppos & ~PAGE_CACHE_MASK; @@ -966,7 +966,6 @@ page_ok: index += offset >> PAGE_CACHE_SHIFT; offset &= ~PAGE_CACHE_MASK; prev_offset = offset; - ra.prev_offset = offset; page_cache_release(page); if (ret == nr && desc->count) @@ -1056,7 +1055,9 @@ no_cached_page: out: *_ra = ra; - _ra->prev_index = prev_index; + _ra->prev_pos = prev_index; + _ra->prev_pos <<= PAGE_CACHE_SHIFT; + _ra->prev_pos |= prev_offset; *ppos = ((loff_t) index << PAGE_CACHE_SHIFT) + offset; if (cached_page) @@ -1415,7 +1416,7 @@ retry_find: * Found the page and have a reference on it. */ mark_page_accessed(page); - ra->prev_index = page->index; + ra->prev_pos = page->index << PAGE_CACHE_SHIFT; vmf->page = page; return ret | VM_FAULT_LOCKED; --- linux-2.6.22-git15.orig/mm/readahead.c +++ linux-2.6.22-git15/mm/readahead.c @@ -45,7 +45,7 @@ void file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping) { ra->ra_pages = mapping->backing_dev_info->ra_pages; - ra->prev_index = -1; + ra->prev_pos = -1; } EXPORT_SYMBOL_GPL(file_ra_state_init); @@ -326,7 +326,7 @@ static unsigned long get_next_ra_size(st * indicator. The flag won't be set on already cached pages, to avoid the * readahead-for-nothing fuss, saving pointless page cache lookups. * - * prev_index tracks the last visited page in the _previous_ read request. + * prev_pos tracks the last visited byte in the _previous_ read request. * It should be maintained by the caller, and will be used for detecting * small random reads. Note that the readahead algorithm checks loosely * for sequential patterns. Hence interleaved reads might be served as @@ -350,11 +350,9 @@ ondemand_readahead(struct address_space bool hit_readahead_marker, pgoff_t offset, unsigned long req_size) { - int max;/* max readahead pages */ - int sequential; - - max = ra->ra_pages; - sequential = (offset - ra->prev_index <= 1UL) || (req_size > max); + int max = ra->ra_pages; /* max readahead pages */ + pgoff_t prev_offset; + int sequential; /* * It's the expected callback offset, assume sequential access. @@ -368,6 +366,9 @@ ondemand_readahead(struct address_space goto readit; } + prev_offset = ra->prev_pos >> PAGE_CACHE_SHIFT; + sequential = offset - prev_offset <= 1UL || req_size > max; + /* * Standalone, small read. * Read as is, and do not pollute the readahead state. --- linux-2.6.22-git15.orig/fs/ext3/dir.c +++ linux-2.6.22-git15/fs/ext3/dir.c @@ -143,7 +143,7 @@ static int ext3_readdir(struct file * fi sb->s_bdev->bd_inode->i_mapping, >f_ra, filp, index, 1); - filp->f_ra.prev_index = index; + filp->f_ra.prev_pos = index << PAGE_CACHE_SHIFT; bh = ext3_bread(NULL, inode, blk, 0, );
[PATCH 5/8] remove several readahead macros
Remove VM_MAX_CACHE_HIT, MAX_RA_PAGES and MIN_RA_PAGES. Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- include/linux/mm.h |2 -- mm/readahead.c | 10 +- 2 files changed, 1 insertion(+), 11 deletions(-) --- linux-2.6.22-git15.orig/include/linux/mm.h +++ linux-2.6.22-git15/include/linux/mm.h @@ -1136,8 +1136,6 @@ int write_one_page(struct page *page, in /* readahead.c */ #define VM_MAX_READAHEAD 128 /* kbytes */ #define VM_MIN_READAHEAD 16 /* kbytes (includes current page) */ -#define VM_MAX_CACHE_HIT 256 /* max pages in a row in cache before -* turning readahead off */ int do_page_cache_readahead(struct address_space *mapping, struct file *filp, pgoff_t offset, unsigned long nr_to_read); --- linux-2.6.22-git15.orig/mm/readahead.c +++ linux-2.6.22-git15/mm/readahead.c @@ -21,16 +21,8 @@ void default_unplug_io_fn(struct backing } EXPORT_SYMBOL(default_unplug_io_fn); -/* - * Convienent macros for min/max read-ahead pages. - * Note that MAX_RA_PAGES is rounded down, while MIN_RA_PAGES is rounded up. - * The latter is necessary for systems with large page size(i.e. 64k). - */ -#define MAX_RA_PAGES (VM_MAX_READAHEAD*1024 / PAGE_CACHE_SIZE) -#define MIN_RA_PAGES DIV_ROUND_UP(VM_MIN_READAHEAD*1024, PAGE_CACHE_SIZE) - struct backing_dev_info default_backing_dev_info = { - .ra_pages = MAX_RA_PAGES, + .ra_pages = VM_MAX_READAHEAD * 1024 / PAGE_CACHE_SIZE, .state = 0, .capabilities = BDI_CAP_MAP_COPY, .unplug_io_fn = default_unplug_io_fn, -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/8] readahead cleanups and interleaved readahead take 2
Linus, To save you from some merge conflicts, I rebased this readahead patchset to 2.6.22-git5. The following patches are based on yesterday's discussions, compiled and tested OK. smaller file_ra_state: [PATCH 1/8] compacting file_ra_state [PATCH 2/8] mmap read-around simplification [PATCH 3/8] combine file_ra_state.prev_index/prev_offset into prev_pos code cleanups: [PATCH 4/8] trivial filemap.c cleanups [PATCH 5/8] remove several readahead macros [PATCH 6/8] remove the limit max_sectors_kb imposed on max_readahead_kb support of interleaved reads: [PATCH 7/8] introduce radix_tree_scan_hole() [PATCH 8/8] basic support of interleaved reads The diffstat is block/ll_rw_blk.c |9 - fs/ext3/dir.c |2 - fs/ext4/dir.c |2 - fs/splice.c|2 - include/linux/fs.h | 14 +++- include/linux/mm.h |2 - include/linux/radix-tree.h |2 + lib/radix-tree.c | 34 mm/filemap.c | 31 +- mm/readahead.c | 58 +++ 10 files changed, 92 insertions(+), 64 deletions(-) Regards, Fengguang Wu --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 8/8] basic support of interleaved reads
This is a simplified version of the pagecache context based readahead. It handles the case of multiple threads reading on the same fd and invalidating each others' readahead state. It does the trick by scanning the pagecache and recovering the current read stream's readahead status. The algorithm works in a opportunistic way, in that it do not try to detect interleaved reads _actively_, which requires a probe into the page cache(which means a little more overheads for random reads). It only tries to handle a previously started sequential readahead whose state was overwritten by another concurrent stream, and it can do this job pretty well. Negative and positive examples(or what you can expect from it): 1) it cannot detect and serve perfect request-by-request interleaved reads right: timestream 1 stream 2 0 1 1 1001 2 2 3 1002 4 3 5 1003 6 4 7 1004 8 5 9 1005 Here no single readahead will be carried out. 2) However, if it's two concurrent reads by two threads, the chance of the initial sequential readahead be started is huge. Once the first sequential readahead is started for a stream, this patch will ensure that the readahead window continues to rampup and won't be disturbed by other streams. timestream 1 stream 2 0 1 1 2 2 1001 3 3 4 1002 5 1003 6 4 7 5 8 1004 9 6 101005 11 7 121006 131007 Here steam 1 will start a readahead at page 2, and stream 2 will start its first readahead at page 1003. From then on the two streams will be served right. Cc: Nick Piggin <[EMAIL PROTECTED]> Cc: Rusty Russell <[EMAIL PROTECTED]> Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- mm/readahead.c | 33 +++-- 1 file changed, 23 insertions(+), 10 deletions(-) --- linux-2.6.22-git15.orig/mm/readahead.c +++ linux-2.6.22-git15/mm/readahead.c @@ -371,6 +371,29 @@ ondemand_readahead(struct address_space } /* +* Hit a marked page without valid readahead state. +* E.g. interleaved reads. +* Query the pagecache for async_size, which normally equals to +* readahead size. Ramp it up and use it as the new readahead size. +*/ + if (hit_readahead_marker) { + pgoff_t start; + + read_lock_irq(>tree_lock); + start = radix_tree_scan_hole(>page_tree, offset, max+1); + read_unlock_irq(>tree_lock); + + if (!start || start - offset > max) + return 0; + + ra->start = start; + ra->size = start - offset; /* old async_size */ + ra->size = get_next_ra_size(ra, max); + ra->async_size = ra->size; + goto readit; + } + + /* * It may be one of * - first read on start of file * - sequential cache miss @@ -381,16 +404,6 @@ ondemand_readahead(struct address_space ra->size = get_init_ra_size(req_size, max); ra->async_size = ra->size > req_size ? ra->size - req_size : ra->size; - /* -* Hit on a marked page without valid readahead state. -* E.g. interleaved reads. -* Not knowing its readahead pos/size, bet on the minimal possible one. -*/ - if (hit_readahead_marker) { - ra->start++; - ra->size = get_next_ra_size(ra, max); - } - readit: return ra_submit(ra, mapping, filp); } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/8] compacting file_ra_state
Use 'unsigned int' instead of 'unsigned long' for readahead sizes. This helps reduce memory consumption on 64bit CPU when a lot of files are opened. CC: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]> --- include/linux/fs.h |8 mm/readahead.c |2 +- 2 files changed, 5 insertions(+), 5 deletions(-) --- linux-2.6.22-git15.orig/include/linux/fs.h +++ linux-2.6.22-git15/include/linux/fs.h @@ -697,12 +697,12 @@ struct fown_struct { * Track a single file's readahead state */ struct file_ra_state { - pgoff_t start; /* where readahead started */ - unsigned long size; /* # of readahead pages */ - unsigned long async_size; /* do asynchronous readahead when + pgoff_t start; /* where readahead started */ + unsigned int size; /* # of readahead pages */ + unsigned int async_size;/* do asynchronous readahead when there are only # of pages ahead */ - unsigned long ra_pages; /* Maximum readahead window */ + unsigned int ra_pages; /* Maximum readahead window */ unsigned long mmap_hit; /* Cache hit stat for mmap accesses */ unsigned long mmap_miss;/* Cache miss stat for mmap accesses */ unsigned long prev_index; /* Cache last read() position */ --- linux-2.6.22-git15.orig/mm/readahead.c +++ linux-2.6.22-git15/mm/readahead.c @@ -350,7 +350,7 @@ ondemand_readahead(struct address_space bool hit_readahead_marker, pgoff_t offset, unsigned long req_size) { - unsigned long max; /* max readahead pages */ + int max;/* max readahead pages */ int sequential; max = ra->ra_pages; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Kconfig: Remove top level menu "Code maturity level options"
This patch removes the top level menu "Code maturity level options", and moves its options into menu "General setup". This makes Kconfig less cluttered and easier to setup. Cc: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Al Boldi <[EMAIL PROTECTED]> --- --- a/init/Kconfig 2007-07-09 06:38:47.0 +0300 +++ b/init/Kconfig 2007-07-21 06:42:06.0 +0300 @@ -7,7 +7,7 @@ config DEFCONFIG_LIST default "/boot/config-$UNAME_RELEASE" default "arch/$ARCH/defconfig" -menu "Code maturity level options" +menu "General setup" config EXPERIMENTAL bool "Prompt for development and/or incomplete code/drivers" @@ -61,9 +61,6 @@ config INIT_ENV_ARG_LIMIT Maximum of each of the number of arguments and environment variables passed to init from the kernel command line. -endmenu - -menu "General setup" config LOCALVERSION string "Local version - append to kernel release" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [broken-out-2007-07-20-00-22] kernel bug at kernel/params:570
On 7/21/07, Satyam Sharma <[EMAIL PROTECTED]> wrote: Hopefully this bug should be 100% reproducible at boot time anyway. Don't care much for XFS and unionfs, but hoping deselecting ATA from the config doesn't change the variables much in this equation. ] Gargh! My system obviously cannot boot without libata. Guess it's time to go through git log and see how to fix that build breakage myself ... Michal, how did you even manage to build / boot this kernel! On 7/21/07, Greg KH <[EMAIL PROTECTED]> wrote: > On Fri, Jul 20, 2007 at 06:37:33PM -0700, Andrew Morton wrote: > > On Fri, 20 Jul 2007 18:02:57 -0700 Greg KH <[EMAIL PROTECTED]> wrote: > > > > > --- a/kernel/params.c > > > +++ b/kernel/params.c > > > @@ -567,7 +567,11 @@ static void __init kernel_param_sysfs_se > > > kobject_set_name(>kobj, name); > > > kobject_init(>kobj); > > > ret = kobject_add(>kobj); > > > - BUG_ON(ret < 0); > > > + if (ret) { > > > + printk(KERN_ERR "module '%s' failed to be added to sysfs, " > > > + "the system will be unstable now.\n", name); > > > + return; > > > + } > > > > It would be nice to print the value of `ret' too. What I'm surprised about is that %eax doesn't seem to contain the return value `ret' of kobject_add(). It's 1, which is funny, given: ret = kobject_add(>kobj); BUG_ON(ret < 0); One wouldn't expect BUG() -- or the corresponding exception handler -- to clobber registers, that would be a sad day. But I cracked this one alright. His .config has CONFIG_PROFILE_LIKELY=y which replaces unlikely() / likely() with do_check_likely() and forces gcc to clobber %eax with the condition itself, which in our case was (ret < 0) == TRUE, and thus, the "1" value we saw in %eax in the register dumps. We should probably document somewhere that CONFIG_PROFILE_LIKELY is not good for debugging. Hmmm ... thinking out aloud here, but probably I don't need to fix that libata breakage at all. I'll just put the BUG_ON(ret < 0) back in the code, deselect PROFILE_LIKELY, and this time we _will_ have the return of kobject_add() in %eax ... That'll at least clear up the EEXIST vs EINVAL mystery, that'll be a good data point, yes. Anyway, I guess I must stop my running commentary -- will only post after this is cleared up now :-) Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Documentation for sysfs, hotplug, and firmware loading.
On Friday 20 July 2007 4:09:36 am Greg KH wrote: > On Fri, Jul 20, 2007 at 09:54:01AM +0200, Cornelia Huck wrote: > > On Fri, 20 Jul 2007 00:00:01 -0700, > > > > Greg KH <[EMAIL PROTECTED]> wrote: > > > > I don't insist on it, mknod insists on it. You cannot mknod a dev > > > > node without specifying block or char. > > > > > > > > You're saying that sysfs should provide major and minor numbers > > > > without anywhere specifying "char" or "block", meaning the major and > > > > minor numbers cannot be _used_. I am insisting on getting the third > > > > piece of information without which "major" and "minor" are useless. > > > > > > > > I asked very specifically about this at OLS, several times. What > > > > you're telling me now seems to contradict what you told me then. > > > > > > Here's the rule: > > > If the SUBSYSTEM is "block", it's a block device. Otherwise > > > it's a char device. > > > > That's actually quite confusing to the casual reader, since: > > > But also realize that the majority of events you will get have nothing > > > to do with device nodes. I think you are forgetting this fact. > > > > So the rule should be: > > If the SUBSYSTEM is "block" (implying major/minor are provided), > > it's a block device. > > If the SUBSYSTEM is not "block", and major/minor are provided, > > it's a char device. > > If major/minor are not provided, the event/device is not > > relevant to device node creation. > > Yes, that is much more descriptive, thanks. agreed, thanks. I'll try to post an updated version of my hotplug documentation later tonight. (Just a _touch_ jetlagged at the moment, though. It may only be 9:47 california time, but it's 11:47 on the east cost. I think.) > greg k-h Rob -- "One of my most productive days was throwing away 1000 lines of code." - Ken Thompson. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: posible latency issues in seq_read
Chris Friesen a écrit : Lee Revell wrote: On 7/20/07, Chris Friesen <[EMAIL PROTECTED]> wrote: We've run into an issue (on 2.6.10) where calling "lsof" triggers lost packets on our server. Preempt is disabled, and NAPI is enabled. Can you reproduce with a recent kernel? Lots of latency issues have been fixed since then. Unfortunately I have to fix it on this version (the bug was found on shipped product), so if there was a difference I'd have to isolate the changes and backport them. Also, I can't run the software that triggers the problem on a newer kernel as it has dependencies on various patches that are not in mainline. Basically what I'd like to know is whether calling schedule() in seq_read() is safe or whether it would break assumptions made by seq_file users. It wont help much. seq_read() is fine in itself. The problem is in established_get_next() and established_get_first() not allowing softirq processing, while scanning a possibly huge hash table, even if few sockets are hashed in. As cond_resched_softirq() was added in linux-2.6.11, you probably *need* to check the diffs between linux-2.6.10 & linux-2.6.11 files : include/linux/sched.h net/core/sock.c (__release_sock() latency) net/ipv4/tcp_ipv4.c (/proc/net/tcp latency) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Use descriptor's functions instead of inline assembly
* Glauber de Oliveira Costa ([EMAIL PROTECTED]) wrote: > This patch provides a new set of functions for managing the descriptor > tables that can be used instead of putting the raw assembly in .c files. Looks alright, some cleanups below > Remodeling of store_tr() suggested by Frederik Deweerdt. > > Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]> > > diff --git a/arch/x86_64/kernel/head64.c b/arch/x86_64/kernel/head64.c > index 6c34bdd..dde41d7 100644 > --- a/arch/x86_64/kernel/head64.c > +++ b/arch/x86_64/kernel/head64.c > @@ -70,7 +70,7 @@ void __init x86_64_start_kernel(char * real_mode_data) > > for (i = 0; i < IDT_ENTRIES; i++) > set_intr_gate(i, early_idt_handler); > - asm volatile("lidt %0" :: "m" (idt_descr)); > + load_idt((const struct desc_ptr *)_descr); No need for extra casting > early_printk("Kernel alive\n"); > > diff --git a/arch/x86_64/kernel/reboot.c b/arch/x86_64/kernel/reboot.c > index 7503068..7c50a12 100644 > --- a/arch/x86_64/kernel/reboot.c > +++ b/arch/x86_64/kernel/reboot.c > @@ -11,6 +11,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -132,7 +133,7 @@ void machine_emergency_restart(void) > } > > case BOOT_TRIPLE: > - __asm__ __volatile__("lidt (%0)": :"r" (_idt)); > + load_idt((const struct desc_ptr *)_idt); same here, plus opportunity for cleanup > __asm__ __volatile__("int3"); > > reboot_type = BOOT_KBD; > diff --git a/arch/x86_64/kernel/setup64.c b/arch/x86_64/kernel/setup64.c > index 1200aaa..fef7290 100644 > --- a/arch/x86_64/kernel/setup64.c > +++ b/arch/x86_64/kernel/setup64.c > @@ -224,8 +224,8 @@ void __cpuinit cpu_init (void) > memcpy(cpu_gdt(cpu), cpu_gdt_table, GDT_SIZE); > > cpu_gdt_descr[cpu].size = GDT_SIZE; > - asm volatile("lgdt %0" :: "m" (cpu_gdt_descr[cpu])); > - asm volatile("lidt %0" :: "m" (idt_descr)); > + load_gdt((const struct desc_ptr *)_gdt_descr[cpu]); > + load_idt((const struct desc_ptr *)_descr); same here > memset(me->thread.tls_array, 0, GDT_ENTRY_TLS_ENTRIES * 8); > syscall_init(); > diff --git a/arch/x86_64/kernel/suspend.c b/arch/x86_64/kernel/suspend.c > index b39d478..ddedadf 100644 > --- a/arch/x86_64/kernel/suspend.c > +++ b/arch/x86_64/kernel/suspend.c > @@ -32,9 +32,9 @@ void __save_processor_state(struct saved_context *ctxt) > /* >* descriptor tables >*/ > - asm volatile ("sgdt %0" : "=m" (ctxt->gdt_limit)); > - asm volatile ("sidt %0" : "=m" (ctxt->idt_limit)); > - asm volatile ("str %0" : "=m" (ctxt->tr)); > + store_gdt((struct desc_ptr *)>gdt_limit); > + store_idt((struct desc_ptr *)>idt_limit); same here, opportunity for cleanup > + store_tr(ctxt->tr); > > /* XMM0..XMM15 should be handled by kernel_fpu_begin(). */ > /* > @@ -91,8 +91,9 @@ void __restore_processor_state(struct saved_context *ctxt) >* now restore the descriptor tables to their proper values >* ltr is done i fix_processor_context(). >*/ > - asm volatile ("lgdt %0" :: "m" (ctxt->gdt_limit)); > - asm volatile ("lidt %0" :: "m" (ctxt->idt_limit)); > + load_gdt((const struct desc_ptr *)>gdt_limit); > + load_idt((const struct desc_ptr *)>idt_limit); > + > > /* >* segment registers > diff --git a/include/asm-x86_64/desc.h b/include/asm-x86_64/desc.h > index ac991b5..f2b0a6f 100644 > --- a/include/asm-x86_64/desc.h > +++ b/include/asm-x86_64/desc.h > @@ -20,6 +20,15 @@ extern struct desc_struct cpu_gdt_table[GDT_ENTRIES]; > #define load_LDT_desc() asm volatile("lldt %w0"::"r" (GDT_ENTRY_LDT*8)) > #define clear_LDT() asm volatile("lldt %w0"::"r" (0)) > > +static inline unsigned long __store_tr(void) > +{ > + unsigned long tr; > + asm volatile ("str %w0":"=r" (tr)); > + return tr; > +} native_store_tr (although I've no objection to just fixing the interface) Index: linus-2.6/arch/x86_64/kernel/head64.c === --- linus-2.6.orig/arch/x86_64/kernel/head64.c +++ linus-2.6/arch/x86_64/kernel/head64.c @@ -70,7 +70,7 @@ void __init x86_64_start_kernel(char * r for (i = 0; i < IDT_ENTRIES; i++) set_intr_gate(i, early_idt_handler); - load_idt((const struct desc_ptr *)_descr); + load_idt(_descr); early_printk("Kernel alive\n"); Index: linus-2.6/arch/x86_64/kernel/reboot.c === --- linus-2.6.orig/arch/x86_64/kernel/reboot.c +++ linus-2.6/arch/x86_64/kernel/reboot.c @@ -24,7 +24,7 @@ void (*pm_power_off)(void); EXPORT_SYMBOL(pm_power_off); -static long no_idt[3]; +static struct desc_ptr no_idt; static enum { BOOT_TRIPLE = 't', BOOT_KBD = 'k' @@ -133,7 +133,7 @@ void
Re: [PATCH] infiniband mlx4: potential leaks in __mlx4_ib_modify_qp
thanks, applied. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [broken-out-2007-07-20-00-22] kernel bug at kernel/params:570
[ Considering this has sufficiently excited me, I became the second person to illegitimately download 2.6.22-mm1 and am presently building Michal's config. The strange thing is that I couldn't get 22-mm1 to even build with the posted .config -- so had to deselect XFS, ATA, unionfs. Hopefully this bug should be 100% reproducible at boot time anyway. Don't care much for XFS and unionfs, but hoping deselecting ATA from the config doesn't change the variables much in this equation. ] On 7/21/07, Greg KH <[EMAIL PROTECTED]> wrote: On Fri, Jul 20, 2007 at 06:37:33PM -0700, Andrew Morton wrote: > On Fri, 20 Jul 2007 18:02:57 -0700 Greg KH <[EMAIL PROTECTED]> wrote: > > > --- a/kernel/params.c > > +++ b/kernel/params.c > > @@ -567,7 +567,11 @@ static void __init kernel_param_sysfs_se > > kobject_set_name(>kobj, name); > > kobject_init(>kobj); > > ret = kobject_add(>kobj); > > - BUG_ON(ret < 0); > > + if (ret) { > > + printk(KERN_ERR "module '%s' failed to be added to sysfs, " > > + "the system will be unstable now.\n", name); > > + return; > > + } > > It would be nice to print the value of `ret' too. What I'm surprised about is that %eax doesn't seem to contain the return value `ret' of kobject_add(). It's 1, which is funny, given: ret = kobject_add(>kobj); BUG_ON(ret < 0); One wouldn't expect BUG() -- or the corresponding exception handler -- to clobber registers, that would be a sad day. Ok, how about this version: --- kernel/params.c |7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) --- a/kernel/params.c +++ b/kernel/params.c @@ -567,7 +567,12 @@ static void __init kernel_param_sysfs_se kobject_set_name(>kobj, name); kobject_init(>kobj); ret = kobject_add(>kobj); - BUG_ON(ret < 0); + if (ret) { + printk(KERN_ERR "Module '%s' failed to be added to sysfs, " + "error number %d\n", name, ret); + printk(KERN_ERR "The system will be unstable now.\n"); + return; + } param_sysfs_setup(mk, kparam, num_params, name_skip); kobject_uevent(>kobj, KOBJ_ADD); } I'm building with this: if (ret) { printk("~ .%s.%d.%s. ~\n", name, ret, kparam->name); return; } To also print out the evil kparam->name that caused us to crash. When ret == EINVAL, name would be "", so not so helpful alone. Also enabling netconsole, though I'm sure there's zero chances of NET / ethXXX / netconsole being up _this_ early in the boot ... Will keep you guys posted :-) Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/5] [V2] Define is_global_init() and is_container_init()
Andrew Morton [EMAIL PROTECTED] wrote: | On Thu, 19 Jul 2007 00:21:58 -0700 | [EMAIL PROTECTED] wrote: | | > --- lx26-22-rc6-mm1a.orig/kernel/pid.c 2007-07-16 12:55:15.0 -0700 | > +++ lx26-22-rc6-mm1a/kernel/pid.c 2007-07-16 13:10:48.0 -0700 | > @@ -69,6 +69,13 @@ struct pid_namespace init_pid_ns = { | > .last_pid = 0, | > .child_reaper = _task | > }; | > +EXPORT_SYMBOL(init_pid_ns); | > + | > +int is_global_init(struct task_struct *tsk) | > +{ | > + return tsk == init_pid_ns.child_reaper; | > +} | > +EXPORT_SYMBOL(is_global_init); | | I don't immediately see why init_pid_ns was exported to modules. | | It would need to be exported if is_global_init() was made static inline in a | header (which seems like a sensible thing to do), but it wasn't. It did not need to be exported in this patch. I have a couple of follow-on patches that cleaned up some header-file dependencies and made is_global_init() inline. Those patches are changing a bit as I merge them with Pavel Emelianov's pid ns changes. I will send a separate patch to inline is_global_init(). Suka - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, Announce] Unified x86 architecture, arch/x86
On 7/20/07, Ingo Molnar <[EMAIL PROTECTED]> wrote: * Jeff Garzik <[EMAIL PROTECTED]> wrote: > I agree with Andi... it's quite nice to be able to leave some > arch/i386 stuff, and not carry it over to arch/x86-64. we can leave those few items in arch/x86 just as much. No need to keep around a legacy tree for that. how about making all files ans directories take _32 or _64 in the name? except the files or dir that are shared. for example: k8_bus.c is only need by 64 ===> change it to k8_bus_64.c mach-generic===> mach-generic_32 YH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: Dell Inspiron 1501 fails to boot in 2.6.21+
On 7/20/07, Mark Tiefenbruck <[EMAIL PROTECTED]> wrote: I'd appreciate any help on getting this report sent to the appropriate list and, of course, getting this fixed. I don't know what's useful, so you're getting everything. This will be a very long e-mail. My new laptop won't boot with kernel versions 2.6.21 or 2.6.22 . No oops. No panic. It just stops printing messages. Maybe it would eventually continue if I wait long enough, but it's unacceptable either way. I include below the contents of dmesg for a working kernel up to the point where it halts. I'm also including what it usually does for a few lines after that point. I did git-bisect on the 2.6.21.y tree. I'm including the result of that as well. It mentions HPET, so I should mention my computer also fails to boot when I enable HPET in my BIOS. I don't have the details of this currently; I can reproduce it again if needed. I've also included my kernel configuration and ver_linux output. You'll notice that my gcc version is 4.2.0, but this also happens with 4.1.2. I'm including /proc/cpuinfo and lspci -vvv. I'm including /proc/ioports and /proc/iomem. I don't have a /proc/scsi. Thanks, Mark Here's the commit that causes the problem: e9e2cdb412412326c4827fc78ba27f410d837e6e is first bad commit commit e9e2cdb412412326c4827fc78ba27f410d837e6e Author: Thomas Gleixner <[EMAIL PROTECTED]> Date: Fri Feb 16 01:28:04 2007 -0800 [PATCH] clockevents: i386 drivers Add clockevent drivers for i386: lapic (local) and PIT/HPET (global). Update the timer IRQ to call into the PIT/HPET driver's event handler and the lapic-timer IRQ to call into the lapic clockevent driver. The assignement of timer functionality is delegated to the core framework code and replaces the compile and runtime evalution in do_timer_interrupt_hook() Use the clockevents broadcast support and implement the lapic_broadcast function for ACPI. No changes to existing functionality. [ kdump fix from Vivek Goyal <[EMAIL PROTECTED]> ] [ fixes based on review feedback from Arjan van de Ven <[EMAIL PROTECTED]> ] Cleanups-from: Adrian Bunk <[EMAIL PROTECTED]> Build-fixes-from: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> Cc: john stultz <[EMAIL PROTECTED]> Cc: Roman Zippel <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> As a wild guess, I'd bet that the rcu queues are failing to get called (probably some problem with the timer interrupt in the APs?), thus preventing the system to get into a quiescent state. It does seem timer related to me. Maybe one of the timer gurus have any other word on this? -- Glauber de Oliveira Costa. "Free as in Freedom" http://glommer.net "The less confident you are, the more serious you have to act." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, Announce] Unified x86 architecture, arch/x86
On 7/20/07, Steven Rostedt <[EMAIL PROTECTED]> wrote: > I really like the idea of a unified source tree for the 2 x86 variants. > The technical differences are really small (of course there are > differences, especially in the boot sequence), and striving to unify as > much as possible while having a clean way to do per 32/64 bit parts as > well is something that imo is the right thing. > Not to mention all the paravirt stuff that's going on. Having a single x86 arch to work with would be greatly beneficial to the work being done to port paravirt to x86_64. As for paravirt, it'd really help. As I had the tree lagged behind by so much, a great part of the work now is checking where i386 is, seeing if it applies for 64-bit, and so on. The differences are not so huge, and I'm trying my best to not let them deviate too much. It could mostly be built incrementally. And I bet a huge part of the tree could be like this too: In most places, they are different for no particular reason, just because two people implemented it separately. There'd be a huge effort to bring those differences into an end, but I think I'd pay in future development speed. (not to mention the duplicate bugs linus have already talked about) Way to go, Thomas and Ingo! I am pretty much for it too. -- Glauber de Oliveira Costa. "Free as in Freedom" http://glommer.net "The less confident you are, the more serious you have to act." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: net/ipv4/inetpeer.c stack warnings
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Thu, 19 Jul 2007 14:48:59 +0200 > Gabriel C wrote: > > Hello , > > > > I noticed on current git this warning in net/ipv4/inetpeer.c > > Yeah, I have no idea why the gcc people thought that this was > something worth warning about. Especially since explicitly > checking for != NULL silences the warning again. Sigh, applied :-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: film at 11: kernel update breaks udev.
On 7/21/07, Kay Sievers <[EMAIL PROTECTED]> wrote: On 7/21/07, Dave Jones <[EMAIL PROTECTED]> wrote: > On Sat, Jul 21, 2007 at 03:28:12AM +0200, Kay Sievers wrote: > > On 7/21/07, Dave Jones <[EMAIL PROTECTED]> wrote: > > > On Sat, Jul 21, 2007 at 03:09:55AM +0200, Kay Sievers wrote: > > > > On 7/21/07, Dave Jones <[EMAIL PROTECTED]> wrote: > > > > > Just one of my machines to 2.6.22.1, and got this during boot.. > > > > > > > > > > Starting udev: udevd-event[619]: udev_node_symlink: symlink(../../sdc/dev/disk/by-uuid/2d773baf-8174-10a6-14db-a78e0e676e89) failed: File exists > > > > > > > > > > Under 2.6.21, all was fine. > > > > > > > > > > sdc is one disk of a 3 disk raid5 set. > > > > > The raidset still manages to come up despite this. > > > > > > > > > > This is a Fedora 7 box, with udev-106-4.1.fc7 > > > > > > > > > > What changed this time? > > > > > > > > CONFIG_BLK_DEV_BSG=y? > > > > > > > > There's a name-clash, because bsg tries to create devices with the same name. > > > > James sent a patch, it's on lkml. > > > > > > BSG isn't in 2.6.22 > > > > Ok. There has nothing else changed, that I could think of what could cause this. > > > > The code in udev that prints this message looks like: > >err("symlink(%s, %s) failed: %s", linktarget, filename, strerror(errno)); > > > > That doesn't really match what you posted. Are there chars missing? > > Umm. Now I'm confused. Note above that it's talking about sdc. > /dev/disk/by-uuid/ contains .. > > lrwxrwxrwx 1 root root 9 2007-07-17 20:35 2d773baf-8174-10a6-14db-a78e0e676e89 -> ../../sdd > lrwxrwxrwx 1 root root 10 2007-07-20 18:44 3B69-1AFD -> ../../sdl1 > lrwxrwxrwx 1 root root 10 2007-07-20 19:06 46A1-3FCB -> ../../sdi1 > lrwxrwxrwx 1 root root 10 2007-07-17 20:35 4e728818-fcf1-21ee-07a5-302b72bc6129 -> ../../sdc1 > lrwxrwxrwx 1 root root 10 2007-07-17 20:35 5f435361-5797-4a8c-a285-c72fa455d401 -> ../../sda1 > lrwxrwxrwx 1 root root 9 2007-07-17 20:35 9502a546-dd98-41df-8916-45032d801b69 -> ../../md0 > lrwxrwxrwx 1 root root 10 2007-07-17 20:35 ed102ac9-5615-c34b-5fe7-1a9029705ebf -> ../../sda2 > > note that uuid matches sdd instead. > > > And what does: > > udevtest /block/sdc > > print? > > parse_file: reading '/etc/udev/rules.d/05-udev-early.rules' as rules file > parse_file: reading '/etc/udev/rules.d/40-multipath.rules' as rules file > parse_file: reading '/etc/udev/rules.d/50-udev.rules' as rules file > parse_file: reading '/etc/udev/rules.d/60-libsane.rules' as rules file > parse_file: reading '/etc/udev/rules.d/60-net.rules' as rules file > parse_file: reading '/etc/udev/rules.d/60-pcmcia.rules' as rules file > parse_file: reading '/etc/udev/rules.d/60-wacom.rules' as rules file > parse_file: reading '/etc/udev/rules.d/85-pcscd_ccid.rules' as rules file > parse_file: reading '/etc/udev/rules.d/85-pcscd_egate.rules' as rules file > parse_file: reading '/etc/udev/rules.d/90-alsa.rules' as rules file > parse_file: reading '/etc/udev/rules.d/90-hal.rules' as rules file > parse_file: reading '/etc/udev/rules.d/95-pam-console.rules' as rules file > parse_file: reading '/etc/udev/rules.d/bluetooth.rules' as rules file > This program is for debugging only, it does not create any node, > or run any program specified by a RUN key. It may show incorrect results, > if rules match against subsystem specfic kernel event variables. > > main: looking at device '/block/sdc' from subsystem 'block' > run_program: '/bin/bash -c '/sbin/lsmod | /bin/grep ^dm_multipath'' > run_program: '/bin/bash' (stdout) 'dm_multipath 28889 0 ' > run_program: '/bin/bash' returned with status 0 > run_program: '/lib/udev/usb_id -x' > run_program: '/lib/udev/usb_id' returned with status 1 > run_program: '/lib/udev/scsi_id -g -x -s /block/sdc -d /dev/.tmp-8-32' > run_program: '/lib/udev/scsi_id' (stdout) 'ID_VENDOR=ATA' > run_program: '/lib/udev/scsi_id' (stdout) 'ID_MODEL=WDC_WD2500KS-00M' > run_program: '/lib/udev/scsi_id' (stdout) 'ID_REVISION=02.0' > run_program: '/lib/udev/scsi_id' (stdout) 'ID_SERIAL=SATA_WDC_WD2500KS-00_WD-WCANK6187088' > run_program: '/lib/udev/scsi_id' (stdout) 'ID_SERIAL_SHORT=WD-WCANK6187088' > run_program: '/lib/udev/scsi_id' (stdout) 'ID_TYPE=disk' > run_program: '/lib/udev/scsi_id' (stdout) 'ID_BUS=scsi' > run_program: '/lib/udev/scsi_id' returned with status 0 > udev_rules_get_name: add symlink 'disk/by-id/scsi-SATA_WDC_WD2500KS-00_WD-WCANK6187088' > run_program: '/lib/udev/path_id /block/sdc' > run_program: '/lib/udev/path_id' (stdout) 'ID_PATH=pci-:05:05.0-scsi-0:0:0:0' > run_program: '/lib/udev/path_id' returned with status 0 > udev_rules_get_name: add symlink 'disk/by-path/pci-:05:05.0-scsi-0:0:0:0' > run_program: '/lib/udev/vol_id --export /dev/.tmp-8-32' > run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_USAGE=raid' > run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_TYPE=linux_raid_member' > run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_VERSION=0.90.0' >
Re: [RFC, Announce] Unified x86 architecture, arch/x86
On Sat, 21 Jul 2007, Arnd Bergmann wrote: > On Saturday 21 July 2007, Thomas Gleixner wrote: > > In my experience, it's very helpful to have a single set of header > files, and merging the two versions of one header usually exposes > bugs that have been fixed in only one of the two, so you get > to fix actual bugs in the process. This can still be done after the merge tglx did. > > In the s390 merge, I also started out in an attempt to guarantee > unchanged object files, much like what you describe. However, it > turned out that fixing it in the process is actually easier. > Either way, 'diff -D __x86_64__' is a great tool for a start, you > should try it out to see how easy it is to merge a lot of files. > > To put it into perspective, I think the s390 merge was a lot easier > than the x86 merge, because there is only a very limited set of > hardware configurations for s390 compared to others. We ended up > doing the full merge with three people within less than a week > and no separate files at all. This is the big reason they wanted to keep it binary identical. Since there are just way too many different configs out there in the x86 world > > OTOH, the powerpc merge is now going into its third year, mostly > because it was started with the intention to remove all cruft > in the process and to only allow sane code into the new architecture. I'd expect x86 to move much faster, just because there are more developers and users of x86 PCs than there are for powerpc. > > The steps that I'd suggest instead are: > > * merge all exported header files of the two architectures. This > alone is a worthy goal, because it allows us to get rid of > the ugly code for deciding which version to use in installed > headers and elsewhere. I don't see why this can't be done after the first "Big" merge. > > * Create an arch/x86/Makefile that descends into ../i386/* and > ../x86_64/* instead of its subdirectories. The thing that Thomas pointed out, is that physical location of the source actually does matter. Having two files side by side with the same name except for a _32.c and _64.c, makes a developer want to merge them. A perfect example is looking at both arch/x86/kernel/module_{32,64}.c One would be encouraged to make that into a single file. But having a arch/i386/kernel/module.c and a arch/x86_64/kernel/module.c would take some time before anyone would care. > > * Merge the arch/x86/* subdirectories, one at a time, starting with > the low-hanging fruit like oprofile or pci, and do the hard > ones like mm and kernel last. Your looking at a 10year plus merge with that approach. I think that is exactly what Ingo and Thomas _dont_ want. Doing it as the big bang way as is posted in this patch is the fastest way to get where we want to go. > > Unfortunately, I don't think I'll spend much time on this, so I > don't get to decide on it, but you asked for feedback ;-) > I'm actually looking forward to helping out here ;-) -- Steve - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: film at 11: kernel update breaks udev.
On 7/21/07, Dave Jones <[EMAIL PROTECTED]> wrote: On Sat, Jul 21, 2007 at 03:28:12AM +0200, Kay Sievers wrote: > On 7/21/07, Dave Jones <[EMAIL PROTECTED]> wrote: > > On Sat, Jul 21, 2007 at 03:09:55AM +0200, Kay Sievers wrote: > > > On 7/21/07, Dave Jones <[EMAIL PROTECTED]> wrote: > > > > Just one of my machines to 2.6.22.1, and got this during boot.. > > > > > > > > Starting udev: udevd-event[619]: udev_node_symlink: symlink(../../sdc/dev/disk/by-uuid/2d773baf-8174-10a6-14db-a78e0e676e89) failed: File exists > > > > > > > > Under 2.6.21, all was fine. > > > > > > > > sdc is one disk of a 3 disk raid5 set. > > > > The raidset still manages to come up despite this. > > > > > > > > This is a Fedora 7 box, with udev-106-4.1.fc7 > > > > > > > > What changed this time? > > > > > > CONFIG_BLK_DEV_BSG=y? > > > > > > There's a name-clash, because bsg tries to create devices with the same name. > > > James sent a patch, it's on lkml. > > > > BSG isn't in 2.6.22 > > Ok. There has nothing else changed, that I could think of what could cause this. > > The code in udev that prints this message looks like: >err("symlink(%s, %s) failed: %s", linktarget, filename, strerror(errno)); > > That doesn't really match what you posted. Are there chars missing? Umm. Now I'm confused. Note above that it's talking about sdc. /dev/disk/by-uuid/ contains .. lrwxrwxrwx 1 root root 9 2007-07-17 20:35 2d773baf-8174-10a6-14db-a78e0e676e89 -> ../../sdd lrwxrwxrwx 1 root root 10 2007-07-20 18:44 3B69-1AFD -> ../../sdl1 lrwxrwxrwx 1 root root 10 2007-07-20 19:06 46A1-3FCB -> ../../sdi1 lrwxrwxrwx 1 root root 10 2007-07-17 20:35 4e728818-fcf1-21ee-07a5-302b72bc6129 -> ../../sdc1 lrwxrwxrwx 1 root root 10 2007-07-17 20:35 5f435361-5797-4a8c-a285-c72fa455d401 -> ../../sda1 lrwxrwxrwx 1 root root 9 2007-07-17 20:35 9502a546-dd98-41df-8916-45032d801b69 -> ../../md0 lrwxrwxrwx 1 root root 10 2007-07-17 20:35 ed102ac9-5615-c34b-5fe7-1a9029705ebf -> ../../sda2 note that uuid matches sdd instead. > And what does: > udevtest /block/sdc > print? parse_file: reading '/etc/udev/rules.d/05-udev-early.rules' as rules file parse_file: reading '/etc/udev/rules.d/40-multipath.rules' as rules file parse_file: reading '/etc/udev/rules.d/50-udev.rules' as rules file parse_file: reading '/etc/udev/rules.d/60-libsane.rules' as rules file parse_file: reading '/etc/udev/rules.d/60-net.rules' as rules file parse_file: reading '/etc/udev/rules.d/60-pcmcia.rules' as rules file parse_file: reading '/etc/udev/rules.d/60-wacom.rules' as rules file parse_file: reading '/etc/udev/rules.d/85-pcscd_ccid.rules' as rules file parse_file: reading '/etc/udev/rules.d/85-pcscd_egate.rules' as rules file parse_file: reading '/etc/udev/rules.d/90-alsa.rules' as rules file parse_file: reading '/etc/udev/rules.d/90-hal.rules' as rules file parse_file: reading '/etc/udev/rules.d/95-pam-console.rules' as rules file parse_file: reading '/etc/udev/rules.d/bluetooth.rules' as rules file This program is for debugging only, it does not create any node, or run any program specified by a RUN key. It may show incorrect results, if rules match against subsystem specfic kernel event variables. main: looking at device '/block/sdc' from subsystem 'block' run_program: '/bin/bash -c '/sbin/lsmod | /bin/grep ^dm_multipath'' run_program: '/bin/bash' (stdout) 'dm_multipath 28889 0 ' run_program: '/bin/bash' returned with status 0 run_program: '/lib/udev/usb_id -x' run_program: '/lib/udev/usb_id' returned with status 1 run_program: '/lib/udev/scsi_id -g -x -s /block/sdc -d /dev/.tmp-8-32' run_program: '/lib/udev/scsi_id' (stdout) 'ID_VENDOR=ATA' run_program: '/lib/udev/scsi_id' (stdout) 'ID_MODEL=WDC_WD2500KS-00M' run_program: '/lib/udev/scsi_id' (stdout) 'ID_REVISION=02.0' run_program: '/lib/udev/scsi_id' (stdout) 'ID_SERIAL=SATA_WDC_WD2500KS-00_WD-WCANK6187088' run_program: '/lib/udev/scsi_id' (stdout) 'ID_SERIAL_SHORT=WD-WCANK6187088' run_program: '/lib/udev/scsi_id' (stdout) 'ID_TYPE=disk' run_program: '/lib/udev/scsi_id' (stdout) 'ID_BUS=scsi' run_program: '/lib/udev/scsi_id' returned with status 0 udev_rules_get_name: add symlink 'disk/by-id/scsi-SATA_WDC_WD2500KS-00_WD-WCANK6187088' run_program: '/lib/udev/path_id /block/sdc' run_program: '/lib/udev/path_id' (stdout) 'ID_PATH=pci-:05:05.0-scsi-0:0:0:0' run_program: '/lib/udev/path_id' returned with status 0 udev_rules_get_name: add symlink 'disk/by-path/pci-:05:05.0-scsi-0:0:0:0' run_program: '/lib/udev/vol_id --export /dev/.tmp-8-32' run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_USAGE=raid' run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_TYPE=linux_raid_member' run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_VERSION=0.90.0' run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_UUID=2d773baf-8174-10a6-14db-a78e0e676e89' run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_LABEL=' run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_LABEL_SAFE=' run_program:
Re: [broken-out-2007-07-20-00-22] kernel bug at kernel/params:570
On Fri, Jul 20, 2007 at 06:37:33PM -0700, Andrew Morton wrote: > On Fri, 20 Jul 2007 18:02:57 -0700 Greg KH <[EMAIL PROTECTED]> wrote: > > > --- a/kernel/params.c > > +++ b/kernel/params.c > > @@ -567,7 +567,11 @@ static void __init kernel_param_sysfs_se > > kobject_set_name(>kobj, name); > > kobject_init(>kobj); > > ret = kobject_add(>kobj); > > - BUG_ON(ret < 0); > > + if (ret) { > > + printk(KERN_ERR "module '%s' failed to be added to sysfs, " > > + "the system will be unstable now.\n", name); > > + return; > > + } > > It would be nice to print the value of `ret' too. Ok, how about this version: --- kernel/params.c |7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) --- a/kernel/params.c +++ b/kernel/params.c @@ -567,7 +567,12 @@ static void __init kernel_param_sysfs_se kobject_set_name(>kobj, name); kobject_init(>kobj); ret = kobject_add(>kobj); - BUG_ON(ret < 0); + if (ret) { + printk(KERN_ERR "Module '%s' failed to be added to sysfs, " + "error number %d\n", name, ret); + printk(KERN_ERR "The system will be unstable now.\n"); + return; + } param_sysfs_setup(mk, kparam, num_params, name_skip); kobject_uevent(>kobj, KOBJ_ADD); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] AFS: Fix file locking
On Fri, 20 Jul 2007, Nick Piggin wrote: > > So you did. Then to answer that, yes it could be faster because there are > stupid volatiles sprinkled all over the bitops code so you could easily > end up having to do more loads. Does it make a real difference? Unlikely, > but David loves counting cycles :) I thought we long long since removed the volatiles. They are buggy and horrible, and we really want to let the compiler combine multiple test-bits, and if they matter that implies locking is buggy or something worse.. Ie we'd *want* if (test_bit(x, y) || test_bit(z,y)) to be rewritten by the compiler as testing bits x/z at the same time. But now I'm too scared to look. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [broken-out-2007-07-20-00-22] kernel bug at kernel/params:570
On Fri, 20 Jul 2007 18:02:57 -0700 Greg KH <[EMAIL PROTECTED]> wrote: > --- a/kernel/params.c > +++ b/kernel/params.c > @@ -567,7 +567,11 @@ static void __init kernel_param_sysfs_se > kobject_set_name(>kobj, name); > kobject_init(>kobj); > ret = kobject_add(>kobj); > - BUG_ON(ret < 0); > + if (ret) { > + printk(KERN_ERR "module '%s' failed to be added to sysfs, " > + "the system will be unstable now.\n", name); > + return; > + } It would be nice to print the value of `ret' too. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] hugetlbfs read() support
(sorry if this is a resend... something bad seems to have happened to me) Andrew Morton wrote: On Thu, 19 Jul 2007 08:51:49 -0700 Badari Pulavarty <[EMAIL PROTECTED]> wrote: This code doesn't have all the ghastly tricks which we deploy to handle concurrent truncate. Do I need to ? Baaahh!! I don't want to deal with them. Nick, can you think of any serious consequences of a read/truncate race in there? I can't.. As it doesn't allow writes, then I _think_ it should be OK. If you ever did want to add write(2) support, then you would have transient zeroes problems. But I'm not completely sure.. we've had a lot of (and still have some known and probably unknown) bugs just in that single generic_mapping_read function, most of which are due to our rabid aversion to doing any locking whatsoever there. So why not just hold i_mutex around the whole thing to be safe? -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: film at 11: kernel update breaks udev.
On Sat, Jul 21, 2007 at 03:28:12AM +0200, Kay Sievers wrote: > On 7/21/07, Dave Jones <[EMAIL PROTECTED]> wrote: > > On Sat, Jul 21, 2007 at 03:09:55AM +0200, Kay Sievers wrote: > > > On 7/21/07, Dave Jones <[EMAIL PROTECTED]> wrote: > > > > Just one of my machines to 2.6.22.1, and got this during boot.. > > > > > > > > Starting udev: udevd-event[619]: udev_node_symlink: > > symlink(../../sdc/dev/disk/by-uuid/2d773baf-8174-10a6-14db-a78e0e676e89) > > failed: File exists > > > > > > > > Under 2.6.21, all was fine. > > > > > > > > sdc is one disk of a 3 disk raid5 set. > > > > The raidset still manages to come up despite this. > > > > > > > > This is a Fedora 7 box, with udev-106-4.1.fc7 > > > > > > > > What changed this time? > > > > > > CONFIG_BLK_DEV_BSG=y? > > > > > > There's a name-clash, because bsg tries to create devices with the same > > name. > > > James sent a patch, it's on lkml. > > > > BSG isn't in 2.6.22 > > Ok. There has nothing else changed, that I could think of what could cause > this. > > The code in udev that prints this message looks like: >err("symlink(%s, %s) failed: %s", linktarget, filename, strerror(errno)); > > That doesn't really match what you posted. Are there chars missing? Umm. Now I'm confused. Note above that it's talking about sdc. /dev/disk/by-uuid/ contains .. lrwxrwxrwx 1 root root 9 2007-07-17 20:35 2d773baf-8174-10a6-14db-a78e0e676e89 -> ../../sdd lrwxrwxrwx 1 root root 10 2007-07-20 18:44 3B69-1AFD -> ../../sdl1 lrwxrwxrwx 1 root root 10 2007-07-20 19:06 46A1-3FCB -> ../../sdi1 lrwxrwxrwx 1 root root 10 2007-07-17 20:35 4e728818-fcf1-21ee-07a5-302b72bc6129 -> ../../sdc1 lrwxrwxrwx 1 root root 10 2007-07-17 20:35 5f435361-5797-4a8c-a285-c72fa455d401 -> ../../sda1 lrwxrwxrwx 1 root root 9 2007-07-17 20:35 9502a546-dd98-41df-8916-45032d801b69 -> ../../md0 lrwxrwxrwx 1 root root 10 2007-07-17 20:35 ed102ac9-5615-c34b-5fe7-1a9029705ebf -> ../../sda2 note that uuid matches sdd instead. > And what does: > udevtest /block/sdc > print? parse_file: reading '/etc/udev/rules.d/05-udev-early.rules' as rules file parse_file: reading '/etc/udev/rules.d/40-multipath.rules' as rules file parse_file: reading '/etc/udev/rules.d/50-udev.rules' as rules file parse_file: reading '/etc/udev/rules.d/60-libsane.rules' as rules file parse_file: reading '/etc/udev/rules.d/60-net.rules' as rules file parse_file: reading '/etc/udev/rules.d/60-pcmcia.rules' as rules file parse_file: reading '/etc/udev/rules.d/60-wacom.rules' as rules file parse_file: reading '/etc/udev/rules.d/85-pcscd_ccid.rules' as rules file parse_file: reading '/etc/udev/rules.d/85-pcscd_egate.rules' as rules file parse_file: reading '/etc/udev/rules.d/90-alsa.rules' as rules file parse_file: reading '/etc/udev/rules.d/90-hal.rules' as rules file parse_file: reading '/etc/udev/rules.d/95-pam-console.rules' as rules file parse_file: reading '/etc/udev/rules.d/bluetooth.rules' as rules file This program is for debugging only, it does not create any node, or run any program specified by a RUN key. It may show incorrect results, if rules match against subsystem specfic kernel event variables. main: looking at device '/block/sdc' from subsystem 'block' run_program: '/bin/bash -c '/sbin/lsmod | /bin/grep ^dm_multipath'' run_program: '/bin/bash' (stdout) 'dm_multipath 28889 0 ' run_program: '/bin/bash' returned with status 0 run_program: '/lib/udev/usb_id -x' run_program: '/lib/udev/usb_id' returned with status 1 run_program: '/lib/udev/scsi_id -g -x -s /block/sdc -d /dev/.tmp-8-32' run_program: '/lib/udev/scsi_id' (stdout) 'ID_VENDOR=ATA' run_program: '/lib/udev/scsi_id' (stdout) 'ID_MODEL=WDC_WD2500KS-00M' run_program: '/lib/udev/scsi_id' (stdout) 'ID_REVISION=02.0' run_program: '/lib/udev/scsi_id' (stdout) 'ID_SERIAL=SATA_WDC_WD2500KS-00_WD-WCANK6187088' run_program: '/lib/udev/scsi_id' (stdout) 'ID_SERIAL_SHORT=WD-WCANK6187088' run_program: '/lib/udev/scsi_id' (stdout) 'ID_TYPE=disk' run_program: '/lib/udev/scsi_id' (stdout) 'ID_BUS=scsi' run_program: '/lib/udev/scsi_id' returned with status 0 udev_rules_get_name: add symlink 'disk/by-id/scsi-SATA_WDC_WD2500KS-00_WD-WCANK6187088' run_program: '/lib/udev/path_id /block/sdc' run_program: '/lib/udev/path_id' (stdout) 'ID_PATH=pci-:05:05.0-scsi-0:0:0:0' run_program: '/lib/udev/path_id' returned with status 0 udev_rules_get_name: add symlink 'disk/by-path/pci-:05:05.0-scsi-0:0:0:0' run_program: '/lib/udev/vol_id --export /dev/.tmp-8-32' run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_USAGE=raid' run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_TYPE=linux_raid_member' run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_VERSION=0.90.0' run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_UUID=2d773baf-8174-10a6-14db-a78e0e676e89' run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_LABEL=' run_program: '/lib/udev/vol_id' (stdout) 'ID_FS_LABEL_SAFE=' run_program: '/lib/udev/vol_id' returned with
Re: [PATCH 2/3] i386: use x86_64's desc_def.h
* Rusty Russell ([EMAIL PROTECTED]) wrote: > On Thu, 2007-07-19 at 09:27 +1000, Rusty Russell wrote: > > On Wed, 2007-07-18 at 09:19 -0700, Zachary Amsden wrote: > > > > +#define GET_CONTENTS(desc) (((desc)->raw32.b >> 10) & 3) > > > > +#define GET_WRITABLE(desc) (((desc)->raw32.b >> 9) & 1) > > > > > > You got rid of the duplicate definitions here, but then added new > > > duplicates (GET_CONTENTS / WRITABLE). Can you stick them in desc.h? > > > > To be honest, I got sick of counting bits at this point, and didn't want > > to introduce bugs. > > > > Here's the updated version of PATCH 1/3: > > And 2/3: > === > i386: use x86_64's desc_def.h plus this needed as well now Index: linus-2.6/include/asm-i386/xen/hypercall.h === --- linus-2.6.orig/include/asm-i386/xen/hypercall.h +++ linus-2.6/include/asm-i386/xen/hypercall.h @@ -359,8 +359,8 @@ MULTI_update_descriptor(struct multicall mcl->op = __HYPERVISOR_update_descriptor; mcl->args[0] = maddr; mcl->args[1] = maddr >> 32; - mcl->args[2] = desc.a; - mcl->args[3] = desc.b; + mcl->args[2] = desc.raw32.a; + mcl->args[3] = desc.raw32.b; } static inline void Index: linus-2.6/drivers/lguest/interrupts_and_traps.c === --- linus-2.6.orig/drivers/lguest/interrupts_and_traps.c +++ linus-2.6/drivers/lguest/interrupts_and_traps.c @@ -103,9 +103,9 @@ void maybe_do_interrupt(struct lguest *l } idt = >idt[FIRST_EXTERNAL_VECTOR+irq]; - if (idt_present(idt->a, idt->b)) { + if (idt_present(idt->raw32.a, idt->raw32.b)) { clear_bit(irq, lg->irqs_pending); - set_guest_interrupt(lg, idt->a, idt->b, 0); + set_guest_interrupt(lg, idt->raw32.a, idt->raw32.b, 0); } } @@ -116,7 +116,7 @@ static int has_err(unsigned int trap) int deliver_trap(struct lguest *lg, unsigned int num) { - u32 lo = lg->idt[num].a, hi = lg->idt[num].b; + u32 lo = lg->idt[num].raw32.a, hi = lg->idt[num].raw32.b; if (!idt_present(lo, hi)) return 0; @@ -139,7 +139,7 @@ static int direct_trap(const struct lgue return 0; /* Interrupt gates (0xE) or not present (0x0) can't go direct. */ - return idt_type(trap->a, trap->b) == 0xF; + return idt_type(trap->raw32.a, trap->raw32.b) == 0xF; } void pin_stack_pages(struct lguest *lg) @@ -170,15 +170,15 @@ static void set_trap(struct lguest *lg, u8 type = idt_type(lo, hi); if (!idt_present(lo, hi)) { - trap->a = trap->b = 0; + trap->raw32.a = trap->raw32.b = 0; return; } if (type != 0xE && type != 0xF) kill_guest(lg, "bad IDT type %i", type); - trap->a = ((__KERNEL_CS|GUEST_PL)<<16) | (lo&0x); - trap->b = (hi&0xEF00); + trap->raw32.a = ((__KERNEL_CS|GUEST_PL)<<16) | (lo&0x); + trap->raw32.b = (hi&0xEF00); } void load_guest_idt_entry(struct lguest *lg, unsigned int num, u32 lo, u32 hi) @@ -204,8 +204,8 @@ static void default_idt_entry(struct des if (trap == LGUEST_TRAP_ENTRY) flags |= (GUEST_PL << 13); - idt->a = (LGUEST_CS<<16) | (handler&0x); - idt->b = (handler&0x) | flags; + idt->raw32.a = (LGUEST_CS<<16) | (handler&0x); + idt->raw32.b = (handler&0x) | flags; } void setup_default_idt_entries(struct lguest_ro_state *state, Index: linus-2.6/drivers/lguest/lg.h === --- linus-2.6.orig/drivers/lguest/lg.h +++ linus-2.6/drivers/lguest/lg.h @@ -44,8 +44,8 @@ void free_pagetables(void); int init_pagetables(struct page **switcher_page, unsigned int pages); /* Full 4G segment descriptors, suitable for CS and DS. */ -#define FULL_EXEC_SEGMENT ((struct desc_struct){0x, 0x00cf9b00}) -#define FULL_SEGMENT ((struct desc_struct){0x, 0x00cf9300}) +#define FULL_EXEC_SEGMENT ((struct desc_struct){ {0x00cf9b00ULL} }) +#define FULL_SEGMENT ((struct desc_struct){ {0x00cf9300ULL} }) struct lguest_dma_info { Index: linus-2.6/drivers/lguest/lguest.c === --- linus-2.6.orig/drivers/lguest/lguest.c +++ linus-2.6/drivers/lguest/lguest.c @@ -173,7 +173,7 @@ static void lguest_load_idt(const struct struct desc_struct *idt = (void *)desc->address; for (i = 0; i < (desc->size+1)/8; i++) - hcall(LHCALL_LOAD_IDT_ENTRY, i, idt[i].a, idt[i].b); + hcall(LHCALL_LOAD_IDT_ENTRY, i, idt[i].raw32.a, idt[i].raw32.b); } static void lguest_load_gdt(const struct Xgt_desc_struct *desc) Index: linus-2.6/drivers/lguest/segments.c === ---
Re: [PATCH 3/3] i386: Replace struct Xgt_desc_struct with struct desc_ptr
* Rusty Russell ([EMAIL PROTECTED]) wrote: > Remove i386's Xgt_desc_struct definition and use desc_def.h's desc_ptr. plus this is needed now Index: linus-2.6/drivers/lguest/lg.h === --- linus-2.6.orig/drivers/lguest/lg.h +++ linus-2.6/drivers/lguest/lg.h @@ -91,13 +91,13 @@ struct lguest_ro_state { /* Host information we need to restore when we switch back. */ u32 host_cr3; - struct Xgt_desc_struct host_idt_desc; - struct Xgt_desc_struct host_gdt_desc; + struct desc_ptr host_idt_desc; + struct desc_ptr host_gdt_desc; u32 host_sp; /* Fields which are used when guest is running. */ - struct Xgt_desc_struct guest_idt_desc; - struct Xgt_desc_struct guest_gdt_desc; + struct desc_ptr guest_idt_desc; + struct desc_ptr guest_gdt_desc; struct i386_hw_tss guest_tss; struct desc_struct guest_idt[IDT_ENTRIES]; struct desc_struct guest_gdt[GDT_ENTRIES]; Index: linus-2.6/arch/i386/xen/enlighten.c === --- linus-2.6.orig/arch/i386/xen/enlighten.c +++ linus-2.6/arch/i386/xen/enlighten.c @@ -301,7 +301,7 @@ static void xen_set_ldt(const void *addr xen_mc_issue(PARAVIRT_LAZY_CPU); } -static void xen_load_gdt(const struct Xgt_desc_struct *dtr) +static void xen_load_gdt(const struct desc_ptr *dtr) { unsigned long *frames; unsigned long va = dtr->address; @@ -401,7 +401,7 @@ static int cvt_gate_to_trap(int vector, } /* Locations of each CPU's IDT */ -static DEFINE_PER_CPU(struct Xgt_desc_struct, idt_desc); +static DEFINE_PER_CPU(struct desc_ptr, idt_desc); /* Set an IDT entry. If the entry is part of the current IDT, then also update Xen. */ @@ -433,7 +433,7 @@ static void xen_write_idt_entry(struct d preempt_enable(); } -static void xen_convert_trap_info(const struct Xgt_desc_struct *desc, +static void xen_convert_trap_info(const struct desc_ptr *desc, struct trap_info *traps) { unsigned in, out, count; @@ -452,7 +452,7 @@ static void xen_convert_trap_info(const void xen_copy_trap_info(struct trap_info *traps) { - const struct Xgt_desc_struct *desc = &__get_cpu_var(idt_desc); + const struct desc_ptr *desc = &__get_cpu_var(idt_desc); xen_convert_trap_info(desc, traps); } @@ -460,7 +460,7 @@ void xen_copy_trap_info(struct trap_info /* Load a new IDT into Xen. In principle this can be per-CPU, so we hold a spinlock to protect the static traps[] array (static because it avoids allocation, and saves stack space). */ -static void xen_load_idt(const struct Xgt_desc_struct *desc) +static void xen_load_idt(const struct desc_ptr *desc) { static DEFINE_SPINLOCK(lock); static struct trap_info traps[257]; Index: linus-2.6/drivers/lguest/lguest.c === --- linus-2.6.orig/drivers/lguest/lguest.c +++ linus-2.6/drivers/lguest/lguest.c @@ -167,7 +167,7 @@ static void lguest_write_idt_entry(struc hcall(LHCALL_LOAD_IDT_ENTRY, entrynum, low, high); } -static void lguest_load_idt(const struct Xgt_desc_struct *desc) +static void lguest_load_idt(const struct desc_ptr *desc) { unsigned int i; struct desc_struct *idt = (void *)desc->address; @@ -176,7 +176,7 @@ static void lguest_load_idt(const struct hcall(LHCALL_LOAD_IDT_ENTRY, i, idt[i].raw32.a, idt[i].raw32.b); } -static void lguest_load_gdt(const struct Xgt_desc_struct *desc) +static void lguest_load_gdt(const struct desc_ptr *desc) { BUG_ON((desc->size+1)/8 != GDT_ENTRIES); hcall(LHCALL_LOAD_GDT, __pa(desc->address), GDT_ENTRIES, 0); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: film at 11: kernel update breaks udev.
On 7/21/07, Dave Jones <[EMAIL PROTECTED]> wrote: On Sat, Jul 21, 2007 at 03:09:55AM +0200, Kay Sievers wrote: > On 7/21/07, Dave Jones <[EMAIL PROTECTED]> wrote: > > Just one of my machines to 2.6.22.1, and got this during boot.. > > > > Starting udev: udevd-event[619]: udev_node_symlink: symlink(../../sdc/dev/disk/by-uuid/2d773baf-8174-10a6-14db-a78e0e676e89) failed: File exists > > > > Under 2.6.21, all was fine. > > > > sdc is one disk of a 3 disk raid5 set. > > The raidset still manages to come up despite this. > > > > This is a Fedora 7 box, with udev-106-4.1.fc7 > > > > What changed this time? > > CONFIG_BLK_DEV_BSG=y? > > There's a name-clash, because bsg tries to create devices with the same name. > James sent a patch, it's on lkml. BSG isn't in 2.6.22 Ok. There has nothing else changed, that I could think of what could cause this. The code in udev that prints this message looks like: err("symlink(%s, %s) failed: %s", linktarget, filename, strerror(errno)); That doesn't really match what you posted. Are there chars missing? Can you please recheck? And what does: udevtest /block/sdc print? Thanks, Kay - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] hugetlbfs read() support
Andrew Morton wrote: On Thu, 19 Jul 2007 08:51:49 -0700 Badari Pulavarty <[EMAIL PROTECTED]> wrote: This code doesn't have all the ghastly tricks which we deploy to handle concurrent truncate. Do I need to ? Baaahh!! I don't want to deal with them. Nick, can you think of any serious consequences of a read/truncate race in there? I can't.. As it doesn't allow writes, then I _think_ it should be OK. If you ever did want to add write(2) support, then you would have transient zeroes problems. But I'm not completely sure.. we've had a lot of (and still have some known and probably unknown) bugs just in that single generic_mapping_read function, most of which are due to our rabid aversion to doing any locking whatsoever there. So why not just hold i_mutex around the whole thing to be safe? -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/7] lguest: documentation pt VII: FIXMEs
Documentation: The FIXMEs Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> --- Documentation/lguest/lguest.c | 12 drivers/char/hvc_lguest.c |3 +++ drivers/lguest/interrupts_and_traps.c | 14 ++ drivers/lguest/io.c | 10 ++ drivers/lguest/lguest.c |8 drivers/lguest/lguest_asm.S | 14 ++ drivers/lguest/page_tables.c |5 + drivers/lguest/segments.c |4 drivers/net/lguest_net.c | 19 +++ 9 files changed, 89 insertions(+) === --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -1536,3 +1536,15 @@ int main(int argc, char *argv[]) /* Finally, run the Guest. This doesn't return. */ run_guest(lguest_fd, _list); } +/*:*/ + +/*M:999 + * Mastery is done: you now know everything I do. + * + * But surely you have seen code, features and bugs in your wanderings which + * you now yearn to attack? That is the real game, and I look forward to you + * patching and forking lguest into the Your-Name-Here-visor. + * + * Farewell, and good coding! + * Rusty Russell. + */ === --- a/drivers/char/hvc_lguest.c +++ b/drivers/char/hvc_lguest.c @@ -13,6 +13,9 @@ * functions. :*/ +/*M:002 The console can be flooded: while the Guest is processing input the + * Host can send more. Buffering in the Host could alleviate this, but it is a + * difficult problem in general. :*/ /* Copyright (C) 2006 Rusty Russell, IBM Corporation * * This program is free software; you can redistribute it and/or modify === --- a/drivers/lguest/interrupts_and_traps.c +++ b/drivers/lguest/interrupts_and_traps.c @@ -231,6 +231,20 @@ static int direct_trap(const struct lgue * go direct, of course 8) */ return idt_type(trap->a, trap->b) == 0xF; } +/*:*/ + +/*M:005 The Guest has the ability to turn its interrupt gates into trap gates, + * if it is careful. The Host will let trap gates can go directly to the + * Guest, but the Guest needs the interrupts atomically disabled for an + * interrupt gate. It can do this by pointing the trap gate at instructions + * within noirq_start and noirq_end, where it can safely disable interrupts. */ + +/*M:006 The Guests do not use the sysenter (fast system call) instruction, + * because it's hardcoded to enter privilege level 0 and so can't go direct. + * It's about twice as fast as the older "int 0x80" system call, so it might + * still be worthwhile to handle it in the Switcher and lcall down to the + * Guest. The sysenter semantics are hairy tho: search for that keyword in + * entry.S :*/ /*H:260 When we make traps go directly into the Guest, we need to make sure * the kernel stack is valid (ie. mapped in the page tables). Otherwise, the === --- a/drivers/lguest/io.c +++ b/drivers/lguest/io.c @@ -553,6 +553,16 @@ void release_all_dma(struct lguest *lg) up_read(>mm->mmap_sem); } +/*M:007 We only return a single DMA buffer to the Launcher, but it would be + * more efficient to return a pointer to the entire array of DMA buffers, which + * it can cache and choose one whenever it wants. + * + * Currently the Launcher uses a write to /dev/lguest, and the return value is + * the address of the DMA structure with the interrupt number placed in + * dma->used_len. If we wanted to return the entire array, we need to return + * the address, array size and interrupt number: this seems to require an + * ioctl(). :*/ + /*L:320 This routine looks for a DMA buffer registered by the Guest on the * given key (using the BIND_DMA hypercall). */ unsigned long get_dma_buffer(struct lguest *lg, === --- a/drivers/lguest/lguest.c +++ b/drivers/lguest/lguest.c @@ -251,6 +251,14 @@ static void irq_enable(void) { lguest_data.irq_enabled = X86_EFLAGS_IF; } +/*:*/ +/*M:003 Note that we don't check for outstanding interrupts when we re-enable + * them (or when we unmask an interrupt). This seems to work for the moment, + * since interrupts are rare and we'll just get the interrupt on the next timer + * tick, but now we have CONFIG_NO_HZ, we should revisit this. One way + * would be to put the "irq_enabled" field in a page by itself, and have the + * Host write-protect it when an interrupt comes in when irqs are disabled. + * There will then be a page fault as soon as interrupts are re-enabled. :*/ /*G:034 * The Interrupt Descriptor Table (IDT). === --- a/drivers/lguest/lguest_asm.S +++ b/drivers/lguest/lguest_asm.S @@ -41,6 +41,20 @@ LGUEST_PATCH(pushf, movl
Re: [PATCH] AFS: Fix file locking
Andrew Morton wrote: On Wed, 18 Jul 2007 15:56:53 +1000 Nick Piggin <[EMAIL PROTECTED]> wrote: Andrew Morton wrote: On Tue, 17 Jul 2007 13:47:32 +0100 David Howells <[EMAIL PROTECTED]> wrote: + if (type == AFS_LOCK_READ && + vnode->flags & (1 << AFS_VNODE_READLOCKED)) { Here we use vnode->flags & (1 << foo) + set_bit(AFS_VNODE_LOCKING, >flags); and elsewhere we use set_bit(foo, >flags) and clear_bit() This is a bit strange. Does the open-coded bit-test have any performance benefit on any architecture? Not on x86 at least, afaik. It uses locked operations on x86, but you can use __set_bit instead (which should always be at least as efficient as the C version). I said "bit-test". ie: test_bit(). That doesn't use a locked operation. So you did. Then to answer that, yes it could be faster because there are stupid volatiles sprinkled all over the bitops code so you could easily end up having to do more loads. Does it make a real difference? Unlikely, but David loves counting cycles :) -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/7] lguest: documentation pt VI: Switcher
Documentation: The Switcher Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> --- drivers/lguest/core.c | 51 +++- drivers/lguest/switcher.S | 271 ++--- 2 files changed, 276 insertions(+), 46 deletions(-) === --- a/drivers/lguest/core.c +++ b/drivers/lguest/core.c @@ -394,46 +394,89 @@ static void set_ts(void) write_cr0(cr0|8); } +/*S:010 + * We are getting close to the Switcher. + * + * Remember that each CPU has two pages which are visible to the Guest when it + * runs on that CPU. This has to contain the state for that Guest: we copy the + * state in just before we run the Guest. + * + * Each Guest has "changed" flags which indicate what has changed in the Guest + * since it last ran. We saw this set in interrupts_and_traps.c and + * segments.c. + */ static void copy_in_guest_info(struct lguest *lg, struct lguest_pages *pages) { + /* Copying all this data can be quite expensive. We usually run the +* same Guest we ran last time (and that Guest hasn't run anywhere else +* meanwhile). If that's not the case, we pretend everything in the +* Guest has changed. */ if (__get_cpu_var(last_guest) != lg || lg->last_pages != pages) { __get_cpu_var(last_guest) = lg; lg->last_pages = pages; lg->changed = CHANGED_ALL; } - /* These are pretty cheap, so we do them unconditionally. */ + /* These copies are pretty cheap, so we do them unconditionally: */ + /* Save the current Host top-level page directory. */ pages->state.host_cr3 = __pa(current->mm->pgd); + /* Set up the Guest's page tables to see this CPU's pages (and no +* other CPU's pages). */ map_switcher_in_guest(lg, pages); + /* Set up the two "TSS" members which tell the CPU what stack to use +* for traps which do directly into the Guest (ie. traps at privilege +* level 1). */ pages->state.guest_tss.esp1 = lg->esp1; pages->state.guest_tss.ss1 = lg->ss1; - /* Copy direct trap entries. */ + /* Copy direct-to-Guest trap entries. */ if (lg->changed & CHANGED_IDT) copy_traps(lg, pages->state.guest_idt, default_idt_entries); - /* Copy all GDT entries but the TSS. */ + /* Copy all GDT entries which the Guest can change. */ if (lg->changed & CHANGED_GDT) copy_gdt(lg, pages->state.guest_gdt); /* If only the TLS entries have changed, copy them. */ else if (lg->changed & CHANGED_GDT_TLS) copy_gdt_tls(lg, pages->state.guest_gdt); + /* Mark the Guest as unchanged for next time. */ lg->changed = 0; } +/* Finally: the code to actually call into the Switcher to run the Guest. */ static void run_guest_once(struct lguest *lg, struct lguest_pages *pages) { + /* This is a dummy value we need for GCC's sake. */ unsigned int clobber; + /* Copy the guest-specific information into this CPU's "struct +* lguest_pages". */ copy_in_guest_info(lg, pages); - /* Put eflags on stack, lcall does rest: suitable for iret return. */ + /* Now: we push the "eflags" register on the stack, then do an "lcall". +* This is how we change from using the kernel code segment to using +* the dedicated lguest code segment, as well as jumping into the +* Switcher. +* +* The lcall also pushes the old code segment (KERNEL_CS) onto the +* stack, then the address of this call. This stack layout happens to +* exactly match the stack of an interrupt... */ asm volatile("pushf; lcall *lguest_entry" +/* This is how we tell GCC that %eax ("a") and %ebx ("b") + * are changed by this routine. The "=" means output. */ : "=a"(clobber), "=b"(clobber) +/* %eax contains the pages pointer. ("0" refers to the + * 0-th argument above, ie "a"). %ebx contains the + * physical address of the Guest's top-level page + * directory. */ : "0"(pages), "1"(__pa(lg->pgdirs[lg->pgdidx].pgdir)) +/* We tell gcc that all these registers could change, + * which means we don't have to save and restore them in + * the Switcher. */ : "memory", "%edx", "%ecx", "%edi", "%esi"); } +/*:*/ /*H:030 Let's jump straight to the the main loop which runs the Guest. * Remember, this is called by the Launcher reading /dev/lguest, and we keep === --- a/drivers/lguest/switcher.S +++ b/drivers/lguest/switcher.S @@ -6,41 +6,131 @@ * are feeling invigorated and refreshed then the next, more
[PATCH 1/7] lguest: documentation pt I: Preparation
The netfilter code had very good documentation: the Netfilter Hacking HOWTO. Noone ever read it. So this time I'm trying something different, using a bit of Knuthiness. Start with drivers/lguest/README. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> --- Documentation/lguest/extract | 58 + Documentation/lguest/lguest.c |9 +++-- drivers/lguest/Makefile | 12 ++ drivers/lguest/README | 47 ++ drivers/lguest/core.c |7 ++- drivers/lguest/hypercalls.c |9 +++-- drivers/lguest/interrupts_and_traps.c | 13 +++ drivers/lguest/io.c |8 +++- drivers/lguest/lguest.c | 30 +++-- drivers/lguest/lguest_bus.c |3 + drivers/lguest/lguest_user.c |7 +++ drivers/lguest/page_tables.c | 10 - drivers/lguest/segments.c | 11 ++ drivers/lguest/switcher.S | 13 +++ 14 files changed, 218 insertions(+), 19 deletions(-) === --- /dev/null +++ b/Documentation/lguest/extract @@ -0,0 +1,58 @@ +#! /bin/sh + +set -e + +PREFIX=$1 +shift + +trap 'rm -r $TMPDIR' 0 +TMPDIR=`mktemp -d` + +exec 3>/dev/null +for f; do +while IFS=" +" read -r LINE; do + case "$LINE" in + *$PREFIX:[0-9]*:\**) + NUM=`echo "$LINE" | sed "s/.*$PREFIX:\([0-9]*\).*/\1/"` + if [ -f $TMPDIR/$NUM ]; then + echo "$TMPDIR/$NUM already exits prior to $f" + exit 1 + fi + exec 3>>$TMPDIR/$NUM + echo $f | sed 's,\.\./,,g' > $TMPDIR/.$NUM + /bin/echo "$LINE" | sed -e "s/$PREFIX:[0-9]*//" -e "s/:\*/*/" >&3 + ;; + *$PREFIX:[0-9]*) + NUM=`echo "$LINE" | sed "s/.*$PREFIX:\([0-9]*\).*/\1/"` + if [ -f $TMPDIR/$NUM ]; then + echo "$TMPDIR/$NUM already exits prior to $f" + exit 1 + fi + exec 3>>$TMPDIR/$NUM + echo $f | sed 's,\.\./,,g' > $TMPDIR/.$NUM + /bin/echo "$LINE" | sed "s/$PREFIX:[0-9]*//" >&3 + ;; + *:\**) + /bin/echo "$LINE" | sed -e "s/:\*/*/" -e "s,/\*\*/,," >&3 + echo >&3 + exec 3>/dev/null + ;; + *) + /bin/echo "$LINE" >&3 + ;; + esac +done < $f +echo >&3 +exec 3>/dev/null +done + +LASTFILE="" +for f in $TMPDIR/*; do +if [ "$LASTFILE" != $(cat $TMPDIR/.$(basename $f) ) ]; then + LASTFILE=$(cat $TMPDIR/.$(basename $f) ) + echo "[ $LASTFILE ]" +fi +cat $f +done + === --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -1,5 +1,10 @@ -/* Simple program to layout "physical" memory for new lguest guest. - * Linked high to avoid likely physical memory. */ +/*P:100 This is the Launcher code, a simple program which lays out the + * "physical" memory for the new Guest by mapping the kernel image and the + * virtual devices, then reads repeatedly from /dev/lguest to run the Guest. + * + * The only trick: the Makefile links it statically at a high address, so it + * will be clear of the guest memory region. It means that each Guest cannot + * have more than 2.5G of memory on a normally configured Host. :*/ #define _LARGEFILE64_SOURCE #define _GNU_SOURCE #include === --- a/drivers/lguest/Makefile +++ b/drivers/lguest/Makefile @@ -5,3 +5,15 @@ obj-$(CONFIG_LGUEST) += lg.o obj-$(CONFIG_LGUEST) += lg.o lg-y := core.o hypercalls.o page_tables.o interrupts_and_traps.o \ segments.o io.o lguest_user.o switcher.o + +Preparation Preparation!: PREFIX=P +Guest: PREFIX=G +Drivers: PREFIX=D +Launcher: PREFIX=L +Host: PREFIX=H +Switcher: PREFIX=S +Mastery: PREFIX=M +Beer: + @for f in Preparation Guest Drivers Launcher Host Switcher Mastery; do echo "{==- $$f -==}"; make -s $$f; done; echo "{==-==}" +Preparation Preparation! Guest Drivers Launcher Host Switcher Mastery: + @sh ../../Documentation/lguest/extract $(PREFIX) `find ../../* -name '*.[chS]' -wholename '*lguest*'` === --- /dev/null +++ b/drivers/lguest/README @@ -0,0 +1,47 @@ +Welcome, friend reader, to lguest. + +Lguest is an adventure, with you, the reader, as Hero. I can't think of many +5000-line projects which offer both such capability and glimpses of future +potential; it is an exciting time to be delving into the source! + +But be warned; this is an arduous journey of several hours or more! And as we +know, all true Heroes are driven by a Noble Goal. Thus I offer a Beer (or +equivalent) to anyone
[PATCH 2/7] lguest: documentation pt II: Guest
Documentation: The Guest Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> --- drivers/lguest/lguest.c | 458 --- drivers/lguest/lguest_asm.S | 57 +++-- include/linux/lguest.h | 47 +++- 3 files changed, 512 insertions(+), 50 deletions(-) === --- a/drivers/lguest/lguest.c +++ b/drivers/lguest/lguest.c @@ -66,6 +66,12 @@ #include #include +/*G:010 Welcome to the Guest! + * + * The Guest in our tale is a simple creature: identical to the Host but + * behaving in simplified but equivalent ways. In particular, the Guest is the + * same kernel as the Host (or at least, built from the same source code). :*/ + /* Declarations for definitions in lguest_guest.S */ extern char lguest_noirq_start[], lguest_noirq_end[]; extern const char lgstart_cli[], lgend_cli[]; @@ -84,7 +90,26 @@ struct lguest_device_desc *lguest_device struct lguest_device_desc *lguest_devices; static cycle_t clock_base; -static enum paravirt_lazy_mode lazy_mode; +/*G:035 Notice the lazy_hcall() above, rather than hcall(). This is our first + * real optimization trick! + * + * When lazy_mode is set, it means we're allowed to defer all hypercalls and do + * them as a batch when lazy_mode is eventually turned off. Because hypercalls + * are reasonably expensive, batching them up makes sense. For example, a + * large mmap might update dozens of page table entries: that code calls + * lguest_lazy_mode(PARAVIRT_LAZY_MMU), does the dozen updates, then calls + * lguest_lazy_mode(PARAVIRT_LAZY_NONE). + * + * So, when we're in lazy mode, we call async_hypercall() to store the call for + * future processing. When lazy mode is turned off we issue a hypercall to + * flush the stored calls. + * + * There's also a hack where "mode" is set to "PARAVIRT_LAZY_FLUSH" which + * indicates we're to flush any outstanding calls immediately. This is used + * when an interrupt handler does a kmap_atomic(): the page table changes must + * happen immediately even if we're in the middle of a batch. Usually we're + * not, though, so there's nothing to do. */ +static enum paravirt_lazy_mode lazy_mode; /* Note: not SMP-safe! */ static void lguest_lazy_mode(enum paravirt_lazy_mode mode) { if (mode == PARAVIRT_LAZY_FLUSH) { @@ -108,6 +133,16 @@ static void lazy_hcall(unsigned long cal async_hcall(call, arg1, arg2, arg3); } +/* async_hcall() is pretty simple: I'm quite proud of it really. We have a + * ring buffer of stored hypercalls which the Host will run though next time we + * do a normal hypercall. Each entry in the ring has 4 slots for the hypercall + * arguments, and a "hcall_status" word which is 0 if the call is ready to go, + * and 255 once the Host has finished with it. + * + * If we come around to a slot which hasn't been finished, then the table is + * full and we just make the hypercall directly. This has the nice side + * effect of causing the Host to run all the stored calls in the ring buffer + * which empties it for next time! */ void async_hcall(unsigned long call, unsigned long arg1, unsigned long arg2, unsigned long arg3) { @@ -115,6 +150,9 @@ void async_hcall(unsigned long call, static unsigned int next_call; unsigned long flags; + /* Disable interrupts if not already disabled: we don't want an +* interrupt handler making a hypercall while we're already doing +* one! */ local_irq_save(flags); if (lguest_data.hcall_status[next_call] != 0xFF) { /* Table full, so do normal hcall which will flush table. */ @@ -124,7 +162,7 @@ void async_hcall(unsigned long call, lguest_data.hcalls[next_call].edx = arg1; lguest_data.hcalls[next_call].ebx = arg2; lguest_data.hcalls[next_call].ecx = arg3; - /* Make sure host sees arguments before "valid" flag. */ + /* Arguments must all be written before we mark it to go */ wmb(); lguest_data.hcall_status[next_call] = 0; if (++next_call == LHCALL_RING_SIZE) @@ -132,9 +170,14 @@ void async_hcall(unsigned long call, } local_irq_restore(flags); } - +/*:*/ + +/* Wrappers for the SEND_DMA and BIND_DMA hypercalls. This is mainly because + * Jeff Garzik complained that __pa() should never appear in drivers, and this + * helps remove most of them. But also, it wraps some ugliness. */ void lguest_send_dma(unsigned long key, struct lguest_dma *dma) { + /* The hcall might not write this if something goes wrong */ dma->used_len = 0; hcall(LHCALL_SEND_DMA, key, __pa(dma), 0); } @@ -142,11 +185,16 @@ int lguest_bind_dma(unsigned long key, s int lguest_bind_dma(unsigned long key, struct lguest_dma *dmas, unsigned int num, u8 irq) { + /* This is the only hypercall which actually wants 5
[PATCH 3/7] lguest: documentation pt III: Drivers
Documentation: The Drivers Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> --- drivers/block/lguest_blk.c | 171 +++--- drivers/char/hvc_lguest.c | 77 + drivers/lguest/lguest_bus.c | 72 drivers/net/lguest_net.c| 222 +++ include/linux/lguest_bus.h |5 include/linux/lguest_launcher.h | 60 ++ 6 files changed, 565 insertions(+), 42 deletions(-) === --- a/drivers/block/lguest_blk.c +++ b/drivers/block/lguest_blk.c @@ -1,6 +1,12 @@ -/* A simple block driver for lguest. - * - * Copyright 2006 Rusty Russell <[EMAIL PROTECTED]> IBM Corporation +/*D:400 + * The Guest block driver + * + * This is a simple block driver, which appears as /dev/lgba, lgbb, lgbc etc. + * The mechanism is simple: we place the information about the request in the + * device page, then use SEND_DMA (containing the data for a write, or an empty + * "ping" DMA for a read). + :*/ +/* Copyright 2006 Rusty Russell <[EMAIL PROTECTED]> IBM Corporation * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -25,27 +31,50 @@ static char next_block_index = 'a'; +/*D:420 Here is the structure which holds all the information we need about + * each Guest block device. + * + * I'm sure at this stage, you're wondering "hey, where was the adventure I was + * promised?" and thinking "Rusty sucks, I shall say nasty things about him on + * my blog". I think Real adventures have boring bits, too, and you're in the + * middle of one. But it gets better. Just not quite yet. */ struct blockdev { + /* The block queue infrastructure wants a spinlock: it is held while it +* calls our block request function. We grab it in our interrupt +* handler so the responses don't mess with new requests. */ spinlock_t lock; - /* The disk structure for the kernel. */ + /* The disk structure registered with kernel. */ struct gendisk *disk; - /* The major number for this disk. */ + /* The major device number for this disk, and the interrupt. We only +* really keep them here for completeness; we'd need them if we +* supported device unplugging. */ int major; int irq; + /* The physical address of this device's memory page */ unsigned long phys_addr; - /* The mapped block page. */ + /* The mapped memory page for convenient acces. */ struct lguest_block_page *lb_page; - /* We only have a single request outstanding at a time. */ + /* We only have a single request outstanding at a time: this is it. */ struct lguest_dma dma; struct request *req; }; -/* Jens gave me this nice helper to end all chunks of a request. */ +/*D:495 We originally used end_request() throughout the driver, but it turns + * out that end_request() is deprecated, and doesn't actually end the request + * (which seems like a good reason to deprecate it!). It simply ends the first + * bio. So if we had 3 bios in a "struct request" we would do all 3, + * end_request(), do 2, end_request(), do 1 and end_request(): twice as much + * work as we needed to do. + * + * This reinforced to me that I do not understand the block layer. + * + * Nonetheless, Jens Axboe gave me this nice helper to end all chunks of a + * request. This improved disk speed by 130%. */ static void end_entire_request(struct request *req, int uptodate) { if (end_that_request_first(req, uptodate, req->hard_nr_sectors)) @@ -55,30 +84,62 @@ static void end_entire_request(struct re end_that_request_last(req, uptodate); } +/* I'm told there are only two stories in the world worth telling: love and + * hate. So there used to be a love scene here like this: + * + * Launcher: We could make beautiful I/O together, you and I. + * Guest: My, that's a big disk! + * + * Unfortunately, it was just too raunchy for our otherwise-gentle tale. */ + +/*D:490 This is the interrupt handler, called when a block read or write has + * been completed for us. */ static irqreturn_t lgb_irq(int irq, void *_bd) { + /* We handed our "struct blockdev" as the argument to request_irq(), so +* it is passed through to us here. This tells us which device we're +* dealing with in case we have more than one. */ struct blockdev *bd = _bd; unsigned long flags; + /* We weren't doing anything? Strange, but could happen if we shared +* interrupts (we don't!). */ if (!bd->req) { pr_debug("No work!\n"); return IRQ_NONE; } + /* Not done yet? That's equally strange. */ if (!bd->lb_page->result) { pr_debug("No result!\n"); return IRQ_NONE;
Re: film at 11: kernel update breaks udev.
On Sat, Jul 21, 2007 at 03:09:55AM +0200, Kay Sievers wrote: > On 7/21/07, Dave Jones <[EMAIL PROTECTED]> wrote: > > Just one of my machines to 2.6.22.1, and got this during boot.. > > > > Starting udev: udevd-event[619]: udev_node_symlink: > > symlink(../../sdc/dev/disk/by-uuid/2d773baf-8174-10a6-14db-a78e0e676e89) > > failed: File exists > > > > Under 2.6.21, all was fine. > > > > sdc is one disk of a 3 disk raid5 set. > > The raidset still manages to come up despite this. > > > > This is a Fedora 7 box, with udev-106-4.1.fc7 > > > > What changed this time? > > CONFIG_BLK_DEV_BSG=y? > > There's a name-clash, because bsg tries to create devices with the same name. > James sent a patch, it's on lkml. BSG isn't in 2.6.22 Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: where is the code for read system call?
Am Samstag, 21. Juli 2007 schrieb Agarwal, Lomesh: > My application reads from socket. I need to change the behavior of read > system call for an experiment. Can someone point me to code? fs/read_write.c: line 356 asmlinkage ssize_t sys_read(unsigned int fd, char __user * buf, size_t count) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: film at 11: kernel update breaks udev.
On 7/21/07, Dave Jones <[EMAIL PROTECTED]> wrote: Just one of my machines to 2.6.22.1, and got this during boot.. Starting udev: udevd-event[619]: udev_node_symlink: symlink(../../sdc/dev/disk/by-uuid/2d773baf-8174-10a6-14db-a78e0e676e89) failed: File exists Under 2.6.21, all was fine. sdc is one disk of a 3 disk raid5 set. The raidset still manages to come up despite this. This is a Fedora 7 box, with udev-106-4.1.fc7 What changed this time? CONFIG_BLK_DEV_BSG=y? There's a name-clash, because bsg tries to create devices with the same name. James sent a patch, it's on lkml. Kay - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] hugetlbfs read() support
Nishanth Aravamudan wrote: On 19.07.2007 [09:58:50 -0700], Andrew Morton wrote: On Thu, 19 Jul 2007 08:51:49 -0700 Badari Pulavarty <[EMAIL PROTECTED]> wrote: + } + + offset += ret; + retval += ret; + len -= ret; + index += offset >> HPAGE_SHIFT; + offset &= ~HPAGE_MASK; + + page_cache_release(page); + if (ret == nr && len) + continue; + goto out; + } +out: + return retval; +} This code doesn't have all the ghastly tricks which we deploy to handle concurrent truncate. Do I need to ? Baaahh!! I don't want to deal with them. Nick, can you think of any serious consequences of a read/truncate race in there? I can't.. All I want is a simple read() to get my oprofile working. Please advise. Did you consider changing oprofile userspace to read the executable with mmap? It's not actually oprofile's code, though, it's libbfd (used by oprofile). And it works fine (presumably) for other binaries. So... what's the problem with changing it? The fact that it is a library doesn't really make a difference except that you'll also help everyone else who links with it. It won't break backwards compatibility, and it will work on older kernels... -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
film at 11: kernel update breaks udev.
Just one of my machines to 2.6.22.1, and got this during boot.. Starting udev: udevd-event[619]: udev_node_symlink: symlink(../../sdc/dev/disk/by-uuid/2d773baf-8174-10a6-14db-a78e0e676e89) failed: File exists Under 2.6.21, all was fine. sdc is one disk of a 3 disk raid5 set. The raidset still manages to come up despite this. This is a Fedora 7 box, with udev-106-4.1.fc7 What changed this time? Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, Announce] Unified x86 architecture, arch/x86
Thomas Gleixner wrote: > [...] > As usual, comments and suggestions are welcome! Compiles and boots fine here ( on my Dell Precision WorkStation 530 MT ). And nothing broke so far. I only got some Kconfig warnings[1] with my config[2] but that is. ( I don't know whatever this matter but it boots 7,52 seconds faster as current git head ) [1]http://194.231.229.228/linux-x86/warning [2]http://194.231.229.228/linux-x86/config-x86 > > Thomas, Ingo > > Regards, Gabriel C - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [broken-out-2007-07-20-00-22] kernel bug at kernel/params:570
On Sat, Jul 21, 2007 at 02:28:52AM +0200, Michal Piotrowski wrote: > On 21/07/07, Satyam Sharma <[EMAIL PROTECTED]> wrote: > > Oh, which means ... > > > > > > On 7/21/07, Satyam Sharma <[EMAIL PROTECTED]> wrote: > > > On 7/21/07, Greg KH <[EMAIL PROTECTED]> wrote: > > > > On Fri, Jul 20, 2007 at 03:59:12PM -0700, Andrew Morton wrote: > > > > > On Fri, 20 Jul 2007 15:50:47 -0700 > > > > > Greg KH <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > On Fri, Jul 20, 2007 at 06:32:21PM +0200, Michal Piotrowski wrote: > > > > > > > Hi Greg, > > > > > > > > > > > > > > This looks like a sysfs bug > > > > > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/ > > > > > > > broken-out-2007-07-20-00-22/3.jpg > > > > > > > > > > > > > > l *kernel_param_sysfs_setup+0x75 > > > > > > > 0xc13c0894 is in kernel_param_sysfs_setup (kernel/params.c:570). > > > > > > > 565 mk->mod = THIS_MODULE; > > > > > > > 566 kobj_set_kset_s(mk, module_subsys); > > > > > > > > > 567 kobject_set_name(>kobj, name); > > > > Shouldn't the return of kobject_set_name() be checked here? > > > > [ Looking at code, and realizing that kobject_set_name() manages to > > succeed even when given a null string! ] > > > > > > > > > 568 kobject_init(>kobj); > > > > > > > 569 ret = kobject_add(>kobj); > > > > > > > 570 BUG_ON(ret < 0); > > > > > > > 571 param_sysfs_setup(mk, kparam, num_params, > > name_skip); > > > > > > > 572 kobject_uevent(>kobj, KOBJ_ADD); > > > > > > > 573 } > > > > > > > 574 > > > > > > > > > > > > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/ > > > > > > > broken-out-2007-07-20-00-22/mm-config > > > > > > > > > > > > What kernel version is this happening on? The -mm tree? Can you > > try > > > > > > Linus's tree instead? > > > > > > > > > > > > It looks like there was some needed information right before the > > first > > > > > > stack dump, showing exactly what kobject was trying to be added > > that was > > > > > > already present. Odds are this is a kernel parameter with the same > > name > > > > > > as a duplicate one within the same module, > > > > > > I don't think that's an -EEXIST. > > > > > > I think what we have here is kobject_add() exiting with -EINVAL. > > > (kobject attempted to be registered with no name!) > > > > > > [ The first trace on that screen shows: kobject_shadow_add+0x5b/0x189. > > > That's the WARN_ON(1) at lib/kobject.c:176. If it was a EEXIST case, > > > we would've seen an offset in kobject_shadow_add closer to 0x189, > > > because the dump_stack() for EEXIST is barely 4 instructions before > > > we return from that function. ] > > > > > > > > > but the trick is going to be > > > > > > trying to figure out what module is causing this. > > > > > > So I'd guess we want to search for a module that's passing a kobject * > > > to kobject_add() such that !kobj->k_name is true. > > > > Oh, that's kernel_param_sysfs_setup itself. So we actually need to > > search for a built-in module in Michal's config that ... has an ... empty > > "" modname !? > > I'll try to figure out this Try the patch below to help you boot and figure out what went wrong. Post the kernel log results and I'll try to help you out. thanks, greg k-h --- kernel/params.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) --- a/kernel/params.c +++ b/kernel/params.c @@ -567,7 +567,11 @@ static void __init kernel_param_sysfs_se kobject_set_name(>kobj, name); kobject_init(>kobj); ret = kobject_add(>kobj); - BUG_ON(ret < 0); + if (ret) { + printk(KERN_ERR "module '%s' failed to be added to sysfs, " + "the system will be unstable now.\n", name); + return; + } param_sysfs_setup(mk, kparam, num_params, name_skip); kobject_uevent(>kobj, KOBJ_ADD); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Use the tsk argument in init_new_context()
On Thu, Jul 19, 2007 at 05:42:38PM -0700, Andrew Morton wrote: > On Sun, 8 Jul 2007 22:55:08 -0300 > Diego Woitasen <[EMAIL PROTECTED]> wrote: > > > Signed-off-by: Diego Woitasen <[EMAIL PROTECTED]> > > --- > > arch/i386/kernel/ldt.c |2 +- > > arch/x86_64/kernel/ldt.c |2 +- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/arch/i386/kernel/ldt.c b/arch/i386/kernel/ldt.c > > index e0b2d17..c2eb4fb 100644 > > --- a/arch/i386/kernel/ldt.c > > +++ b/arch/i386/kernel/ldt.c > > @@ -96,7 +96,7 @@ int init_new_context(struct task_struct *tsk, struct > > mm_struct *mm) > > > > init_MUTEX(>context.sem); > > mm->context.size = 0; > > - old_mm = current->mm; > > + old_mm = tsk->mm; > > if (old_mm && old_mm->context.size > 0) { > > down(_mm->context.sem); > > retval = copy_ldt(>context, _mm->context); > > diff --git a/arch/x86_64/kernel/ldt.c b/arch/x86_64/kernel/ldt.c > > index bc9ffd5..99a92ed 100644 > > --- a/arch/x86_64/kernel/ldt.c > > +++ b/arch/x86_64/kernel/ldt.c > > @@ -100,7 +100,7 @@ int init_new_context(struct task_struct *tsk, struct > > mm_struct *mm) > > > > init_MUTEX(>context.sem); > > mm->context.size = 0; > > - old_mm = current->mm; > > + old_mm = tsk->mm; > > if (old_mm && old_mm->context.size > 0) { > > down(_mm->context.sem); > > retval = copy_ldt(>context, _mm->context); > > > When called from dup_mm(), `tsk' refers to the new task and `current' > refers to the old one. I'd have expected this to crash during your testing? > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ Yes, sorry... that patch is bad. Now my question is, why all architectures have the task argument and neither use it? I undertand now that init_new_context() work with current but what about the *tsk arg. -- -- Diego Woitasen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Documentation for sysfs, hotplug, and firmware loading.
On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote: > > Always look at the parent devices themselves for determining device > > context properties. > > For determining? > > What was the original language of this document? Ok, that's just being mean, cut it out right now if you ever want my help again. I'll gladly accept patches for this document that is in the kernel tree now if you want to send them. But criticizing the grammer of a document with statements like this one gets you no where and is damm rude. I suggest you start this thread over if you want my feedback, I'm not going to respond anymore to this one. greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Documentation for sysfs, hotplug, and firmware loading.
On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote: > I'm not trying to document /sys/devices. I'm trying to document hotplug, > populating /dev, and things like firmware loading that fall out of that. > This requires use of sysfs, and I'm only trying to document as much of sysfs > as you need to do that. Like I stated before, you do not need to even have sysfs mounted to have a dynamic /dev. And why do you need to document populating /dev dynamically? udev already solves this problem for you, it's not like people are going off and reinventing udev for their own enjoyment would not at least look at how it solves this problem first. To do otherwise would be foolish :) Firmware loading is fine to document if you wish to do so. But again, why? We already have multiple userspace programs that provide this feature for them. Perhaps you want to document how to add firmware to a system in order for these different programs to pick them up? Or perhaps you want to document how to add this kind of functionality to your kernel driver so that it can handle firmware loading by using the firmware interface that the kernel provides? If you just want to document the hotplug/uevent api, then do just that. However I think you are overreaching with your scope here and getting mighty confused in the process. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Documentation for sysfs, hotplug, and firmware loading.
On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote: > Ok, back up. /sys/devices does not contain all the information necessary to > populate /dev, because it hasn't got things like > ramdisks, /dev/zero, /dev/console which are THERE in sysfs, which may or may > not be supported by the kernel (the kernel might have ramdisk support, might > not). Welcome to 2007: $ ls /sys/devices/virtual/mem/ full kmem kmsg mem null port random urandom zero $ ls /sys/devices/virtual/tty/ console tty12 tty19 tty25 tty31 tty38 tty44 tty50 tty57 tty63 ptmx tty13 tty2 tty26 tty32 tty39 tty45 tty51 tty58 tty7 tty tty14 tty20 tty27 tty33 tty4 tty46 tty52 tty59 tty8 tty0 tty15 tty21 tty28 tty34 tty40 tty47 tty53 tty6 tty9 tty1 tty16 tty22 tty29 tty35 tty41 tty48 tty54 tty60 tty10tty17 tty23 tty3 tty36 tty42 tty49 tty55 tty61 tty11tty18 tty24 tty30 tty37 tty43 tty5 tty56 tty62 I suggest you take a close look at the kernel before making statements like the above :) > These things could also, in future, have their major and minor numbers > dynamically (even randomly) assigned. That's been discussed on this list. I tried that once, it will require some core api kernel changes and a lot of infrastrucure work to get that to work properly. Not that it will never happen in the future, but it's just not a trivial change at the moment... thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Documentation for sysfs, hotplug, and firmware loading.
On Wednesday 18 July 2007 7:40:20 pm Greg KH wrote: > On Wed, Jul 18, 2007 at 01:39:53PM -0400, Rob Landley wrote: > > PICK ONE! JUST
[GIT PULL] MMC updates
Linus, please pull from git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc.git for-linus to receive the following updates: MAINTAINERS |7 ++- drivers/mmc/host/at91_mci.c | 13 - drivers/mmc/host/sdhci.c|2 ++ drivers/mmc/host/sdhci.h|1 + 4 files changed, 21 insertions(+), 2 deletions(-) Marc Pignat (1): mmc: at91_mci: wakeup on card insertion (or removal) Pierre Ossman (2): mmc: add maintainer for at91 sdhci: make sure to clear the error interrupt diff --git a/MAINTAINERS b/MAINTAINERS index fbe0dca..c9fab2b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -645,7 +645,12 @@ W: http://linux-atm.sourceforge.net S: Maintained ATMEL AT91 MCI DRIVER -S: Orphan +P: Nicolas Ferre +M: [EMAIL PROTECTED] +L: [EMAIL PROTECTED] (subscribers-only) +W: http://www.atmel.com/products/AT91/ +W: http://www.at91.com/ +S: Maintained ATMEL MACB ETHERNET DRIVER P: Haavard Skinnemoen diff --git a/drivers/mmc/host/at91_mci.c b/drivers/mmc/host/at91_mci.c index 28c8818..15aab37 100644 --- a/drivers/mmc/host/at91_mci.c +++ b/drivers/mmc/host/at91_mci.c @@ -903,8 +903,10 @@ static int __init at91_mci_probe(struct platform_device *pdev) /* * Add host to MMC layer */ - if (host->board->det_pin) + if (host->board->det_pin) { host->present = !at91_get_gpio_value(host->board->det_pin); + device_init_wakeup(>dev, 1); + } else host->present = -1; @@ -940,6 +942,7 @@ static int __exit at91_mci_remove(struct platform_device *pdev) host = mmc_priv(mmc); if (host->present != -1) { + device_init_wakeup(>dev, 0); free_irq(host->board->det_pin, host); cancel_delayed_work(>mmc->detect); } @@ -966,8 +969,12 @@ static int __exit at91_mci_remove(struct platform_device *pdev) static int at91_mci_suspend(struct platform_device *pdev, pm_message_t state) { struct mmc_host *mmc = platform_get_drvdata(pdev); + struct at91mci_host *host = mmc_priv(mmc); int ret = 0; + if (device_may_wakeup(>dev)) + enable_irq_wake(host->board->det_pin); + if (mmc) ret = mmc_suspend_host(mmc, state); @@ -977,8 +984,12 @@ static int at91_mci_suspend(struct platform_device *pdev, pm_message_t state) static int at91_mci_resume(struct platform_device *pdev) { struct mmc_host *mmc = platform_get_drvdata(pdev); + struct at91mci_host *host = mmc_priv(mmc); int ret = 0; + if (device_may_wakeup(>dev)) + disable_irq_wake(host->board->det_pin); + if (mmc) ret = mmc_resume_host(mmc); diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 10d15c3..4a24db0 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -1024,6 +1024,8 @@ static irqreturn_t sdhci_irq(int irq, void *dev_id) intmask &= ~(SDHCI_INT_CMD_MASK | SDHCI_INT_DATA_MASK); + intmask &= ~SDHCI_INT_ERROR; + if (intmask & SDHCI_INT_BUS_POWER) { printk(KERN_ERR "%s: Card is consuming too much power!\n", mmc_hostname(host->mmc)); diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h index 7400f4b..a6c8704 100644 --- a/drivers/mmc/host/sdhci.h +++ b/drivers/mmc/host/sdhci.h @@ -107,6 +107,7 @@ #define SDHCI_INT_CARD_INSERT 0x0040 #define SDHCI_INT_CARD_REMOVE 0x0080 #define SDHCI_INT_CARD_INT0x0100 +#define SDHCI_INT_ERROR 0x8000 #define SDHCI_INT_TIMEOUT 0x0001 #define SDHCI_INT_CRC 0x0002 #define SDHCI_INT_END_BIT 0x0004 -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git patches] two warning fixes
On Fri, 2007-07-20 at 20:34 +0200, Krzysztof Halasa wrote: > Linus Torvalds <[EMAIL PROTECTED]> writes: > > > More people *should* generally ask themselves: "was the warning worth it?" > > and then, if the answer is "no", they shouldn't add code, they should > > remove the thing that causes the warning in the first place. > > Sure. If a routine uses must_check yet its return value may be > safely ignored then that must_check is simply misplaced and should > be removed. It does not mean all must_checks are bad - each of them > isn't bad unless one can demonstrate it is. > > Back to sysfs_create_bin_file() - if one can demonstrate a caller > can safely ignore the return value (which, it seems, is the > case), then exactly this very must_check should be removed Typically, the EDID creation in radeonfb :-) In fact, I'm not even sure there's -any- user of those sysfs files. I added them back then to allow distros to extract the EDID infos that were probed by radeonfb to properly configure the X server (because on some machines, the EDID is coming from the firmware/BIOS, not from DDC, and X can't get at it). I don't know if they ever used them. In any case, it doesn't make sense to abort initialization of the driver if for some reasons those files can't be created (for example, the core fbdev starts exposing EDID files, radeonfb isn't properly updated, name clash, error). Aborting the initialization will make sure that on some machines such as powermacs with radeon, whatever error is displayed will never be seen by the user. That's a typical, but I have plenty more. For example, the powermac thermal control drivers. They work pretty well by themselves. They also expose via sysfs all the current values, fan speeds, temps ,etc... for the sake of whoever wants to do a GUI or "monitor" what's going on, but that is not critical to the operation of the driver. Thus, failure to create those files is not critical. I have plenty other examples. Thus, we have two choices here: - The simple one: sysfs_create_blah() displays a warning when it fails and has no must_check - The one that adds code everywhere (the current one): sysfs_create_blah() returns an error, has much_check, and thus all callers like I described abvoe need to add code to test it and print a warning. Lots of added .text and .data for little benefit. Cheers, Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, Announce] Unified x86 architecture, arch/x86
On Saturday 21 July 2007, Thomas Gleixner wrote: > The topic of sharing more x86 code has been discussed on LKML a number > of times. Various approaches were discussed and we decided to advance > the discussion by implementing a full solution that brings the > transition to a shared tree to completion. Great stuff. I've worked on doing the same for s390 and powerpc in the past, and really think it's the right thing to do. I've even started my own x86 merge two or three times in the past but never got very far because of the quickly moving source. > In this initial implementation the old arch/i386 and arch/x86_64 trees > are removed _immediately_, in the same commit, and all future x86 > development goes on in the new, shared tree. So the transition right now > is one atomic operation. > > As a next step we plan to generate a gradual, fully bisectable, fully > working switchover from the current code to the fully populated > arch/x86 tree. It will result in about 1000-2000 commits. We are > releasing our current solution because it 100% represents the finally > resulting arch/x86 source tree already, and we first wanted to make > sure that the new architecture layout works fine and folks are happy > before we go and do the (even more complex) fine-grained work. I don't think it's really good to do it this way, or maybe I'm still misunderstanding where you're going. If you really want to end up with the exact set of files that you have your tree now, I see absolutely zero point in making it bisectable. On the contrary, there is nothing particularly complicated in it, so once it has seen some amount of testing it can better get merged in one big changeset. I'm just not convinced that it actully is what we want to end up with. In my experience, it's very helpful to have a single set of header files, and merging the two versions of one header usually exposes bugs that have been fixed in only one of the two, so you get to fix actual bugs in the process. In the s390 merge, I also started out in an attempt to guarantee unchanged object files, much like what you describe. However, it turned out that fixing it in the process is actually easier. Either way, 'diff -D __x86_64__' is a great tool for a start, you should try it out to see how easy it is to merge a lot of files. To put it into perspective, I think the s390 merge was a lot easier than the x86 merge, because there is only a very limited set of hardware configurations for s390 compared to others. We ended up doing the full merge with three people within less than a week and no separate files at all. OTOH, the powerpc merge is now going into its third year, mostly because it was started with the intention to remove all cruft in the process and to only allow sane code into the new architecture. The steps that I'd suggest instead are: * merge all exported header files of the two architectures. This alone is a worthy goal, because it allows us to get rid of the ugly code for deciding which version to use in installed headers and elsewhere. * Merge the remaining header files, to end up with a single include/asm-x86 directory. * Come up with a model that integrates the machine type selection of i386 with the way we build things on x86_64. One way would be to make X86_64 another platform next to X86_PC, X86_VOYAGER and the others. * Create an arch/x86/Kconfig that handles the new common configuration * Create an arch/x86/Makefile that descends into ../i386/* and ../x86_64/* instead of its subdirectories. * Merge the arch/x86/* subdirectories, one at a time, starting with the low-hanging fruit like oprofile or pci, and do the hard ones like mm and kernel last. Unfortunately, I don't think I'll spend much time on this, so I don't get to decide on it, but you asked for feedback ;-) Arnd <>< - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [broken-out-2007-07-20-00-22] kernel bug at kernel/params:570
On 21/07/07, Satyam Sharma <[EMAIL PROTECTED]> wrote: Oh, which means ... On 7/21/07, Satyam Sharma <[EMAIL PROTECTED]> wrote: > On 7/21/07, Greg KH <[EMAIL PROTECTED]> wrote: > > On Fri, Jul 20, 2007 at 03:59:12PM -0700, Andrew Morton wrote: > > > On Fri, 20 Jul 2007 15:50:47 -0700 > > > Greg KH <[EMAIL PROTECTED]> wrote: > > > > > > > On Fri, Jul 20, 2007 at 06:32:21PM +0200, Michal Piotrowski wrote: > > > > > Hi Greg, > > > > > > > > > > This looks like a sysfs bug > > > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/ > > > > > broken-out-2007-07-20-00-22/3.jpg > > > > > > > > > > l *kernel_param_sysfs_setup+0x75 > > > > > 0xc13c0894 is in kernel_param_sysfs_setup (kernel/params.c:570). > > > > > 565 mk->mod = THIS_MODULE; > > > > > 566 kobj_set_kset_s(mk, module_subsys); > > > > > 567 kobject_set_name(>kobj, name); Shouldn't the return of kobject_set_name() be checked here? [ Looking at code, and realizing that kobject_set_name() manages to succeed even when given a null string! ] > > > > > 568 kobject_init(>kobj); > > > > > 569 ret = kobject_add(>kobj); > > > > > 570 BUG_ON(ret < 0); > > > > > 571 param_sysfs_setup(mk, kparam, num_params, name_skip); > > > > > 572 kobject_uevent(>kobj, KOBJ_ADD); > > > > > 573 } > > > > > 574 > > > > > > > > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/ > > > > > broken-out-2007-07-20-00-22/mm-config > > > > > > > > What kernel version is this happening on? The -mm tree? Can you try > > > > Linus's tree instead? > > > > > > > > It looks like there was some needed information right before the first > > > > stack dump, showing exactly what kobject was trying to be added that was > > > > already present. Odds are this is a kernel parameter with the same name > > > > as a duplicate one within the same module, > > I don't think that's an -EEXIST. > > I think what we have here is kobject_add() exiting with -EINVAL. > (kobject attempted to be registered with no name!) > > [ The first trace on that screen shows: kobject_shadow_add+0x5b/0x189. > That's the WARN_ON(1) at lib/kobject.c:176. If it was a EEXIST case, > we would've seen an offset in kobject_shadow_add closer to 0x189, > because the dump_stack() for EEXIST is barely 4 instructions before > we return from that function. ] > > > > > but the trick is going to be > > > > trying to figure out what module is causing this. > > So I'd guess we want to search for a module that's passing a kobject * > to kobject_add() such that !kobj->k_name is true. Oh, that's kernel_param_sysfs_setup itself. So we actually need to search for a built-in module in Michal's config that ... has an ... empty "" modname !? I'll try to figure out this Shouldn't that turn up pretty quickly in a grep? How do I do that, btw? Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.22.1-rt3
On Thu, 2007-07-19 at 20:37 -0700, Daniel Walker wrote: > The broken out series is here, > ftp://source.mvista.com/pub/dwalker/rt/patch-2.6.22.1-rt4-dw1.tar.gz I'll pick that up soon. Thanks, tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Documentation for sysfs, hotplug, and firmware loading.
On Thursday 19 July 2007 4:16:17 am Cornelia Huck wrote: > On Wed, 18 Jul 2007 13:39:53 -0400, > > Rob Landley <[EMAIL PROTECTED]> wrote: > > Nope. If you recurse down under /sys/class following symlinks, you go > > into an endless loop bouncing off of /sys/devices and getting pointed > > back. If you don't follow symlinks, it works fine up until about 2.6.20 > > at which point things that were previously directories BECAME symlinks > > because the directories got moved, and it all broke. > > I have no idea what you're doing. See the email to kay sievers. In 2.6.14 following symlinks hit an endless /sys/block/hda/device/block/device/block/device/block... This has changed since, like much of sysfs, but in the absence of either a spec or a stable API there's no guarantee it won't reoccur. > > Which is why I want it documented where to look for these suckers. Just > > give me ONE STABLE WAY TO FIND THIS INFORMATION, PLEASE. > > See Documentation/sysfs-rules.txt. Ok: Paragraph 1: "It's not stable." Paragraph 2: "It's not stable." Paragraph 3: If you really really need to access it directly... Paragraph 4: DO NOT DO $XXX. Paragraph 5: Expect it to be mounted at /sys Paragraph 6: DO NOT DO $XXX. (Specficially, the way you were distinguishing between block and char devices? Don't do that. No, we won't tell you what to replace it with, keep reading.) So far, not exactly gripping reading. Paragraph 7: What a devpath is. Ok, is it just me or does it say that applications shouldn't use the symlinks in sysfs? Why are they there, then? Paragraph 8: The kernel has a name for the device. Paragraph 9: Subsystem is a string. What it means, we leave for you to guess. Paragraph 10: Driver is the name of a driver. (Does this mean a driver is currently loaded and handling the device, or that the kernel is suggesting a driver based on something like PCI ID, through the kind of mechanism that used to be used to request module loading? Experimentally, it looks like the first, which makes sense but isn't specified. Does something like /sys/class/mem/zero or have a driver? Experimentally, no, it hasn't got a device link.) Paragraph 11: Atributes, and yet more DO NOT DO $XXX. It took me three reads of that to figure out they probably meant "Attributes belong to a device, don't confuse the attributes of another device with attributes of this device." (Following _which_ device symlink?) Ok, back up. /sys/devices does not contain all the information necessary to populate /dev, because it hasn't got things like ramdisks, /dev/zero, /dev/console which are THERE in sysfs, which may or may not be supported by the kernel (the kernel might have ramdisk support, might not). These things could also, in future, have their major and minor numbers dynamically (even randomly) assigned. That's been discussed on this list. I'm not trying to document /sys/devices. I'm trying to document hotplug, populating /dev, and things like firmware loading that fall out of that. This requires use of sysfs, and I'm only trying to document as much of sysfs as you need to do that. I'm not documenting stuff like /sys/devices/system/cpu. The consensus so far is "the udev implementation is the spec", except I watched the udev implementation change rather a lot before I stopped tracking it, and saw a number of people complain on this list about things breaking when they upgraded the kernel but not udev. Back to reading the document: > - Properties of parent devices never belong into a child device. Belong into? > Always look at the parent devices themselves for determining device > context properties. For determining? What was the original language of this document? > If the device 'eth0' or 'sda' does not have a > "driver"-link, then this device does not have a driver. Again, whether they mean "the kernel was not built with a driver that can handle this device" or "no driver is currently loaded and handling this device". It _sounds_ like "this device is not supported by Linux", which probably isn't what they meant. > Never copy any property of the parent-device into a child-device. I note that the only mention made so far of parent-child relationships in devices is in terms of "don'ts". I assume they're talking about how a partition can be the child of a block device, and a network controller card can be the child of a pci bus device? Ah, I see. The next paragraph is on hierarchy, yet doesn't actually explain anything, other than to imply that the device hierarchy being fully represented there is a dream to be achieved sometime in the future but not necessarily the truth with today's kernels, because stuff is still being _moved_ into /sys/devices. > - Classification by subsystem > There are currently three places for classification of devices: > /sys/block, /sys/class and /sys/bus. So if somebody wants to write code that runs on a current kernel, they have no alternative but to
Re: [RFC, Announce] Unified x86 architecture, arch/x86
On 21/07/07, Ingo Molnar <[EMAIL PROTECTED]> wrote: * Michal Piotrowski <[EMAIL PROTECTED]> wrote: > >We are pleased to announce a project we've been working on for some > >time: the unified x86 architecture tree, or "arch/x86" - and we'd > >like to solicit feedback about it. > > > >What is this about? > [..] > >As usual, comments and suggestions are welcome! > > I really like this idea - code duplication is a bad thing. > > BTW. I don't see any regression here :) cool - could you tell us a bit more about on what type of box you tried it, it is an old P4 (i386) and how wide and versatile the .config is? http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.22-git15/config Ingo Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, Announce] Unified x86 architecture, arch/x86
Alan Cox wrote: > On Fri, 20 Jul 2007 18:38:39 -0400 > Jeff Garzik <[EMAIL PROTECTED]> wrote: > >> I agree with Andi... it's quite nice to be able to leave some arch/i386 >> stuff, and not carry it over to arch/x86-64. > > Its easy enough to push that stuff into arch/x86/legacy and have one > subdirectory of stuff to pull in for ancient systems. The other thing is that "legacy" in this context is fungible. No IOMMU was legacy until the Intel x86-64 chips came out, and I can promise you that some legacy code will be necessary once we start seeing VIA and others come out with embedded x86-64. On the other hand, it's pretty bloody safe to assume that we'll never see an x86-64 chip without CPUID, CMOV, FXSAVE, SSE-2, CMPXCHG, etc. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: joydev.c and saitek cyborg evo force
On 20/06/07, Jiri Kosina <[EMAIL PROTECTED]> wrote: Could you please send me the report descriptor of the device, so that I could debug it locally here? Hi Jiri, sorry for the delay, below the report descriptor and attached is the full report when I've connected the joystick. report descriptor (size 851, read 851) = 05 01 09 04 a1 01 09 01 a1 00 85 06 09 30 15 00 26 00 10 35 00 46 00 10 75 10 95 01 81 02 09 31 81 02 05 02 09 bb 26 ff 00 46 ff 00 75 08 81 02 05 09 19 01 29 0c 25 01 45 01 75 01 95 0c 81 02 05 01 09 39 25 07 46 3b 01 55 00 65 44 75 04 95 01 81 42 65 00 05 02 09 ba 26 ff 00 46 ff 00 75 08 81 02 c0 05 0f 09 92 a1 02 85 02 09 a6 09 a4 09 a0 09 9f 25 01 45 00 75 01 95 04 81 02 75 04 95 01 81 03 09 22 75 07 25 09 81 02 09 94 75 01 25 01 81 02 75 08 81 03 c0 09 21 a1 02 85 0b 09 22 25 09 91 02 09 25 a1 02 09 26 09 30 09 32 09 31 09 33 09 34 09 40 09 41 15 01 25 08 91 00 c0 09 53 25 0c 75 05 91 02 09 56 15 00 25 01 75 01 91 02 09 55 a1 02 05 01 09 30 09 31 95 02 91 02 c0 05 0f 09 50 27 fe ff 00 00 47 fe ff 00 00 75 10 95 01 55 fd 66 01 10 91 02 55 00 65 00 09 57 26 ff 00 46 68 01 75 08 65 44 91 02 65 00 09 54 27 fe ff 00 00 47 fe ff 00 00 75 10 55 fd 66 01 10 91 02 55 00 65 00 09 58 a1 02 05 0a 09 01 09 02 26 2b 01 45 00 95 02 91 02 c0 05 0f 09 a7 27 fe ff 00 00 47 fe ff 00 00 95 01 55 fd 66 01 10 91 02 55 00 65 00 c0 09 5a a1 02 85 0c 09 23 26 2b 01 45 00 91 02 09 5c 26 10 27 46 10 27 55 fd 66 01 10 91 02 55 00 65 00 09 5b 25 7f 75 08 91 02 09 5e 26 10 27 75 10 55 fd 66 01 10 91 02 55 00 65 00 09 5d 25 7f 75 08 91 02 c0 09 73 a1 02 85 0d 09 23 26 2b 01 45 00 75 10 91 02 09 70 15 81 25 7f 36 f0 d8 46 10 27 75 08 91 02 c0 09 6e a1 02 85 0e 09 23 15 00 26 2b 01 35 00 45 00 75 10 91 02 09 70 25 7f 46 10 27 75 08 91 02 09 6f 15 81 36 f0 d8 91 02 09 71 15 00 26 ff 00 35 00 46 68 01 91 02 09 72 26 10 27 46 10 27 75 10 55 fd 66 01 10 91 02 55 00 65 00 c0 09 5f a1 02 85 0f 09 23 26 2b 01 45 00 91 02 09 61 15 9c 25 64 36 f0 d8 46 10 27 75 08 91 02 09 62 91 02 09 60 16 0c fe 26 f4 01 75 10 91 02 09 65 15 00 26 e8 03 35 00 91 02 09 63 25 64 75 08 91 02 09 64 91 02 c0 09 77 a1 02 85 51 09 22 25 09 45 00 91 02 09 78 a1 02 09 7b 09 79 09 7a 15 01 25 03 91 00 c0 09 7c 15 00 26 fe 00 91 02 c0 09 92 a1 02 85 52 09 96 a1 02 09 9a 09 99 09 97 09 98 09 9b 09 9c 15 01 25 06 91 00 c0 c0 05 ff 0a 01 03 a1 02 85 40 0a 02 03 a1 02 1a 11 03 2a 20 03 25 10 91 00 c0 0a 03 03 15 00 27 ff ff 00 00 75 10 91 02 c0 05 0f 09 7d a1 02 85 43 09 7e 26 80 00 46 10 27 75 08 91 02 c0 09 85 a1 02 85 44 09 86 27 ff ff 00 00 45 00 75 10 91 02 09 87 91 02 09 88 91 02 c0 05 ff 0a 00 01 a1 02 85 81 05 01 09 30 15 81 25 7f 36 f0 d8 46 10 27 75 08 91 02 09 31 91 02 c0 05 0f 09 7f a1 02 85 0b 09 80 15 00 26 ff 7f 35 00 45 00 75 0f b1 03 09 a9 25 01 75 01 b1 03 09 83 26 ff 00 75 08 b1 03 09 84 25 10 b1 03 09 a8 a1 02 09 73 09 6e 09 5a 09 5f 95 04 b1 03 c0 c0 c0 cheers, --renato Reclaim your digital rights, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm joy-dmesg.log.gz Description: GNU Zip compressed data
Re: [broken-out-2007-07-20-00-22] kernel bug at kernel/params:570
Oh, which means ... On 7/21/07, Satyam Sharma <[EMAIL PROTECTED]> wrote: On 7/21/07, Greg KH <[EMAIL PROTECTED]> wrote: > On Fri, Jul 20, 2007 at 03:59:12PM -0700, Andrew Morton wrote: > > On Fri, 20 Jul 2007 15:50:47 -0700 > > Greg KH <[EMAIL PROTECTED]> wrote: > > > > > On Fri, Jul 20, 2007 at 06:32:21PM +0200, Michal Piotrowski wrote: > > > > Hi Greg, > > > > > > > > This looks like a sysfs bug > > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/ > > > > broken-out-2007-07-20-00-22/3.jpg > > > > > > > > l *kernel_param_sysfs_setup+0x75 > > > > 0xc13c0894 is in kernel_param_sysfs_setup (kernel/params.c:570). > > > > 565 mk->mod = THIS_MODULE; > > > > 566 kobj_set_kset_s(mk, module_subsys); > > > > 567 kobject_set_name(>kobj, name); Shouldn't the return of kobject_set_name() be checked here? [ Looking at code, and realizing that kobject_set_name() manages to succeed even when given a null string! ] > > > > 568 kobject_init(>kobj); > > > > 569 ret = kobject_add(>kobj); > > > > 570 BUG_ON(ret < 0); > > > > 571 param_sysfs_setup(mk, kparam, num_params, name_skip); > > > > 572 kobject_uevent(>kobj, KOBJ_ADD); > > > > 573 } > > > > 574 > > > > > > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/ > > > > broken-out-2007-07-20-00-22/mm-config > > > > > > What kernel version is this happening on? The -mm tree? Can you try > > > Linus's tree instead? > > > > > > It looks like there was some needed information right before the first > > > stack dump, showing exactly what kobject was trying to be added that was > > > already present. Odds are this is a kernel parameter with the same name > > > as a duplicate one within the same module, I don't think that's an -EEXIST. I think what we have here is kobject_add() exiting with -EINVAL. (kobject attempted to be registered with no name!) [ The first trace on that screen shows: kobject_shadow_add+0x5b/0x189. That's the WARN_ON(1) at lib/kobject.c:176. If it was a EEXIST case, we would've seen an offset in kobject_shadow_add closer to 0x189, because the dump_stack() for EEXIST is barely 4 instructions before we return from that function. ] > > > but the trick is going to be > > > trying to figure out what module is causing this. So I'd guess we want to search for a module that's passing a kobject * to kobject_add() such that !kobj->k_name is true. Oh, that's kernel_param_sysfs_setup itself. So we actually need to search for a built-in module in Michal's config that ... has an ... empty "" modname !? Shouldn't that turn up pretty quickly in a grep? How do I do that, btw? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, Announce] Unified x86 architecture, arch/x86
* Michal Piotrowski <[EMAIL PROTECTED]> wrote: > >We are pleased to announce a project we've been working on for some > >time: the unified x86 architecture tree, or "arch/x86" - and we'd > >like to solicit feedback about it. > > > >What is this about? > [..] > >As usual, comments and suggestions are welcome! > > I really like this idea - code duplication is a bad thing. > > BTW. I don't see any regression here :) cool - could you tell us a bit more about on what type of box you tried it, and how wide and versatile the .config is? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, Announce] Unified x86 architecture, arch/x86
Hi, On 21/07/07, Thomas Gleixner <[EMAIL PROTECTED]> wrote: We are pleased to announce a project we've been working on for some time: the unified x86 architecture tree, or "arch/x86" - and we'd like to solicit feedback about it. What is this about? [..] As usual, comments and suggestions are welcome! I really like this idea - code duplication is a bad thing. BTW. I don't see any regression here :) Thomas, Ingo Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, Announce] Unified x86 architecture, arch/x86
On Fri, 20 Jul 2007 18:38:39 -0400 Jeff Garzik <[EMAIL PROTECTED]> wrote: > I agree with Andi... it's quite nice to be able to leave some arch/i386 > stuff, and not carry it over to arch/x86-64. Its easy enough to push that stuff into arch/x86/legacy and have one subdirectory of stuff to pull in for ancient systems. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3] pcmcia: CompactFlash driver for PA Semi Electra boards
On Thu, 5 Jul 2007 09:49:14 -0500 [EMAIL PROTECTED] (Olof Johansson) wrote: > Driver for the CompactFlash slot on the PA Semi Electra eval board. It's > a simple device sitting on localbus, with interrupts and detect/voltage > control over GPIO. > > The driver is implemented as an of_platform driver, and adds localbus > as a bus being probed by the of_platform framework. > > > Signed-off-by: Olof Johansson <[EMAIL PROTECTED]> > > --- > > On Mon, Jun 25, 2007 at 03:43:41PM -0500, olof wrote: > > > The ifdef is needed since for CONFIG_PCMCIA=n builds, the bus notifier > > isn't available. I wanted to do the bus notifier registration explicitly > > before the of_platform bus probe to avoid later surprises due to reordered > > initcalls in case it was split up in it's own initcall. > > > > I could add the code under ifdef as well, but it didn't seem too > > critical. Once the second major board comes along I'll probably move it > > out to a per-board file, there's no real need for it just yet. > > Alright, turns out I still need to declare the extern bus type, which would > mean > two #ifdefs in one function. Moving it out instead. > > I've addressed Milton's comments as well. > > Who's maintaining PCMCIA? MAINTAINERS only lists a mailing list, no person. > Seems > weird for a component that's marked as maintained. Dominik Brodowski. He's having a bit of downtime at present (exams, I think). He expects to return. Meanwhile, cc'ing me usually has some effect. > > ... > > +static const char driver_name[] = "electra-cf"; > > ... > > +static struct of_device_id electra_cf_match[] = > +{ > + { > + .compatible = "electra-cf", > + }, > + {}, > +}; Could have reused driver_name[] here, if that was appropriate. > +static struct of_platform_driver electra_cf_driver = > +{ > + .name = (char *)driver_name, ug. But it's not your fault - we should have always made it const. > --- mainline.orig/arch/powerpc/platforms/pasemi/setup.c > +++ mainline/arch/powerpc/platforms/pasemi/setup.c I never know who maintains random-scruffy-ppc code like this. From a peek in the git-whatchanged output, it appears to be yourself. Have a few little fixies: --- a/drivers/pcmcia/electra_cf.c~pcmcia-compactflash-driver-for-pa-semi-electra-boards-fix +++ a/drivers/pcmcia/electra_cf.c @@ -201,9 +201,7 @@ static int __devinit electra_cf_probe(st if (!cf) return -ENOMEM; - init_timer(>timer); - cf->timer.function = electra_cf_timer; - cf->timer.data = (unsigned long) cf; + setup_timer(>timer, electra_cf_timer, (unsigned long)cf); cf->irq = NO_IRQ; cf->ofdev = ofdev; @@ -340,16 +338,14 @@ static int __devexit electra_cf_remove(s return 0; } -static struct of_device_id electra_cf_match[] = -{ +static struct of_device_id electra_cf_match[] = { { .compatible = "electra-cf", }, {}, }; -static struct of_platform_driver electra_cf_driver = -{ +static struct of_platform_driver electra_cf_driver = { .name = (char *)driver_name, .match_table= electra_cf_match, .probe= electra_cf_probe, @@ -371,4 +367,3 @@ module_exit(electra_cf_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR ("Olof Johansson <[EMAIL PROTECTED]>"); MODULE_DESCRIPTION("PA Semi Electra CF driver"); - _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: Create clflush() inline, remove hardcoded wbinvd
Glauber de Oliveira Costa wrote: > On Fri, 2007-07-20 at 14:19 -0700, H. Peter Anvin wrote: >> Create an inline function for clflush(), with the proper arguments, >> and use it instead of hard-coding the instruction. >> >> This also removes one instance of hard-coded wbinvd, based on a patch >> by Bauder de Oliveira Costa. > Hey, Who's that guy that got a name so close to mine? ;-) That would be Mr. Typo! >> Cc: Andi Kleen <[EMAIL PROTECTED]> >> Cc: Glauber de Oliveira Costa <[EMAIL PROTECTED]> I got it right here at least :-/ -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] compat_ioctl requires CONFIG_BLOCK
On Saturday 21 July 2007, Sebastian Siewior wrote: > > Got with randconfig > include/linux/loop.h:66: error: expected specifier-qualifier-list before > 'request_queue_t' > make[1]: *** [fs/compat_ioctl.o] Error 1 > > parts of compat ioctl require CONFIG_BLOCK to be set. > > Signed-off-by: Sebastian Siewior <[EMAIL PROTECTED]> > Index: b/fs/compat_ioctl.c > === > --- a/fs/compat_ioctl.c > +++ b/fs/compat_ioctl.c > @@ -63,7 +63,9 @@ > #include > #include > #include > +#ifdef CONFIG_BLOCK > #include > +#endif Adding #ifdef around an #include is considered bad style. Better just make loop.h compile without any conditionals. Does the below patch work for you? Arnd <>< --- a/include/linux/loop.h +++ b/include/linux/loop.h @@ -63,7 +63,7 @@ struct loop_device { struct task_struct *lo_thread; wait_queue_head_t lo_event; - request_queue_t *lo_queue; + struct request_queue*lo_queue; struct gendisk *lo_disk; struct list_headlo_list; }; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [broken-out-2007-07-20-00-22] kernel bug at kernel/params:570
On 7/21/07, Greg KH <[EMAIL PROTECTED]> wrote: On Fri, Jul 20, 2007 at 03:59:12PM -0700, Andrew Morton wrote: > On Fri, 20 Jul 2007 15:50:47 -0700 > Greg KH <[EMAIL PROTECTED]> wrote: > > > On Fri, Jul 20, 2007 at 06:32:21PM +0200, Michal Piotrowski wrote: > > > Hi Greg, > > > > > > This looks like a sysfs bug > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/ > > > broken-out-2007-07-20-00-22/3.jpg > > > > > > l *kernel_param_sysfs_setup+0x75 > > > 0xc13c0894 is in kernel_param_sysfs_setup (kernel/params.c:570). > > > 565 mk->mod = THIS_MODULE; > > > 566 kobj_set_kset_s(mk, module_subsys); > > > 567 kobject_set_name(>kobj, name); > > > 568 kobject_init(>kobj); > > > 569 ret = kobject_add(>kobj); > > > 570 BUG_ON(ret < 0); > > > 571 param_sysfs_setup(mk, kparam, num_params, name_skip); > > > 572 kobject_uevent(>kobj, KOBJ_ADD); > > > 573 } > > > 574 > > > > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/broken-out-2007-07-20-00-22/mm-config > > > > What kernel version is this happening on? The -mm tree? Can you try > > Linus's tree instead? > > > > It looks like there was some needed information right before the first > > stack dump, showing exactly what kobject was trying to be added that was > > already present. Odds are this is a kernel parameter with the same name > > as a duplicate one within the same module, I don't think that's an -EEXIST. I think what we have here is kobject_add() exiting with -EINVAL. (kobject attempted to be registered with no name!) [ The first trace on that screen shows: kobject_shadow_add+0x5b/0x189. That's the WARN_ON(1) at lib/kobject.c:176. If it was a EEXIST case, we would've seen an offset in kobject_shadow_add closer to 0x189, because the dump_stack() for EEXIST is barely 4 instructions before we return from that function. ] > > but the trick is going to be > > trying to figure out what module is causing this. So I'd guess we want to search for a module that's passing a kobject * to kobject_add() such that !kobj->k_name is true. > > So it's not a sysfs bug, but rather a driver issue that this is > > catching. > > In that case a BUG was way too harsh treatment, and in fact directly > contributed to our inability to debug the bug! > > Can we wind that back a bit? Add some useful printks and then recover > in some fashion? [...] So I'm guessing he was trying to catch something specific here. Considering that: (1) This isn't a bug that should bring down the kernel that hard, and, (2) kobject_shadow_add() seems to be dumping enough stacks and printing printk's on errors already, I'd suggest to just get rid of the BUG_ON() in kernel_param_sysfs_setup() Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
where is the code for read system call?
My application reads from socket. I need to change the behavior of read system call for an experiment. Can someone point me to code? thanks - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][SELinux] Let us not leak memory in SELinux : security_netlbl_cache_add()
On Sat, 21 Jul 2007, Jesper Juhl wrote: > Hi, > > Leaking memory is a bad idea, so let's not do it, in > security/selinux/ss/services.c::security_netlbl_cache_add(). > > Note: The Coverity checker gets credit for spotting this one. > Note: Patch has only been compile tested. Thanks! Verified and applied to: git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6.git#for-linus - James -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] console: fix section mismatch warning in vgacon.c
On Fri, 2007-07-20 at 23:27 +0200, Sam Ravnborg wrote: > Fix following section mismatch warning: > WARNING: vmlinux.o(.text+0x121e62): Section mismatch: reference to > .init.text:__alloc_bootmem (between 'vgacon_startup' and 'vgacon_scrolldelta') > > Browsing the code it seems that vgacon_scrollback_startup() is only > called during the init phase so the reference to the .init.text > section is OK. > Teach modpost not to warn using ___init_refok. > > Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]> Acked-by: Antonino Daplas <[EMAIL PROTECTED]> Tony - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [kvm-devel] [GIT PULL][RESEND] Late KVM Updates for the 2.6.23 merge window
On Sat, 21 Jul 2007, S.Çağlar Onur wrote: > > With Linus's latest git, shutting down a guest (fired with -smp 2 -m 512) > sometimes > ends up like [1], this occured as soon as qemu window closed. > > [1] http://cekirdek.pardus.org.tr/~caglar/kvm/dmesg.latest [ 737.460654] Bad page state in process 'qemu-kvm' [ 737.460656] page:f5e68000 flags:0xea02 mapping: mapcount:2 count:0 [ 737.460657] Trying to fix it up, but a reboot is needed [ 737.460659] Backtrace: [ 737.460691] [] bad_page+0x64/0x8e [ 737.460733] [] free_hot_cold_page+0x68/0x15a That's the "free_pages_check()", and in particular it seems to be "page_mapcount()" being non-zero that triggered that thing. So it looks like something in KVM isn't coherent about the mapping vs the usage counters.. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86: Create clflush() inline, remove hardcoded wbinvd
On Fri, 2007-07-20 at 14:19 -0700, H. Peter Anvin wrote: > Create an inline function for clflush(), with the proper arguments, > and use it instead of hard-coding the instruction. > > This also removes one instance of hard-coded wbinvd, based on a patch > by Bauder de Oliveira Costa. Hey, Who's that guy that got a name so close to mine? ;-) > > Cc: Andi Kleen <[EMAIL PROTECTED]> > Cc: Glauber de Oliveira Costa <[EMAIL PROTECTED]> > Signed-off-by: H. Peter Anvin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] i2o: defined but not used.
Got with randconfig drivers/message/i2o/exec-osm.c:539: warning: 'i2o_exec_lct_notify' defined but not used Signed-off-by: Sebastian Siewior <[EMAIL PROTECTED]> Index: b/drivers/message/i2o/exec-osm.c === --- a/drivers/message/i2o/exec-osm.c +++ b/drivers/message/i2o/exec-osm.c @@ -389,9 +389,7 @@ static void i2o_exec_lct_modified(struct if (i2o_device_parse_lct(c) != -EAGAIN) change_ind = c->lct->change_ind + 1; -#ifdef CONFIG_I2O_LCT_NOTIFY_ON_CHANGES i2o_exec_lct_notify(c, change_ind); -#endif }; /** @@ -525,6 +523,7 @@ int i2o_exec_lct_get(struct i2o_controll return rc; } +#ifdef CONFIG_I2O_LCT_NOTIFY_ON_CHANGES /** * i2o_exec_lct_notify - Send a asynchronus LCT NOTIFY request * @c: I2O controller to which the request should be send @@ -570,6 +569,13 @@ static int i2o_exec_lct_notify(struct i2 return 0; }; +#else +static int i2o_exec_lct_notify(struct i2o_controller *c, u32 change_ind) +{ + return 0; +} +#endif + /* Exec OSM driver struct */ struct i2o_driver i2o_exec_driver = { .name = OSM_NAME, -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [broken-out-2007-07-20-00-22] kernel bug at kernel/params:570
On Fri, 20 Jul 2007 16:10:52 -0700 Greg KH wrote: > On Fri, Jul 20, 2007 at 03:59:12PM -0700, Andrew Morton wrote: > > On Fri, 20 Jul 2007 15:50:47 -0700 > > Greg KH <[EMAIL PROTECTED]> wrote: > > > > > On Fri, Jul 20, 2007 at 06:32:21PM +0200, Michal Piotrowski wrote: > > > > Hi Greg, > > > > > > > > This looks like a sysfs bug > > > > > > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/broken-out-2007-07-20-00-22/3.jpg > > > > > > > > l *kernel_param_sysfs_setup+0x75 > > > > 0xc13c0894 is in kernel_param_sysfs_setup (kernel/params.c:570). > > > > 565 mk->mod = THIS_MODULE; > > > > 566 kobj_set_kset_s(mk, module_subsys); > > > > 567 kobject_set_name(>kobj, name); > > > > 568 kobject_init(>kobj); > > > > 569 ret = kobject_add(>kobj); > > > > 570 BUG_ON(ret < 0); > > > > 571 param_sysfs_setup(mk, kparam, num_params, name_skip); > > > > 572 kobject_uevent(>kobj, KOBJ_ADD); > > > > 573 } > > > > 574 > > > > > > > > > > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/broken-out-2007-07-20-00-22/mm-config > > > > > > What kernel version is this happening on? The -mm tree? Can you try > > > Linus's tree instead? > > > > > > It looks like there was some needed information right before the first > > > stack dump, showing exactly what kobject was trying to be added that was > > > already present. Odds are this is a kernel parameter with the same name > > > as a duplicate one within the same module, but the trick is going to be > > > trying to figure out what module is causing this. > > > > > > So it's not a sysfs bug, but rather a driver issue that this is > > > catching. > > > > In that case a BUG was way too harsh treatment, and in fact directly > > contributed to our inability to debug the bug! > > > > Can we wind that back a bit? Add some useful printks and then recover > > in some fashion? > > Sure, I don't mind doing that at all. > > Hm, it looks like Randy added this back in September last year with: > commit d8c7649e99e4b081b624aefe1e77caa30b53cb18 > Author: Randy Dunlap <[EMAIL PROTECTED]> > Date: Fri Sep 29 01:58:55 2006 -0700 > > [PATCH] kernel/params: driver layer error checking > > Check driver layer return values in kernel/params.c > > Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> > Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> > Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> > > (wow, I love git and the signed-off-tree for things like this, it's > trivial to find this information out.) > > So I'm guessing he was trying to catch something specific here. > > Randy, any objection to changing that BUG_ON to a printk warning instead > telling the user exactly what needs to be fixed and that the system is > now going to be unstable when any module is unloaded? Of course not (no objection). I added a BUG_ON() ? Shame on me. --- ~Randy *** Remember to use Documentation/SubmitChecklist when testing your code *** - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/