Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)
Andrew Morton wrote: On Fri, 04 May 2007 10:18:12 +0400 Alex Tomas <[EMAIL PROTECTED]> wrote: Andrew Morton wrote: Yes, there can be issues with needing to allocate journal space within the context of a commit. But no-no, this isn't required. we only need to mark pages/blocks within transaction, otherwise race is possible when we allocate blocks in transaction, then transacton starts to commit, then we mark pages/blocks to be flushed before commit. I don't understand. Can you please describe the race in more detail? if I understood your idea right, then in data=ordered mode, commit thread writes all dirty mapped blocks before real commit. say, we have two threads: t1 is a thread doing flushing and t2 is a commit thread t1 t2 find dirty inode I find some dirty unallocated blocks journal_start() allocate blocks attach them to I journal_stop() going to commit find inode I dirty do NOT find these blocks because they're allocated only, but pages/bhs aren't mapped to them start commit map pages/bhs to just allocate blocks so, either we mark pages/bhs someway within journal_start()--journal_stop() or commit thread should do lookup for all dirty pages. the latter doesn't sound nice, IMHO. thanks, Alex - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc
On Thu, May 03, 2007 at 11:28:15PM -0700, Andrew Morton wrote: > > > The posix spec implies that negative `len' is permitted - presumably > > > "allocate > > > ahead of `offset'". How peculiar. > > > > I just checked the man page for posix_fallocate() and it says: > > > > EINVAL offset or len was less than zero. That describes the current glibc implementation. > > We should probably follow this lead. > > Yes, I think so. I'm suspecting that > http://www.opengroup.org/onlinepubs/009695399/functions/posix_fallocate.html > is just buggy. Or I can't read. > > I mean, if we're going to support negative `len' then is the byte at > `offset' inside or outside the segment? Head spins. > > However it would be neat if someone could test $OTHER_OS and, perhaps more > importantly, the present glibc emulation (which I assume your manpage is > referring to, so this would be a manpage test ;)). int posix_fallocate (int fd, __off_t offset, __off_t len) { struct stat64 st; struct statfs f; /* `off_t' is a signed type. Therefore we can determine whether OFFSET + LEN is too large if it is a negative value. */ if (offset < 0 || len < 0) return EINVAL; if (offset + len < 0) return EFBIG; /* First thing we have to make sure is that this is really a regular file. */ if (__fxstat64 (_STAT_VER, fd, &st) != 0) return EBADF; if (S_ISFIFO (st.st_mode)) return ESPIPE; if (! S_ISREG (st.st_mode)) return ENODEV; if (len == 0) { if (st.st_size < offset) { int ret = __ftruncate (fd, offset); if (ret != 0) ret = errno; return ret; } return 0; } ... is what glibc does ATM. Seems we violate the case where len == 0, as EINVAL in that case is "shall fail". But reading the standard to imply negative len is ok is too much guessing, there is no word what it means when len is negative and "required storage for regular file data starting at offset and continuing for len bytes" doesn't make sense for negative size. And given the general "Implementations may support additional errors not included in this list, may generate errors included in this list under circumstances other than those described here, or may contain extensions or limitations that prevent some errors from occurring." I believe returning EINVAL for len < 0 is not a POSIX violation. That doesn't mean the standard shouldn't be clarified, whether by saying EINVAL must be returned for non-positive len or saying that using negative len has undefined or implementation defined behavior. > The above opengroup page only permits S_ISREG. Preallocating directories > sounds quite useful to me, although it's something which would be pretty > hard to emulate if the FS doesn't support it. And there's a decent case to > be made for emulating it - run-anywhere reasons. Does glibc emulation support > directories? Quite unlikely. No, see above. Jakub - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + per-cpuset-hugetlb-accounting-and-administration.patch added to -mm tree
On Thu, May 03, 2007 at 06:38:21PM -0700, Paul Jackson wrote: > Adding Christoph Lameter <[EMAIL PROTECTED]> to the cc list, as he knows > more about hugetlb pages than I do. > This patch strikes me as a bit odd. > Granted, it's solving what could be a touchy problem with a fairly > simple solution, which is usually a Good Thing(tm). > However, the idea that different tasks would see different values for > the following fields in /proc/meminfo: > HugePages_Total: 0 > HugePages_Free: 0 > strikes me as odd, and risky. I would have thought that usually, all > tasks in the system should see the same values in the files in /proc > (as opposed to the files in particular task subdirectories /proc/.) > This patch strikes me as a bit of a hack, good for compatibility, but > hiding a booby trap that will bite some user code in the long run. > But I'm not enough of an expert to know what the right tradeoffs are > in this matter. The semantics of the global /proc/meminfo should not change; a separate per-cpuset reporting mechanism should really be used. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Bluetooth: postpone hci_dev unregistration
Hi Jiri, > (I sent this a week ago but it seems to have got lost in other noise, > resending) > > From: Jiri Kosina <[EMAIL PROTECTED]> > > Bluetooth: postpone hci_dev unregistration > > Commit b40df57 substituted bh_lock_sock() in hci_sock_dev_event() for > lock_sock() when unregistering HCI device, in order to prevent deadlock > against locking in l2cap_connect_cfm() from softirq context. > > This however introduces another problem - hci_sock_dev_event() for > HCI_DEV_UNREG can also be triggered in atomic context, in which calling > lock_sock() is not safe as it could sleep. Reported by Jeremy Fitzhardinge > at http://lkml.org/lkml/2007/4/23/271 > > This patch moves the detaching of sockets from hci_device into workqueue, > so that lock_sock() can be used safely. This requires movement of > deallocation of hci_dev - deallocating device just after > hci_unregister_dev() would be too soon, as it could happen before the > workqueue has been run. I saw the report on LKML, but I am not really comfortable with this approach. It feels like an ugly hack. This needs more thinking and I think that simplifying the looking between HCI and L2CAP should be the goal. Regards Marcel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)
On Fri, 04 May 2007 10:18:12 +0400 Alex Tomas <[EMAIL PROTECTED]> wrote: > Andrew Morton wrote: > > Yes, there can be issues with needing to allocate journal space within the > > context of a commit. But > > no-no, this isn't required. we only need to mark pages/blocks within > transaction, otherwise race is possible when we allocate blocks in > transaction, > then transacton starts to commit, then we mark pages/blocks to be flushed > before commit. I don't understand. Can you please describe the race in more detail? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, May 03, 2007 at 08:08:41PM -0700, Christoph Lameter wrote: > Performance tests show a slight improvements in netperf (not a > strong case for a performance improvement but removing the > constructor has definitely no negative impact so why keep > this around?). Cache effects are not so easily visible. Cache profile results from more realistic workloads (e.g. major macrobenchmarks) are more appropriate for gauging this. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc
On Fri, 4 May 2007 16:07:31 +1000 David Chinner <[EMAIL PROTECTED]> wrote: > On Thu, May 03, 2007 at 09:29:55PM -0700, Andrew Morton wrote: > > On Thu, 26 Apr 2007 23:33:32 +0530 "Amit K. Arora" <[EMAIL PROTECTED]> > > wrote: > > > > > This patch implements the fallocate() system call and adds support for > > > i386, x86_64 and powerpc. > > > > > > ... > > > +{ > > > + struct file *file; > > > + struct inode *inode; > > > + long ret = -EINVAL; > > > + > > > + if (len == 0 || offset < 0) > > > + goto out; > > > > The posix spec implies that negative `len' is permitted - presumably > > "allocate > > ahead of `offset'". How peculiar. > > I just checked the man page for posix_fallocate() and it says: > > EINVAL offset or len was less than zero. > > We should probably follow this lead. Yes, I think so. I'm suspecting that http://www.opengroup.org/onlinepubs/009695399/functions/posix_fallocate.html is just buggy. Or I can't read. I mean, if we're going to support negative `len' then is the byte at `offset' inside or outside the segment? Head spins. However it would be neat if someone could test $OTHER_OS and, perhaps more importantly, the present glibc emulation (which I assume your manpage is referring to, so this would be a manpage test ;)). > > > + > > > + ret = -ENODEV; > > > + if (!S_ISREG(inode->i_mode)) > > > + goto out_fput; > > > > So we return ENODEV against an S_ISBLK fd, as per the posix spec. That > > seems a bit silly of them. > > H - I thought that the intention of sys_fallocate() was to > be generic enough to eventually allow preallocation on directories. > If that is the case, then this check will prevent that The above opengroup page only permits S_ISREG. Preallocating directories sounds quite useful to me, although it's something which would be pretty hard to emulate if the FS doesn't support it. And there's a decent case to be made for emulating it - run-anywhere reasons. Does glibc emulation support directories? Quite unlikely. But yes, sounds like a desirable thing. Would XFS support it easily if the above check was relaxed? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Rewrite the MAJOR() macro as a call to imajor().
On Sat, 28 Apr 2007 06:23:54 -0400 (EDT) "Robert P. J. Day" <[EMAIL PROTECTED]> wrote: > Replace the MAJOR() macro invocation with a call to the inline > imajor() routine. > > Signed-off-by: Robert P. J. Day <[EMAIL PROTECTED]> > > --- > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index 6b5b642..08da15b 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -710,7 +710,7 @@ static inline int is_loop_device(struct file *file) > { > struct inode *i = file->f_mapping->host; > > - return i && S_ISBLK(i->i_mode) && MAJOR(i->i_rdev) == LOOP_MAJOR; > + return i && S_ISBLK(i->i_mode) && imajor(i) == LOOP_MAJOR; > } there's no runtime change, and I count a couple hundred MAJORs in the tree. I don't want to receive 200 one-line patches please. If you're going to do this then please do decent-sized per-subsystem patches and see if you can persuade the subsystem maintainers to take them directly. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)
Andrew Morton wrote: Yes, there can be issues with needing to allocate journal space within the context of a commit. But no-no, this isn't required. we only need to mark pages/blocks within transaction, otherwise race is possible when we allocate blocks in transaction, then transacton starts to commit, then we mark pages/blocks to be flushed before commit. a) If the page has newly allocated space on disk then the metadata which refers to that page is already in the journal: no new journal space needed. b) If the page doesn't have space allocated on disk then we don't need to write it out at ordered-mode commit time, because the post-recovery filesystem will not have any references to that page. c) If the page is dirty due to overwrite then no metadata update was required. IOW, under what circumstances would an ordered-mode commit need to allocate space for a delayed-allocate page? no need to allocate space within commit thread, I think. only to take care of the race I described above. in hackish version of data=ordered for delayed allocation I used counter of submitted bio's with newly-allocated blocks and commit thread waits for the counter to reach 0. However b) might lead to the hey-my-file-is-full-of-zeroes problem. thanks, Alex - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm Patch]nbd: check the return value of sysfs_create_file
On Sat, 28 Apr 2007 13:30:23 +0800 WANG Cong <[EMAIL PROTECTED]> wrote: > Since 'sysfs_create_file' is declared with attribute warn_unused_result, we > must always check its return value carefully. > Well that's not really the reason for your patch. warn_unused_result is there to tell us that there are deeper problems in the code which need addressing: the failure to check the sysfs_create_file() return value means that bugs in the kernel can remain undetected, or can be harder to find. > > --- > > --- linux-2.6.21-rc7-mm2/drivers/block/nbd.c.orig 2007-04-27 > 17:27:47.0 +0800 > +++ linux-2.6.21-rc7-mm2/drivers/block/nbd.c 2007-04-27 17:47:32.0 > +0800 > @@ -373,7 +373,10 @@ static void nbd_do_it(struct nbd_device > BUG_ON(lo->magic != LO_MAGIC); > > lo->pid = current->pid; > - sysfs_create_file(&lo->disk->kobj, &pid_attr.attr); > + if (sysfs_create_file(&lo->disk->kobj, &pid_attr.attr)) { > + printk(KERN_ERR "nbd: sysfs_create_file failed!"); > + return; > + } > > while ((req = nbd_read_stat(lo)) != NULL) > nbd_end_request(req); It would better saner to propagate this error back through callers: --- a/drivers/block/nbd.c~nbd-check-the-return-value-of-sysfs_create_file-fix +++ a/drivers/block/nbd.c @@ -366,23 +366,25 @@ static struct disk_attribute pid_attr = .show = pid_show, }; -static void nbd_do_it(struct nbd_device *lo) +static int nbd_do_it(struct nbd_device *lo) { struct request *req; + int ret; BUG_ON(lo->magic != LO_MAGIC); lo->pid = current->pid; - if (sysfs_create_file(&lo->disk->kobj, &pid_attr.attr)) { + ret = sysfs_create_file(&lo->disk->kobj, &pid_attr.attr); + if (ret) { printk(KERN_ERR "nbd: sysfs_create_file failed!"); - return; + return ret; } while ((req = nbd_read_stat(lo)) != NULL) nbd_end_request(req); sysfs_remove_file(&lo->disk->kobj, &pid_attr.attr); - return; + return 0; } static void nbd_clear_que(struct nbd_device *lo) @@ -572,7 +574,9 @@ static int nbd_ioctl(struct inode *inode case NBD_DO_IT: if (!lo->file) return -EINVAL; - nbd_do_it(lo); + error = nbd_do_it(lo); + if (error) + return error; /* on return tidy up in case we have a signal */ /* Forcibly shutdown the socket causing all listeners * to error _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + per-cpuset-hugetlb-accounting-and-administration.patch added to -mm tree
On 5/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote: Note, Ken, that if we did that, the calculation of these new Total and Free stats would be a little different than your new code. Instead of looping over the memory nodes in the current tasks mems_allowed mask, we would loop over the memory nodes allowed in the cpuset being queried (the cpuset whose 'hugepages_total' or 'hugepages_free' special file we were reading, not the current tasks cpuset.) This is even more controversial and messy. akpm already dropped the patch and expressed that he doesn't like it. And I won't go down another messy path. I will let this idea RIP. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] revoke: change revoke_table to fileset and revoke_details
On Thu, 3 May 2007, Andrew Morton wrote: > Well that's the "locking" protocol then: each instance of this structure is > only ever touched by a single thread, yes? Yes. Each do_revoke() call creates a new instance. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + per-cpuset-hugetlb-accounting-and-administration.patch added to -mm tree
David wrote: > This information is already exported to userspace through sysfs. Simply > grab the N-mems allowed to your task from /proc/pid/status, cat > /sys/devices/system/node/nodeN/meminfo for each N, and add. Good point. I don't see how this present patch, to change /proc/meminfo, can be justified, given this. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Correct location for ADC/DAC drivers
On Wednesday 02 May 2007 21:11, Russell King wrote: > > > Is there a maintainer for this "drivers/mfd" directory? > > > > rmk > > I wouldn't go that far. There's no real infrastructure there > to maintain, so I'd actually say that the directory was > maintainerless. However, I'll own up to the UCB/MCP drivers > in there. So perhaps you could answer is you feel that these ADC & DAC chrdev device drivers would fit into this drivers/mfd directory, or are better suited for the drivers/char directory? Thanks. Best regards, Stefan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc
On Thu, May 03, 2007 at 09:29:55PM -0700, Andrew Morton wrote: > On Thu, 26 Apr 2007 23:33:32 +0530 "Amit K. Arora" <[EMAIL PROTECTED]> wrote: > > > This patch implements the fallocate() system call and adds support for > > i386, x86_64 and powerpc. > > > > ... > > +{ > > + struct file *file; > > + struct inode *inode; > > + long ret = -EINVAL; > > + > > + if (len == 0 || offset < 0) > > + goto out; > > The posix spec implies that negative `len' is permitted - presumably "allocate > ahead of `offset'". How peculiar. I just checked the man page for posix_fallocate() and it says: EINVAL offset or len was less than zero. We should probably follow this lead. > > + > > + ret = -ENODEV; > > + if (!S_ISREG(inode->i_mode)) > > + goto out_fput; > > So we return ENODEV against an S_ISBLK fd, as per the posix spec. That > seems a bit silly of them. H - I thought that the intention of sys_fallocate() was to be generic enough to eventually allow preallocation on directories. If that is the case, then this check will prevent that Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] New firewire stack
On Thu, 03 May 2007, Kristian Høgsberg wrote: > Adrian Bunk wrote: > >> | An advantage of changing the names is that they are now prefixed. > >> > >> Is the opportunity to clean up module names compelling enough, vs. (the > >> wish for) minimized trouble with scripts which refer to module names? > >> ... > > > > How big is the trouble actually? > > Exactly. In Fedora we've just added a fw-sbp2 case to mkinitrd, it's only a > couple of lines of extra shell code: > > elif [ "$modName" = "fw-sbp2" ]; then > findmodule fw-core > findmodule fw-ohci > modName="fw-sbp2" > > and that's the extent of the changes. The sbp2 case for the old drivers is > still in there and in the end mkinitrd works with either stack. > > Kristian I also think both stacks should be provided in the mainline kernel, preferably in their own separate directories. I still need the old stack for dv1394, which isn't available in the new stack. But if the new stack is also there, I might be motivated for example to try out the new sbp2 module, to see how well it works and how it compares in performance to the old sbp2 module. If it's not there, I'm probably not going to go out of my way to download it from the net, since my existing setup is working just fine for me. -Bill - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] natsemi: Improve DspCfg workaround
> The natsemi driver contains a workaround for broken hardware which can > on some boards cause more problems than it solves. The following patch > series improves this by making the diagnostic more obvious and allowing > users to disable the workaround if it causes them problems. Works great. Thank You all for help. Thanks Rafał -- NIE KUPUJ!!! ...zanim nie porownasz cen >> http://link.interia.pl/f1a5e - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + per-cpuset-hugetlb-accounting-and-administration.patch added to -mm tree
Andrew wrote: > If it's per-cpuset information then shouldn't it be presented in > /dev/cpuset/something? Yeah - if huge pages were mainline future, rather than the more controversial sideline they are now, then it would make more sense to put in these stats in each cpuset. Note, Ken, that if we did that, the calculation of these new Total and Free stats would be a little different than your new code. Instead of looping over the memory nodes in the current tasks mems_allowed mask, we would loop over the memory nodes allowed in the cpuset being queried (the cpuset whose 'hugepages_total' or 'hugepages_free' special file we were reading, not the current tasks cpuset.) But I'm reluctant to entertain such cpuset additions until I see more of where my colleague Christoph is going in related work. Clearly as can be seen on one of his posts on the parallel lkml thread: Re: + pretend-cpuset-has-some-form-of-hugetlb-page-reservation.patch added to -mm tree earlier today, Christoph is no great fan of the current implementation of huge pages. And clearly as memory continues to get bigger, we will be putting more stress on these page size related mechanisms. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Detecting process death for anycast named process monitoring
On Wed, May 02, 2007 at 06:12:27PM -0500, David M. Lloyd wrote: > On Wed, 2007-05-02 at 16:30 -0600, Chris Friesen wrote: > > Glen Turner wrote: > > > > > The question is, how can a process with no relationship to another > > > process detect that process unexpectedly dying? If named goes > > > away to a better place, we want to shut down the interface > > > which causes Quagga to inject the anycast route. > > > We did something similar where arbitrary processes can register to be > > sent an arbitrary signal when the state of other processes change. > > What about something like inotify, but for processes? That would be > cool... Or maybe just ignoring the SIGHUP before exec'ing the named process as a child. -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] Power Management: use mutexes instead of semaphores
On Fri, 27 Apr 2007 10:43:22 +0200 Matthias Kaehlcke <[EMAIL PROTECTED]> wrote: > the Power Management code uses semaphores as mutexes. use the mutex > API instead of the (binary) semaphores I know it's a little thing, but given a choice between a) changelogs which use capital letters and fullstops and b) changelogs which do not, I think a) gives a better result. I note that none of these patches added a #include . Each C file which uses mutexes should do that, rather than relying upon accidental nested includes. I hope you're checking for that. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] change global zonelist order v4 [0/2]
On Fri, 27 Apr 2007 14:45:30 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > Hi, this is version 4. including Lee Schermerhon's good rework. > and automatic configuration at boot time. hm, this adds rather a lot of code. Have we established that it's worth it? And it's complex - how do poor users know what to do with this new control? This: + * = "[dD]efault | "0" - default, automatic configuration. + * = "[nN]ode"|"1" - order by node locality, + * then zone within node. + * = "[zZ]one"|"2" - order by zone, then by locality within zone seems a bit excessive. I think just the 0/1/2 plus documentation would suffice? I haven't followed this discussion very closely I'm afraid. If we came up with a good reason why Linux needs this feature then could someone please (re)describe it? Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel/relay.c: a strange usage of delayed_work
On Fri, 2007-05-04 at 01:38 +0400, Oleg Nesterov wrote: > relay_switch_subbuf() does schedule_delayed_work(&buf->wake_readers, 1), > wakeup_readers() only does wake_up_interruptible() and nothing more. > > Why can't we use a plain timer for this? > > In any case, this "wake_up ->read_wait after a minimal possible delay" > looks somewhat strange to me, could you explain? just curious. > The reason it's done that way is that if the event that causes the relay_switch_subbuf() happens to be an event logged from schedule(), and we directly call wake_up_interruptible() at that point, we lock up the machine because it ends up back in schedule(). Deferring it avoids the problem. I don't see any problem with using a plain timer instead - I'll work up a patch to make that change. Tom - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2/6] add config option to vmalloc stacks (was: Re: [-mm patch] i386: enable 4k stacks by default)
On Mon, Apr 30, 2007 at 10:43:10AM -0700, William Lee Irwin III wrote: > + Allocates the stack physically discontiguously and from high > + memory. Furthermore an unmapped guard page follows the stack. > + This is not for end-users. It's intended to trigger fatal > + system errors under various forms of stack abuse. Why is this not for end-users? Will it not trigger anything useful unless set up properly, or is a big performace hit -- and how, or what? All the kernel debug options are underdocumented this way -- I'd like to have as many of them on as I can without absolutely killing performance, (or rather, *you* would) -- but I can never tell without grovelling all over for the info, which... well, I haven't done it yet, anyway. "End-user" is just insufficently defined for anyone compiling their own kernel. Could you add a bit more text here describing what the effect of physically discontiguous high-memory stacks is? An additional frobnitz dereference on every badda-bing badda-bang, likely to double the time it takes to dance the hokey pokey? *shrug* Some of those debug options probably don't get set very often on kernels that are run for more than to see if it boots. -- Joseph Fannin [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + per-cpuset-hugetlb-accounting-and-administration.patch added to -mm tree
On Thu, 3 May 2007, Paul Jackson wrote: > 2) adding two new values, by such names as: > > Current_Cpuset_HugePages_Total:0 > Current_Cpuset_HugePages_Free: 0 > This information is already exported to userspace through sysfs. Simply grab the N-mems allowed to your task from /proc/pid/status, cat /sys/devices/system/node/nodeN/meminfo for each N, and add. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Routing 600+ vlan's via linux problems (looks like arp problems)
On Fri, May 04, 2007 at 05:48:18AM +0200, Øyvind Vågen Jægtnes wrote: > Hi again :) > > On 5/4/07, Willy Tarreau <[EMAIL PROTECTED]> wrote: > >On Thu, May 03, 2007 at 11:12:09PM +0200, Øyvind Vågen Jægtnes wrote: > >> On 5/3/07, Jan Engelhardt <[EMAIL PROTECTED]> wrote: > >> > > >> >On May 3 2007 22:53, Willy Tarreau wrote: > >> >>> For the rest all we see in the arp cache is (incomplete) > >> >> > >> >>I suspect that your arp cache is full (128 entries by default). > >> >>Check /proc/sys/net/ipv4/neigh/gc_thresh1 (128 for me). You can > >> >>set it as high as gc_thresh2 (512 for me), and I don't know what > >> >>happens above. > >> > > >> >Above, you will perhaps need the not-so-elegant userspace arpd :-/ > >> > >> Yes, i was suspecting that the arp cache got full, but i will try > >> increasing it :) > >> Would there be any huge bugs if i change these lines in arp.c: > >> > >>.gc_thresh1 = 128, > >>.gc_thresh2 = 512, > >> > >> to > >> > >>.gc_thresh1 = 700, > >>.gc_thresh2 = 700, > >> > >> under the definition for struct arp_tbl? > > > >I don't think it could cause a problem, but network people will surely > >correct me if I'm wrong. > > System is up and running perfectly now, it is routing everything at > about 200 mbps now with only 5% load avg with the above changes to > arp.c > > So the real question now is, why is this number so low by default? > It would probably be much better if this could be handled dynamically > in the kernel. I remember I read an argument against this a long time ago, but I don't remember where. I think it was some arbitrary decision that people using more than X ARP entries will need arpd. Most probably the code path in the ARP updates is/was not much optimized to handle large number of entries. Think about cable operators who may have 10-2 entries ! > Its a Juniper M7i > It comes default with a 5400 rpm laptop 2.5" harddrive but now we > bought a more robust "server" 2.5" harddrive. The "server" ones are not necessarily more robust, often they are faster. > It still barfs on the OS > install, so the linux is doing all the job now. Will get a juniper guy > to come and fix :) > > As a side note, i'm starting to wonder if it was worth the $20k when i > could just have a linux machine to do the job with a clone for backup > ;) That's often how linux penetrates the enterprise ;-) Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc
Andrew Morton writes: > On Thu, 26 Apr 2007 23:33:32 +0530 "Amit K. Arora" <[EMAIL PROTECTED]> wrote: > > > This patch implements the fallocate() system call and adds support for > > i386, x86_64 and powerpc. > > > > ... > > > > +asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len) > > Please add a comment over this function which specifies its behaviour. > Really it should be enough material from which a full manpage can be > written. This looks like it will have the same problem on s390 as sys_sync_file_range. Maybe the prototype should be: asmlinkage long sys_fallocate(loff_t offset, loff_t len, int fd, int mode) Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-dvb] Re: DST/BT878 module customization (.. was: Critical points about ...)
On Thu, 3 May 2007, Mauro Carvalho Chehab wrote: > Em Qua, 2007-05-02 ??s 04:10 -0700, Trent Piepho escreveu: > > I promise, this time it's right! > > http://linuxtv.org/hg/~tap/dst-new > > Confirmed. Now the patch is properly working. My tests were done with a > board with DST. Those are the results: > > 1) when DST is unselected, on a board with DST, it will print the errors > indicating that the Kconfig items were not selected: > > DVB: registering new adapter (bttv0). > DVB: Unable to find symbol dst_attach() > frontend_init: Could not find a Twinhan DST. > dvb-bt8xx: A frontend driver was not found for device 109e/0878 subsystem > fbfb/f800 > > The only issue is the wrong printk msg, stating that a "frontend driver" > were not found. As this issue also happens with the current driver, due > the usage of dvb_attach() macro, I don't see any regressions. > > It would be nice, however, to have a patch making dvb_attach more > generic, by e.g. having a variant that allows passing another message. Only this message is from dvb_attach(): > DVB: Unable to find symbol dst_attach() Is it saying that it cannot load the module that dst_attach() is in (it doesn't know what module that is, modprobe knows that). If you enabled dst support and deleted the module, it would be the same. If you turn off dvb_attach() and also disable dst, you should instead get this message: dst_attach: driver disabled by Kconfig Maybe that would look nicer with a "DVB: " prefix? That would easier if it wasn't necessary to update the printk in each boilerplate stub function. What if one macro created these stubs > frontend_init: Could not find a Twinhan DST. > dvb-bt8xx: A frontend driver was not found for device 109e/0878 subsystem > fbfb/f800 These two messages are printed by the dvb-bt8xx driver, not by dvb_attach(). It would be trivial to change of course, but I'm not sure what would be pedantically correct for both dst and non-dst based hardware. > There's an argument against the prototype changes on dst_attach and > dst_ca_attach since they aren't frontend. The reason I changed that, is the dst_attach() already did return a dvb_frontend pointer, it was just inside an enclosing structure. i.e. what existed before: { struct dst_state *state; state = dst_attach(...); card->fe = &state->frontend; } /* state goes out of scope */ The frontend is inside the state struct and the state pointer isn't saved anywhere. dvb-bt8xx just saves a frontend pointer from inside the dst state and tosses the state pointer away. So I changed that to: card->fe = dst_attach(...); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + per-cpuset-hugetlb-accounting-and-administration.patch added to -mm tree
Ken wrote: > If this is odd, do you have any suggestions for alternative? No, I don't. Sorry. It's a touchy problem, and I'm not enough of an expert to know what the right tradeoffs are in this matter. I agree with your point that if you realize what's going on, namely that what cpuset the task reading meminfo is in affects the HugePages values that are read, then one can use the interface easily enough. ... how about: 1) don't change the existing HugePages_* values - keep them system-wide, and 2) adding two new values, by such names as: Current_Cpuset_HugePages_Total:0 Current_Cpuset_HugePages_Free: 0 That's certainly an uglier proposal than yours ;). But at least it seems clearer, and doesn't make incompatible changes to what's there. It does require user level code change to actually benefit from the new values, whereas your patch sort of sneaks them in, on the assumption that the majority of reads of these values would really prefer getting the cpuset relative totals instead. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + per-cpuset-hugetlb-accounting-and-administration.patch added to -mm tree
On Thu, 3 May 2007 21:49:12 -0700 "Ken Chen" <[EMAIL PROTECTED]> wrote: > On 5/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote: > > Adding Christoph Lameter <[EMAIL PROTECTED]> to the cc list, as he knows > > more about hugetlb pages than I do. > > > > This patch strikes me as a bit odd. > > > > Granted, it's solving what could be a touchy problem with a fairly > > simple solution, which is usually a Good Thing(tm). > > > > However, the idea that different tasks would see different values for > > the following fields in /proc/meminfo: > > > > HugePages_Total: 0 > > HugePages_Free: 0 > > > > strikes me as odd, and risky. I would have thought that usually, all > > tasks in the system should see the same values in the files in /proc > > (as opposed to the files in particular task subdirectories /proc/.) > > > > This patch strikes me as a bit of a hack, good for compatibility, but > > hiding a booby trap that will bite some user code in the long run. > > > > But I'm not enough of an expert to know what the right tradeoffs are > > in this matter. > > Would annotating the Hugepages_* field with name of cpuset help? There are existing programs which parse /proc/meminfo. If we're going to do any of this then it would need to be via new fields. I don't think we should be altering the meaning of the HugePages fields like this. One can imagine scenarios in which such a change would cause existing userspace scripts to fail. Plus it's Just Weird to use /proc/meminfo in this manner. > I > orginally thought that since cpuset's mems are hirearchical in memory > assignment, it is fairly straightforward to understand what's going > on: parent cpuset stats include its and all of its children. For > example, if root cpuset has two sub job1 and job2 cpusets, each has 20 > and 30 htlb pages, when query at each level, we have: > > [EMAIL PROTECTED] echo $$ > /dev/cpuset/tasks > [EMAIL PROTECTED] grep HugePages_Total /proc/meminfo > HugePages_Total:50 > > [EMAIL PROTECTED] echo $$ > /dev/cpuset/job1/tasks > [EMAIL PROTECTED] grep HugePages_Total /proc/meminfo > HugePages_Total:20 > > [EMAIL PROTECTED] echo $$ > /dev/cpuset/job2/tasks > [EMAIL PROTECTED] grep HugePages_Total /proc/meminfo > HugePages_Total:30 > > If this is odd, do you have any suggestions for alternative? If it's per-cpuset information then shouldn't it be presented in /dev/cpuset/something? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: console font limits
On Thu, 2007-05-03 at 23:58 -0400, Daniel Hazelton wrote: > On Thursday 03 May 2007 20:39:05 H. Peter Anvin wrote: > > Kyle Moffett wrote: > I guess I could start on that work again - shouldn't take me all that long to > recover the stuff I lost when a blackout caused my hard drive to get > corrupted beyond recovery (and the automated journal replay didn't do a > damned thing - I think it actually *added* to the corruption, but I don't > think any filesystem would have survived that) You might want to look at the modesetting-101 branch of DRM. It's goal is similar to yours. They even have a drm framebuffer. I don't know how far they are with their goal, but I can see some progress. Here's their git tree: git://git.freedesktop.org/git/mesa/drm#modesetting-101 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RHEL 3
On Fri, 2007-05-04 at 12:27 +0800, Majumder, Rajib wrote: > Hi, you're offtopic and are better off asking on a RH list > > I am wondering if RHEL 3 (based on 2.4.21 kernel but RH claims they > backported lot of 2.6 kernel's feature into it) supports Multi-Core and > Hyperthreaded CPUs. it'll boot. it'll not work well. > > Is the CPU-scheduler multi-core/ no > hyperthreading aware? yes > Is it aware ccNUMA no - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc
On Thu, 3 May 2007 21:29:55 -0700 Andrew Morton <[EMAIL PROTECTED]> wrote: > > + ret = -EFBIG; > > + if (offset + len > inode->i_sb->s_maxbytes) > > + goto out_fput; > > This code does handle offset+len going negative, but only by accident, I > suspect. But it doesn't handle offset+len wrapping through zero. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/8] Universal power supply class (was: battery class)
On 5/3/07, Anton Vorontsov <[EMAIL PROTECTED]> wrote: This class is result of "external power" and "battery" classes merge, as suggested by David Woodhouse. He also implemented uevent support. Looks great. In particular, the policies you've chosen for the attributes and units are very reasonable. I'll gladly accept patches moving tp_smapi to this interface (or eventually do it myself when I have time). A few minor points: +#define POWER_SUPPLY_TECHNOLOGY_UNKNOWN 0 +#define POWER_SUPPLY_TECHNOLOGY_NIMH1 +#define POWER_SUPPLY_TECHNOLOGY_LION2 +#define POWER_SUPPLY_TECHNOLOGY_LIPO3 Might as well add NiCd (common in UPS). +#define POWER_SUPPLY_CAPACITY_LEVEL_UNKNOWN 0 +#define POWER_SUPPLY_CAPACITY_LEVEL_CRITICAL 1 +#define POWER_SUPPLY_CAPACITY_LEVEL_LOW 2 +#define POWER_SUPPLY_CAPACITY_LEVEL_NORMAL 3 +#define POWER_SUPPLY_CAPACITY_LEVEL_HIGH 4 +#define POWER_SUPPLY_CAPACITY_LEVEL_FULL 5 Should this be synthesized by the driver if the hardware gives only quantitative values? If so, maybe provide some guidelines. +enum power_supply_type { + POWER_SUPPLY_TYPE_BATTERY = 0, + POWER_SUPPLY_TYPE_UPS, + POWER_SUPPLY_TYPE_AC, + POWER_SUPPLY_TYPE_USB, +}; How about dumb (non-USB) DC power? Any reason to distinguish it from AC? Shem - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + per-cpuset-hugetlb-accounting-and-administration.patch added to -mm tree
On 5/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote: Adding Christoph Lameter <[EMAIL PROTECTED]> to the cc list, as he knows more about hugetlb pages than I do. This patch strikes me as a bit odd. Granted, it's solving what could be a touchy problem with a fairly simple solution, which is usually a Good Thing(tm). However, the idea that different tasks would see different values for the following fields in /proc/meminfo: HugePages_Total: 0 HugePages_Free: 0 strikes me as odd, and risky. I would have thought that usually, all tasks in the system should see the same values in the files in /proc (as opposed to the files in particular task subdirectories /proc/.) This patch strikes me as a bit of a hack, good for compatibility, but hiding a booby trap that will bite some user code in the long run. But I'm not enough of an expert to know what the right tradeoffs are in this matter. Would annotating the Hugepages_* field with name of cpuset help? I orginally thought that since cpuset's mems are hirearchical in memory assignment, it is fairly straightforward to understand what's going on: parent cpuset stats include its and all of its children. For example, if root cpuset has two sub job1 and job2 cpusets, each has 20 and 30 htlb pages, when query at each level, we have: [EMAIL PROTECTED] echo $$ > /dev/cpuset/tasks [EMAIL PROTECTED] grep HugePages_Total /proc/meminfo HugePages_Total:50 [EMAIL PROTECTED] echo $$ > /dev/cpuset/job1/tasks [EMAIL PROTECTED] grep HugePages_Total /proc/meminfo HugePages_Total:20 [EMAIL PROTECTED] echo $$ > /dev/cpuset/job2/tasks [EMAIL PROTECTED] grep HugePages_Total /proc/meminfo HugePages_Total:30 If this is odd, do you have any suggestions for alternative? - Ken - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007 20:34:48 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Thu, 3 May 2007, Andrew Morton wrote: > > > On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <[EMAIL > > PROTECTED]> wrote: > > > > > Performance tests show a slight improvements in netperf (not a > > > strong case for a performance improvement but removing the > > > constructor has definitely no negative impact so why keep > > > this around?). > > > > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost > > > (127.0.0.1) port 0 AF_INET > > > Recv SendSend > > > Socket Socket Message Elapsed > > > Size SizeSize Time Throughput > > > bytes bytes bytessecs.10^6bits/sec > > > > > > Before: > > > 87380 16384 1638410.016026.04 > > > 87380 16384 1638410.015992.17 > > > 87380 16384 1638410.016071.23 > > > > > > After: > > > 87380 16384 1638410.016090.20 > > > 87380 16384 1638410.016078.3 > > > 87380 16384 1638410.006013.52 > > > > How could a filesystem change affect networking performance? > > > > The change looks nice, but I'd microbenchmark it with a > > write-to-ext2-on-ramdisk > > or something like that. > > H.. I was told in another thread that this is the most frequently used > slab for this benchmark That would be hair-raising ;) I suspect confusion with sk_buff. buffer_heads do get used quite a bit though. A good microbenchmark would be to sit in a tight loop extending and truncating an ext2 file - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007, Andrew Morton wrote: > The change looks nice, but I'd microbenchmark it with a > write-to-ext2-on-ramdisk > or something like that. Hmmm... How does one benchmark buffer head performance? Guess just by copying files? Not sure if the following will cut it. Two tests. First copying 8M of small files into a 16M ramdisk: for i in 1 2 3 4 5 6 7 8 9; do mke2fs /dev/ram0 >/dev/null mount /dev/ram0 /media >/dev/null time cp -a /etc /media umount /dev/ram0 done; No constructor real0m0.104s user0m0.016s sys 0m0.056s real0m0.090s user0m0.008s sys 0m0.056s real0m0.089s user0m0.016s sys 0m0.048s real0m0.097s user0m0.004s sys 0m0.064s real0m0.091s user0m0.008s sys 0m0.052s real0m0.091s user0m0.004s sys 0m0.060s real0m0.098s user0m0.008s sys 0m0.060s real0m0.091s user0m0.000s sys 0m0.064s real0m0.090s user0m0.012s sys 0m0.052s W/constructor real0m0.099s user0m0.004s sys 0m0.100s real0m0.098s user0m0.008s sys 0m0.096s real0m0.091s user0m0.016s sys 0m0.080s real0m0.091s user0m0.012s sys 0m0.084s real0m0.090s user0m0.012s sys 0m0.080s real0m0.090s user0m0.020s sys 0m0.076s real0m1.269s user0m0.012s sys 0m0.084s real0m0.095s user0m0.016s sys 0m0.084s real0m0.096s user0m0.020s sys 0m0.084s The no constructor numbers are generally lower. Lowest is no constructor with 0.089. Second. Copy vmlinux (52M) to 128M ramdisk: for i in 1 2 3 4 5 6 7 8 9; do mke2fs /dev/ram0 >/dev/null mount /dev/ram0 /media >/dev/null time cp slub/vmlinux /media umount /dev/ram0 done; No constructor: real0m2.095s user0m0.000s sys 0m0.168s real0m0.187s user0m0.008s sys 0m0.124s real0m0.186s user0m0.008s sys 0m0.120s real0m0.195s user0m0.008s sys 0m0.128s real0m0.177s user0m0.004s sys 0m0.120s real0m0.182s user0m0.004s sys 0m0.120s real0m0.186s user0m0.008s sys 0m0.120s real0m0.190s user0m0.004s sys 0m0.128s real0m0.174s user0m0.004s sys 0m0.116s Constructor real0m0.183s user0m0.004s sys 0m0.188s real0m0.183s user0m0.004s sys 0m0.192s real0m0.177s user0m0.012s sys 0m0.176s real0m0.186s user0m0.004s sys 0m0.192s real0m0.187s user0m0.008s sys 0m0.188s real0m0.184s user0m0.004s sys 0m0.192s real0m0.177s user0m0.012s sys 0m0.176s real0m0.183s user0m0.004s sys 0m0.192s real0m0.182s user0m0.004s sys 0m0.188s Same here. Low is 0.174 no constructor. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] ext4: write support for preallocated blocks/extents
On Thu, 26 Apr 2007 23:46:23 +0530 "Amit K. Arora" <[EMAIL PROTECTED]> wrote: > This patch adds write support for preallocated (using fallocate system > call) blocks/extents. The preallocated extents in ext4 are marked > "uninitialized", hence they need special handling especially while > writing to them. This patch takes care of that. > > ... > > /* > + * ext4_ext_try_to_merge: > + * tries to merge the "ex" extent to the next extent in the tree. > + * It always tries to merge towards right. If you want to merge towards > + * left, pass "ex - 1" as argument instead of "ex". > + * Returns 0 if the extents (ex and ex+1) were _not_ merged and returns > + * 1 if they got merged. OK. > + */ > +int ext4_ext_try_to_merge(struct inode *inode, > + struct ext4_ext_path *path, > + struct ext4_extent *ex) > +{ > + struct ext4_extent_header *eh; > + unsigned int depth, len; > + int merge_done=0, uninitialized = 0; space around "=", please. Many people prefer not to do the multiple-definitions-per-line, btw: int merge_done = 0; int uninitialized = 0; reasons: - If gives you some space for a nice comment - It makes patches much more readable, and it makes rejects easier to fix - standardisation. > + depth = ext_depth(inode); > + BUG_ON(path[depth].p_hdr == NULL); > + eh = path[depth].p_hdr; > + > + while (ex < EXT_LAST_EXTENT(eh)) { > + if (!ext4_can_extents_be_merged(inode, ex, ex + 1)) > + break; > + /* merge with next extent! */ > + if (ext4_ext_is_uninitialized(ex)) > + uninitialized = 1; > + ex->ee_len = cpu_to_le16(ext4_ext_get_actual_len(ex) > + + ext4_ext_get_actual_len(ex + 1)); > + if (uninitialized) > + ext4_ext_mark_uninitialized(ex); > + > + if (ex + 1 < EXT_LAST_EXTENT(eh)) { > + len = (EXT_LAST_EXTENT(eh) - ex - 1) > + * sizeof(struct ext4_extent); > + memmove(ex + 1, ex + 2, len); > + } > + eh->eh_entries = cpu_to_le16(le16_to_cpu(eh->eh_entries)-1); Kenrel convention is to put spaces around "-" > + merge_done = 1; > + BUG_ON(eh->eh_entries == 0); eek, scary BUG_ON. Do we really need to be that severe? Would it be better to warn and run ext4_error() here? > + } > + > + return merge_done; > +} > + > + > > ... > > +/* > + * ext4_ext_convert_to_initialized: > + * this function is called by ext4_ext_get_blocks() if someone tries to write > + * to an uninitialized extent. It may result in splitting the uninitialized > + * extent into multiple extents (upto three). Atleast one initialized extent > + * and atmost two uninitialized extents can result. There are some typos here > + * There are three possibilities: > + * a> No split required: Entire extent should be initialized. > + * b> Split into two extents: Only one end of the extent is being written > to. > + * c> Split into three extents: Somone is writing in middle of the extent. and here > + */ > +int ext4_ext_convert_to_initialized(handle_t *handle, struct inode *inode, > + struct ext4_ext_path *path, > + ext4_fsblk_t iblock, > + unsigned long max_blocks) > +{ > + struct ext4_extent *ex, *ex1 = NULL, *ex2 = NULL, *ex3 = NULL, newex; > + struct ext4_extent_header *eh; > + unsigned int allocated, ee_block, ee_len, depth; > + ext4_fsblk_t newblock; > + int err = 0, ret = 0; > + > + depth = ext_depth(inode); > + eh = path[depth].p_hdr; > + ex = path[depth].p_ext; > + ee_block = le32_to_cpu(ex->ee_block); > + ee_len = ext4_ext_get_actual_len(ex); > + allocated = ee_len - (iblock - ee_block); > + newblock = iblock - ee_block + ext_pblock(ex); > + ex2 = ex; > + > + /* ex1: ee_block to iblock - 1 : uninitialized */ > + if (iblock > ee_block) { > + ex1 = ex; > + ex1->ee_len = cpu_to_le16(iblock - ee_block); > + ext4_ext_mark_uninitialized(ex1); > + ex2 = &newex; > + } > + /* for sanity, update the length of the ex2 extent before > + * we insert ex3, if ex1 is NULL. This is to avoid temporary > + * overlap of blocks. > + */ > + if (!ex1 && allocated > max_blocks) > + ex2->ee_len = cpu_to_le16(max_blocks); > + /* ex3: to ee_block + ee_len : uninitialised */ > + if (allocated > max_blocks) { > + unsigned int newdepth; > + ex3 = &newex; > + ex3->ee_block = cpu_to_le32(iblock + max_blocks); > + ext4_ext_store_pblock(ex3, newblock + max_blocks); > + ex3->ee_len = cpu_to_le16(allocated - max_blocks); > + ext4_ext_mark_uni
Re: [RFC] [PATCH] DRM TTM Memory Manager patch
On Thu, 2007-05-03 at 01:01 +0200, Thomas Hellström wrote: > It might be possible to find schemes that work around this. One way > could possibly be to have a buffer mapping -and validate order for > shared buffers. If mapping never blocks on anything other than the fence, then there isn't any dead lock possibility. What this says is that ordering of rendering between clients is *not DRMs problem*. I think that's a good solution though; I want to let multiple apps work on DRM-able memory with their own CPU without contention. I don't recall if Eric layed out the proposed rules, but: 1) Map never blocks on map. Clients interested in dealing with this are on their own. 2) Submit blocks on map. You must unmap all buffers before submitting them. Doing the relocations in the kernel makes this all possible. 3) Map blocks on the fence from submit. We can play with pending the flush until the app asks for the buffer back, or we can play with figuring out when flushes are useful automatically. Doesn't matter if the policy is in the kernel. I'm interested in making deadlock avoidence trivial and eliminating any map-map contention. -- [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Re: [PATCH 4/5] ext4: fallocate support in ext4
On Thu, 26 Apr 2007 23:43:32 +0530 "Amit K. Arora" <[EMAIL PROTECTED]> wrote: > This patch has the ext4 implemtation of fallocate system call. > > ... > > + /* ext4_can_extents_be_merged should have checked that either > + * both extents are uninitialized, or both aren't. Thus we > + * need to check only one of them here. > + */ Please always format multiline comments like this: /* * ext4_can_extents_be_merged should have checked that either * both extents are uninitialized, or both aren't. Thus we * need to check only one of them here. */ > ... > > +/* > + * ext4_fallocate: > + * preallocate space for a file > + * mode is for future use, e.g. for unallocating preallocated blocks etc. > + */ This description is rather thin. What is the filesystem's actual behaviour here? If the file is using extents then the implementation will do . If the file is using bitmaps then we will do . But what? Here is where it should be described. > +int ext4_fallocate(struct inode *inode, int mode, loff_t offset, loff_t len) > +{ > + handle_t *handle; > + ext4_fsblk_t block, max_blocks; > + int ret, ret2, nblocks = 0, retries = 0; > + struct buffer_head map_bh; > + unsigned int credits, blkbits = inode->i_blkbits; > + > + /* Currently supporting (pre)allocate mode _only_ */ > + if (mode != FA_ALLOCATE) > + return -EOPNOTSUPP; > + > + if (!(EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL)) > + return -ENOTTY; So we don't implement fallocate on bitmap-based files! Well that's huge news. The changelog would be an appropriate place to communicate this, along with reasons why, or a description of the plan to fix it. Also, posix says nothing about fallocate() returning ENOTTY. > + block = offset >> blkbits; > + max_blocks = (EXT4_BLOCK_ALIGN(len + offset, blkbits) >> blkbits) > + - block; > + mutex_lock(&EXT4_I(inode)->truncate_mutex); > + credits = ext4_ext_calc_credits_for_insert(inode, NULL); > + mutex_unlock(&EXT4_I(inode)->truncate_mutex); Now I'm mystified. Given that we're allocating an arbitrary amount of disk space, and that this disk space will require an arbitrary amount of metadata, how can we work out how much journal space we'll be needing without at least looking at `len'? > + handle=ext4_journal_start(inode, credits + Please always put spaces around "=" > + EXT4_DATA_TRANS_BLOCKS(inode->i_sb)+1); And around "+" > + if (IS_ERR(handle)) > + return PTR_ERR(handle); > +retry: > + ret = 0; > + while (ret >= 0 && ret < max_blocks) { > + block = block + ret; > + max_blocks = max_blocks - ret; > + ret = ext4_ext_get_blocks(handle, inode, block, > + max_blocks, &map_bh, > + EXT4_CREATE_UNINITIALIZED_EXT, 0); > + BUG_ON(!ret); BUG_ON is vicious. Is it really justified here? Possibly a WARN_ON and ext4_error() would be safer and more useful here. > + if (ret > 0 && test_bit(BH_New, &map_bh.b_state) Use buffer_new() here. A separate patch which fixes the three existing instances of open-coded BH_foo usage would be appreciated. > + && ((block + ret) > (i_size_read(inode) << blkbits))) Check for wrap though the sign bit and through zero please. > + nblocks = nblocks + ret; > + } > + > + if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) > + goto retry; > + > + /* Time to update the file size. > + * Update only when preallocation was requested beyond the file size. > + */ Fix comment layout. > + if ((offset + len) > i_size_read(inode)) { Both the lhs and the rhs here are signed. Please review for possible overflows through the sign bit and through zero. Perhaps a comment explaining why it's correct would be appropriate. > + if (ret > 0) { > + /* if no error, we assume preallocation succeeded completely */ > + mutex_lock(&inode->i_mutex); > + i_size_write(inode, offset + len); > + EXT4_I(inode)->i_disksize = i_size_read(inode); > + mutex_unlock(&inode->i_mutex); > + } else if (ret < 0 && nblocks) { > + /* Handle partial allocation scenario */ The above two comments should be indented one additional tabstop. > + loff_t newsize; > + mutex_lock(&inode->i_mutex); > + newsize = (nblocks << blkbits) + i_size_read(inode); > + i_size_write(inode, EXT4_BLOCK_ALIGN(newsize, blkbits)); > + EXT4_I(inode)->i_disksize = i_size_read(inode); > + mut
Re: [PATCH 3/5] ext4: Extent overlap bugfix
On Thu, 26 Apr 2007 23:41:01 +0530 "Amit K. Arora" <[EMAIL PROTECTED]> wrote: > +unsigned int ext4_ext_check_overlap(struct inode *inode, > + struct ext4_extent *newext, > + struct ext4_ext_path *path) > +{ > + unsigned long b1, b2; > + unsigned int depth, len1; > + > + b1 = le32_to_cpu(newext->ee_block); > + len1 = le16_to_cpu(newext->ee_len); > + depth = ext_depth(inode); > + if (!path[depth].p_ext) > + goto out; > + b2 = le32_to_cpu(path[depth].p_ext->ee_block); > + > + /* get the next allocated block if the extent in the path > + * is before the requested block(s) */ > + if (b2 < b1) { > + b2 = ext4_ext_next_allocated_block(path); > + if (b2 == EXT_MAX_BLOCK) > + goto out; > + } > + > + if (b1 + len1 > b2) { Are we sure that b1+len cannot wrap through zero here? > + newext->ee_len = cpu_to_le16(b2 - b1); > + return 1; > + } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc
On Thu, 26 Apr 2007 23:33:32 +0530 "Amit K. Arora" <[EMAIL PROTECTED]> wrote: > This patch implements the fallocate() system call and adds support for > i386, x86_64 and powerpc. > > ... > > +asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len) Please add a comment over this function which specifies its behaviour. Really it should be enough material from which a full manpage can be written. If that's all too much, this material should at least be spelled out in the changelog. Because there's no way in which this change can be fully reviewed unless someone (ie: you) tells us what it is setting out to achieve. If we 100% implement some standard then a URL for what we claim to implement would suffice. Given that we're at least using different types from posix I doubt if such a thing would be sufficient. And given the complexity and potential variability within the filesystem implementations of this, I'd expect that _something_ additional needs to be said? > +{ > + struct file *file; > + struct inode *inode; > + long ret = -EINVAL; > + > + if (len == 0 || offset < 0) > + goto out; The posix spec implies that negative `len' is permitted - presumably "allocate ahead of `offset'". How peculiar. > + ret = -EBADF; > + file = fget(fd); > + if (!file) > + goto out; > + if (!(file->f_mode & FMODE_WRITE)) > + goto out_fput; > + > + inode = file->f_path.dentry->d_inode; > + > + ret = -ESPIPE; > + if (S_ISFIFO(inode->i_mode)) > + goto out_fput; > + > + ret = -ENODEV; > + if (!S_ISREG(inode->i_mode)) > + goto out_fput; So we return ENODEV against an S_ISBLK fd, as per the posix spec. That seems a bit silly of them. > + ret = -EFBIG; > + if (offset + len > inode->i_sb->s_maxbytes) > + goto out_fput; This code does handle offset+len going negative, but only by accident, I suspect. It happens that s_maxbytes has unsigned type. Perhaps a comment here would settle the reader's mind. > + if (inode->i_op && inode->i_op->fallocate) > + ret = inode->i_op->fallocate(inode, mode, offset, len); > + else > + ret = -ENOSYS; If we _are_ going to support negative `len', as posix suggests, I think we should perform the appropriate sanity conversions to `offset' and `len' right here, rather than expecting each filesystem to do it. If we're not going to handle negative `len' then we should check for it. > +out_fput: > + fput(file); > +out: > + return ret; > +} > +EXPORT_SYMBOL(sys_fallocate); I don't believe this needs to be exported to modules? > +/* > + * fallocate() modes > + */ > +#define FA_ALLOCATE 0x1 > +#define FA_DEALLOCATE0x2 Now those aren't in posix. They should be documented, along with their expected semantics. > #ifdef __KERNEL__ > > #include > @@ -1125,6 +1131,7 @@ struct inode_operations { > ssize_t (*listxattr) (struct dentry *, char *, size_t); > int (*removexattr) (struct dentry *, const char *); > void (*truncate_range)(struct inode *, loff_t, loff_t); > + long (*fallocate)(struct inode *, int, loff_t, loff_t); I really do think it's better to put the variable names in definitions such as this. Especially when we have two identically-typed variables next to each other like that. Quick: which one is the offset and which is the length? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RHEL 3
Hi, I am wondering if RHEL 3 (based on 2.4.21 kernel but RH claims they backported lot of 2.6 kernel's feature into it) supports Multi-Core and Hyperthreaded CPUs. Is the CPU-scheduler multi-core/hyperthreading aware? Is it aware ccNUMA multi-core CPU? Any input is appreciated. Thanks Rajib == Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html == - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22 -mm merge plans -- vm bugfixes
Andrew Morton wrote: On Thu, 03 May 2007 11:32:23 +1000 Nick Piggin <[EMAIL PROTECTED]> wrote: void fastcall unlock_page(struct page *page) { + VM_BUG_ON(!PageLocked(page)); smp_mb__before_clear_bit(); - if (!TestClearPageLocked(page)) - BUG(); - smp_mb__after_clear_bit(); - wake_up_page(page, PG_locked); + ClearPageLocked(page); + if (unlikely(test_bit(PG_waiters, &page->flags))) { + clear_bit(PG_waiters, &page->flags); + wake_up_page(page, PG_locked); + } } Why is that significantly faster than plain old wake_up_page(), which tests waitqueue_active()? Because it needs fewer barriers and doesn't touch random a random hash cacheline in the fastpath. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: console font limits
On Thursday 03 May 2007 20:39:05 H. Peter Anvin wrote: > Kyle Moffett wrote: > > Actually I think the real problem was that "KD_GRAPHICS" got overloaded > > to mean "some userspace program is probably poking at the GPU in very > > direct ways possibly including /dev/mem". As such it really isn't safe > > at all for the kernel to write stuff to the screen in that situation; > > you could turn a panic()+reboot-after-30-secs into an unrecoverable hard > > PCI bus lockup. IIRC there were at least a couple chipsets which had > > that problem with X. If we can implement enough APIs for X to do all of > > its stuff from userspace without iopl() or /dev/mem then we could > > probably bring back the option for dumping oopses to screen in > > KD_GRAPHICS mode, but otherwise it'll just cause more headaches. > > It never meant anything *BUT* that, to the best of my knowledge. That > was certainly the original meaning of KD_GRAPHICS. I started work last year on making the framebuffer layer use the DRM internals for all controls, providing a unified kernel and userspace system for accessing the graphics devices. It never got anywhere because I couldn't figure out a simple system for figuring out which driver (out of the numerous ones that could potentially be compiled into the kernel) to actually give control to. (I know I could have just looped over them all and figured it out that way, but that is far from elegant) I guess I could start on that work again - shouldn't take me all that long to recover the stuff I lost when a blackout caused my hard drive to get corrupted beyond recovery (and the automated journal replay didn't do a damned thing - I think it actually *added* to the corruption, but I don't think any filesystem would have survived that) DRH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
how can I debug to debug kernel pointer error?
Hi all! I met a issue that some code changing one process preempt_count. preempt_count is changed to a very large number, for instant, 0x300, just before finish_schedule function in schedule. Who can give me some suggestion to debug such problem? Thanks very much! Janboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Routing 600+ vlan's via linux problems (looks like arp problems)
Hi again :) On 5/4/07, Willy Tarreau <[EMAIL PROTECTED]> wrote: On Thu, May 03, 2007 at 11:12:09PM +0200, Øyvind Vågen Jægtnes wrote: > On 5/3/07, Jan Engelhardt <[EMAIL PROTECTED]> wrote: > > > >On May 3 2007 22:53, Willy Tarreau wrote: > >>> For the rest all we see in the arp cache is (incomplete) > >> > >>I suspect that your arp cache is full (128 entries by default). > >>Check /proc/sys/net/ipv4/neigh/gc_thresh1 (128 for me). You can > >>set it as high as gc_thresh2 (512 for me), and I don't know what > >>happens above. > > > >Above, you will perhaps need the not-so-elegant userspace arpd :-/ > > Yes, i was suspecting that the arp cache got full, but i will try > increasing it :) > Would there be any huge bugs if i change these lines in arp.c: > >.gc_thresh1 = 128, >.gc_thresh2 = 512, > > to > >.gc_thresh1 = 700, >.gc_thresh2 = 700, > > under the definition for struct arp_tbl? I don't think it could cause a problem, but network people will surely correct me if I'm wrong. System is up and running perfectly now, it is routing everything at about 200 mbps now with only 5% load avg with the above changes to arp.c So the real question now is, why is this number so low by default? It would probably be much better if this could be handled dynamically in the kernel. > This setup will only run for about 1-2 hours while we fix the hardware > router (it is running now, but only on a backup flash card solution. > the harddrive in it died ;) Huhhh! Please tell us exactly what make and model of ROUTER you are using which embeds a HARD DRIVE, so that we recall never to buy that ! Having seen uptimes of 5 years on moderately big access routers, I would have find it awful to see them die multiple times in that timeframe because of a crappy IDE drive inside ! Its a Juniper M7i It comes default with a 5400 rpm laptop 2.5" harddrive but now we bought a more robust "server" 2.5" harddrive. It still barfs on the OS install, so the linux is doing all the job now. Will get a juniper guy to come and fix :) As a side note, i'm starting to wonder if it was worth the $20k when i could just have a linux machine to do the job with a clone for backup ;) regards Øyvind Vågen Jægtnes +47 96 22 03 08 [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Serial 8250: clear the lsr_break_flag at open
Russell King wrote: The backup code is something I never properly reviewed, so no comments there. The tx_empty code I assumed would be a relatively rare event, except when closing the port (at which point you don't particularly care about errors anyway, not even the break flag since chances are you'll miss the following character.) That "if" statement in the backup code does look a little dodgy, more than is perhaps required. I think it's correct, but I need to add a lock there in my patch to protect the LSR check. Given that people might want to poll it for various reasons, I guess saving the status away should be done. However, there's a slight issue with working out which character the error is associated with. Careful locking may be the answer to that though. I think as long as you hold the port lock while you grab the LSR and set the saved flags it will work. As for start_tx, yes, though slightly harder to check. Maybe the code should be modified to reduce the number of potential LSR reads by reading the IIR first, and only if that shows no interrupt pending should the LSR be read (and the error flags remembered.) The version of start_tx in 2.6.21 does check IIR first, and it only checks the LSR if UART_BUG_TXEN is set, so I assume that's not a big deal. I'll sleep on it tonight, look it over tomorrow morning, and resend the patch. Thanks, -corey - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RELEASE] Lguest for 2.6.21
On Thu, 2007-05-03 at 22:20 -0500, Matt Mackall wrote: > I take it both sides of the virtual device drivers are turned on by > the lguest option? Yeah, to quote the code in drivers/lguest/lguest_bus.c: /* At the moment we build all the drivers into the kernel because they're so * simple: 8144 bytes for all three of them as I type this. And as the console * really needs to be built in, it's actually only 3527 bytes for the network * and block drivers. > For the purposes of kernel hacking, I'd want to boot into one build > and repeatedly launch another build as a guest, thereby getting > faster hack/build/test cycles than either qemu or full reboot. > How tightly coupled are things here? I do that all the time, too. The main issue is that we provide no ABI for lguest (at least, not yet), so if you actually change guest/host kernel version, you're on your own... Thanks! Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007, Andrew Morton wrote: > On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> > wrote: > > > Performance tests show a slight improvements in netperf (not a > > strong case for a performance improvement but removing the > > constructor has definitely no negative impact so why keep > > this around?). > > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost > > (127.0.0.1) port 0 AF_INET > > Recv SendSend > > Socket Socket Message Elapsed > > Size SizeSize Time Throughput > > bytes bytes bytessecs.10^6bits/sec > > > > Before: > > 87380 16384 1638410.016026.04 > > 87380 16384 1638410.015992.17 > > 87380 16384 1638410.016071.23 > > > > After: > > 87380 16384 1638410.016090.20 > > 87380 16384 1638410.016078.3 > > 87380 16384 1638410.006013.52 > > How could a filesystem change affect networking performance? > > The change looks nice, but I'd microbenchmark it with a > write-to-ext2-on-ramdisk > or something like that. H.. I was told in another thread that this is the most frequently used slab for this benchmark .. Just accepted that as true. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] MM: use DIV_ROUND_UP() in mm/memory.c
On Tue, 24 Apr 2007 16:10:22 +0200 Rolf Eike Beer <[EMAIL PROTECTED]> wrote: > This should make no difference in behaviour. > > Signed-off-by: Rolf Eike Beer <[EMAIL PROTECTED]> > > --- > commit 64aa7c3136258d3abc76354b5f83b9a9575169c0 > tree 8037adc04b57cd6150456399b7caccf99489385a > parent bf0bd376f79cadb4f8cd454db1723eb9be0aabc1 > author Rolf Eike Beer <[EMAIL PROTECTED]> Tue, 24 Apr 2007 16:05:40 +0200 > committer Rolf Eike Beer <[EMAIL PROTECTED]> Tue, 24 Apr 2007 16:05:40 > +0200 > > mm/memory.c |7 +++ > 1 files changed, 3 insertions(+), 4 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index e7066e7..45bba1f 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1838,12 +1838,11 @@ void unmap_mapping_range(struct address_space > *mapping, > { > struct zap_details details; > pgoff_t hba = holebegin >> PAGE_SHIFT; > - pgoff_t hlen = (holelen + PAGE_SIZE - 1) >> PAGE_SHIFT; > + pgoff_t hlen = DIV_ROUND_UP(holelen, PAGE_SIZE); > > /* Check for overflow. */ > if (sizeof(holelen) > sizeof(hlen)) { > - long long holeend = > - (holebegin + holelen + PAGE_SIZE - 1) >> PAGE_SHIFT; > + long long holeend = DIV_ROUND_UP(holebegin + holelen, > PAGE_SIZE); > if (holeend & ~(long long)ULONG_MAX) > hlen = ULONG_MAX - hba + 1; > } > @@ -2592,7 +2591,7 @@ int make_pages_present(unsigned long addr, unsigned > long > end) > write = (vma->vm_flags & VM_WRITE) != 0; > BUG_ON(addr >= end); > BUG_ON(end > vma->vm_end); > - len = (end+PAGE_SIZE-1)/PAGE_SIZE-addr/PAGE_SIZE; > + len = DIV_ROUND_UP(end, PAGE_SIZE) - addr/PAGE_SIZE; > ret = get_user_pages(current, current->mm, addr, > len, write, 0, NULL, NULL); > if (ret < 0) The patch is wordwrapped. Please fix your MUA. More seriously, on i386: textdata bss dec hex filename 15509 27 28 155643ccc mm/memory.o (before) 15561 27 28 156163d00 mm/memory.o (after) I'm not sure why - some of the quantities which we're dividing by there are 64-bit and perhaps the compiler has decided not to do shifting. Please always check the before-and-after .text size from now on? Now I'm worried about all the other DIV_ROUND_UP() conversions we did. We should get in there and work out why it went bad. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Remove constructor from buffer_head
On Thu, 3 May 2007 20:08:41 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > Performance tests show a slight improvements in netperf (not a > strong case for a performance improvement but removing the > constructor has definitely no negative impact so why keep > this around?). > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost > (127.0.0.1) port 0 AF_INET > Recv SendSend > Socket Socket Message Elapsed > Size SizeSize Time Throughput > bytes bytes bytessecs.10^6bits/sec > > Before: > 87380 16384 1638410.016026.04 > 87380 16384 1638410.015992.17 > 87380 16384 1638410.016071.23 > > After: > 87380 16384 1638410.016090.20 > 87380 16384 1638410.016078.3 > 87380 16384 1638410.006013.52 How could a filesystem change affect networking performance? The change looks nice, but I'd microbenchmark it with a write-to-ext2-on-ramdisk or something like that. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RELEASE] Lguest for 2.6.21
On Fri, May 04, 2007 at 10:43:09AM +1000, Rusty Russell wrote: > On Fri, 2007-05-04 at 10:13 +1000, Rusty Russell wrote: > > On Thu, 2007-05-03 at 11:02 -0500, Matt Mackall wrote: > > > On Thu, May 03, 2007 at 12:43:48AM +1000, Rusty Russell wrote: > > > > http://lguest.ozlabs.org/lguest-2.6.21-254.patch.gz > > > > > > > > See Documentation/lguest/lguest.txt for how to run, > > > > drivers/lguest/README for the draft code documentation journey. > > > > > > Your lguest readme is quite lacking in the area of how to configure a > > > guest kernel as opposed to the host kernel. More hand-holding, please. > > > > Hi Matt! > > > > Ah, that's because they are the same kernel. Turning on CONFIG_LGUEST > > builds-in the parts needed to be a guest as well. Ok, I thought that might be a possibility. > -- You will need to configure your kernel with the following options: > +- Lguest runs the same kernel as guest and host. You can configure > + them differently, but usually it's easiest not to. I take it both sides of the virtual device drivers are turned on by the lguest option? For the purposes of kernel hacking, I'd want to boot into one build and repeatedly launch another build as a guest, thereby getting faster hack/build/test cycles than either qemu or full reboot. How tightly coupled are things here? -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Remove constructor from buffer_head
Performance tests show a slight improvements in netperf (not a strong case for a performance improvement but removing the constructor has definitely no negative impact so why keep this around?). TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET Recv SendSend Socket Socket Message Elapsed Size SizeSize Time Throughput bytes bytes bytessecs.10^6bits/sec Before: 87380 16384 1638410.016026.04 87380 16384 1638410.015992.17 87380 16384 1638410.016071.23 After: 87380 16384 1638410.016090.20 87380 16384 1638410.016078.3 87380 16384 1638410.006013.52 Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> --- fs/buffer.c | 22 -- 1 file changed, 4 insertions(+), 18 deletions(-) Index: slub/fs/buffer.c === --- slub.orig/fs/buffer.c 2007-05-03 19:17:09.0 -0700 +++ slub/fs/buffer.c2007-05-03 19:57:30.0 -0700 @@ -2907,9 +2907,10 @@ static void recalc_bh_state(void) struct buffer_head *alloc_buffer_head(gfp_t gfp_flags) { - struct buffer_head *ret = kmem_cache_alloc(bh_cachep, + struct buffer_head *ret = kmem_cache_zalloc(bh_cachep, set_migrateflags(gfp_flags, __GFP_RECLAIMABLE)); if (ret) { + INIT_LIST_HEAD(&ret->b_assoc_buffers); get_cpu_var(bh_accounting).nr++; recalc_bh_state(); put_cpu_var(bh_accounting); @@ -2928,17 +2929,6 @@ void free_buffer_head(struct buffer_head } EXPORT_SYMBOL(free_buffer_head); -static void -init_buffer_head(void *data, struct kmem_cache *cachep, unsigned long flags) -{ - if (flags & SLAB_CTOR_CONSTRUCTOR) { - struct buffer_head * bh = (struct buffer_head *)data; - - memset(bh, 0, sizeof(*bh)); - INIT_LIST_HEAD(&bh->b_assoc_buffers); - } -} - static void buffer_exit_cpu(int cpu) { int i; @@ -2965,12 +2955,8 @@ void __init buffer_init(void) { int nrpages; - bh_cachep = kmem_cache_create("buffer_head", - sizeof(struct buffer_head), 0, - (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC| - SLAB_MEM_SPREAD), - init_buffer_head, - NULL); + bh_cachep = KMEM_CACHE(buffer_head, + SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD); /* * Limit the bh occupancy to 10% of ZONE_NORMAL - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Regression with SLUB on Netperf and Volanomark
H... I do not see a regression (up to date slub with all outstanding patches applied). This is without any options enabled (but antifrag patches are present so slub_max_order=4 slub_min_objects=16) Could you post a .config? Missing patches against 2.6.21-rc7-mm2 can be found at http://ftp.kernel.org/pub/linux/kernel/peopl/christoph/slub-patches slab TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET Recv SendSend Socket Socket Message Elapsed Size SizeSize Time Throughput bytes bytes bytessecs.10^6bits/sec 87380 16384 1638410.016068.61 87380 16384 1638410.015877.91 87380 16384 1638410.015835.68 87380 16384 1638410.015840.58 slub TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET Recv SendSend Socket Socket Message Elapsed Size SizeSize Time Throughput bytes bytes bytessecs.10^6bits/sec 87380 16384 1638410.535646.53 87380 16384 1638410.016073.09 87380 16384 1638410.016094.68 87380 16384 1638410.016088.50 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fw: [BUG 2.6.21-rc7] acpi_pm clocksource loses time on x86-64
On Wed, 2007-05-02 at 11:10 -0700, john stultz wrote: > On Sun, 2007-04-29 at 17:24 +0200, Mikael Pettersson wrote: > > On Thu, 26 Apr 2007 15:42:44 -0700, john stultz wrote: > > >Another shot in the dark: > > > > > >I wonder if the ACPI PM counter is halting in idle. Does booting w/ > > >idle=poll change the behavior? (Please do this while your laptop is > > >plugged in, as it will run the cpu at full speed all the time). > > > > Bingo! > > Awesome! Finally, some progress! Thanks again for putting up w/ all my > testing requests. > > > I booted the x86-64 2.6.21 final kernel with idle=poll and let the > > laptop idle for an hour. The ondemand cpufreq governor did reduce > > the CPU's clock frequency, but that shouldn't have affected the > > chipset or the ACPI PM counter. > > > > Anyway, after 60 minutes `date' and `hwclock' were still in perfect > > sync and matched actual time. > > > > Any ideas why this halting in idle doesn't happen with the 32-bit kernel? > > No clue. Time to ask Len. :) > > Hey Len, > So that slow acpi_pm on x86_64 seems to be connected w/ the idle loop. > I'm guessing the chipset halts the ACPI PM in lower C states. Do you > have any guesses as to what might differ between x86_64 and i386 ACPI > idle loops? Or might this be something different in what the BIOS > exports in x86_64 mode or i386 mode? Mikael, Just trying to dig a bit more through the acpi_processor_idle code. Could you run "cat /proc/acpi/processor/CPU1/power" and reply w/ the output? thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Stop ignoring argument in drivers/network/b44.c
This patch uses the phy_id variable in b44_readphy and b44_writephy. Signed-off-by: Matthew Martin <[EMAIL PROTECTED]> --- --- vanilla-linux-2.6.21-git4/drivers/net/b44.c 2007-05-03 11:16:21.0 -0500 +++ linux-2.6.21-git4/drivers/net/b44.c 2007-05-03 17:02:39.0 -0500 @@ -327,45 +327,59 @@ static void b44_enable_ints(struct b44 * bw32(bp, B44_IMASK, bp->imask); } -static int b44_readphy(struct b44 *bp, int reg, u32 *val) +static int b44_readphy(struct b44 *bp, int reg, u32 *val, int phy_addr) { int err; bw32(bp, B44_EMAC_ISTAT, EMAC_INT_MII); - bw32(bp, B44_MDIO_DATA, (MDIO_DATA_SB_START | -(MDIO_OP_READ << MDIO_DATA_OP_SHIFT) | -(bp->phy_addr << MDIO_DATA_PMD_SHIFT) | -(reg << MDIO_DATA_RA_SHIFT) | -(MDIO_TA_VALID << MDIO_DATA_TA_SHIFT))); + + if (!phy_addr) + bw32(bp, B44_MDIO_DATA, (MDIO_DATA_SB_START | +(MDIO_OP_READ << MDIO_DATA_OP_SHIFT) | +(bp->phy_addr << MDIO_DATA_PMD_SHIFT) | +(reg << MDIO_DATA_RA_SHIFT) | +(MDIO_TA_VALID << MDIO_DATA_TA_SHIFT))); + else + bw32(bp, B44_MDIO_DATA, (MDIO_DATA_SB_START | +(MDIO_OP_READ << MDIO_DATA_OP_SHIFT) | +(phy_addr << MDIO_DATA_PMD_SHIFT) | +(reg << MDIO_DATA_RA_SHIFT) | +(MDIO_TA_VALID << MDIO_DATA_TA_SHIFT))); + err = b44_wait_bit(bp, B44_EMAC_ISTAT, EMAC_INT_MII, 100, 0); *val = br32(bp, B44_MDIO_DATA) & MDIO_DATA_DATA; return err; } -static int b44_writephy(struct b44 *bp, int reg, u32 val) +static int b44_writephy(struct b44 *bp, int reg, u32 val, int phy_addr) { bw32(bp, B44_EMAC_ISTAT, EMAC_INT_MII); - bw32(bp, B44_MDIO_DATA, (MDIO_DATA_SB_START | -(MDIO_OP_WRITE << MDIO_DATA_OP_SHIFT) | -(bp->phy_addr << MDIO_DATA_PMD_SHIFT) | -(reg << MDIO_DATA_RA_SHIFT) | -(MDIO_TA_VALID << MDIO_DATA_TA_SHIFT) | -(val & MDIO_DATA_DATA))); + + if (!phy_addr) + bw32(bp, B44_MDIO_DATA, (MDIO_DATA_SB_START | +(MDIO_OP_WRITE << MDIO_DATA_OP_SHIFT) | +(bp->phy_addr << MDIO_DATA_PMD_SHIFT) | +(reg << MDIO_DATA_RA_SHIFT) | +(MDIO_TA_VALID << MDIO_DATA_TA_SHIFT) | +(val & MDIO_DATA_DATA))); + else + bw32(bp, B44_MDIO_DATA, (MDIO_DATA_SB_START | +(MDIO_OP_WRITE << MDIO_DATA_OP_SHIFT) | +(phy_addr << MDIO_DATA_PMD_SHIFT) | +(reg << MDIO_DATA_RA_SHIFT) | +(MDIO_TA_VALID << MDIO_DATA_TA_SHIFT) | +(val & MDIO_DATA_DATA))); + return b44_wait_bit(bp, B44_EMAC_ISTAT, EMAC_INT_MII, 100, 0); } /* miilib interface */ -/* FIXME FIXME: phy_id is ignored, bp->phy_addr use is unconditional - * due to code existing before miilib use was added to this driver. - * Someone should remove this artificial driver limitation in - * b44_{read,write}phy. bp->phy_addr itself is fine (and needed). - */ static int b44_mii_read(struct net_device *dev, int phy_id, int location) { u32 val; struct b44 *bp = netdev_priv(dev); - int rc = b44_readphy(bp, location, &val); + int rc = b44_readphy(bp, location, &val, phy_id); if (rc) return 0x; return val; @@ -375,7 +389,7 @@ static void b44_mii_write(struct net_dev int val) { struct b44 *bp = netdev_priv(dev); - b44_writephy(bp, location, val); + b44_writephy(bp, location, val, phy_id); } static int b44_phy_reset(struct b44 *bp) @@ -383,11 +397,11 @@ static int b44_phy_reset(struct b44 *bp) u32 val; int err; - err = b44_writephy(bp, MII_BMCR, BMCR_RESET); + err = b44_writephy(bp, MII_BMCR, BMCR_RESET, 0); if (err) return err; udelay(100); - err = b44_readphy(bp, MII_BMCR, &val); + err = b44_readphy(bp, MII_BMCR, &val, 0); if (!err) { if (val & BMCR_RESET) { printk(KERN_ERR PFX "%s: PHY Reset would not complete.\n", @@ -446,15 +460,15 @@ static int b44_setup_phy(struct b44 *bp) u32 val; int err; - if ((err = b44_readphy(bp, B44_MII_ALEDCTRL, &val)) != 0) + if ((err = b44_readphy(bp, B44_MII_ALEDCTRL, &val, 0))
Re: [PATCH] tty add compat_ioctl
Paul Fulghum wrote: Arnd Bergmann wrote: - In your driver you don't get the big kernel lock in the compat_ioctl function. I assume that this is correct for the particular driver, but it may be nice if you could consequently also add an unlocked_ioctl function that can be used without the BKL for native ioctls. It would be good to hear an opinon on this from someone who has an insight in tty locking issues though, so I'm Cc:ing some people who have touched that recently. I don't count on higher level locking for synchronization issues specific to the driver. I thought the current compat_ioctl() was already meant to *not* have the BKL just like unlocked_ioctl. My thought was that any driver getting a recent update like compat_ioctl() would need to be reviewed for BKL safety and take the lock manually if necessary. Nevermind. I misread what you wrote (I'm tired). Yes, adding an unlocked_ioctl() makes sense. -- Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Detecting process death for anycast named process monitoring
On Wed, 2007-05-02 at 16:30 -0600, Chris Friesen wrote: > Glen Turner wrote: > > > The question is, how can a process with no relationship to another > > process detect that process unexpectedly dying? If named goes > > away to a better place, we want to shut down the interface > > which causes Quagga to inject the anycast route. > We did something similar where arbitrary processes can register to be > sent an arbitrary signal when the state of other processes change. What about something like inotify, but for processes? That would be cool... - DML - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Regression with SLUB on Netperf and Volanomark
H.. One potential issues are the complicated way the slab is handled. Could you try this patch and see what impact it has? If it has any then remove the cachline alignment and see how that influences things. Remove constructor from buffer_head Buffer head management uses a constructor which increases overhead for object handling. Remove the constructor. That way SLUB can place the freepointer in an optimal location instead of after the object in potentially another cache line. Also having no constructor makes allocation and disposal of slabs from the page allocator much easier since no pass over the objects allocated to call construtors is necessary. SLUB can directly begin by serving the first object. Plus it simplifies the code and removes a difficult to understand element for buffer handling. Align the buffer heads on cacheline boundaries for best performance. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> --- fs/buffer.c | 22 -- include/linux/buffer_head.h |2 +- 2 files changed, 5 insertions(+), 19 deletions(-) Index: slub/fs/buffer.c === --- slub.orig/fs/buffer.c 2007-04-30 22:03:21.0 -0700 +++ slub/fs/buffer.c2007-05-03 18:37:47.0 -0700 @@ -2907,9 +2907,10 @@ static void recalc_bh_state(void) struct buffer_head *alloc_buffer_head(gfp_t gfp_flags) { - struct buffer_head *ret = kmem_cache_alloc(bh_cachep, + struct buffer_head *ret = kmem_cache_zalloc(bh_cachep, set_migrateflags(gfp_flags, __GFP_RECLAIMABLE)); if (ret) { + INIT_LIST_HEAD(&ret->b_assoc_buffers); get_cpu_var(bh_accounting).nr++; recalc_bh_state(); put_cpu_var(bh_accounting); @@ -2928,17 +2929,6 @@ void free_buffer_head(struct buffer_head } EXPORT_SYMBOL(free_buffer_head); -static void -init_buffer_head(void *data, struct kmem_cache *cachep, unsigned long flags) -{ - if (flags & SLAB_CTOR_CONSTRUCTOR) { - struct buffer_head * bh = (struct buffer_head *)data; - - memset(bh, 0, sizeof(*bh)); - INIT_LIST_HEAD(&bh->b_assoc_buffers); - } -} - static void buffer_exit_cpu(int cpu) { int i; @@ -2965,12 +2955,8 @@ void __init buffer_init(void) { int nrpages; - bh_cachep = kmem_cache_create("buffer_head", - sizeof(struct buffer_head), 0, - (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC| - SLAB_MEM_SPREAD), - init_buffer_head, - NULL); + bh_cachep = KMEM_CACHE(buffer_head, + SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD); /* * Limit the bh occupancy to 10% of ZONE_NORMAL Index: slub/include/linux/buffer_head.h === --- slub.orig/include/linux/buffer_head.h 2007-05-03 18:40:51.0 -0700 +++ slub/include/linux/buffer_head.h2007-05-03 18:41:07.0 -0700 @@ -73,7 +73,7 @@ struct buffer_head { struct address_space *b_assoc_map; /* mapping this buffer is associated with */ atomic_t b_count; /* users using this buffer_head */ -}; +} cacheline_aligned_in_smp; /* * macro tricks to expand the set_buffer_foo(), clear_buffer_foo() - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [v4l-dvb-maintainer] [PATCH 35/36] Use menuconfig objects II - DVB
On Fri, 4 May 2007, Roman Zippel wrote: > I don't quite understand. With the menuconfig changes more menu entries > should appear on the left side, so I don't understand why you have to > "drill down" to reach it. > The rule for menu to appear on the left side is relatively simple - all > its parents must be of menu type as well. So if a menuconfig is on the > right side it must have a normal config entry as parent. I think that's it. The media tree was done with options to select the core system module, then a menuconfig that depended on that which the drivers were under. > > > menuconfig > > > if > > > [all the other options] > > > endif > > > > > > Into this: > > > > > > menuconfig > > > [all the other options] > > > endmenu > > > > > > The reason is that a frontend would easily be able to understand the > > > coupling > > > between the "menuconfig " and the "if ". It will make it easier > > > for > > > the frontend to see that all the options are inside and controlled by the > > > enclosing menuconfig. > > If the frontend wants to change the behaviour of a menuconfig, it can > already do that, so this doesn't require a syntax change. How about these examples: menuconfig FOO if FOO config A depends on FOO endif config B if FOO config C depends on FOO endif Or this: menu FOO menuconfig BAR config A menuconfig BAZ config B endmenu How does it show the first one, keeping the config entries in the correct order and put them into the menu at the same time? And which of what should the second be show? foo \-bar \-baz or foo |-bar \-baz There is no question with menus, as the menu tree is clearly lexically defined by the matching menu / endmenu pairs. But menuconfig doesn't work that way, and it seems like it would make more sense if it did. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + per-cpuset-hugetlb-accounting-and-administration.patch added to -mm tree
Adding Christoph Lameter <[EMAIL PROTECTED]> to the cc list, as he knows more about hugetlb pages than I do. This patch strikes me as a bit odd. Granted, it's solving what could be a touchy problem with a fairly simple solution, which is usually a Good Thing(tm). However, the idea that different tasks would see different values for the following fields in /proc/meminfo: HugePages_Total: 0 HugePages_Free: 0 strikes me as odd, and risky. I would have thought that usually, all tasks in the system should see the same values in the files in /proc (as opposed to the files in particular task subdirectories /proc/.) This patch strikes me as a bit of a hack, good for compatibility, but hiding a booby trap that will bite some user code in the long run. But I'm not enough of an expert to know what the right tradeoffs are in this matter. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-dvb] DST/BT878 module customization (.. was: Critical points about ...)
Original-Nachricht Datum: Fri, 04 May 2007 02:31:49 +0400 Von: Manu Abraham <[EMAIL PROTECTED]> An: [EMAIL PROTECTED] CC: linux-kernel@vger.kernel.org Betreff: Re: [linux-dvb] DST/BT878 module customization (.. was: Critical points about ...) > Markus Rechberger wrote: > > > I mean the mail from Helge Hafting (thread [linux-dvb] Critical > > points about kernel 2.6.21 and pseudo-authorities) at the very first > > beginning. > > > > I am replying to this mail, just because someone's spreading lies all > around. > On the mentioned thread, what i wrote (and that was the only mail from > my side): > > There is a saying: "He who lives by the sword, dies by the sword." Hi Manu, The saying that you stated is a very christian one. I perhaps should state that I am 47 years old now, raised in in utmost reactionary region called Bavaria (Western Germany), and also raised by parents of Russian / Polonian origin who shared the Nazi regime with the usual "I-do-not-want-to-talk-about-it-and-I-do-not-want-to-feel-responsible-about-it "-behaviour. And I am very much not only interested in german post-war history, but I simply love to write provocative letters or mails to make my conviction utmost clear that all this capitalist bullshit around us should vanish and shrink and be overcome some day. Basic christian ideals are very close to basic marxist ideas. The one who never does perceive that is a real poor human being in my eyes, if not to say: a complete idiot or a system-conforming hypocrite. BUT: I in fact do not read this "saying" for the first time: In my personal experience (feel very sorry about it, but it's true) it has always truthfully been an excuse for persons being strongly limited on what I would call utmost primitive instincts like greed or rapacity (i. e. the utmost perfect sounding "would-like-to-capitalists", if not to say: the perfect slaves or: the perfect counterrevolutionaries or strike-breakers, if not to say: the utmost perfect asscreepers). Please forgive me for that statement, but I am simply stating my personal experiences very truthfully, without playing any politics, but just telling you my "personal truth" or the sum of all my personal life experience unfortunately bound to that. And if there is discussion needed on that we should do it private or anyway on some other thread, but definitely not on this one. Hints to help you to understand the difference: 1. There is a GPL license written by Richard Stallman whose origin I do not know: Its essence is the philosophy to share and to be highly transparent as far as information level is concerned. 2. There is a saying by Linus in which he states the best choice he ever did was conforming his work to the terms of Richard Stallman, the GPL. 3. Wikipedia says that Linus's father was no christian at all, but simply a communist. See, Manu, there are deeply primitive instinct-driven hypocrites around like hell, but there are also truthful human beings around. But: The Internet does not provide a platform to find out who is who and what is what. The Internet may be necessary, but in the end it's just a drag, isn't it? Sincerely Uwe > > > Original Message > Subject: Re: [linux-dvb] Re: Critical points about kernel 2.6.21 and > pseudo-authorities > Date: Tue, 01 May 2007 04:19:41 +0400 > From: Manu Abraham <[EMAIL PROTECTED]> > To: Uwe Bugla <[EMAIL PROTECTED]> > CC: [EMAIL PROTECTED], [EMAIL PROTECTED], > linux-kernel@vger.kernel.org, [EMAIL PROTECTED], > [EMAIL PROTECTED], [EMAIL PROTECTED] > References: <[EMAIL PROTECTED]> > <[EMAIL PROTECTED]> > <[EMAIL PROTECTED]> > <[EMAIL PROTECTED]> > <[EMAIL PROTECTED]> <[EMAIL PROTECTED] > <[EMAIL PROTECTED]> > <[EMAIL PROTECTED]> > <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> > > Uwe Bugla wrote: > > > 1. You utmost personally are responsible for 4 ununsable kernels, as > far as bt8xx cards are concerned: 2.6.13, 2.6.14, 2.6.15, 2.6.16! > > 2. You did not even want to imply to resolve that issue by incarnating > that "community and synergy principle" that linux community needs to > exist at all, but you just perverted it by flaming capable people - > > You mean like this: > > > Original Message > Subject: kernel patch practice in 2.6.13-mm2 > Date: Tue, 13 Sep 2005 16:46:35 +0200 (MEST) > From: Uwe Bugla <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > CC: [EMAIL PROTECTED] > > Hi, > if you continue to send or sign mm-patches for Kernel 2.6.13 as a > consequence of a design change I would appreciate you to stop rubbing out > my > name. > You did that in a file called /Documentation/dvb/bt8xx.txt. > My objective is understandable good documentation, even if it may
Re: [patch] compiler: introduce __used and __maybe_unused
__used is defined to be __attribute__((unused)) for all pre-3.4 gcc compilers to suppress warnings for unused functions because perhaps they are referenced only in inline assembly. It is defined to be __attribute__((used)) for gcc 3.4 and later so that the code is still emitted for such functions. __maybe_unused is defined to be __attribute__((unused)) for both function and variable use if it could possibly be unreferenced due to the evaluation of preprocessor macros. Function prototypes shall be marked with __maybe_unused if the actual definition of the function is dependant on preprocessor macros. No update to compiler-intel.h is necessary because ICC supports both __attribute__((used)) and __attribute__((unused)) as specified by the gcc manual. __attribute_used__ is deprecated and will be removed once all current code is converted to using __used. Cc: Rusty Russell <[EMAIL PROTECTED]> Cc: Andrian Bunk <[EMAIL PROTECTED]> Signed-off-by: David Rientjes <[EMAIL PROTECTED]> --- include/linux/compiler-gcc.h |1 + include/linux/compiler-gcc3.h |6 -- include/linux/compiler-gcc4.h |3 ++- include/linux/compiler.h | 21 ++--- 4 files changed, 25 insertions(+), 6 deletions(-) diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h --- a/include/linux/compiler-gcc.h +++ b/include/linux/compiler-gcc.h @@ -37,3 +37,4 @@ #define noinline __attribute__((noinline)) #define __attribute_pure__ __attribute__((pure)) #define __attribute_const____attribute__((__const__)) +#define __maybe_unused __attribute__((unused)) diff --git a/include/linux/compiler-gcc3.h b/include/linux/compiler-gcc3.h --- a/include/linux/compiler-gcc3.h +++ b/include/linux/compiler-gcc3.h @@ -4,9 +4,11 @@ #include #if __GNUC_MINOR__ >= 3 -# define __attribute_used____attribute__((__used__)) +# define __used__attribute__((__used__)) +# define __attribute_used____used /* deprecated */ #else -# define __attribute_used____attribute__((__unused__)) +# define __used__attribute__((__unused__)) +# define __attribute_used____used /* deprecated */ #endif #if __GNUC_MINOR__ >= 4 diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h --- a/include/linux/compiler-gcc4.h +++ b/include/linux/compiler-gcc4.h @@ -12,7 +12,8 @@ # define __inline __inline__attribute__((always_inline)) #endif -#define __attribute_used__ __attribute__((__used__)) +#define __used __attribute__((__used__)) +#define __attribute_used__ __used /* deprecated */ #define __must_check __attribute__((warn_unused_result)) #define __compiler_offsetof(a,b) __builtin_offsetof(a,b) #define __always_inlineinline __attribute__((always_inline)) diff --git a/include/linux/compiler.h b/include/linux/compiler.h --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -108,15 +108,30 @@ extern void __chk_io_ptr(const void __iomem *); * Allow us to avoid 'defined but not used' warnings on functions and data, * as well as force them to be emitted to the assembly file. * - * As of gcc 3.3, static functions that are not marked with attribute((used)) - * may be elided from the assembly file. As of gcc 3.3, static data not so + * As of gcc 3.4, static functions that are not marked with attribute((used)) + * may be elided from the assembly file. As of gcc 3.4, static data not so * marked will not be elided, but this may change in a future gcc version. * + * NOTE: Because distributions shipped with a backported unit-at-a-time + * compiler in gcc 3.3, we must define __used to be __attribute__((used)) + * for gcc >=3.3 instead of 3.4. + * * In prior versions of gcc, such functions and data would be emitted, but * would be warned about except with attribute((unused)). + * + * Mark functions that are referenced only in inline assembly as __used so + * the code is emitted even though it appears to be unreferenced. */ #ifndef __attribute_used__ -# define __attribute_used__/* unimplemented */ +# define __attribute_used__/* deprecated */ +#endif + +#ifndef __used +# define __used/* unimplemented */ +#endif + +#ifndef __maybe_unused +# define __maybe_unused/* unimplemented */ #endif /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] make cancel_rearming_delayed_work() reliable
On Fri, 4 May 2007 00:42:26 +0400 Oleg Nesterov <[EMAIL PROTECTED]> wrote: > Thanks to Jarek Poplawski for the ideas and for spotting the bug in the > initial draft patch. > > cancel_rearming_delayed_work() currently has many limitations, because it > requires that dwork always re-arms itself via queue_delayed_work(). So it > hangs forever if dwork doesn't do this, or cancel_rearming_delayed_work/ > cancel_delayed_work was already called. It uses flush_workqueue() in a loop, > so it can't be used if workqueue was freezed, and it is potentially live- > lockable on busy system if delay is small. > > With this patch cancel_rearming_delayed_work() doesn't make any assumptions > about dwork, it can re-arm itself via queue_delayed_work(), or queue_work(), > or do nothing. > > As a "side effect", cancel_work_sync() was changed to handle re-arming works > as well. > > Disadvantages: > > - this patch adds wmb() to insert_work(). > > - slowdowns the fast path (when del_timer() succeeds on entry) of > cancel_rearming_delayed_work(), because wait_on_work() is called > unconditionally. In that case, compared to the old version, we are > doing "unneeded" lock/unlock for each online CPU. > > On the other hand, this means we don't need to use cancel_work_sync() > after cancel_rearming_delayed_work(). > > - complicates the code (.text grows by 130 bytes). > hm, this is getting complex. > + while (!try_to_grab_pending(work)) > + ; The patch adds a couple of spinloops. Normally we put a cpu_relax() into such loops. It can make a very large difference under some circumstances. > + while (!del_timer(&dwork->timer) && > +!try_to_grab_pending(&dwork->work)) > + ; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22 -mm merge plans: slub on PowerPC
On Fri, 4 May 2007, Benjamin Herrenschmidt wrote: > > The SLUB allocator relies on struct page fields first_page and slab, > > overwritten by ptl when SPLIT_PTLOCK: so the SLUB allocator cannot then > > be used for the lowest level of pagetable pages. This was obstructing > > SLUB on PowerPC, which uses kmem_caches for its pagetables. So convert > > its pte level to use quicklist pages (whereas pmd, pud and 64k-page pgd > > want partpages, so continue to use kmem_caches for pmd, pud and pgd). > > But to keep up appearances for pgtable_free, we still need PTE_CACHE_NUM. > > Interesting... I'll have a look asap. I would also recommend looking at removing the constructors for the remaining slabs. A constructor requires that SLUB never touch the object (same situation as is resulting from enabling debugging). So it must increase the object size in order to put the free pointer after the object. In case of a order of 2 cache this has a particularly bad effect of doubling object size. If the objects can be overwritten on free (no constructor) then we can use the first word of the object as a freepointer on kfree. Meaning we can use a hot cacheline so no cache miss. On alloc we have already touched the first cacheline which also avoids a cacheline fetch there. This is the optimal way of operation for SLUB. Hmmm We could add an option to allow the use of a constructor while keeping the free pointer at the beginning of the object? Then we would have to zap the first word on alloc. Would work like quicklists. Add SLAB_FREEPOINTER_MAY_OVERLAP? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] synclink_gt add compat_ioctl
On Thu, 03 May 2007 13:01:17 -0500 Paul Fulghum <[EMAIL PROTECTED]> wrote: > Add compat_ioctl handler to synclink_gt driver. > > The one case requiring a separate 32 bit handler could be > removed by redefining the associated structure in > a way compatible with both 32 and 64 bit systems. But that > approach would break existing native 64 bit user applications. A made a few changes here... From: Andrew Morton <[EMAIL PROTECTED]> - Fix i386 build: In file included from drivers/char/synclink_gt.c:85: include/linux/synclink.h:175: error: expected specifier-qualifier-list before 'compat_ulong_t' - We might as well do the same ifdef-avoidery trick around compat_ioctl() too. That required that it be renamed. - It is fishy that apart from one outlier in kexec.h, synclink.h is the only header file which uses compat_ulong_t. Are we doing this right? Cc: Alan Cox <[EMAIL PROTECTED]> Cc: Arnd Bergmann <[EMAIL PROTECTED]> Cc: Paul Fulghum <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- drivers/char/synclink_gt.c | 16 +--- include/linux/synclink.h |5 +++-- 2 files changed, 12 insertions(+), 9 deletions(-) diff -puN drivers/char/synclink_gt.c~synclink_gt-add-compat_ioctl-fix drivers/char/synclink_gt.c --- a/drivers/char/synclink_gt.c~synclink_gt-add-compat_ioctl-fix +++ a/drivers/char/synclink_gt.c @@ -1176,15 +1176,16 @@ static int ioctl(struct tty_struct *tty, } #ifdef CONFIG_COMPAT -static long compat_ioctl(struct tty_struct *tty, struct file *file, +static long synclink_compat_ioctl(struct tty_struct *tty, struct file *file, unsigned int cmd, unsigned long arg) { struct slgt_info *info = tty->driver_data; int rc = -ENOIOCTLCMD; - if (sanity_check(info, tty->name, "compat_ioctl")) + if (sanity_check(info, tty->name, "synclink_compat_ioctl")) return -ENODEV; - DBGINFO(("%s compat_ioctl() cmd=%08X\n", info->device_name, cmd)); + DBGINFO(("%s synclink_compat_ioctl() cmd=%08X\n", + info->device_name, cmd)); switch (cmd) { @@ -1219,9 +1220,12 @@ static long compat_ioctl(struct tty_stru break; } - DBGINFO(("%s compat_ioctl() cmd=%08X rc=%d\n", info->device_name, cmd, rc)); + DBGINFO(("%s synclink_compat_ioctl() cmd=%08X rc=%d\n", + info->device_name, cmd, rc)); return rc; } +#else +#define synclink_compat_ioctl NULL #endif /* @@ -3554,9 +3558,7 @@ static const struct tty_operations ops = .chars_in_buffer = chars_in_buffer, .flush_buffer = flush_buffer, .ioctl = ioctl, -#ifdef CONFIG_COMPAT - .compat_ioctl = compat_ioctl, -#endif + .compat_ioctl = synclink_compat_ioctl, .throttle = throttle, .unthrottle = unthrottle, .send_xchar = send_xchar, diff -puN include/linux/synclink.h~synclink_gt-add-compat_ioctl-fix include/linux/synclink.h --- a/include/linux/synclink.h~synclink_gt-add-compat_ioctl-fix +++ a/include/linux/synclink.h @@ -169,9 +169,9 @@ typedef struct _MGSL_PARAMS } MGSL_PARAMS, *PMGSL_PARAMS; +#ifdef CONFIG_COMPAT /* provide 32 bit ioctl compatibility on 64 bit systems */ -struct MGSL_PARAMS32 -{ +struct MGSL_PARAMS32 { compat_ulong_t mode; unsigned char loopback; unsigned short flags; @@ -186,6 +186,7 @@ struct MGSL_PARAMS32 unsigned char stop_bits; unsigned char parity; }; +#endif #define MICROGATE_VENDOR_ID 0x13c0 #define SYNCLINK_DEVICE_ID 0x0010 _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Regression with SLUB on Netperf and Volanomark
On Thu, 3 May 2007, Chen, Tim C wrote: > We are still seeing a 5% regression on TCP streaming with > slub_min_objects set at 16 and a 10% regression for Volanomark, after > increasing slub_min_objects to 16 and setting slub_max_order=4 and using > the 2.6.21-rc7-mm2 kernel. The performance between slub_min_objects=8 > and 16 are similar. Ok. We then need to look at partial list management. It could be that the sequence of partials is reversed. The problem is that I do not really have time to concentrate on performance right now. Stability comes first. We will likely end up putting some probes in there to find out where the overhead comes from. > > Check slabinfo output for the network slabs and see what order is > > used. The number of objects per slab is important for performance. > > The order used is 0 for the buffer_head, which is the most used object. > > I think they are 104 bytes per object. Hmmm Then it was not affected by slab_max_order? Try slab_min_order=1 or 2 to increase that? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tty add compat_ioctl
Arnd Bergmann wrote: - The return value of the new compat_ioctl methods should probably 'int', not 'long'. We've had the discussion before and then decided not to change the existing compat_ioctl and unlocked_ioctl functions -- even though int is more appropriate, but having the same prototype has the advantage that a driver can use the same function for both ->ioctl and ->compat_ioctl if all calls are compatible. I noticed that but thought the change in return value type had some higher purpose I had not perceived. If it can be int that would be the way to go. - In your driver you don't get the big kernel lock in the compat_ioctl function. I assume that this is correct for the particular driver, but it may be nice if you could consequently also add an unlocked_ioctl function that can be used without the BKL for native ioctls. It would be good to hear an opinon on this from someone who has an insight in tty locking issues though, so I'm Cc:ing some people who have touched that recently. I don't count on higher level locking for synchronization issues specific to the driver. I thought the current compat_ioctl() was already meant to *not* have the BKL just like unlocked_ioctl. My thought was that any driver getting a recent update like compat_ioctl() would need to be reviewed for BKL safety and take the lock manually if necessary. Drivers that are falling behind wont have a compat_ioctl defined at all. -- Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] compiler: introduce __used and __maybe_unused
On Thu, May 03, 2007 at 05:35:57PM -0700, David Rientjes wrote: >... > There was a mistake in the current implementation of __attribute_used__ > whereas it would be defined to be __attribute__((used)) incorrectly for > gcc 3.3 and later. The unit-at-a-time compilation scheme was only > introduced in gcc 3.4 and later versions as specified in > http://www.gnu.org/software/gcc/gcc-3.4/changes.html. >... AFAIR, Suse shipped a release of their distribution with a gcc 3.3 containing a backported unit-at-a-time. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] lib/hexdump
> > Ho hum. Perhaps a middle ground is to implement hexdump-to-memory as the > > core function. hex_dumper() becomes a simple wrapper around that. (but > > how big is its buffer? One line would be OK, I guess) > > Yeah, I almost did it that way. We'll see. > > > > OK, that's one way to do it. I'll wait a bit for other comments. > > > > Good luck ;) next try: From: Randy Dunlap <[EMAIL PROTECTED]> Based on ace_dump_mem() from Grant Likely for the Xilinx SystemACE CompactFlash interface. Add print_hex_dump() & hex_dumper() to lib/hexdump.c and linux/kernel.h. This patch adds the functions print_hex_dump() & hex_dumper(). print_hex_dump() can be used to perform a hex + ASCII dump of data to syslog, in an easily viewable format, thus providing a common text hex dump format. hex_dumper() provides a dump-to-memory function. It converts one "line" of output (16 bytes of input) at a time. Example usages: print_hex_dump(KERN_DEBUG, DUMP_PREFIX_ADDRESS, frame->data, frame->len); hex_dumper(frame->data, frame->len, linebuf, sizeof(linebuf)); Example output using %DUMP_PREFIX_OFFSET: 0009ab42: 40414243 44454647 48494a4b [EMAIL PROTECTED] HIJKLMNO Example output using %DUMP_PREFIX_ADDRESS: 88089af0: 70717273 74757677 78797a7b 7c7d7e7f-pqrstuvw xyz{|}~. Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> --- include/linux/kernel.h | 10 lib/Makefile |2 lib/hexdump.c | 105 + 3 files changed, 116 insertions(+), 1 deletion(-) --- linux-2.6.21-git4.orig/include/linux/kernel.h +++ linux-2.6.21-git4/include/linux/kernel.h @@ -202,6 +202,16 @@ extern enum system_states { extern void dump_stack(void); +enum { + DUMP_PREFIX_NONE, + DUMP_PREFIX_ADDRESS, + DUMP_PREFIX_OFFSET +}; +extern void hex_dumper(void *buf, size_t len, char *linebuf, size_t linebuflen); +extern void print_hex_dump(const char *level, int prefix_type, + void *buf, size_t len); +#define hextoasc(x)"0123456789abcdef"[x] + #ifdef DEBUG /* If you are writing a driver, please use dev_dbg instead */ #define pr_debug(fmt,arg...) \ --- linux-2.6.21-git4.orig/lib/Makefile +++ linux-2.6.21-git4/lib/Makefile @@ -13,7 +13,7 @@ lib-$(CONFIG_SMP) += cpumask.o lib-y += kobject.o kref.o kobject_uevent.o klist.o obj-y += div64.o sort.o parser.o halfmd4.o debug_locks.o random32.o \ -bust_spinlocks.o +bust_spinlocks.o hexdump.o ifeq ($(CONFIG_DEBUG_KOBJECT),y) CFLAGS_kobject.o += -DDEBUG --- /dev/null +++ linux-2.6.21-git4/lib/hexdump.c @@ -0,0 +1,105 @@ +/* + * lib/hexdump.c + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. See README and COPYING for + * more details. + */ + +#include +#include +#include +#include + +/** + * hex_dumper - convert a blob of data to "hex ASCII" in memory + * @buf: data blob to dump + * @len: number of bytes in the @buf + * @linebuf: where to put the converted data + * @linebuflen: total size of @linebuf, including space for terminating NUL + * + * hex_dumper() works on one "line" of output at a time, i.e., + * 16 bytes of input data converted to hex + ASCII output. + * + * Given a buffer of u8 data, hex_dumper() converts the input data to a + * hex + ASCII dump at the supplied memory location. + * The converted output is always NUL-terminated. + * + * E.g.: + * hex_dumper(frame->data, frame->len, linebuf, sizeof(linebuf)); + * + * Prints the offsets of the block of memory, not addresses: + * 0009ab42: 40414243 44454647 48494a4b [EMAIL PROTECTED] HIJKLMNO + */ +void hex_dumper(void *buf, size_t len, char *linebuf, size_t linebuflen) +{ + const u8 *ptr = buf; + u8 ch; + int j, lx = 0; + + for (j = 0; (j < 16) && (j < len) && (lx + 3) < linebuflen; j++) { + if (j && !(j % 4)) + linebuf[lx++] = ' '; + ch = ptr[j]; + linebuf[lx++] = hextoasc(ch >> 4); + linebuf[lx++] = hextoasc(ch & 0x0f); + } + if (lx < linebuflen) + linebuf[lx++] = '-'; + for (j = 0; (j < 16) && (j < len) && (lx + 2) < linebuflen; j++) { + linebuf[lx++] = isprint(ptr[j]) ? ptr[j] : '.'; + if (j == 7) + linebuf[lx++] = ' '; + } + linebuf[lx++] = '\0'; +} +EXPORT_SYMBOL(hex_dumper); + +/** + * print_hex_dump - print a text hex dump to syslog for a binary blob of data + * @level: kernel log level (e.g. KERN_DEBUG) + * @prefix_type: controls whether prefix of an offset, address, or none + * is printed (%DUMP_PREFIX_OFFSET, %DUMP_PREFIX_ADDRESS, %DUMP_PREFIX_NONE) + * @buf: data blob to dump + * @len: number of bytes in the @buf + * + * Given a buffer of u8 data, print_hex_dump() prints a hex + ASCII dump + * to the ker
Re: [linux-dvb] DST/BT878 module customization (.. was: Critical points about ...)
Am Freitag, den 04.05.2007, 02:31 +0400 schrieb Manu Abraham: > Markus Rechberger wrote: > > > I mean the mail from Helge Hafting (thread [linux-dvb] Critical > > points about kernel 2.6.21 and pseudo-authorities) at the very first > > beginning. > > > > I am replying to this mail, just because someone's spreading lies all > around. > On the mentioned thread, what i wrote (and that was the only mail from > my side): > > There is a saying: "He who lives by the sword, dies by the sword." > Within the last six years there was in the end exactly one, never asked for, private mail with worst *bullshit* about another person, Mauro in this case. It came from you, out of any feasible arguments for me anymore. I'm stupid, but not stupid enough to allow such stuff coming in rule. But I still say you have been first and are waiting longest to get your work in, please try again to get your ACKs and rant about not enough replies. Cheers, Hermann - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RELEASE] Lguest for 2.6.21
On Fri, 2007-05-04 at 10:13 +1000, Rusty Russell wrote: > On Thu, 2007-05-03 at 11:02 -0500, Matt Mackall wrote: > > On Thu, May 03, 2007 at 12:43:48AM +1000, Rusty Russell wrote: > > > http://lguest.ozlabs.org/lguest-2.6.21-254.patch.gz > > > > > > See Documentation/lguest/lguest.txt for how to run, > > > drivers/lguest/README for the draft code documentation journey. > > > > Your lguest readme is quite lacking in the area of how to configure a > > guest kernel as opposed to the host kernel. More hand-holding, please. > > Hi Matt! > > Ah, that's because they are the same kernel. Turning on CONFIG_LGUEST > builds-in the parts needed to be a guest as well. > > Thanks for pointing out that weakness. I will modify lguest.txt to make > that clear. Something like this: diff -r 940ec1c6ac5a Documentation/lguest/lguest.txt --- a/Documentation/lguest/lguest.txt Thu May 03 23:00:19 2007 +1000 +++ b/Documentation/lguest/lguest.txt Fri May 04 10:17:23 2007 +1000 @@ -23,7 +23,10 @@ Developer features: Running Lguest: -- You will need to configure your kernel with the following options: +- Lguest runs the same kernel as guest and host. You can configure + them differently, but usually it's easiest not to. + + You will need to configure your kernel with the following options: CONFIG_HIGHMEM64G=n ("High Memory Support" "64GB")[1] CONFIG_TUN=y/m ("Universal TUN/TAP device driver support") Cheers, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: console font limits
Kyle Moffett wrote: > > Actually I think the real problem was that "KD_GRAPHICS" got overloaded > to mean "some userspace program is probably poking at the GPU in very > direct ways possibly including /dev/mem". As such it really isn't safe > at all for the kernel to write stuff to the screen in that situation; > you could turn a panic()+reboot-after-30-secs into an unrecoverable hard > PCI bus lockup. IIRC there were at least a couple chipsets which had > that problem with X. If we can implement enough APIs for X to do all of > its stuff from userspace without iopl() or /dev/mem then we could > probably bring back the option for dumping oopses to screen in > KD_GRAPHICS mode, but otherwise it'll just cause more headaches. > It never meant anything *BUT* that, to the best of my knowledge. That was certainly the original meaning of KD_GRAPHICS. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] compiler: introduce __used and __maybe_unused
__used is defined to be __attribute__((unused)) for all pre-3.4 gcc compilers to suppress warnings for unused functions because perhaps they are referenced only in inline assembly. It is defined to be __attribute__((used)) for gcc 3.4 and later so that the code is still emitted for such functions. There was a mistake in the current implementation of __attribute_used__ whereas it would be defined to be __attribute__((used)) incorrectly for gcc 3.3 and later. The unit-at-a-time compilation scheme was only introduced in gcc 3.4 and later versions as specified in http://www.gnu.org/software/gcc/gcc-3.4/changes.html. __maybe_unused is defined to be __attribute__((unused)) for both function and variable use if it could possibly be unreferenced due to the evaluation of preprocessor macros. Function prototypes shall be marked with __maybe_unused if the actual definition of the function is dependant on preprocessor macros. No update to compiler-intel.h is necessary because ICC supports both __attribute__((used)) and __attribute__((unused)) as specified by the gcc manual. __attribute_used__ is deprecated and will be removed once all current code is converted to using __used. Cc: Rusty Russell <[EMAIL PROTECTED]> Cc: Andrian Bunk <[EMAIL PROTECTED]> Signed-off-by: David Rientjes <[EMAIL PROTECTED]> --- include/linux/compiler-gcc.h |1 + include/linux/compiler-gcc3.h | 13 ++--- include/linux/compiler-gcc4.h |3 ++- include/linux/compiler.h | 17 ++--- 4 files changed, 23 insertions(+), 11 deletions(-) diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h --- a/include/linux/compiler-gcc.h +++ b/include/linux/compiler-gcc.h @@ -37,3 +37,4 @@ #define noinline __attribute__((noinline)) #define __attribute_pure__ __attribute__((pure)) #define __attribute_const____attribute__((__const__)) +#define __maybe_unused __attribute__((unused)) diff --git a/include/linux/compiler-gcc3.h b/include/linux/compiler-gcc3.h --- a/include/linux/compiler-gcc3.h +++ b/include/linux/compiler-gcc3.h @@ -3,14 +3,13 @@ /* These definitions are for GCC v3.x. */ #include -#if __GNUC_MINOR__ >= 3 -# define __attribute_used____attribute__((__used__)) -#else -# define __attribute_used____attribute__((__unused__)) -#endif - #if __GNUC_MINOR__ >= 4 -#define __must_check __attribute__((warn_unused_result)) +# define __used__attribute__((__used__)) +# define __attribute_used____used /* deprecated */ +# define __must_check __attribute__((warn_unused_result)) +#else +# define __used__attribute__((__unused__)) +# define __attribute_used____used /* deprecated */ #endif #define __always_inlineinline __attribute__((always_inline)) diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h --- a/include/linux/compiler-gcc4.h +++ b/include/linux/compiler-gcc4.h @@ -12,7 +12,8 @@ # define __inline __inline__attribute__((always_inline)) #endif -#define __attribute_used__ __attribute__((__used__)) +#define __used __attribute__((__used__)) +#define __attribute_used__ __used /* deprecated */ #define __must_check __attribute__((warn_unused_result)) #define __compiler_offsetof(a,b) __builtin_offsetof(a,b) #define __always_inlineinline __attribute__((always_inline)) diff --git a/include/linux/compiler.h b/include/linux/compiler.h --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -108,15 +108,26 @@ extern void __chk_io_ptr(const void __iomem *); * Allow us to avoid 'defined but not used' warnings on functions and data, * as well as force them to be emitted to the assembly file. * - * As of gcc 3.3, static functions that are not marked with attribute((used)) - * may be elided from the assembly file. As of gcc 3.3, static data not so + * As of gcc 3.4, static functions that are not marked with attribute((used)) + * may be elided from the assembly file. As of gcc 3.4, static data not so * marked will not be elided, but this may change in a future gcc version. * * In prior versions of gcc, such functions and data would be emitted, but * would be warned about except with attribute((unused)). + * + * Mark functions that are referenced only in inline assembly as __used so + * the code is emitted even though it appears to be unreferenced. */ #ifndef __attribute_used__ -# define __attribute_used__/* unimplemented */ +# define __attribute_used__/* deprecated */ +#endif + +#ifndef __used +# define __used/* unimplemented */ +#endif + +#ifndef __maybe_unused +# define __maybe_unused/* unimplemented */ #endif /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a
Re: [v4l-dvb-maintainer] [PATCH 35/36] Use menuconfig objects II - DVB
Hi, On Thu, 3 May 2007, Sam Ravnborg wrote: > Please include Roman Zippel when you propose kconfig changes. Thanks, the lkml volume lately forces me to skip a lot, so it's quite possible I miss something. :) > > xconfig has the menu tree display in the left panel, where one can see the > > overall layout of the menu tree and jump directly to any menu (even one > > multiple levels deep). All the menuconfigs that used to be menus don't show > > up here anymore. > > > > To turn a menuconfig off, you must go to the top level menu containing the > > menuconfig you want (and you must know which one that is!). Then you have > > to > > drill down through each menu level one by one, by finding that menu in the > > top > > panel (which also has all the config options listed) and clicking on it to > > get > > to the next one. When you get to the menuconfig you want, you must enter it > > and then you finally get the box to turn that menuconfig off. > > > > It looks like your changes are going in, so I suppose the solution is to > > improve the way xconfig handles "menuconfig". I don't quite understand. With the menuconfig changes more menu entries should appear on the left side, so I don't understand why you have to "drill down" to reach it. The rule for menu to appear on the left side is relatively simple - all its parents must be of menu type as well. So if a menuconfig is on the right side it must have a normal config entry as parent. > > I wonder, would it be possible to change the kconfig language so that: > > menuconfig > > boolean "name of menu" > > > > Did the same thing as: > > config > > boolean "name of menu" > > menu "name of menu" > > depends on > > > > This way you could change this: > > > > menuconfig > > if > > [all the other options] > > endif > > > > Into this: > > > > menuconfig > > [all the other options] > > endmenu > > > > The reason is that a frontend would easily be able to understand the > > coupling > > between the "menuconfig " and the "if ". It will make it easier for > > the frontend to see that all the options are inside and controlled by the > > enclosing menuconfig. If the frontend wants to change the behaviour of a menuconfig, it can already do that, so this doesn't require a syntax change. bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22 -mm merge plans: slub on PowerPC
On Thu, 2007-05-03 at 22:04 +0100, Hugh Dickins wrote: > On Thu, 3 May 2007, Hugh Dickins wrote: > > > > Seems we're all wrong in thinking Christoph's Kconfiggery worked > > as intended: maybe it just works some of the time. I'm not going > > to hazard a guess as to how to fix it up, will resume looking at > > the powerpc's quicklist potential later. > > Here's the patch I've been testing on G5, with 4k and with 64k pages, > with SLAB and with SLUB. But, though it doesn't crash, the pgd > kmem_cache in the 4k-page SLUB case is revealing SLUB's propensity > for using highorder allocations where SLAB would stick to order 0: > under load, exec's mm_init gets page allocation failure on order 4 > - SLUB's calculate_order may need some retuning. (I'd expect it to > be going for order 3 actually, I'm not sure how order 4 comes about.) > > I don't know how offensive Ben and Paulus may find this patch: > the kmem_cache use was nicely done and this messes it up a little. > > > The SLUB allocator relies on struct page fields first_page and slab, > overwritten by ptl when SPLIT_PTLOCK: so the SLUB allocator cannot then > be used for the lowest level of pagetable pages. This was obstructing > SLUB on PowerPC, which uses kmem_caches for its pagetables. So convert > its pte level to use quicklist pages (whereas pmd, pud and 64k-page pgd > want partpages, so continue to use kmem_caches for pmd, pud and pgd). > But to keep up appearances for pgtable_free, we still need PTE_CACHE_NUM. Interesting... I'll have a look asap. Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] remove "#if 0" from find_bus function, export it.
On Thu, May 03, 2007 at 04:14:59PM -0700, Greg KH wrote: > On Fri, May 04, 2007 at 01:31:21AM +0400, Anton Vorontsov wrote: > > This function were placed in "#if 0" because nobody was using it. > > We using it now. > > Why? Shouldn't you just export the pointer you need instead? We can do one way or another. We can ask W1 bus maintainer to export bus type. Or we can un-"if 0" generic find_bus/bus_find function. > And if you really want it, and you convince me you really need it, No, I don't want it at all. But ds2760_battery driver need to find w1 bus type. A long time ago in a Galaxy far far away we're used to find_bus function, then it was removed, and somewhere in the thread I gave link to, someone suggested to show real user of that function and post that patch. I've just done that. So, if you're unwilling to revert that function, please say it explicitly, and I'll ping w1 folks to export bus type. I really-really don't care how exactly we should find that bus stuff. > thanks, > > greg k-h Good luck, -- Anton Vorontsov email: [EMAIL PROTECTED] backup email: [EMAIL PROTECTED] irc://irc.freenode.org/bd2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: console font limits
On May 03, 2007, at 16:16:51, Jan Engelhardt wrote: On May 3 2007 13:15, H. Peter Anvin wrote: Jan Engelhardt wrote: Put people didn't like that, and disabled text output when the console is in KD_GRAPHICS mode... at the cost of not getting the kernel oops, heh. I thought the reason we didn't display text in KD_GRAPHICS mode was that KD_GRAPHICS might mean "in a completely different mode that only userspace knows about." Hrm. Maybe we need a distinction into KD_KGRAPHICS and KD_UGRAPHICS then. Actually I think the real problem was that "KD_GRAPHICS" got overloaded to mean "some userspace program is probably poking at the GPU in very direct ways possibly including /dev/mem". As such it really isn't safe at all for the kernel to write stuff to the screen in that situation; you could turn a panic()+reboot-after-30-secs into an unrecoverable hard PCI bus lockup. IIRC there were at least a couple chipsets which had that problem with X. If we can implement enough APIs for X to do all of its stuff from userspace without iopl () or /dev/mem then we could probably bring back the option for dumping oopses to screen in KD_GRAPHICS mode, but otherwise it'll just cause more headaches. Cheers, Kyle Moffett - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] m68knommu: use generic irq framework
Change the m68knommu irq handling to use the generic irq framework. Signed-off-by: Greg Ungerer <[EMAIL PROTECTED]> --- arch/m68knommu/Kconfig |4 arch/m68knommu/kernel/Makefile |4 arch/m68knommu/kernel/asm-offsets.c|5 arch/m68knommu/kernel/irq.c| 82 + arch/m68knommu/kernel/setup.c |6 arch/m68knommu/kernel/traps.c |2 arch/m68knommu/platform/5307/Makefile |2 arch/m68knommu/platform/5307/entry.S | 39 +--- arch/m68knommu/platform/5307/ints.c| 279 - arch/m68knommu/platform/5307/vectors.c | 29 ++- arch/m68knommu/platform/68328/entry.S | 10 - arch/m68knommu/platform/68328/ints.c | 130 ++- arch/m68knommu/platform/68360/entry.S |6 arch/m68knommu/platform/68360/ints.c | 233 +-- include/asm-m68knommu/irq.h| 75 include/asm-m68knommu/irqnode.h| 36 include/asm-m68knommu/m68360.h |8 include/asm-m68knommu/machdep.h| 10 - include/asm-m68knommu/traps.h |4 19 files changed, 171 insertions(+), 793 deletions(-) diff -Naur linux-2.6.21/arch/m68knommu/Kconfig linux-2.6.21-gt/arch/m68knommu/Kconfig --- linux-2.6.21/arch/m68knommu/Kconfig 2007-04-26 13:08:32.0 +1000 +++ linux-2.6.21-gt/arch/m68knommu/Kconfig 2007-05-04 00:20:43.0 +1000 @@ -45,6 +45,10 @@ bool default y +config GENERIC_HARDIRQS + bool + default y + config GENERIC_CALIBRATE_DELAY bool default y diff -Naur linux-2.6.21/arch/m68knommu/kernel/asm-offsets.c linux-2.6.21-gt/arch/m68knommu/kernel/asm-offsets.c --- linux-2.6.21/arch/m68knommu/kernel/asm-offsets.c2007-04-26 13:08:32.0 +1000 +++ linux-2.6.21-gt/arch/m68knommu/kernel/asm-offsets.c 2007-05-04 00:20:44.0 +1000 @@ -15,7 +15,6 @@ #include #include #include -#include #include #define DEFINE(sym, val) \ @@ -72,10 +71,6 @@ #else /* bitfields are a bit difficult */ DEFINE(PT_VECTOR, offsetof(struct pt_regs, pc) + 4); - /* offsets into the irq_handler struct */ - DEFINE(IRQ_HANDLER, offsetof(struct irq_node, handler)); - DEFINE(IRQ_DEVID, offsetof(struct irq_node, dev_id)); - DEFINE(IRQ_NEXT, offsetof(struct irq_node, next)); #endif /* offsets into the kernel_stat struct */ diff -Naur linux-2.6.21/arch/m68knommu/kernel/irq.c linux-2.6.21-gt/arch/m68knommu/kernel/irq.c --- linux-2.6.21/arch/m68knommu/kernel/irq.c1970-01-01 10:00:00.0 +1000 +++ linux-2.6.21-gt/arch/m68knommu/kernel/irq.c 2007-05-04 00:20:44.0 +1000 @@ -0,0 +1,82 @@ +/* + * arch/m68knommu/kernel/irq.c + * + * (C) Copyright 2007, Greg Ungerer <[EMAIL PROTECTED]> + * + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file COPYING in the main directory of this archive + * for more details. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +asmlinkage void do_IRQ(int irq, struct pt_regs *regs) +{ + struct pt_regs *oldregs = set_irq_regs(regs); + + irq_enter(); + __do_IRQ(irq); + irq_exit(); + + set_irq_regs(oldregs); +} + +void ack_bad_irq(unsigned int irq) +{ + printk("IRQ: unexpected irq=%d\n", irq); +} + +static struct irq_chip m_irq_chip = { + .name = "M68K-INTC", + .enable = enable_vector, + .disable= disable_vector, + .ack= ack_vector, +}; + +void __init init_IRQ(void) +{ + int irq; + + init_vectors(); + + for (irq = 0; (irq < NR_IRQS); irq++) { + irq_desc[irq].status = IRQ_DISABLED; + irq_desc[irq].action = NULL; + irq_desc[irq].depth = 1; + irq_desc[irq].chip = &m_irq_chip; + } +} + +int show_interrupts(struct seq_file *p, void *v) +{ + struct irqaction *ap; + int irq = *((loff_t *) v); + + if (irq == 0) + seq_puts(p, " CPU0\n"); + + if (irq < NR_IRQS) { + ap = irq_desc[irq].action; + if (ap) { + seq_printf(p, "%3d: ", irq); + seq_printf(p, "%10u ", kstat_irqs(irq)); + seq_printf(p, "%14s ", irq_desc[irq].chip->name); + + seq_printf(p, "%s", ap->name); + for (ap = ap->next; ap; ap = ap->next) + seq_printf(p, ", %s", ap->name); + seq_putc(p, '\n'); + } + } + + return 0; +} + diff -Naur linux-2.6.21/arch/m68knommu/kernel/Makefile linux-2.6.21-gt/arch/m68knommu/kernel/Makefile --- linux-2.6.21/arch/m68knommu/kernel/Makefile 2007-04-26 13:08:32.0 +1000 +++ linux-2.6.21-gt/arch/m68knommu/kernel/Makefile 2007-05-04 00:20:43.0 +10
Re: [RELEASE] Lguest for 2.6.21
On Thu, 2007-05-03 at 11:02 -0500, Matt Mackall wrote: > On Thu, May 03, 2007 at 12:43:48AM +1000, Rusty Russell wrote: > > http://lguest.ozlabs.org/lguest-2.6.21-254.patch.gz > > > > See Documentation/lguest/lguest.txt for how to run, > > drivers/lguest/README for the draft code documentation journey. > > Your lguest readme is quite lacking in the area of how to configure a > guest kernel as opposed to the host kernel. More hand-holding, please. Hi Matt! Ah, that's because they are the same kernel. Turning on CONFIG_LGUEST builds-in the parts needed to be a guest as well. Thanks for pointing out that weakness. I will modify lguest.txt to make that clear. Cheers, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Routing 600+ vlan's via linux problems (looks like arp problems)
On Fri, May 04, 2007 at 12:50:17AM +0200, Jan Engelhardt wrote: > > On May 4 2007 00:23, Willy Tarreau wrote: > > > >> This setup will only run for about 1-2 hours while we fix the hardware > >> router (it is running now, but only on a backup flash card solution. > >> the harddrive in it died ;) > > > >Huhhh! Please tell us exactly what make and model of ROUTER you are using > >which embeds a HARD DRIVE, so that we recall never to buy that ! Having > >seen uptimes of 5 years on moderately big access routers, I would have > >find it awful to see them die multiple times in that timeframe because > >of a crappy IDE drive inside ! > > Haha. Would you be happy if it ran on a CF card instead? :> Yes, because at least when you design a system to run on a CF card, you ensure never to write on it because you know that would kill it. Then since you never write on it, it does not wear out and has no problem running for years (unless you bought cheap end-user CF of course). But industrial-grade CF *is* reliable for such usages. People having problems with CF are dumb asses who install a full standard system on those (sometimes even with swap) then complain it dies after one year. A hard disk simply fails after some time even if you never use it at all. A head flying 10 microns above a platter passing at 33 m/s obviously likes to caress it sometimes, with a polite "oops sorry" excuse that you hear meters away. That's a pretty bad design to put such a SPOF in some equipment which IMHO has no real justification for embedding one, really. Cheers, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
ExpressCard hotswap support?
I've got a Thinkpad Z60m with an ExpressCard slot, and I got a Belkin F5U250 GigE ExpressCard (Marvell 88E8053 chip using sky2 driver). It appears that Linux only recognizes it if I insert the card with the system powered off. If I hot-insert the card, nothing happens (no messages logged, no PCI device shows up, nothing). Does Linux support hotswapping ExpressCards? This is with Fedora Core 6 with all updates, kernel 2.6.20-1.2948.fc6. -- Chris Adams <[EMAIL PROTECTED]> Systems and Network Administrator - HiWAAY Internet Services I don't speak for anybody but myself - that's enough trouble. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-dvb] DST/BT878 module customization (.. was: Critical points about ...)
Original-Nachricht Datum: Fri, 4 May 2007 00:06:51 +0200 Von: "Markus Rechberger" <[EMAIL PROTECTED]> An: "Manu Abraham" <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED], linux-kernel@vger.kernel.org Betreff: Re: [linux-dvb] DST/BT878 module customization (.. was: Critical points about ...) > On 5/3/07, Manu Abraham <[EMAIL PROTECTED]> wrote: > > Markus Rechberger wrote: > > > On 5/3/07, Manu Abraham <[EMAIL PROTECTED]> wrote: > > >> Mauro Carvalho Chehab wrote: > > >> > Enough. Let's stop arguing non technical issues. > > >> > > > >> > If either one of you have any technical argue against the Trent's > > >> > patches, please point where the fix is wrong. Otherwise, if you > wish, > > >> > you may send an acked-by agreeing with the fix. > > >> > > > >> > > >> Why don't you stop this childish behaviour ? > > >> > > >> After explaining to you the reasons in the previous mail: > > >> being the author and maintainer of dst/dst_ca and maintainer of > > >> dvb-bt8xx, i NACK this change > > >> > > >> (1) You aren't DVB maintainer > > > > > > I've seen that too often already, now we could point to a mail someone > > > sent to Uwe regarding maintainership. > > > > > > FYI, I have never written to Uwe regarding any sort of maintainership. > > You seem to be quite up with an overdose of drugs > > > > I mean the mail from Helge Hafting (thread [linux-dvb] Critical > points about kernel 2.6.21 and pseudo-authorities) at the very first > beginning. > > > From 2005/09/13 - 2007/05/03 (till date) there have been 15 mails from > > my side to Uwe, none of which has a topic whatsoever you say. Only the > > first mail was a private mail and that is CC'd to Johannes as well. > > > > Firstly you seem to play politics by getting Uwe to flame me, then when > > it backfired, you are trying to play tricks with the rest of the > > community as well, by spreading nonsense statements. > > > > I sent several comments to Uwe to stop flaming, Trent was in the CC > sometimes I never wrote that he should flame on anyone. > I can simply forward you all mails I sent to Uwe there's not one bad mail. > > My point is moreover to get that issue sorted out by either accepting > his "proposal" or stating out why not to add it (and there must be a > reason behind it, and no mail which is 2 years old, or explaining what > the device is, again it got explained what's required from you) > > seems like your response is based on that misunderstood sentence, > sorry for not beeing clear enough. > > Markus Hi Markus, fine chap, Please cool down... I guess I understood Manu's response: a. He just changed his priorities to pick up an old project that seemed to have died, but did not die at all - this project is called cx878 project, and it is the most radical approach that I ever have seen - trying to make all BT8xx drivers independent from bttv, which is not horrible, but only consequent, necessary, and good and fine. Please see my previous mails on that issue. Just read the ML to get the appropriate link and please get yourself in it to help developping it. I swear it is the right path, although I am still missing the avoidance of dvb-pll.c. A closer look into that module will quite easily tell you that there aren't any BT8xx based PCI cards needing that module except the ones needing the lgdt330x frontend driver, which is maintained by Mike Krufky. So for all other cards treated by the dvb-bt8xx backend this module is nothing but heavily obsolete and nonsense, if not to say: RAM-Wasting. b. In so far, Manu's statements do not base on any mail that is 2 years old, but he simply changed his mind, after it was necessarily me personally to build up "the golden bridge" for him, Mike and others as well. c. I am deeply thankful for your diplomatic behaviour involving Trent, as this brought up Manu to react in the end instead of crawling back into his snail house. d. But please let us establish peace among each other now, because without peace we will not be able to continue the whole thing... Hi Trent, I want to thank you for all your efforts - as they at least work for my deep satisfaction, but they may not work for other people as well for simply technical reasons (example: treating dst and dst_ca as one simple case does no good at all, does it?), but our primadonna Manuel Abraham simply follows another far more radical path - to get the whole thing independent from bttv, which is the RIGHT path. Your invested energies weren't wasted at all, but they only approach "plan a" while "plan b" goes much more further than "plan a." It is as simple as that. And, as I stated already, I am open for both plans - and if the more radical one gains more mercy I will not disagree, but simply follow it and trying my best to improve it. Hi Mauro, I would deeply appreciate you to pull my "proposal" for the Kconfig in the frontends section as at least the semantic problem gotta be resolved (SPO instead of SO - whoe
Re: [PATCH 2/2] revoke: change revoke_table to fileset and revoke_details
On Thu, 3 May 2007 23:32:28 +0300 (EEST) Pekka J Enberg <[EMAIL PROTECTED]> wrote: > On Thu, 3 May 2007, Andrew Morton wrote: > > > +/** > > > + * fileset - an array of file pointers. > > > + * @files:the array of file pointers > > > + * @nr: number of elements in the array > > > + * @end: index to next unused file pointer > > > + */ > > > +struct fileset { > > > + struct file **files; > > > + unsigned long nr; > > > + unsigned long end; > > > +}; > > > > What's the locking protocol for all this? > > What do you mean? There is no concurrent access going on here. Well that's the "locking" protocol then: each instance of this structure is only ever touched by a single thread, yes? > On Thu, 3 May 2007, Andrew Morton wrote: > > > +static void free_fset(struct fileset *fset) > > > +{ > > > + int i; > > > + > > > + for (i = fset->end; i < fset->nr; i++) > > > + fput(fset->files[i]); > > > + > > > + kfree(fset->files); > > > + kfree(fset); > > > +} > > > > Confused. Shouldn't it be > > > > for (i = 0; i < fset->end; i++) > > No. The fset->end is an index to the first _unused_ file pointer. All > entries before that are in use by revoked file descriptors so we don't > want to fput() them. > OK. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [POWERPC] 8xx: mpc885ads pcmcia support
On Friday 04 May 2007, Vitaly Bordug wrote: > Adds support for PowerQuicc on-chip PCMCIA. The driver is implemented as > of_device, so only arch/powerpc stuff is capable to use it, which now > implies only mpc885ads reference board. > > To cope with the code that should be hooked inside driver, but is really > board specific (like set_voltage), global structure mpc8xx_pcmcia_ops > holds necessary function pointers that are filled in the BSP code. > > Signed-off-by: Vitaly Bordug <[EMAIL PROTECTED]> Acked-by: Arnd Bergmann <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/8] Universal power supply class (was: battery class)
On Thu, May 03, 2007 at 03:53:46PM -0700, Greg KH wrote: > On Fri, May 04, 2007 at 01:31:39AM +0400, Anton Vorontsov wrote: > > This class is result of "external power" and "battery" classes merge, > > as suggested by David Woodhouse. He also implemented uevent support. > > > > Here how userspace seeing it now: > > > > # ls /sys/class/power\ supply/ > > ac main-battery usb > > Please don't put a space in a class name. Yes, we can do it, but some > scripts will bomb. If you look all of the other classes use a '_' > instead. Ack. power_supply would be okay? Or power-supply better? > > # cat /sys/class/power\ supply/ > > ac/ main-battery/ usb/ > > Um, shouldn't that be an error? Isn't /sys/class/power\ supply/ a > directory? Actually that class does not work, I just faking output and lying it works. ;-) Just kidding.. It's just shell completion. > > # cat /sys/class/power\ supply/ac/type > > AC > > > > # cat /sys/class/power\ supply/usb/type > > USB > > > > # cat /sys/class/power\ supply/main-battery/type > > Battery > > > > # cat /sys/class/power\ supply/ac/online > > 1 > > > > # cat /sys/class/power\ supply/usb/online > > 0 > > I don't really understand, is the 'usb' and 'ac' directories really a > 'struct device' here? Shouldn't there be some symlinks around here? > > Can you do a 'tree /sys/class/power\ supply/' and show me the output? # tree -bash: tree: command not found # ls -al /sys/class/power\ supply/ total 0 drwxr-xr-x 2 root root 0 Jan 1 00:01 . drwxr-xr-x 20 root root 0 Jan 1 00:01 .. lrwxrwxrwx 1 root root 0 Jan 1 00:01 ac -> ../../devices/platform/pda-power/ac lrwxrwxrwx 1 root root 0 Jan 1 00:01 usb -> ../../devices/platform/pda-power/usb # ls -al /sys/class/power\ supply/ac/ total 0 drwxr-xr-x 3 root root0 Jan 1 00:01 . drwxr-xr-x 5 root root0 Jan 1 00:01 .. -r--r--r-- 1 root root 4096 Jan 1 00:01 online drwxr-xr-x 2 root root0 Jan 1 00:01 power lrwxrwxrwx 1 root root0 Jan 1 00:01 subsystem -> ../../../../class/power supply -r--r--r-- 1 root root 4096 Jan 1 00:01 type --w--- 1 root root 4096 Jan 1 00:01 uevent > > > > # cat /sys/class/power\ supply/main-battery/status > > Charging > > > > # cat /sys/class/leds/h5400\:red-left/trigger > > none h5400-radio timer hwtimer ac-online usb-online > > main-battery-charging-or-full [main-battery-charging] > > main-battery-full > > Huh? What does the led have to do with the battery? Have you read Documentation/power_supply_class.txt? Quoting: "It also integrates with LED framework, for the purpose of providing typically expected feedback of battery charging/fully charged status and AC/USB power supply online status. (Note that specific details of the indication (including whether to use it at all) are fully controllable by user and/or specific machine defaults, per design principles of LED framework)." So, PDA/phones using LEDs to provide feedback of battery charging status. You put PDA into cradle, and LED1 starts to flash... when battery fully charged, LED1 offs, and another LED2 (with different color) starts flashing. > > diff --git a/drivers/power/Makefile b/drivers/power/Makefile > > new file mode 100644 > > index 000..95085ba > > --- /dev/null > > +++ b/drivers/power/Makefile > > @@ -0,0 +1,15 @@ > > +power_supply-objs := power_supply_core.o > > + > > +ifeq ($(CONFIG_SYSFS),y) > > +power_supply-objs += power_supply_sysfs.o > > +endif > > Why would this work at all without sysfs? I don't know, because it can? I didn't tested w/o sysfs, though. But sysfs is just one of interfaces power supply class using to "export" power supply information to the user-space. apm_power is another. And who knows what new intefaces we'll see later. > > + > > +static int __init power_supply_class_init(void) > > +{ > > + power_supply_class = class_create(THIS_MODULE, "power supply"); > > Please use "power_supply" instead as mentioned above. Ack again. > > --- /dev/null > > +++ b/drivers/power/power_supply_sysfs.c > > @@ -0,0 +1,254 @@ > > +/* > > + * Sysfs interface for the universal power supply monitor class > > + * > > + * Copyright ?? 2007 David Woodhouse <[EMAIL PROTECTED]> > > What's with the ?? :-) It's because my locale is utf8 unaware, and mutt destroyed (c) symbol. > > + * Copyright (c) 2007 Anton Vorontsov <[EMAIL PROTECTED]> > > + * Copyright (c) 2004 Szabolcs Gyurko > > + * Copyright (c) 2003 Ian Molton <[EMAIL PROTECTED]> > > + * > > + * Modified: 2004, Oct Szabolcs Gyurko > > + * > > + * You may use this code as per GPL version 2 > > + */ > > + > > +#include > > + > > +/* > > + * This is because the name "current" breaks the device attr macro. > > + * The "current" word resolvs to "(get_current())" so instead
[PATCH] [POWERPC] 8xx: mpc885ads pcmcia support
Adds support for PowerQuicc on-chip PCMCIA. The driver is implemented as of_device, so only arch/powerpc stuff is capable to use it, which now implies only mpc885ads reference board. To cope with the code that should be hooked inside driver, but is really board specific (like set_voltage), global structure mpc8xx_pcmcia_ops holds necessary function pointers that are filled in the BSP code. Signed-off-by: Vitaly Bordug <[EMAIL PROTECTED]> --- arch/powerpc/boot/dts/mpc885ads.dts | 12 + arch/powerpc/platforms/8xx/m8xx_setup.c |5 arch/powerpc/platforms/8xx/mpc885ads.h |5 arch/powerpc/platforms/8xx/mpc885ads_setup.c | 77 ++ arch/powerpc/sysdev/fsl_soc.c| 12 + drivers/pcmcia/Kconfig |1 drivers/pcmcia/m8xx_pcmcia.c | 352 -- include/linux/fsl_devices.h |5 8 files changed, 279 insertions(+), 190 deletions(-) diff --git a/arch/powerpc/boot/dts/mpc885ads.dts b/arch/powerpc/boot/dts/mpc885ads.dts index 110bf61..56a9f6a 100644 --- a/arch/powerpc/boot/dts/mpc885ads.dts +++ b/arch/powerpc/boot/dts/mpc885ads.dts @@ -112,6 +112,18 @@ compatible = "CPM"; }; + [EMAIL PROTECTED] { + linux,phandle = <0080>; + #interrupt-cells = <1>; + #size-cells = <2>; + compatible = "fsl,pq-pcmcia"; + device_type = "pcmcia"; + reg = <80 80>; + clock-frequency = <2faf080>; + interrupt-parent = ; + interrupts = ; + }; + [EMAIL PROTECTED] { linux,phandle = ; #address-cells = <1>; diff --git a/arch/powerpc/platforms/8xx/m8xx_setup.c b/arch/powerpc/platforms/8xx/m8xx_setup.c index 0901dba..f169355 100644 --- a/arch/powerpc/platforms/8xx/m8xx_setup.c +++ b/arch/powerpc/platforms/8xx/m8xx_setup.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include @@ -49,6 +50,10 @@ #include "sysdev/mpc8xx_pic.h" +#ifdef CONFIG_PCMCIA_M8XX +struct mpc8xx_pcmcia_ops m8xx_pcmcia_ops; +#endif + void m8xx_calibrate_decr(void); extern void m8xx_wdt_handler_install(bd_t *bp); extern int cpm_pic_init(void); diff --git a/arch/powerpc/platforms/8xx/mpc885ads.h b/arch/powerpc/platforms/8xx/mpc885ads.h index 7c31aec..4439346 100644 --- a/arch/powerpc/platforms/8xx/mpc885ads.h +++ b/arch/powerpc/platforms/8xx/mpc885ads.h @@ -91,5 +91,10 @@ #define SICR_ENET_MASK ((uint)0x00ff) #define SICR_ENET_CLKRT((uint)0x002c) +/* Some internal interrupt registers use an 8-bit mask for the interrupt + * level instead of a number. + */ +#define mk_int_int_mask(IL) (1 << (7 - (IL/2))) + #endif /* __ASM_MPC885ADS_H__ */ #endif /* __KERNEL__ */ diff --git a/arch/powerpc/platforms/8xx/mpc885ads_setup.c b/arch/powerpc/platforms/8xx/mpc885ads_setup.c index a57b577..a339026 100644 --- a/arch/powerpc/platforms/8xx/mpc885ads_setup.c +++ b/arch/powerpc/platforms/8xx/mpc885ads_setup.c @@ -22,6 +22,7 @@ #include #include +#include #include #include @@ -51,6 +52,12 @@ static void init_smc1_uart_ioports(struct fs_uart_platform_info* fpi); static void init_smc2_uart_ioports(struct fs_uart_platform_info* fpi); static void init_scc3_ioports(struct fs_platform_info* ptr); +#ifdef CONFIG_PCMCIA_M8XX +extern struct mpc8xx_pcmcia_ops m8xx_pcmcia_ops; +static void pcmcia_hw_setup(int slot, int enable); +static int pcmcia_set_voltage(int slot, int vcc, int vpp); +#endif + void __init mpc885ads_board_setup(void) { cpm8xx_t *cp; @@ -115,6 +122,12 @@ void __init mpc885ads_board_setup(void) immr_unmap(io_port); #endif + +#ifdef CONFIG_PCMCIA_M8XX + /*Set up board specific hook-ups*/ + m8xx_pcmcia_ops.hw_ctrl = pcmcia_hw_setup; + m8xx_pcmcia_ops.voltage_set = pcmcia_set_voltage; +#endif } @@ -322,6 +335,70 @@ void init_smc_ioports(struct fs_uart_platform_info *data) } } +#ifdef CONFIG_PCMCIA_M8XX +static void pcmcia_hw_setup(int slot, int enable) +{ + unsigned *bcsr_io; + + bcsr_io = ioremap(BCSR1, sizeof(unsigned long)); + if (enable) + clrbits32(bcsr_io, BCSR1_PCCEN); + else + setbits32(bcsr_io, BCSR1_PCCEN); + + iounmap(bcsr_io); +} + +static int pcmcia_set_voltage(int slot, int vcc, int vpp) +{ +u32 reg = 0; +unsigned *bcsr_io; + +bcsr_io = ioremap(BCSR1, sizeof(unsigned long)); + +switch(vcc) { +case 0: +break; +case 33: +reg |= BCSR1_PCCVCC0; +break; +case 50: +reg |= BCSR1_PCCVCC1; +
Re: [patch] export hrtimer_forward
On Thu, 03 May 2007 23:10:02 +0400 Stas Sergeev <[EMAIL PROTECTED]> wrote: > Hello. > > Peter Zijlstra wrote: > >> It seems hrtimer_forward was forgotten to > >> export - other symbols of the hrtimers API > > Are there actual in-tree users of this symbol? Without we usually leave > > the symbol unexported, this saves some space. > Do you mean it was really left intentional? > Unbeleivable! But why the other parts of a > hrtimer API are exported nevertheless, and > only this particular function not? It was probably an oversight - generally we take the position that all the formal interface of a subsystem is exported to modules rather than a piecemeal whichever-bits-kernel.org-happens-to-use-today approach. Thomas, is hrtimer_forward() considered part of the hrtimer public API? And are you OK with the patch? > As for the users - I am porting my pcsp driver to > it and I need that function. > It is not exactly in-tree stuff, but it was > in an ALSA tree for years already, so it is a > close one. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)
On Thu, 03 May 2007 21:38:10 +0400 Alex Tomas <[EMAIL PROTECTED]> wrote: > Andrew Morton wrote: > > We can make great improvements here, and I've (twice) previously decribed > > how: hoist the entire ordered-mode data handling out of ext3, and out of > > the buffer_head layer and move it up into the VFS pagecache layer. > > Basically, do ordered-data with a commit-time inode walk, calling > > do_sync_mapping_range(). > > > > Do it in the VFS. Make reiserfs use it, remove reiserfs ordered-mode too. > > Make XFS use it, fix the hey-my-files-are-all-full-of-zeroes problem there. > > I'm not sure it's that easy. > > if we move to pages, then we have to mark pages to be flushed holding > transaction open. now take delayed allocation into account: we need > to allocate number of blocks at once and then mark all pages mapped, > again within context of the same transaction. Yes, there can be issues with needing to allocate journal space within the context of a commit. But a) If the page has newly allocated space on disk then the metadata which refers to that page is already in the journal: no new journal space needed. b) If the page doesn't have space allocated on disk then we don't need to write it out at ordered-mode commit time, because the post-recovery filesystem will not have any references to that page. c) If the page is dirty due to overwrite then no metadata update was required. IOW, under what circumstances would an ordered-mode commit need to allocate space for a delayed-allocate page? However b) might lead to the hey-my-file-is-full-of-zeroes problem. > so, an implementation > would look like the following? > > generic_writepages() { > /* collect set of contig. dirty pages */ > foo_get_blocks() { > foo_journal_start(); > foo_new_blocks(); > foo_attach_blocks_to_inode(); > generic_mark_pages_mapped(); > foo_journal_stop(); > } > } > > another question is will it scale well given number of dirty inodes > can be much larger than number of inodes with dirty mapped blocks > (in delayed allocation case, for example) ? Possibly - zillions of dirty-for-atime inodes might get in the way. A short-term fix would be to create a separate dirty-inode list on the superblock (ug). A long-term fix is to rip all the per-superblock dirty-inode lists and use a radix-tree. Not for lookup purposes, but for the tree's ability to do tagged and restartable searches. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] input: fix aux port detection with some i8042 chips
From: Roland Scheidegger <[EMAIL PROTECTED]> The i8042 driver fails detection of the AUX port with some chips, because they apparently do not change the I8042_CTR_AUXDIS bit immediately. This is known to affect at least HP500 / HP510 notebooks, consequently the built-in touchpad will not work. The patch will simply reread the value until it gets the expected value or a retry limit is hit, without touching other workaround code in the same area. Signed-off-by: Roland Scheidegger <[EMAIL PROTECTED]> --- There is some discussion about non-working touchpads in HP500 notebooks in ubuntu and a (ugly) workaround for this problem here: http://ubuntuforums.org/showthread.php?t=344103. I've got a HP510 and even with 2.6.21 the aux port would get disabled. Works with the patch, for the record the i8042 here needs around 6 tries (sometimes a bit more, sometimes less) until it reads the I8042_CTR_AUXDIS bit correctly, both after disabling and enabling the aux port. (please CC: on any replies) Signed-off-by: Roland Scheidegger <[EMAIL PROTECTED]> --- linux-2.6/drivers/input/serio/i8042.c.orig 2007-05-03 16:32:26.0 +0200 +++ linux-2.6/drivers/input/serio/i8042.c 2007-05-03 16:56:00.0 +0200 @@ -537,6 +537,7 @@ static int __devinit i8042_check_aux(voi int retval = -1; int irq_registered = 0; int aux_loop_broken = 0; + int i = 0; unsigned long flags; unsigned char param; @@ -582,14 +583,27 @@ static int __devinit i8042_check_aux(voi if (i8042_command(¶m, I8042_CMD_AUX_DISABLE)) return -1; - if (i8042_command(¶m, I8042_CMD_CTL_RCTR) || (~param & I8042_CTR_AUXDIS)) { + /* some chips need some time to set the I8042_CTR_AUXDIS bit */ + for (i = 0; i < 100; i++) { + if (!i8042_command(¶m, I8042_CMD_CTL_RCTR) && (param & I8042_CTR_AUXDIS)) + break; + udelay(50); + } + if (i == 100) { printk(KERN_WARNING "Failed to disable AUX port, but continuing anyway... Is this a SiS?\n"); printk(KERN_WARNING "If AUX port is really absent please use the 'i8042.noaux' option.\n"); } if (i8042_command(¶m, I8042_CMD_AUX_ENABLE)) return -1; - if (i8042_command(¶m, I8042_CMD_CTL_RCTR) || (param & I8042_CTR_AUXDIS)) + for (i = 0; i < 100; i++) { + if (i8042_command(¶m, I8042_CMD_CTL_RCTR)) + return -1; + if (~param & I8042_CTR_AUXDIS) + break; + udelay(50); + } + if (i == 100) return -1; /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Linux 2.6.16.50
Security fixes since 2.6.16.49: - CVE-2007-1861: [NETLINK]: Infinite recursion in netlink - CVE-2007-2242: [IPV6]: Disallow RH0 by default Location: ftp://ftp.kernel.org/pub/linux/kernel/v2.6/ git tree: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.16.y.git RSS feed of the git tree: http://www.kernel.org/git/?p=linux/kernel/git/stable/linux-2.6.16.y.git;a=rss Changes since 2.6.16.49: Adrian Bunk (2): Linux 2.6.16.50-rc1 Linux 2.6.16.50 Al Viro (1): mca_nmi_hook() can be called at any point Alexey Kuznetsov (1): [NETLINK]: Infinite recursion in netlink (CVE-2007-1861) Guennadi Liakhovetski (1): IrDA: irttp_dup spin_lock initialisation Jeet Chaudhuri (1): IrDA: Incorrect TTP header reservation Jiri Slaby (1): Char: icom, mark __init as __devinit Shaohua Li (1): x86 microcode: don't check the size YOSHIFUJI Hideaki (1): [IPV6]: Disallow RH0 by default (CVE-2007-2242) Zach Brown (1): aio: remove bare user-triggerable error printk Documentation/networking/ip-sysctl.txt |9 ++ Makefile |2 - arch/i386/kernel/microcode.c |9 +- arch/i386/mach-default/setup.c |2 - drivers/serial/icom.c |4 +- fs/aio.c |1 include/linux/ipv6.h |9 ++ include/linux/sysctl.h |1 net/ipv4/fib_frontend.c| 12 +++- net/ipv6/addrconf.c| 11 +++ net/ipv6/exthdrs.c | 37 - net/irda/irttp.c |5 ++- 12 files changed, 80 insertions(+), 22 deletions(-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Regression with SLUB on Netperf and Volanomark
Christoph Lameter wrote: > Try to boot with > > slub_max_order=4 slub_min_objects=8 > > If that does not help increase slub_min_objects to 16. > We are still seeing a 5% regression on TCP streaming with slub_min_objects set at 16 and a 10% regression for Volanomark, after increasing slub_min_objects to 16 and setting slub_max_order=4 and using the 2.6.21-rc7-mm2 kernel. The performance between slub_min_objects=8 and 16 are similar. >> We found that for Netperf's TCP streaming tests in a loop back mode, >> the TCP streaming performance is about 7% worse when SLUB is enabled >> on >> 2.6.21-rc7-mm1 kernel (x86_64). This test have a lot of sk_buff >> allocation/deallocation. > > 2.6.21-rc7-mm2 contains some performance fixes that may or may not be > useful to you. We've switched to 2.6.21-rc7-mm2 in our tests now. >> >> For Volanomark, the performance is 7% worse for Woodcrest and 12% >> worse for Clovertown. > > SLUBs "queueing" is restricted to the number of objects that fit in > page order slab. SLAB can queue more objects since it has true queues. > Increasing the page size that SLUB uses may fix the problem but then > we run into higher page order issues. > > Check slabinfo output for the network slabs and see what order is > used. The number of objects per slab is important for performance. The order used is 0 for the buffer_head, which is the most used object. I think they are 104 bytes per object. Tim - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: regression on quad Xeon: no SCSI-disks
On Thu, May 03, 2007 at 10:41:41AM +0200, Wolfgang Erig wrote: > I am prepared to do tweaks to your small patch, but I need your help. > My own blindly experiments failed miserably. I don't think that patch did anything wrong, most likely it just triggered a bug elsewhere. These two lines from your dmesg look very suspicious: > PCI: Cannot allocate resource region 0 of device :00:04.0 > PCI: Error while updating region :00:04.0/0 (a8008000 != fec08000) Note that the BAR seems to have high address bits hardwired to fec0. And device :00:04.0 is > 00:04.0 System peripheral: Siemens Nixdorf AG FSC Multiprocessor Interrupt > Controller (rev 02) I'd guess that when we try to reassign this resource, PCI interrupts might just stop working. This could explain SCSI timeouts and other weird things. Maybe this patch helps? Ivan. --- 2.6.21/arch/i386/pci/fixup.c2007-02-04 21:44:54.0 +0300 +++ linux/arch/i386/pci/fixup.c 2007-05-04 01:58:32.629654275 +0400 @@ -436,3 +436,14 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_CY pci_early_fixup_cyrix_5530); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_CYRIX, PCI_DEVICE_ID_CYRIX_5530_LEGACY, pci_early_fixup_cyrix_5530); + +/* + * Siemens Nixdorf AG FSC Multiprocessor Interrupt Controller: + * prevent update of the BAR0, which doesn't look like a normal BAR. + */ +static void __devinit pci_siemens_interrupt_controller(struct pci_dev *dev) +{ + dev->resource[0].flags |= IORESOURCE_PCI_FIXED; +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SIEMENS, 0x0015, + pci_siemens_interrupt_controller); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] remove "#if 0" from find_bus function, export it.
On Fri, May 04, 2007 at 01:31:21AM +0400, Anton Vorontsov wrote: > This function were placed in "#if 0" because nobody was using it. > We using it now. Why? Shouldn't you just export the pointer you need instead? > See http://lwn.net/Articles/210610/ I don't understand the need for this link, it talks about how the api changes all the time, something we all know :) And if you really want it, and you convince me you really need it, can you change it to be "bus_find" to play nicer in the namespace? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] New firewire stack
> Jonathan Woithe wrote: > >> Olaf Hering wrote: > >>> NACK. > >>> Upgrade the current drivers/ieee1394/ with the new code, and keep all > >>> existing module names. > [...] > > However, as a compromise how about renaming the existing stack's modules and > > then reusing the existing names for the new stack? Messy I know, but this > > way both stacks would still be available without recompilation for those who > > needed them and the sbp2-as-root dilemma raised by Olaf would also be > > covered. > > I.e. new modules: > ieee1394 (was fw-core) > ohci1394 (was fw-ohci) > : > old modules, for example: > ieee1394-old > ohci1394-old > : > > Looks... weird. > > On the other hand, a 1394 module compilation cycle in order to do the > fallback is not such a huge issue, except that it requires the person to > be able to compile modules. That's probably the main issue. True on all counts. I guess it's a question of whether the lack of an easy fallback path will significantly reduce the number of testers. I don't have enough of a feel to answer that. > eth1394 (to be done) --- but that's a bad name anyway, it > implements IP over 1394, not Ethernet So, when eth1394 is ported the name should be something like fw-ip, at least if we are to remain consistent with the other 3 module names. > > Oh yes, it would be nice to have working PCILynx support again (although I > > acknowledge it's unlikely to happen). Some of us do have these cards > > installed for sniffing purposes (using nosy) but it would be nice to be able > > to use them with libraw1394 as well. It would for example save me having to > > swap cards depending on what I needed to do (I have insufficient PCI slots > > to have both the PCILynx and OHCI cards installed simultaneously). > > But then, what is the actual utility of pcilynx? (I mean the current > driver, not the card or a future driver.) Last time I checked, sbp2 was > broken without OHCI's physical DMA, and AFAIK raw1394's newer iso API > and video1394 and dv1394 don't work with pcilynx either. It certainly doesn't support the raw1394 API so its current usefulness is extremely limited. > Porting pcilynx to the new low-level API would be quite resource > demanding --- seen in relation to which resources we have, what the > existing pcilynx driver's state of affairs is, and how rare the hardware > is. (For those who have the hardware, the stand-alone Nosy is > undoubtedly the killer application, not pcilynx.) Precisely. As I said, I've probably got a corner case and it's certainly not worth the effort just for that. It would be nice though. You're right about nosy; so long as nosy (which is independent of the firewire stack) keeps working I'll be happy. :) Regards jonathan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-dvb] DST/BT878 module customization (.. was: Critical points about ...)
On 5/4/07, Manu Abraham <[EMAIL PROTECTED]> wrote: Markus Rechberger wrote: > I mean the mail from Helge Hafting (thread [linux-dvb] Critical > points about kernel 2.6.21 and pseudo-authorities) at the very first > beginning. > I am replying to this mail, just because someone's spreading lies all around. On the mentioned thread, what i wrote (and that was the only mail from my side): There is a saying: "He who lives by the sword, dies by the sword." And what issues are outstanding of these discussions? I went over it and it just shows up that there have been communication problems in 2005. We now have open issues with several device drivers and that's what we should focus at. Markus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/8] Universal power supply class (was: battery class)
On Thu, May 03, 2007 at 03:53:46PM -0700, Greg KH wrote: > > # cat /sys/class/power\ supply/ > > ac/ main-battery/ usb/ > > Um, shouldn't that be an error? Isn't /sys/class/power\ supply/ a > directory? I think that's more of a case of: cat /sys/class/power\ supply/ -- "To the extent that we overreact, we proffer the terrorists the greatest tribute." - High Court Judge Michael Kirby - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/8] Universal power supply class (was: battery class)
On Fri, May 04, 2007 at 01:31:39AM +0400, Anton Vorontsov wrote: > This class is result of "external power" and "battery" classes merge, > as suggested by David Woodhouse. He also implemented uevent support. > > Here how userspace seeing it now: > > # ls /sys/class/power\ supply/ > ac main-battery usb Please don't put a space in a class name. Yes, we can do it, but some scripts will bomb. If you look all of the other classes use a '_' instead. > # cat /sys/class/power\ supply/ > ac/ main-battery/ usb/ Um, shouldn't that be an error? Isn't /sys/class/power\ supply/ a directory? > # cat /sys/class/power\ supply/ac/type > AC > > # cat /sys/class/power\ supply/usb/type > USB > > # cat /sys/class/power\ supply/main-battery/type > Battery > > # cat /sys/class/power\ supply/ac/online > 1 > > # cat /sys/class/power\ supply/usb/online > 0 I don't really understand, is the 'usb' and 'ac' directories really a 'struct device' here? Shouldn't there be some symlinks around here? Can you do a 'tree /sys/class/power\ supply/' and show me the output? > > # cat /sys/class/power\ supply/main-battery/status > Charging > > # cat /sys/class/leds/h5400\:red-left/trigger > none h5400-radio timer hwtimer ac-online usb-online > main-battery-charging-or-full [main-battery-charging] > main-battery-full Huh? What does the led have to do with the battery? > > Signed-off-by: David Woodhouse <[EMAIL PROTECTED]> > Signed-off-by: Anton Vorontsov <[EMAIL PROTECTED]> > --- > Documentation/power_supply_class.txt | 167 ++ > drivers/Kconfig |2 + > drivers/Makefile |1 + > drivers/power/Kconfig| 17 +++ > drivers/power/Makefile | 15 ++ > drivers/power/power_supply.h | 42 ++ > drivers/power/power_supply_core.c| 168 ++ > drivers/power/power_supply_leds.c| 176 +++ > drivers/power/power_supply_sysfs.c | 254 > ++ > include/linux/power_supply.h | 169 ++ > 10 files changed, 1011 insertions(+), 0 deletions(-) > create mode 100644 Documentation/power_supply_class.txt > create mode 100644 drivers/power/Kconfig > create mode 100644 drivers/power/Makefile > create mode 100644 drivers/power/power_supply.h > create mode 100644 drivers/power/power_supply_core.c > create mode 100644 drivers/power/power_supply_leds.c > create mode 100644 drivers/power/power_supply_sysfs.c > create mode 100644 include/linux/power_supply.h > > diff --git a/Documentation/power_supply_class.txt > b/Documentation/power_supply_class.txt > new file mode 100644 > index 000..666941f > --- /dev/null > +++ b/Documentation/power_supply_class.txt > @@ -0,0 +1,167 @@ > +Linux power supply class > + > + > +Synopsis > + > +Power supply class used to represent battery, UPS, AC or DC power supply > +properties to user-space. > + > +It defines core set of attributes, which should be applicable to (almost) > +every power supply out there. Attributes are available via sysfs and uevent > +interfaces. > + > +Each attribute has well defined meaning, up to unit of measure used. While > +the attributes provided are believed to be universally applicable to any > +power supply, specific monitoring hardware may not be able to provide them > +all, so any of them may be skipped. > + > +Power supply class is extensible, and allows to define drivers own > attributes. > +The core attribute set is subject to the standard Linux evolution (i.e. > +if it will be found that some attribute is applicable to many power supply > +types or their drivers, it can be added to the core set). > + > +It also integrates with LED framework, for the purpose of providing > +typically expected feedback of battery charging/fully charged status and > +AC/USB power supply online status. (Note that specific details of the > +indication (including whether to use it at all) are fully controllable by > +user and/or specific machine defaults, per design principles of LED > +framework). > + > + > +Attributes/properties > +~ > +Power supply class has predefined set of attributes, this eliminates code > +duplication across drivers. Power supply class insist on reusing its > +predefined attributes *and* their units. > + > +So, userspace gets predictable set of attributes and their units for any > +kind of power supply, and can process/present them to a user in consistent > +manner. Results for different power supplies and machines are also directly > +comparable. > + > +See drivers/power/ds2760_battery.c and drivers/power/pda_power.c for the > +example how to declare and handle attributes. > + > + > +Units > +~ > +Quoting include/linux/power_supply.h: > + > + All voltages, currents,
Re: [Kernel-discuss] [PATCH 3/8] Universal power supply class (was: battery class)
On Thu, May 03, 2007 at 11:14:26PM +0100, ian wrote: > On Fri, 2007-05-04 at 01:31 +0400, Anton Vorontsov wrote: > > # cat /sys/class/power\ supply/ac/type > > AC > > > > # cat /sys/class/power\ supply/usb/type > > USB > > isnt that a bit redundant? Let me note that "usb"/"ac" is just names pda-power driver gives for these supplies. So, it's not power supply class issue, but pda-power. As for pda-power.. Yes, it can name them "supply0" and "supply1"... Or maybe "pda-supplyX", but I don't see any need to maim name just because it is very similar to its type. ;-) Anyhow I don't care much, i.e. if you or anyone else will insist, I'll change pda-power's supply names with no problems. Thanks, -- Anton Vorontsov email: [EMAIL PROTECTED] backup email: [EMAIL PROTECTED] irc://irc.freenode.org/bd2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Error using the buffer cache
I've written a module that acts as a cache for fixed size objects but I get a soft lockup trying to use the buffer cache. I've attached the module that reproduces the error. You need to supply the module with a block device, i.e. insmod disk_cache.ko devname="/dev/hda2". /* * An object oriented disk cache. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include //#include "assert.h" //#include "debug.h" #define assert(s) if(!(s)) { printk(KERN_EMERG "assertion failed: %d: ", __LINE__); panic(#s);} #define KERNEL_SECTOR_SIZE 512 MODULE_LICENSE("Dual BSD/GPL"); struct disk_cache { int object_size;/* object size */ sector_t cache_size;/* number of objects to be held in cache */ sector_t disk_size; /* maximum number of objects */ int objects_per_sector; /* number of objects per sector */ sector_t start; /* start sector for data storage */ }; static int major_num = 0; struct block_device *bdev = NULL; sector_t start = 0; static char *devname = NULL; module_param(devname, charp, 0); /* * Create and initialize a disk cache. * @disk_cache - an allocated disk_cache structure * @object_size - the object size to use (in bytes) * @cache_size - total memory size for cache (in objects) * @disk_size - total disk size for objects (in objects) */ int disk_cache_create(struct disk_cache* const disk_cache, const int object_size, const sector_t disk_size) { assert(disk_cache); assert(disk_size > 0); disk_cache->object_size = object_size; disk_cache->disk_size = disk_size; disk_cache->objects_per_sector = bdev_get_queue(bdev)->hardsect_size / object_size; disk_cache->start = start; start += ((unsigned long)disk_size / (unsigned long)disk_cache->objects_per_sector) + 1; return 1; } EXPORT_SYMBOL(disk_cache_create); /* * Returns an address where the requested object is found. */ void * get_object(const struct disk_cache* disk_cache, const sector_t object_number) { int offset = 0; sector_t sector = 0; struct buffer_head *bh = NULL; assert(disk_cache); assert(object_number < disk_cache->disk_size); /* * Compute page number in which requested object resides. */ sector = disk_cache->start + (unsigned long)object_number / (unsigned long)disk_cache->objects_per_sector; //assert(sector < disk_cache->disk_size); //assert(sector < get_capacity(bdev->bd_disk)); offset = ((unsigned long)object_number % (unsigned long)disk_cache->objects_per_sector) * disk_cache->object_size; bh = __bread(bdev, sector, bdev_get_queue(bdev)->hardsect_size); //return bh->b_data + offset; return NULL; } EXPORT_SYMBOL(get_object); /* * Puts given object back to the buffer cache. Flag 'modified' must be set if the object was modified. */ void put_object(const struct disk_cache* const disk_cache, const sector_t object_number, const int modified) { struct buffer_head *bh = NULL; sector_t sector = disk_cache->start + (unsigned long)object_number / (unsigned long)disk_cache->objects_per_sector; bh = __bread(bdev, sector, bdev_get_queue(bdev)->hardsect_size); //if(modified) { // lock_buffer(bh); // set_buffer_uptodate(bh); // mark_buffer_dirty(bh); // unlock_buffer(bh); //} brelse(bh); /* * an additional release, we didn't released it in get_object */ brelse(bh); } EXPORT_SYMBOL(put_object); static void test(void) { struct disk_cache dk; int i; void *p; if(!disk_cache_create(&dk, 6, 100)) { printk(KERN_ERR "create cache error\n"); return; } for(i = 0; i < 100; i++) { get_object(&dk, i); put_object(&dk, i, 0); } } static int __init disk_cache_init(void) { major_num = register_blkdev(major_num, "disk_cache"); if (major_num <= 0) { printk(KERN_ERR "disk_cache: unable to get major number\n"); return -EINVAL; } if(!devname) { printk(KERN_ERR "disk_cache: must supply a valid block device\n"); return -EINVAL; } bdev = open_bdev_excl(devname, 0, NULL); if(IS_ERR(bdev)) { printk(KERN_ERR "disk_cache: cannot open device %s.\n", devname); return -EINVAL; } printk(KERN_INFO "disk_cache: using device %s\n", devname); printk(KERN_INFO "disk_cache: %llu sectors\n", get_capacity(bdev->bd_disk)); test(); return 0; } stati