Re: [PATCH] drivers/base: export gpl (un)register_memory_notifier
On Thu, 2008-02-14 at 09:12 -0800, Dave Hansen wrote: .. - Use currently other not exported functions in kernel/resource.c, like walk_memory_resource (where we would still need the maximum possible number of pages NR_MEM_SECTIONS) It isn't the act of exporting that's the problem. It's making sure that the exports won't be prone to abuse and that people are using them properly. You should assume that you can export and use walk_memory_resource(). So this seems to come down to a basic question: New hardware seems to have a tendency to get private MMUs, which need private mappings from the kernel address space into a HW defined address space with potentially unique characteristics RDMA in Openfabrics with global MR is the most prominent example heading there That's not a question. ;) Please explain to me why walk_memory_resource() is insufficient for your needs. I've now pointed it out to you at least 3 times. I am not sure what you are trying to do with walk_memory_resource(). The behavior is different on ppc64. Hotplug memory usage assumes that all the memory resources (all system memory, not just IOMEM) are represented in /proc/iomem. Its the case with i386 and ia64. But on ppc64 is contains ONLY iomem related. Paulus didn't want to export all the system memory into /proc/iomem on ppc64. So I had to workaround by providing arch-specific walk_memory_resource() function for ppc64. Thanks, Badari -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] drivers/base: export (un)register_memory_notifier
On Fri, 2008-02-01 at 17:16 +0100, Jan-Bernd Themann wrote: Drivers like eHEA need memory notifiers in order to update their internal DMA memory map when memory is added to or removed from the system. Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED] --- Comment: eHEA patches that exploit these functions will follow drivers/base/memory.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 7ae413f..1e1bd4c 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -52,11 +52,13 @@ int register_memory_notifier(struct notifier_block *nb) { return blocking_notifier_chain_register(memory_chain, nb); } +EXPORT_SYMBOL(register_memory_notifier); void unregister_memory_notifier(struct notifier_block *nb) { blocking_notifier_chain_unregister(memory_chain, nb); } +EXPORT_SYMBOL(unregister_memory_notifier); /* * register_memory - Setup a sysfs device for a memory block Is there a reason for not making them EXPORT_SYMBOL_GPL() ? Otherwise, looks good to me. I have been planning to send this as part of my next update with ppc64 arch-specific remove support and generic __remove_pages() support. If this is blocking your work, lets get this in. Acked-by: Badari Pulavarty [EMAIL PROTECTED] Thanks, Badari -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ip_gra_reasm() should set err incase of skb_clone() failure
Simple error handling fix (against 2.26.23-git12). Thanks, Badari Need to initialize err in case of skb_clone() failure. net/ipv4/ip_fragment.c: In function `ip_defrag': net/ipv4/ip_fragment.c:540: warning: `err' might be used uninitialized in this function Signed-off-by: Badari Pulavarty [EMAIL PROTECTED] --- net/ipv4/ip_fragment.c |1 + 1 file changed, 1 insertion(+) Index: linux-2.6.23/net/ipv4/ip_fragment.c === --- linux-2.6.23.orig/net/ipv4/ip_fragment.c2007-10-17 15:33:27.0 -0700 +++ linux-2.6.23/net/ipv4/ip_fragment.c 2007-10-17 15:50:51.0 -0700 @@ -544,6 +544,7 @@ static int ip_frag_reasm(struct ipq *qp, /* Make the one we just received the head. */ if (prev) { head = prev-next; + err = -ENOMEM; fp = skb_clone(head, GFP_ATOMIC); if (!fp) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] vortex_up should initialize err
Simple compile warning fix. (against 2.6.23-git12) Thanks, Badari vortex_up() should initialize 'err' for a successful return. drivers/net/3c59x.c: In function `vortex_up': drivers/net/3c59x.c:1494: warning: `err' might be used uninitialized in this function Signed-off-by: Badari Pulavarty [EMAIL PROTECTED] --- drivers/net/3c59x.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6.23/drivers/net/3c59x.c === --- linux-2.6.23.orig/drivers/net/3c59x.c 2007-10-17 15:33:07.0 -0700 +++ linux-2.6.23/drivers/net/3c59x.c2007-10-17 16:07:10.0 -0700 @@ -1491,7 +1491,7 @@ vortex_up(struct net_device *dev) struct vortex_private *vp = netdev_priv(dev); void __iomem *ioaddr = vp-ioaddr; unsigned int config; - int i, mii_reg1, mii_reg5, err; + int i, mii_reg1, mii_reg5, err = 0; if (VORTEX_PCI(vp)) { pci_set_power_state(VORTEX_PCI(vp), PCI_D0);/* Go active */ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: select(0, ..) is valid ?
On Wed, 2007-05-16 at 10:37 -0500, Anton Blanchard wrote: Hi Hugh, It's interesting that compat_core_sys_select() shows this kmalloc(0) failure but core_sys_select() does not. That's because core_sys_select() avoids kmalloc by using a buffer on the stack for small allocations (and 0 sure is small). Shouldn't compat_core_sys_select() do just the same? Or is SLUB going to be so efficient that doing so is a waste of time? Nice catch, the original optimisation from Andi is: http://git.kernel.org/git-new/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=70674f95c0a2ea694d5c39f4e514f538a09be36f And I think it makes sense for the compat code to do it too. Anton Here it is .. Should I do one for poll() also ? Thanks, Badari Optimize select by a using stack space for small fd sets. core_sys_select() already has this optimization. This is for compat version. Signed-off-by: Badari Pulavarty [EMAIL PROTECTED] --- fs/compat.c | 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) Index: linux-2.6.22-rc1/fs/compat.c === --- linux-2.6.22-rc1.orig/fs/compat.c 2007-05-12 18:45:56.0 -0700 +++ linux-2.6.22-rc1/fs/compat.c2007-05-16 17:50:39.0 -0700 @@ -1544,9 +1544,10 @@ int compat_core_sys_select(int n, compat compat_ulong_t __user *outp, compat_ulong_t __user *exp, s64 *timeout) { fd_set_bits fds; - char *bits; + void *bits; int size, max_fds, ret = -EINVAL; struct fdtable *fdt; + long stack_fds[SELECT_STACK_ALLOC/sizeof(long)]; if (n 0) goto out_nofds; @@ -1564,11 +1565,14 @@ int compat_core_sys_select(int n, compat * since we used fdset we need to allocate memory in units of * long-words. */ - ret = -ENOMEM; size = FDS_BYTES(n); - bits = kmalloc(6 * size, GFP_KERNEL); - if (!bits) - goto out_nofds; + bits = stack_fds; + if (size sizeof(stack_fds) / 6) { + bits = kmalloc(6 * size, GFP_KERNEL); + ret = -ENOMEM; + if (!bits) + goto out_nofds; + } fds.in = (unsigned long *) bits; fds.out = (unsigned long *) (bits + size); fds.ex = (unsigned long *) (bits + 2*size); @@ -1600,7 +1604,8 @@ int compat_core_sys_select(int n, compat compat_set_fd_set(n, exp, fds.res_ex)) ret = -EFAULT; out: - kfree(bits); + if (bits != stack_fds) + kfree(bits); out_nofds: return ret; } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
select(0, ..) is valid ?
Hi, Is select(0, ..) is a valid operation ? I see that there is no check to prevent this or return success early, without doing any work. Do we need one ? slub code is complaining that we are doing kmalloc(0). Thanks, Badari [ cut here ] Badness at include/linux/slub_def.h:88 Call Trace: [c001e4eb7640] [c000e650] .show_stack+0x68/0x1b0 (unreliable) [c001e4eb76e0] [c029b854] .report_bug+0x94/0xe8 [c001e4eb7770] [c00219f0] .program_check_exception +0x12c/0x568 [c001e4eb77f0] [c0004a84] program_check_common+0x104/0x180 --- Exception: 700 at .get_slab+0x4c/0x234 LR = .__kmalloc+0x24/0xc4 [c001e4eb7ae0] [c001e4eb7b80] 0xc001e4eb7b80 (unreliable) [c001e4eb7b80] [c00a7ff0] .__kmalloc+0x24/0xc4 [c001e4eb7c10] [c00ea720] .compat_core_sys_select+0x90/0x240 [c001e4eb7d00] [c00ec3a4] .compat_sys_select+0xb0/0x190 [c001e4eb7dc0] [c0014944] .ppc32_select+0x14/0x28 [c001e4eb7e30] [c000872c] syscall_exit+0x0/0x40 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: select(0, ..) is valid ?
On Tue, 2007-05-15 at 10:44 -0700, Andrew Morton wrote: On Tue, 15 May 2007 10:29:18 -0700 Badari Pulavarty [EMAIL PROTECTED] wrote: Hi, Is select(0, ..) is a valid operation ? Probably - it becomes an elaborate way of doing a sleep. Whatever - we used to permit it without error, so we should continue to do so. Okay. I see that there is no check to prevent this or return success early, without doing any work. Do we need one ? slub code is complaining that we are doing kmalloc(0). [ cut here ] Badness at include/linux/slub_def.h:88 Call Trace: [c001e4eb7640] [c000e650] .show_stack+0x68/0x1b0 (unreliable) [c001e4eb76e0] [c029b854] .report_bug+0x94/0xe8 [c001e4eb7770] [c00219f0] .program_check_exception +0x12c/0x568 [c001e4eb77f0] [c0004a84] program_check_common+0x104/0x180 --- Exception: 700 at .get_slab+0x4c/0x234 LR = .__kmalloc+0x24/0xc4 [c001e4eb7ae0] [c001e4eb7b80] 0xc001e4eb7b80 (unreliable) [c001e4eb7b80] [c00a7ff0] .__kmalloc+0x24/0xc4 [c001e4eb7c10] [c00ea720] .compat_core_sys_select+0x90/0x240 [c001e4eb7d00] [c00ec3a4] .compat_sys_select+0xb0/0x190 [c001e4eb7dc0] [c0014944] .ppc32_select+0x14/0x28 [c001e4eb7e30] [c000872c] syscall_exit+0x0/0x40 I _think_ we can just do --- a/fs/compat.c~a +++ a/fs/compat.c @@ -1566,9 +1566,13 @@ int compat_core_sys_select(int n, compat */ ret = -ENOMEM; size = FDS_BYTES(n); - bits = kmalloc(6 * size, GFP_KERNEL); - if (!bits) - goto out_nofds; + if (likely(size)) { + bits = kmalloc(6 * size, GFP_KERNEL); + if (!bits) + goto out_nofds; + } else { + bits = NULL; + } fds.in = (unsigned long *) bits; fds.out = (unsigned long *) (bits + size); fds.ex = (unsigned long *) (bits + 2*size); _ Yes. This is what I did earlier, but then I was wondering if I could skip the whole operation and bail out early (if n == 0). I guess not. I mean, if that oopses then I'd be very interested in finding out why. But I'm starting to suspect that it would be better to permit kmalloc(0) in slub. It depends on how many more of these things need fixing. otoh, a kmalloc(0) could be a sign of some buggy/inefficient/weird code, so there's some value in forcing us to go look at all the callsites. So far, I haven't found any other. Lets leave the check. Thanks, Badari - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-mm3 oops in xfrm_register_mode
On Wed, 2006-10-04 at 16:02 -0500, Steve Fox wrote: On Wed, 2006-10-04 at 09:57 -0700, Andrew Morton wrote: You might well find this bisection lands you on origin.patch. ie: a mainline bug. I note that David merged a few more xfrm fixes this morning. So to confirm that, first test just origin.patch and if that fails, test git-of-the-moment. If that doesn't fail, they fixed it. origin.patch from --m3 failed. Unfortunately so did a fresh clone of Linus's git tree. I am not an expert in that area, but your stack trace made me curious. Looking at the dis-assembly, line of code in question is: if (likely(modemap[mode-encap] == NULL)) { Register contents indicate that, its called as xfrm_register_mode(xfrm4_tunnel_mode, AF_INET); or xfrm_register_mode(xfrm4_transport_mode, AF_INET); (family is AF_INET). The invalid deref is due to modemap = 0x7ff (RAX: 07ff) Since its so easy to reproduce, can you add a printk before this check to dump mode-encap and modemap, afinfo, family etc ? Just curious .. Thanks, Badari - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-mm2 boot failure on x86-64
On Thu, 2006-10-05 at 09:53 -0500, Steve Fox wrote: On Wed, 2006-10-04 at 18:08 -0700, Martin Bligh wrote: Andi Kleen wrote: I think most likely it would crash on 2.6.18. Keith mannthey had reported a different crash on 2.6.18-rc4-mm2 when this patch was introduced first time. Following is the link to the thread. Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4? I think it's fixed already in -git22, or at least it is for the IBM box reporting to test.kernel.org. You might want to try that one ... -git22 also panics for me. Steve, Can you post the latest panic stack again (with CONFIG_DEBUG_KERNEL) ? Last time I couldn't match your instruction dump to any code segment in the routine. And also, can you post your .config file. I have an amd64 and em64t machine and both work fine... Thanks, Badari - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-mm2 boot failure on x86-64 II
keith mannthey wrote: On Fri, 2006-10-06 at 01:35 +0200, Andi Kleen wrote: As of yet I haven't been able to recreate the hang. I am running similar HW to Steve. I ran into this with -mm3 Memory: 24150368k/26738688k available (1933k kernel code, 490260k reserved, 978k data, 308k init) [ cut here ] kernel BUG in init_list at mm/slab.c:1334! invalid opcode: [1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.18-mm3-smp #1 RIP: 0010:[8027f8fa] [8027f8fa] init_list+0x1d/0xfd RSP: 0018:80577f48 EFLAGS: 00010212 RAX: 0040 RBX: 0001 RCX: RDX: 0001 RSI: 805ba848 RDI: 810460700040 RBP: 0001 R08: 0001 R09: 0003 R10: R11: 805bc268 R12: 810460700040 R13: 805ba848 R14: R15: FS: () GS:804d8000() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: CR3: 00201000 CR4: 06a0 Process swapper (pid: 0, threadinfo 80576000, task 80455840) Stack: 0001 0001 805ba848 80593aa8 02c0 00010001 0008ef00 0008c000 Call Trace: [80593aa8] kmem_cache_init+0x344/0x406 [805805ef] start_kernel+0x180/0x21b [8058016a] _sinittext+0x16a/0x16e Code: 0f 0b 48 8b 3d 15 ab 1e 00 be d0 00 00 00 e8 c0 f5 ff ff 48 RIP [8027f8fa] init_list+0x1d/0xfd RSP 80577f48 0Kernel panic - not syncing: Attempted to kill the idle task! I am going to revert the patch and see if it works. I ran -git22 just fine. Thanks, Keith Keith, I fixed this already. Can you look for it on lkml (look for 2.6.18-mm3 in the subject line). one typo in mm/slab.c Thanks, Badari - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network problem with 2.6.18-mm1 ?
On Fri, 2006-09-29 at 17:30 -0600, Eric W. Biederman wrote: So it looks like the kernel moved the ioapics. The following patch in 2.6.18-mm1 is known to have that effect. x86_64-mm-insert-ioapics-and-local-apic-into-resource-map Can you please try reverting that one patch? There is a fix an updated version of that patch I think in -mm2 but I haven't had a chance to see if it fixes the problem yet. Bingo !! Reverting this patch fixed my networking problem on 2.6.18-mm2. Thanks, Badari - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take3 2/4] kevent: AIO, aio_sendfile() implementation.
Evgeniy Polyakov wrote: AIO, aio_sendfile() implementation. This patch includes asynchronous propagation of file's data into VFS cache and aio_sendfile() implementation. Network aio_sendfile() works lazily - it asynchronously populates pages into the VFS cache (which can be used for various tricks with adaptive readahead) and then uses usual -sendfile() callback. ... --- /dev/null +++ b/kernel/kevent/kevent_aio.c @@ -0,0 +1,584 @@ +/* + * kevent_aio.c + * Since this is *almost* same as mpage.c code, wondering if its possible to make common generic/helper routines in mpage.c and use it here ? Thanks, Badari - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3/4] kevent: AIO, aio_sendfile() implementation.
Sébastien Dugué wrote: On Wed, 2006-07-26 at 09:22 -0700, Badari Pulavarty wrote: Ulrich Drepper wrote: Christoph Hellwig wrote: My personal opinion on existing AIO is that it is not the right design. Benjamin LaHaise agree with me (if I understood him right), I completely agree with that aswell. I agree, too, but the current code is not the last of the line. Suparna has a st of patches which make the current kernel aio code work much better and especially make it really usable to implement POSIX AIO. In Ottawa we were talking about submitting it and Suparna will. We just thought about a little longer timeframe. I guess it could be accelerated since he mostly has the patch done. But I don't know her schedule. Important here is, don't base any decision on the current aio implementation. Ulrich, Suparna mentioned your interest in making POSIX glibc aio work with kernel-aio at OLS. We thought taking a re-look at the (kernel side) work BULL did, would be a nice starting point. I re-based those patches to 2.6.18-rc2 and sent it to Zach Brown for review before sending them out to list. These patches does NOT make AIO any cleaner. All they do is add functionality to support POSIX AIO easier. These are [ PATCH 1/3 ] Adding signal notification for event completion [ PATCH 2/3 ] lio (listio) completion semantics [ PATCH 3/3 ] cancel_fd support Badari, Thanks for refreshing those patches, they have been sitting here for quite some time now and collected dust. I also think Suparna's patchset for doing buffered AIO would be a real plus here. Suparna explained these in the following article: http://lwn.net/Articles/148755/ If you think, this is a reasonable direction/approach for the kernel and you would take care of glibc side of things - I can spend time on these patches, getting them to reasonable shape and push for inclusion. Ulrich, I you want to have a look at how those patches are put to use in libposix-aio, have a look at http://sourceforge.net/projects/paiol. It could be a starting point for glibc. Thanks, Sébastien. Sebastien, Suparna mentioned at Ulrich wants us to concentrate on kernel-side support, so that he can look at glibc side of things (along with other work he is already doing). So, if we can get an agreement on what kind of kernel support is needed - we can focus our efforts on kernel side first and leave glibc enablement to capable hands of Uli :) Thanks, Badari - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3/4] kevent: AIO, aio_sendfile() implementation.
On Thu, 2006-07-27 at 11:14 -0700, Zach Brown wrote: Suparna mentioned at Ulrich wants us to concentrate on kernel-side support, so that he can look at glibc side of things (along with other work he is already doing). So, if we can get an agreement on what kind of kernel support is needed - we can focus our efforts on kernel side first and leave glibc enablement to capable hands of Uli :) Yeah, and the existing patches still need some cleanup. Badari, did you still want me to look into that? We need someone to claim ultimate responsibility for getting these patches suitable for merging :). I'm happy to do that if Suparna isn't already on it. Zach, Thanks for volunteering !! Sebastien I should be able to help you. Before we spend too much time cleaning up and merging into mainline - I would like an agreement that what we add is good enough for glibc POSIX AIO. I hate to waste everyone's time and add complexity to the kernel - if glibc side is not going to happen :( Thanks, Badari - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3/4] kevent: AIO, aio_sendfile() implementation.
On Thu, 2006-07-27 at 11:44 -0700, Ulrich Drepper wrote: Badari Pulavarty wrote: Before we spend too much time cleaning up and merging into mainline - I would like an agreement that what we add is good enough for glibc POSIX AIO. I haven't seen a description of the interface so far. Would be good if it existed. But I briefly mentioned one quirk in the interface about which Suparna wasn't sure whether it's implemented/implementable in the current interface. Sebastien, could you provide a description of interfaces you are adding ? Since you did all the work, it would be appropriate for you to do it :) If a lio_listio call is made the individual requests are handle just as if they'd be issue separately. I.e., the notification specified in the individual aiocb is performed when the specific request is done. Then, once all requests are done, another notification is made, this time controlled by the sigevent parameter if lio_listio. Another feature which I always wanted: the current lio_listio call returns in blocking mode only if all requests are done. In non-blocking mode it returns immediately and the program needs to poll the aiocbs. What is needed is something in the middle. For instance, if multiple read requests are issued the program might be able to start working as soon as one request is satisfied. I.e., a call similar to lio_listio would be nice which also takes another parameter specifying how many of the NENT aiocbs have to finish before the call returns. Looks reasonable. Thanks, Badari - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html