Re: RFC: drop support for gcc < 4.0
On Tue, 21 Aug 2007, Adrian Bunk wrote: > It is an option to say "gcc >= 4.0 on i386 and >= 3.4 on all other > architectures is required". if you're going to do something like that, you might as well take the extra step and start keeping track of which versions of gcc work with which architectures, along the lines of what dan kegel did with the results matrix of crosstool: http://www.kegel.com/crosstool/crosstool-0.43/buildlogs/ i'm being only moderately facetious, of course but, on the other hand, if there's all this anecdotal information regarding which combinations work and which don't, maybe it's worth codifying that into a compilation check somewhere in the build process. after all, at the moment in init/main.c, any gcc < 3.2 is rejected outright, while gcc-4.1.0 generates a warning. that's incredibly ad hoc and certainly incomplete. might as well just write a script for the scripts/ directory which accepts an architecture and a version of gcc and tells you what the current situation is and what you can do about it. rday -- Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://crashcourse.ca - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add I/O hypercalls for i386 paravirt
Avi Kivity wrote: Zachary Amsden wrote: In general, I/O in a virtual guest is subject to performance problems. The I/O can not be completed physically, but must be virtualized. This means trapping and decoding port I/O instructions from the guest OS. Not only is the trap for a #GP heavyweight, both in the processor and the hypervisor (which usually has a complex #GP path), but this forces the hypervisor to decode the individual instruction which has faulted. Worse, even with hardware assist such as VT, the exit reason alone is not sufficient to determine the true nature of the faulting instruction, requiring a complex and costly instruction decode and simulation. This patch provides hypercalls for the i386 port I/O instructions, which vastly helps guests which use native-style drivers. For certain VMI workloads, this provides a performance boost of up to 30%. We expect KVM and lguest to be able to achieve similar gains on I/O intensive workloads. Won't these workloads be better off using paravirtualized drivers? i.e., do the native drivers with paravirt I/O instructions get anywhere near the performance of paravirt drivers? Yes, in general, this is true (better off with paravirt drivers). However, we have "paravirt" drivers which run in both fully-paravirtualized and fully traditionally virtualized environments. As a result, they use native port I/O operations to interact with virtual hardware. Since not all hypervisors have paravirtualized driver infrastructures and guest O/S support yet, these hypercalls can be advantages to a wide range of scenarios. Using I/O hypercalls as such gives exactly the same performance as paravirt drivers for us, by eliminating the costly decode path, and the simplicity of using the same driver code makes this a huge win in code complexity. Zach - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Console events and accessibility
On Tue, Aug 21, 2007 at 11:29:39PM +0200, Samuel Thibault wrote: > Some external modules like Speakup need to monitor console output. > > This adds a VT notifier that such modules can use to get console output > events: > allocation, deallocation, writes, other updates (cursor position, switch, > etc.) > > Signed-off-by: Samuel Thibault <[EMAIL PROTECTED]> Will speakup work with this kind of change? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Restricting CDC-ACM devices
On Tue, Aug 21, 2007 at 05:03:54PM -0500, Nate wrote: > I would like to use the cdc-acm driver in the Linux kernel (2.6.22-rc1), > but restrict the access to only my VID/PID devices. Is there an easy way > to do with without modifying cdc-acm.c? Why do you not want to modify the driver? > In a past prototype I made a simple wrapper driver for usb serial by > adding my VID/PID numbers to the wrapper driver's id_table. Then when > that usb driver was accessed on connection, the driver just pointed to the > usb_serial_* functions (probe, disconnect, etc). I tried to do the same > with the cdc-acm driver, but the cdc-acm driver's probe function was > called before my driver's probe. I noticed that the cdc-amc driver will > attach when it detects the two CDC-ACM interfaces, so I removed the > cdc-acm driver with "make menuconfig". This didn't work because the > cdc-acm functions I was attempting to call from my driver do not exist. You can disconnect the device from the driver from userspace for any device you just don't want to have connected by using the sysfs bind/unbind files. That doesn't require any kernel changes at all. Why do you want to do this, what are you expecting to achieve with such a change? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add I/O hypercalls for i386 paravirt
Zachary Amsden wrote: > In general, I/O in a virtual guest is subject to performance > problems. The I/O can not be completed physically, but must be > virtualized. This means trapping and decoding port I/O instructions > from the guest OS. Not only is the trap for a #GP heavyweight, both > in the processor and the hypervisor (which usually has a complex #GP > path), but this forces the hypervisor to decode the individual > instruction which has faulted. Worse, even with hardware assist such > as VT, the exit reason alone is not sufficient to determine the true > nature of the faulting instruction, requiring a complex and costly > instruction decode and simulation. > > This patch provides hypercalls for the i386 port I/O instructions, > which vastly helps guests which use native-style drivers. For certain > VMI workloads, this provides a performance boost of up to 30%. We > expect KVM and lguest to be able to achieve similar gains on I/O > intensive workloads. > Won't these workloads be better off using paravirtualized drivers? i.e., do the native drivers with paravirt I/O instructions get anywhere near the performance of paravirt drivers? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Add I/O hypercalls for i386 paravirt
In general, I/O in a virtual guest is subject to performance problems. The I/O can not be completed physically, but must be virtualized. This means trapping and decoding port I/O instructions from the guest OS. Not only is the trap for a #GP heavyweight, both in the processor and the hypervisor (which usually has a complex #GP path), but this forces the hypervisor to decode the individual instruction which has faulted. Worse, even with hardware assist such as VT, the exit reason alone is not sufficient to determine the true nature of the faulting instruction, requiring a complex and costly instruction decode and simulation. This patch provides hypercalls for the i386 port I/O instructions, which vastly helps guests which use native-style drivers. For certain VMI workloads, this provides a performance boost of up to 30%. We expect KVM and lguest to be able to achieve similar gains on I/O intensive workloads. This patch is against 2.6.23-rc2-mm2, and should be targeted for 2.6.24. Zach Virtualized guests in general benefit from having I/O hypercalls. This patch adds support for port I/O hypercalls to VMI and provides the infrastructure for other backends to make use of this feature. Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]> diff --git a/arch/i386/kernel/paravirt.c b/arch/i386/kernel/paravirt.c index ea962c0..4d0d150 100644 --- a/arch/i386/kernel/paravirt.c +++ b/arch/i386/kernel/paravirt.c @@ -329,6 +329,18 @@ struct paravirt_ops paravirt_ops = { .set_iopl_mask = native_set_iopl_mask, .io_delay = native_io_delay, + .outb = native_outb, + .outw = native_outw, + .outl = native_outl, + .inb = native_inb, + .inw = native_inw, + .inl = native_inl, + .outsb = native_outsb, + .outsw = native_outsw, + .outsl = native_outsl, + .insb = native_insb, + .insw = native_insw, + .insl = native_insl, #ifdef CONFIG_X86_LOCAL_APIC .apic_write = native_apic_write, diff --git a/arch/i386/kernel/vmi.c b/arch/i386/kernel/vmi.c index 44feb34..5ecd85b 100644 --- a/arch/i386/kernel/vmi.c +++ b/arch/i386/kernel/vmi.c @@ -56,6 +56,7 @@ static int disable_tsc; static int disable_mtrr; static int disable_noidle; static int disable_vmi_timer; +static int disable_io_ops; /* Cached VMI operations */ static struct { @@ -72,6 +73,18 @@ static struct { void (*set_initial_ap_state)(int, int); void (*halt)(void); void (*set_lazy_mode)(int mode); + void (*outb)(u8 value, u16 port); + void (*outw)(u16 value, u16 port); + void (*outl)(u32 value, u16 port); + u8 (*inb)(u16 port); + u16 (*inw)(u16 port); + u32 (*inl)(u16 port); + void (*outsb)(const void *addr, u16 port, u32 count); + void (*outsw)(const void *addr, u16 port, u32 count); + void (*outsl)(const void *addr, u16 port, u32 count); + void (*insb)(void *addr, u16 port, u32 count); + void (*insw)(void *addr, u16 port, u32 count); + void (*insl)(void *addr, u16 port, u32 count); } vmi_ops; /* Cached VMI operations */ @@ -565,6 +578,33 @@ static void vmi_set_lazy_mode(enum paravirt_lazy_mode mode) } } +#define BUILDIO(bwl,type) \ +static void vmi_out##bwl(type value, int port) { \ + __asm__ __volatile__("call *%0" : : \ + "r"(vmi_ops.out##bwl), "a"(value), "d"(port)); \ +} \ +static type vmi_in##bwl(int port) { \ + type value; \ + __asm__ __volatile__("call *%1" : \ + "=a"(value) : \ + "r"(vmi_ops.in##bwl), "d"(port)); \ + return value; \ +} \ +static void vmi_outs##bwl(int port, const void *addr, unsigned long count) { \ + __asm__ __volatile__("call *%2" : \ + "+S"(addr), "+c"(count) : \ + "r"(vmi_ops.outs##bwl), "d"(port)); \ +} \ +static void vmi_ins##bwl(int port, void *addr, unsigned long count) { \ + __asm__ __volatile__("call *%2" : \ + "+D"(addr), "+c"(count) : \ + "r"(vmi_ops.ins##bwl), "d"(port)); \ +} + +BUILDIO(b,unsigned char) +BUILDIO(w,unsigned short) +BUILDIO(l,unsigned int) + static inline int __init check_vmi_rom(struct vrom_header *rom) { struct pci_header *pci; @@ -791,6 +831,21 @@ static inline int __init activate_vmi(void) para_wrap(load_esp0, vmi_load_esp0, set_kernel_stack, UpdateKernelStack); para_fill(set_iopl_mask, SetIOPLMask); para_fill(io_delay, IODelay); + if (!disable_io_ops) { + para_wrap(inb, vmi_inb, inb, INB); + para_wrap(inw, vmi_inw, inw, INW); + para_wrap(inl, vmi_inl, inl, INL); + para_wrap(outb, vmi_outb, outb, OUTB); + para_wrap(outw, vmi_outw, outw, OUTW); + para_wrap(outl, vmi_outl, outl, OUTL); + para_wrap(insb, vmi_insb, insb, INSB); + para_wrap(insw, vmi_insw, insw, INSW); + para_wrap(insl, vmi_insl, insl, INSL); + para_wrap(outsb, vmi_outsb, outsb, OUTSB); + para_wrap(outsw, vmi_outsw, outsw, OUTSW); + para_wrap(outsl, vmi_outsl, outsl, OUTSL); + } + para_wrap(set_lazy_mode, vmi_set_lazy_mode, set_lazy_mode, SetLazyMode); /* user and kernel flush are just handled with different flags to FlushTLB */ @@ -968,6 +1023,8 @@ static int __init parse_vmi(char *arg) disable_noidle = 1; } else if (!strcmp(arg, "disable_noidle")) disable_noidle
Re: [PATCH 11/23] make atomic_read() and atomic_set() behavior consistent on m32r
Hi, Chris, From: Hirokazu Takata <[EMAIL PROTECTED]> Date: Wed, 22 Aug 2007 10:56:54 +0900 > From: Chris Snook <[EMAIL PROTECTED]> > Date: Mon, 13 Aug 2007 07:24:52 -0400 > > From: Chris Snook <[EMAIL PROTECTED]> > > > > Use volatile consistently in atomic.h on m32r. > > > > Signed-off-by: Chris Snook <[EMAIL PROTECTED]> > > Thanks, > > Acked-by: Hirokazu Takata <[EMAIL PROTECTED]> Hmmm.. It seems my reply was overhasty. Applying the above patch, I have many warning messages like this: <-- snip --> ... CC kernel/sched.o In file included from /project/m32r-linux/kernel/work/linux-2.6_dev.git/include/linux/netlink.h:139, from /project/m32r-linux/kernel/work/linux-2.6_dev.git/include/linux/genetlink.h:4, from /project/m32r-linux/kernel/work/linux-2.6_dev.git/include/net/genetlink.h:4, from /project/m32r-linux/kernel/work/linux-2.6_dev.git/include/linux/taskstats_kern.h:12, from /project/m32r-linux/kernel/work/linux-2.6_dev.git/include/linux/delayacct.h:21, from /project/m32r-linux/kernel/work/linux-2.6_dev.git/kernel/sched.c:61: /project/m32r-linux/kernel/work/linux-2.6_dev.git/include/linux/skbuff.h: In function 'skb_shared': /project/m32r-linux/kernel/work/linux-2.6_dev.git/include/linux/skbuff.h:521: warning: passing argument 1 of 'atomic_read' discards qualifiers from pointer target type ... <-- snip --> In this case, it is because stb_shared() is defined with a parameter with "const" qualifier, in include/linux/skbuff.h. static inline int skb_shared(const struct sk_buff *skb) { return atomic_read(>users) != 1; } I think the parameter of atomic_read() should have "const" qualifier to avoid these warnings, and IMHO this modification might be worth applying on other archs. Here is an additional patch to revise the previous one for m32r. I also tried to rewrite it with inline asm code, but the kernel text size bacame roughly 2kB larger. So, I prefer C version. Thanks, -- Takata [PATCH] m32r: Add "const" qualifier to the parameter of atomic_read() Update atomic_read() to avoid the following warning of gcc-4.1.x: warning: passing argument 1 of 'atomic_read' discards qualifiers from pointer target type Signed-off-by: Hirokazu Takata <[EMAIL PROTECTED]> Cc: Chris Snook <[EMAIL PROTECTED]> --- include/asm-m32r/atomic.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/include/asm-m32r/atomic.h b/include/asm-m32r/atomic.h index ba19689..9d46f86 100644 --- a/include/asm-m32r/atomic.h +++ b/include/asm-m32r/atomic.h @@ -32,7 +32,7 @@ typedef struct { int counter; } atomic_t; * * Atomically reads the value of @v. */ -static __inline__ int atomic_read(atomic_t *v) +static __inline__ int atomic_read(const atomic_t *v) { return *(volatile int *)>counter; } -- 1.5.2.4 -- Hirokazu Takata <[EMAIL PROTECTED]> Linux/M32R Project: http://www.linux-m32r.org/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH][BUGFIX] fix rcu_read_lock in page migraton
This is a patch against the problme Shaohua rported. Just an idea for fix the problem. How do you think ? dummy vma is better ? (I don't like dummy vma.) -Kame == In migration fallback path, write_page() or lock_page() will be called. This causes sleep with holding rcu_read_lock(). For avoding that, just do rcu_lock if the page is Anon.(this is enough.) Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> --- mm/migrate.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) Index: linux-2.6.23-rc2-mm2/mm/migrate.c === --- linux-2.6.23-rc2-mm2.orig/mm/migrate.c +++ linux-2.6.23-rc2-mm2/mm/migrate.c @@ -611,6 +611,7 @@ static int unmap_and_move(new_page_t get int rc = 0; int *result = NULL; struct page *newpage = get_new_page(page, private, ); + int rcu_locked = 0; if (!newpage) return -ENOMEM; @@ -636,8 +637,13 @@ static int unmap_and_move(new_page_t get * we cannot notice that anon_vma is freed while we migrates a page. * This rcu_read_lock() delays freeing anon_vma pointer until the end * of migration. File cache pages are no problem because of page_lock() +* File Caches may use write_page() or lock_page() in migration, then, +* just care Anon page here. */ - rcu_read_lock(); + if (PageAnon(page)) { + rcu_read_lock(); + rcu_locked = 1; + } /* * This is a corner case handling. * When a new swap-cache is read into, it is linked to LRU @@ -656,7 +662,8 @@ static int unmap_and_move(new_page_t get if (rc) remove_migration_ptes(page, page); rcu_unlock: - rcu_read_unlock(); + if (rcu_locked) + rcu_read_unlock(); unlock: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] ppc .gitignore update
Adrian Bunk writes: > From: Grant Likely <[EMAIL PROTECTED]> > > arch/ppc/.gitignore shouldn't exclude arch/ppc/boot/include Already in my for-2.6.24 and master branches. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Smack: Simplified Mandatory Access Control Kernel
--- Kyle Moffett <[EMAIL PROTECTED]> wrote: > On Aug 21, 2007, at 11:50:48, Casey Schaufler wrote: > > --- Kyle Moffett <[EMAIL PROTECTED]> wrote: > >> Well, in this case the "box" I want to secure will eventually be > >> running multi-user X on a multi-level-with-IPsec network. For > >> that kind of protection profile, there is presently no substitute > >> for SELinux with some X11 patches. AppArmor certainly doesn't > >> meet the confidentiality requirements (no data labelling), and > >> SMACK has no way of doing the very tight per-syscall security > >> requirements we have to meet. > > > > And what requirements would those be? Seriously, I've done Common > > Criteria and TCSEC evaluations on systems with less flexibility and > > granularity than Smack that included X, NFSv3, NIS, clusters, and > > all sorts of spiffy stuff. > > These are requirements more of the "give the client warm fuzzies". OK, that's perfectly reasonable. If the client has been sold on the concept of SELinux the client will get warm fuzzies only from SELinux. Security is how you feel about it, after all. > On the other hand, when designing a box that could theoretically be > run on a semi-public unclassified network and yet still be safe > enough to run classified data over IPsec links, you want to give the > client all the warm fuzzies they ask for and more. Yes. Of course, a little hard technology behind it doesn't hurt, either. > > I mean, if the requirement is anything short of "runs SELinux" I > > have good reason to believe that a Smack based system is up to it. > > "up to it", yes, but I think you'll find that beyond the simplest > policies, an SELinux policy that properly uses the SELinux > infrastructure will be much shorter than the equivalent SMACK policy, Well, I find that hard to believe. Maybe I'm only thinking of what you would consider the simplest policies. > not even including all the things that SELinux does and SMACK doesn't. Of course. > >> I didn't make this clear initially but that is the kind of system > >> I'm talking about wanting to secure some 50 million lines of code on. > > > > Cool. SELinux provides one approach to dealing with that, and the > > huge multiuser general purpose machine chuck full of legacy > > software hits the SELinux sweet spot. > > Well, given that 99.9% of the systems people are really concerned > about security on are multi-user general-purpose machines chuck full > of legacy software, that seems to work just fine. Err, no. By unit count such systems are extremely rare. There is tremendous concern for security in your cell phone, your DVR, your PDA, and even your toaster. > If it's a single- > user box then you don't even need MAC, just a firewall, a good locked > rack/case/keyboard/etc, and decent physical security. You cell phone has really lousy physical security. > If it's > entirely custom-controlled software then you can just implement the > "MAC" entirely in your own software. "General-purpose" vs "special- > purpose" is debatable, so I'll just leave that one lie. Indeed. Total control over the software on your phone is not a competetive option for a provider. > Replying to another email: > >> but you written it in wrong language. You written it in C, while > >> you should have written it in SELinux policy language (and your > >> favourite scripting language as frontend). > > > > I have often marvelled at the notion of a simplification layer. I > > believe that you build complex things on top of simple things, not > > the other way around. > > There is no "one answer" to this question in software development. You're correct. Can I quote you on that? > Generally you prioritize things based on maximizing maintainability > and speed and minimizing code, bugs, and complexity. Those are often > both conflicting and in agreement. Here are a few common examples of > simple-thing-on-complex-thing: > ... > > Look at the SELinux model again; it has the following things: >(A) Labels on almost-all user-visible kernel objects >(B) Individual access rules for almost every operation on those > objects >(C) "Transition" rules to set the label on newly created objects. >(D) Fundamental "constraints" which enforce hard limits on what > may be permitted with "allow" rules > > From a fundamental standpoint it's harder to get much simpler than > that. It's easy to get simpler than that: (A) Labels on all objects and subjects (B) Access rules for subjects and objects No transformations. Operations in terms of rwx. lots simpler. > On top of that model, we also have a bit of additional > *flexibility* for MLS/RBAC, although that flexibility may be ignored > completely. >(1) You can define "users" which may only assume some "roles" >(2) You can define "roles" may only run in some "types" >(3) There's a simple way of declaring multiple "levels" and
[PATCH 10/11] cxgb3 - Firmware update
From: Divy Le Ray <[EMAIL PROTECTED]> Update firmware version Allow the driver to be up and running with older FW image Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/common.h |2 +- drivers/net/cxgb3/cxgb3_main.c |9 + drivers/net/cxgb3/t3_hw.c | 20 +++- drivers/net/cxgb3/version.h|2 +- 4 files changed, 22 insertions(+), 11 deletions(-) diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h index b665b20..ff867c2 100644 --- a/drivers/net/cxgb3/common.h +++ b/drivers/net/cxgb3/common.h @@ -691,7 +691,7 @@ int t3_read_flash(struct adapter *adapter, unsigned int addr, unsigned int nwords, u32 *data, int byte_oriented); int t3_load_fw(struct adapter *adapter, const u8 * fw_data, unsigned int size); int t3_get_fw_version(struct adapter *adapter, u32 *vers); -int t3_check_fw_version(struct adapter *adapter); +int t3_check_fw_version(struct adapter *adapter, int *must_load); int t3_init_hw(struct adapter *adapter, u32 fw_params); void mac_prep(struct cmac *mac, struct adapter *adapter, int index); void early_hw_init(struct adapter *adapter, const struct adapter_info *ai); diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index 65ded16..eaebd7f 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -814,11 +814,12 @@ static int cxgb_up(struct adapter *adap) int must_load; if (!(adap->flags & FULL_INIT_DONE)) { - err = t3_check_fw_version(adap); - if (err == -EINVAL) + err = t3_check_fw_version(adap, _load); + if (err == -EINVAL) { err = upgrade_fw(adap); - if (err) - goto out; + if (err && must_load) + goto out; + } err = t3_check_tpsram_version(adap, _load); if (err == -EINVAL) { diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c index 63032e8..3d47627 100644 --- a/drivers/net/cxgb3/t3_hw.c +++ b/drivers/net/cxgb3/t3_hw.c @@ -957,16 +957,18 @@ int t3_get_fw_version(struct adapter *adapter, u32 *vers) /** * t3_check_fw_version - check if the FW is compatible with this driver * @adapter: the adapter - * + * @must_load: set to 1 if loading a new FW image is required + * Checks if an adapter's FW is compatible with the driver. Returns 0 * if the versions are compatible, a negative error otherwise. */ -int t3_check_fw_version(struct adapter *adapter) +int t3_check_fw_version(struct adapter *adapter, int *must_load) { int ret; u32 vers; unsigned int type, major, minor; + *must_load = 1; ret = t3_get_fw_version(adapter, ); if (ret) return ret; @@ -979,9 +981,17 @@ int t3_check_fw_version(struct adapter *adapter) minor == FW_VERSION_MINOR) return 0; - CH_ERR(adapter, "found wrong FW version(%u.%u), " - "driver needs version %u.%u\n", major, minor, - FW_VERSION_MAJOR, FW_VERSION_MINOR); + if (major != FW_VERSION_MAJOR) + CH_ERR(adapter, "found wrong FW version(%u.%u), " + "driver needs version %u.%u\n", major, minor, + FW_VERSION_MAJOR, FW_VERSION_MINOR); + else { + *must_load = 0; + CH_WARN(adapter, "found wrong FW minor version(%u.%u), " + "driver compiled for version %u.%u\n", major, minor, + FW_VERSION_MAJOR, FW_VERSION_MINOR); + } + return -EINVAL; } diff --git a/drivers/net/cxgb3/version.h b/drivers/net/cxgb3/version.h index eb508bf..ef1c633 100644 --- a/drivers/net/cxgb3/version.h +++ b/drivers/net/cxgb3/version.h @@ -39,6 +39,6 @@ /* Firmware version */ #define FW_VERSION_MAJOR 4 -#define FW_VERSION_MINOR 3 +#define FW_VERSION_MINOR 6 #define FW_VERSION_MICRO 0 #endif /* __CHELSIO_VERSION_H */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 11/11] cxgb3 - log and clear PEX errors
From: Divy Le Ray <[EMAIL PROTECTED]> Clear pciE PEX errors late at module load time. Log details when PEX errors occur. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/t3_hw.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c index 3d47627..538b254 100644 --- a/drivers/net/cxgb3/t3_hw.c +++ b/drivers/net/cxgb3/t3_hw.c @@ -1355,6 +1355,10 @@ static void pcie_intr_handler(struct adapter *adapter) {0} }; + if (t3_read_reg(adapter, A_PCIE_INT_CAUSE) & F_PEXERR) + CH_ALERT(adapter, "PEX error code 0x%x\n", +t3_read_reg(adapter, A_PCIE_PEX_ERR)); + if (t3_handle_intr_status(adapter, A_PCIE_INT_CAUSE, PCIE_INTR_MASK, pcie_intr_info, adapter->irq_stats)) t3_fatal_err(adapter); @@ -1806,6 +1810,8 @@ void t3_intr_clear(struct adapter *adapter) for (i = 0; i < ARRAY_SIZE(cause_reg_addr); ++i) t3_write_reg(adapter, cause_reg_addr[i], 0x); + if (is_pcie(adapter)) + t3_write_reg(adapter, A_PCIE_PEX_ERR, 0x); t3_write_reg(adapter, A_PL_INT_CAUSE0, 0x); t3_read_reg(adapter, A_PL_INT_CAUSE0); /* flush */ } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 8/11] cxgb3 - Update internal memory management
From: Divy Le Ray <[EMAIL PROTECTED]> Set PM1 internal memory to round robin mode It balances access to this internal memory for multiport adapters. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/regs.h |2 ++ drivers/net/cxgb3/t3_hw.c |2 ++ 2 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/net/cxgb3/regs.h b/drivers/net/cxgb3/regs.h index 2824278..5e1bc0d 100644 --- a/drivers/net/cxgb3/regs.h +++ b/drivers/net/cxgb3/regs.h @@ -1326,6 +1326,7 @@ #define V_D0_WEIGHT(x) ((x) << S_D0_WEIGHT) #define A_PM1_RX_CFG 0x5c0 +#define A_PM1_RX_MODE 0x5c4 #define A_PM1_RX_INT_ENABLE 0x5d8 @@ -1394,6 +1395,7 @@ #define A_PM1_RX_INT_CAUSE 0x5dc #define A_PM1_TX_CFG 0x5e0 +#define A_PM1_TX_MODE 0x5e4 #define A_PM1_TX_INT_ENABLE 0x5f8 diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c index 23b1a16..13bfbec 100644 --- a/drivers/net/cxgb3/t3_hw.c +++ b/drivers/net/cxgb3/t3_hw.c @@ -3189,6 +3189,8 @@ int t3_init_hw(struct adapter *adapter, u32 fw_params) t3_set_reg_field(adapter, A_PCIX_CFG, 0, F_CLIDECEN); t3_write_reg(adapter, A_PM1_RX_CFG, 0x); + t3_write_reg(adapter, A_PM1_RX_MODE, 0); + t3_write_reg(adapter, A_PM1_TX_MODE, 0); init_hw_for_avail_ports(adapter, adapter->params.nports); t3_sge_init(adapter, >params.sge); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 9/11] cxgb3 - engine microcode update
From: Divy Le Ray <[EMAIL PROTECTED]> Load microcode engine when the interface is configured up. Bump up version to 1.1.0. Allow the driver to be and running with older microcode images. Allow ethtool to log the microcode version. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/common.h |8 ++- drivers/net/cxgb3/cxgb3_main.c | 116 drivers/net/cxgb3/t3_hw.c | 43 +-- 3 files changed, 113 insertions(+), 54 deletions(-) diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h index d54446f..b665b20 100644 --- a/drivers/net/cxgb3/common.h +++ b/drivers/net/cxgb3/common.h @@ -127,8 +127,8 @@ enum { /* adapter interrupt-maintained statistics */ enum { TP_VERSION_MAJOR= 1, - TP_VERSION_MINOR= 0, - TP_VERSION_MICRO= 44 + TP_VERSION_MINOR= 1, + TP_VERSION_MICRO= 0 }; #define S_TP_VERSION_MAJOR 16 @@ -438,6 +438,7 @@ enum { /* chip revisions */ T3_REV_A = 0, T3_REV_B = 2, T3_REV_B2 = 3, + T3_REV_C = 4, }; struct trace_params { @@ -682,7 +683,8 @@ const struct adapter_info *t3_get_adapter_info(unsigned int board_id); int t3_seeprom_read(struct adapter *adapter, u32 addr, u32 *data); int t3_seeprom_write(struct adapter *adapter, u32 addr, u32 data); int t3_seeprom_wp(struct adapter *adapter, int enable); -int t3_check_tpsram_version(struct adapter *adapter); +int t3_get_tp_version(struct adapter *adapter, u32 *vers); +int t3_check_tpsram_version(struct adapter *adapter, int *must_load); int t3_check_tpsram(struct adapter *adapter, u8 *tp_ram, unsigned int size); int t3_set_proto_sram(struct adapter *adap, u8 *data); int t3_read_flash(struct adapter *adapter, unsigned int addr, diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index e5744e7..65ded16 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -721,6 +721,7 @@ static void bind_qsets(struct adapter *adap) } #define FW_FNAME "t3fw-%d.%d.%d.bin" +#define TPSRAM_NAME "t3%c_protocol_sram-%d.%d.%d.bin" static int upgrade_fw(struct adapter *adap) { @@ -742,6 +743,61 @@ static int upgrade_fw(struct adapter *adap) return ret; } +static inline char t3rev2char(struct adapter *adapter) +{ + char rev = 0; + + switch(adapter->params.rev) { + case T3_REV_A: + rev = 'a'; + break; + case T3_REV_B: + case T3_REV_B2: + rev = 'b'; + break; + case T3_REV_C: + rev = 'c'; + break; + } + return rev; +} + +int update_tpsram(struct adapter *adap) +{ + const struct firmware *tpsram; + char buf[64]; + struct device *dev = >pdev->dev; + int ret; + char rev; + + rev = t3rev2char(adap); + if (!rev) + return 0; + + snprintf(buf, sizeof(buf), TPSRAM_NAME, rev, +TP_VERSION_MAJOR, TP_VERSION_MINOR, TP_VERSION_MICRO); + + ret = request_firmware(, buf, dev); + if (ret < 0) { + dev_err(dev, "could not load TP SRAM: unable to load %s\n", + buf); + return ret; + } + + ret = t3_check_tpsram(adap, tpsram->data, tpsram->size); + if (ret) + goto release_tpsram; + + ret = t3_set_proto_sram(adap, tpsram->data); + if (ret) + dev_err(dev, "loading protocol SRAM failed\n"); + +release_tpsram: + release_firmware(tpsram); + + return ret; +} + /** * cxgb_up - enable the adapter * @adapter: adapter being enabled @@ -755,6 +811,7 @@ static int upgrade_fw(struct adapter *adap) static int cxgb_up(struct adapter *adap) { int err = 0; + int must_load; if (!(adap->flags & FULL_INIT_DONE)) { err = t3_check_fw_version(adap); @@ -763,6 +820,13 @@ static int cxgb_up(struct adapter *adap) if (err) goto out; + err = t3_check_tpsram_version(adap, _load); + if (err == -EINVAL) { + err = update_tpsram(adap); + if (err && must_load) + goto out; + } + err = init_dummy_netdevs(adap); if (err) goto out; @@ -1097,9 +1161,11 @@ static int get_eeprom_len(struct net_device *dev) static void get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info) { u32 fw_vers = 0; + u32 tp_vers = 0; struct adapter *adapter = dev->priv; t3_get_fw_version(adapter, _vers); + t3_get_tp_version(adapter, _vers); strcpy(info->driver, DRV_NAME); strcpy(info->version, DRV_VERSION); @@
[PATCH 6/11 RESEND] cxgb3 - Fatal error update
From: Divy Le Ray <[EMAIL PROTECTED]> Stop the MAC when a fatal error is detected. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/cxgb3_main.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index dc5d269..a1f94cf 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -2270,6 +2270,10 @@ void t3_fatal_err(struct adapter *adapter) if (adapter->flags & FULL_INIT_DONE) { t3_sge_stop(adapter); + t3_write_reg(adapter, A_XGM_TX_CTRL, 0); + t3_write_reg(adapter, A_XGM_RX_CTRL, 0); + t3_write_reg(adapter, XGM_REG(A_XGM_TX_CTRL, 1), 0); + t3_write_reg(adapter, XGM_REG(A_XGM_RX_CTRL, 1), 0); t3_intr_disable(adapter); } CH_ALERT(adapter, "encountered fatal error, operation suspended\n"); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/11 RESEND] cxgb3 - log adapter serial number
From: Divy Le Ray <[EMAIL PROTECTED]> Log HW serial number when cxgb3 module is loaded. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/common.h |2 ++ drivers/net/cxgb3/cxgb3_main.c |6 -- drivers/net/cxgb3/t3_hw.c |3 ++- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h index 55922ed..d54446f 100644 --- a/drivers/net/cxgb3/common.h +++ b/drivers/net/cxgb3/common.h @@ -97,6 +97,7 @@ enum { MAX_NPORTS = 2, /* max # of ports */ MAX_FRAME_SIZE = 10240, /* max MAC frame size, including header + FCS */ EEPROMSIZE = 8192, /* Serial EEPROM size */ + SERNUM_LEN = 16,/* Serial # length */ RSS_TABLE_SIZE = 64,/* size of RSS lookup and mapping tables */ TCB_SIZE = 128, /* TCB size */ NMTUS = 16, /* size of MTU table */ @@ -391,6 +392,7 @@ struct vpd_params { unsigned int uclk; unsigned int mdc; unsigned int mem_timing; + u8 sn[SERNUM_LEN + 1]; u8 eth_base[6]; u8 port_type[MAX_NPORTS]; unsigned short xauicfg[2]; diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index a1f94cf..e5744e7 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -2333,10 +2333,12 @@ static void __devinit print_port_info(struct adapter *adap, (adap->flags & USING_MSIX) ? " MSI-X" : (adap->flags & USING_MSI) ? " MSI" : ""); if (adap->name == dev->name && adap->params.vpd.mclk) - printk(KERN_INFO "%s: %uMB CM, %uMB PMTX, %uMB PMRX\n", + printk(KERN_INFO + "%s: %uMB CM, %uMB PMTX, %uMB PMRX, S/N: %s\n", adap->name, t3_mc7_size(>cm) >> 20, t3_mc7_size(>pmtx) >> 20, - t3_mc7_size(>pmrx) >> 20); + t3_mc7_size(>pmrx) >> 20, + adap->params.vpd.sn); } } diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c index dd3149d..23b1a16 100644 --- a/drivers/net/cxgb3/t3_hw.c +++ b/drivers/net/cxgb3/t3_hw.c @@ -505,7 +505,7 @@ struct t3_vpd { u8 vpdr_len[2]; VPD_ENTRY(pn, 16); /* part number */ VPD_ENTRY(ec, 16); /* EC level */ - VPD_ENTRY(sn, 16); /* serial number */ + VPD_ENTRY(sn, SERNUM_LEN); /* serial number */ VPD_ENTRY(na, 12); /* MAC address base */ VPD_ENTRY(cclk, 6); /* core clock */ VPD_ENTRY(mclk, 6); /* mem clock */ @@ -648,6 +648,7 @@ static int get_vpd_params(struct adapter *adapter, struct vpd_params *p) p->uclk = simple_strtoul(vpd.uclk_data, NULL, 10); p->mdc = simple_strtoul(vpd.mdc_data, NULL, 10); p->mem_timing = simple_strtoul(vpd.mt_data, NULL, 10); + memcpy(p->sn, vpd.sn_data, SERNUM_LEN); /* Old eeproms didn't have port information */ if (adapter->params.rev == 0 && !vpd.port0_data[0]) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/11 RESEND] cxgb3 - use immediate data for offload Tx
From: Divy Le Ray <[EMAIL PROTECTED]> Send small TX_DATA work requests as immediate data even when there are fragments. this avoids doing multiple DMAs for small fragmented packets. The driver already implements this optimization for small contiguous packets. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/sge.c | 17 +++-- 1 files changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c index 9213cda..dca2716 100644 --- a/drivers/net/cxgb3/sge.c +++ b/drivers/net/cxgb3/sge.c @@ -1182,8 +1182,8 @@ int t3_eth_xmit(struct sk_buff *skb, struct net_device *dev) * * Writes a packet as immediate data into a Tx descriptor. The packet * contains a work request at its beginning. We must write the packet - * carefully so the SGE doesn't read accidentally before it's written in - * its entirety. + * carefully so the SGE doesn't read it accidentally before it's written + * in its entirety. */ static inline void write_imm(struct tx_desc *d, struct sk_buff *skb, unsigned int len, unsigned int gen) @@ -1191,7 +1191,11 @@ static inline void write_imm(struct tx_desc *d, struct sk_buff *skb, struct work_request_hdr *from = (struct work_request_hdr *)skb->data; struct work_request_hdr *to = (struct work_request_hdr *)d; - memcpy([1], [1], len - sizeof(*from)); + if (likely(!skb->data_len)) + memcpy([1], [1], len - sizeof(*from)); + else + skb_copy_bits(skb, sizeof(*from), [1], len - sizeof(*from)); + to->wr_hi = from->wr_hi | htonl(F_WR_SOP | F_WR_EOP | V_WR_BCNTLFLT(len & 7)); wmb(); @@ -1261,7 +1265,7 @@ static inline void reclaim_completed_tx_imm(struct sge_txq *q) static inline int immediate(const struct sk_buff *skb) { - return skb->len <= WR_LEN && !skb->data_len; + return skb->len <= WR_LEN; } /** @@ -1467,12 +1471,13 @@ static void write_ofld_wr(struct adapter *adap, struct sk_buff *skb, */ static inline unsigned int calc_tx_descs_ofld(const struct sk_buff *skb) { - unsigned int flits, cnt = skb_shinfo(skb)->nr_frags; + unsigned int flits, cnt; - if (skb->len <= WR_LEN && cnt == 0) + if (skb->len <= WR_LEN) return 1; /* packet fits as immediate data */ flits = skb_transport_offset(skb) / 8; /* headers */ + cnt = skb_shinfo(skb)->nr_frags; if (skb->tail != skb->transport_header) cnt++; return flits_to_desc(flits + sgl_len(cnt)); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/11 RESEND] cxgb3 - Expose HW memory page info
From: Divy Le Ray <[EMAIL PROTECTED]> A HW issue requires limiting the receive window size to 23 pages of internal memory. These pages can be configured to different sizes, thus the RDMA driver needs to know the page size to enforce the upper limit. Also assign explicit enum values. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/cxgb3_ctl_defs.h | 52 +--- drivers/net/cxgb3/cxgb3_offload.c |7 + 2 files changed, 38 insertions(+), 21 deletions(-) diff --git a/drivers/net/cxgb3/cxgb3_ctl_defs.h b/drivers/net/cxgb3/cxgb3_ctl_defs.h index 2095dda..6c4f320 100644 --- a/drivers/net/cxgb3/cxgb3_ctl_defs.h +++ b/drivers/net/cxgb3/cxgb3_ctl_defs.h @@ -33,27 +33,29 @@ #define _CXGB3_OFFLOAD_CTL_DEFS_H enum { - GET_MAX_OUTSTANDING_WR, - GET_TX_MAX_CHUNK, - GET_TID_RANGE, - GET_STID_RANGE, - GET_RTBL_RANGE, - GET_L2T_CAPACITY, - GET_MTUS, - GET_WR_LEN, - GET_IFF_FROM_MAC, - GET_DDP_PARAMS, - GET_PORTS, - - ULP_ISCSI_GET_PARAMS, - ULP_ISCSI_SET_PARAMS, - - RDMA_GET_PARAMS, - RDMA_CQ_OP, - RDMA_CQ_SETUP, - RDMA_CQ_DISABLE, - RDMA_CTRL_QP_SETUP, - RDMA_GET_MEM, + GET_MAX_OUTSTANDING_WR = 0, + GET_TX_MAX_CHUNK= 1, + GET_TID_RANGE = 2, + GET_STID_RANGE = 3, + GET_RTBL_RANGE = 4, + GET_L2T_CAPACITY= 5, + GET_MTUS= 6, + GET_WR_LEN = 7, + GET_IFF_FROM_MAC= 8, + GET_DDP_PARAMS = 9, + GET_PORTS = 10, + + ULP_ISCSI_GET_PARAMS= 11, + ULP_ISCSI_SET_PARAMS= 12, + + RDMA_GET_PARAMS = 13, + RDMA_CQ_OP = 14, + RDMA_CQ_SETUP = 15, + RDMA_CQ_DISABLE = 16, + RDMA_CTRL_QP_SETUP = 17, + RDMA_GET_MEM= 18, + + GET_RX_PAGE_INFO= 50, }; /* @@ -161,4 +163,12 @@ struct rdma_ctrlqp_setup { unsigned long long base_addr; unsigned int size; }; + +/* + * Offload TX/RX page information. + */ +struct ofld_page_info { + unsigned int page_size; /* Page size, should be a power of 2 */ + unsigned int num;/* Number of pages */ +}; #endif /* _CXGB3_OFFLOAD_CTL_DEFS_H */ diff --git a/drivers/net/cxgb3/cxgb3_offload.c b/drivers/net/cxgb3/cxgb3_offload.c index e620ed4..522c1be 100644 --- a/drivers/net/cxgb3/cxgb3_offload.c +++ b/drivers/net/cxgb3/cxgb3_offload.c @@ -317,6 +317,8 @@ static int cxgb_offload_ctl(struct t3cdev *tdev, unsigned int req, void *data) struct iff_mac *iffmacp; struct ddp_params *ddpp; struct adap_ports *ports; + struct ofld_page_info *rx_page_info; + struct tp_params *tp = >params.tp; int i; switch (req) { @@ -382,6 +384,11 @@ static int cxgb_offload_ctl(struct t3cdev *tdev, unsigned int req, void *data) if (!offload_running(adapter)) return -EAGAIN; return cxgb_rdma_ctl(adapter, req, data); + case GET_RX_PAGE_INFO: + rx_page_info = data; + rx_page_info->page_size = tp->rx_pg_size; + rx_page_info->num = tp->rx_num_pgs; + break; default: return -EOPNOTSUPP; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/11 RESEND] cxgb3 - tighten checks on TID values
From: Divy Le Ray <[EMAIL PROTECTED]> Enforce validity checks on connection ids Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/cxgb3_defs.h| 20 ++-- drivers/net/cxgb3/cxgb3_offload.c | 28 +++- 2 files changed, 41 insertions(+), 7 deletions(-) diff --git a/drivers/net/cxgb3/cxgb3_defs.h b/drivers/net/cxgb3/cxgb3_defs.h index 483a594..45e9216 100644 --- a/drivers/net/cxgb3/cxgb3_defs.h +++ b/drivers/net/cxgb3/cxgb3_defs.h @@ -79,9 +79,17 @@ static inline struct t3c_tid_entry *lookup_tid(const struct tid_info *t, static inline struct t3c_tid_entry *lookup_stid(const struct tid_info *t, unsigned int tid) { + union listen_entry *e; + if (tid < t->stid_base || tid >= t->stid_base + t->nstids) return NULL; - return &(stid2entry(t, tid)->t3c_tid); + + e = stid2entry(t, tid); + if ((void *)e->next >= (void *)t->tid_tab && + (void *)e->next < (void *)>atid_tab[t->natids]) + return NULL; + + return >t3c_tid; } /* @@ -90,9 +98,17 @@ static inline struct t3c_tid_entry *lookup_stid(const struct tid_info *t, static inline struct t3c_tid_entry *lookup_atid(const struct tid_info *t, unsigned int tid) { + union active_open_entry *e; + if (tid < t->atid_base || tid >= t->atid_base + t->natids) return NULL; - return &(atid2entry(t, tid)->t3c_tid); + + e = atid2entry(t, tid); + if ((void *)e->next >= (void *)t->tid_tab && + (void *)e->next < (void *)>atid_tab[t->natids]) + return NULL; + + return >t3c_tid; } int process_rx(struct t3cdev *dev, struct sk_buff **skbs, int n); diff --git a/drivers/net/cxgb3/cxgb3_offload.c b/drivers/net/cxgb3/cxgb3_offload.c index 522c1be..7fb526a 100644 --- a/drivers/net/cxgb3/cxgb3_offload.c +++ b/drivers/net/cxgb3/cxgb3_offload.c @@ -57,7 +57,7 @@ static DEFINE_RWLOCK(adapter_list_lock); static LIST_HEAD(adapter_list); static const unsigned int MAX_ATIDS = 64 * 1024; -static const unsigned int ATID_BASE = 0x10; +static const unsigned int ATID_BASE = 0x1; static inline int offload_activated(struct t3cdev *tdev) { @@ -684,10 +684,19 @@ static int do_cr(struct t3cdev *dev, struct sk_buff *skb) { struct cpl_pass_accept_req *req = cplhdr(skb); unsigned int stid = G_PASS_OPEN_TID(ntohl(req->tos_tid)); + struct tid_info *t = &(T3C_DATA(dev))->tid_maps; struct t3c_tid_entry *t3c_tid; + unsigned int tid = GET_TID(req); - t3c_tid = lookup_stid(&(T3C_DATA(dev))->tid_maps, stid); - if (t3c_tid->ctx && t3c_tid->client->handlers && + if (unlikely(tid >= t->ntids)) { + printk("%s: passive open TID %u too large\n", + dev->name, tid); + t3_fatal_err(tdev2adap(dev)); + return CPL_RET_BUF_DONE; + } + + t3c_tid = lookup_stid(t, stid); + if (t3c_tid && t3c_tid->ctx && t3c_tid->client->handlers && t3c_tid->client->handlers[CPL_PASS_ACCEPT_REQ]) { return t3c_tid->client->handlers[CPL_PASS_ACCEPT_REQ] (dev, skb, t3c_tid->ctx); @@ -769,16 +778,25 @@ static int do_act_establish(struct t3cdev *dev, struct sk_buff *skb) { struct cpl_act_establish *req = cplhdr(skb); unsigned int atid = G_PASS_OPEN_TID(ntohl(req->tos_tid)); + struct tid_info *t = &(T3C_DATA(dev))->tid_maps; struct t3c_tid_entry *t3c_tid; + unsigned int tid = GET_TID(req); - t3c_tid = lookup_atid(&(T3C_DATA(dev))->tid_maps, atid); + if (unlikely(tid >= t->ntids)) { + printk("%s: active establish TID %u too large\n", + dev->name, tid); + t3_fatal_err(tdev2adap(dev)); + return CPL_RET_BUF_DONE; + } + + t3c_tid = lookup_atid(t, atid); if (t3c_tid && t3c_tid->ctx && t3c_tid->client->handlers && t3c_tid->client->handlers[CPL_ACT_ESTABLISH]) { return t3c_tid->client->handlers[CPL_ACT_ESTABLISH] (dev, skb, t3c_tid->ctx); } else { printk(KERN_ERR "%s: received clientless CPL command 0x%x\n", - dev->name, CPL_PASS_ACCEPT_REQ); + dev->name, CPL_ACT_ESTABLISH); return CPL_RET_BUF_DONE | CPL_RET_BAD_MSG; } } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/11 RESEND] cxgb3 - SGE doorbell overflow warning
From: Divy Le Ray <[EMAIL PROTECTED]> Log doorbell Fifo overflow Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/regs.h |8 drivers/net/cxgb3/sge.c |4 2 files changed, 12 insertions(+), 0 deletions(-) diff --git a/drivers/net/cxgb3/regs.h b/drivers/net/cxgb3/regs.h index aa80313..2824278 100644 --- a/drivers/net/cxgb3/regs.h +++ b/drivers/net/cxgb3/regs.h @@ -172,6 +172,14 @@ #define A_SG_INT_CAUSE 0x5c +#define S_HIPIODRBDROPERR11 +#define V_HIPIODRBDROPERR(x) ((x) << S_HIPIODRBDROPERR) +#define F_HIPIODRBDROPERRV_HIPIODRBDROPERR(1U) + +#define S_LOPIODRBDROPERR10 +#define V_LOPIODRBDROPERR(x) ((x) << S_LOPIODRBDROPERR) +#define F_LOPIODRBDROPERRV_LOPIODRBDROPERR(1U) + #define S_RSPQDISABLED3 #define V_RSPQDISABLED(x) ((x) << S_RSPQDISABLED) #define F_RSPQDISABLEDV_RSPQDISABLED(1U) diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c index a2cfd68..9213cda 100644 --- a/drivers/net/cxgb3/sge.c +++ b/drivers/net/cxgb3/sge.c @@ -2476,6 +2476,10 @@ void t3_sge_err_intr_handler(struct adapter *adapter) "(0x%x)\n", (v >> S_RSPQ0DISABLED) & 0xff); } + if (status & (F_HIPIODRBDROPERR | F_LOPIODRBDROPERR)) + CH_ALERT(adapter, "SGE dropped %s priority doorbell\n", +status & F_HIPIODRBDROPERR ? "high" : "lo"); + t3_write_reg(adapter, A_SG_INT_CAUSE, status); if (status & (F_RSPQCREDITOVERFOW | F_RSPQDISABLED)) t3_fatal_err(adapter); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/11 RESEND] cxgb3 - Update rx coalescing length
From: Divy Le Ray <[EMAIL PROTECTED]> Reduce Rx coalescing length to 12288 Large bursts from the adapter to the host create back pressure on the chip. Reducing the burst size avoids the issue. Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]> --- drivers/net/cxgb3/common.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/net/cxgb3/common.h b/drivers/net/cxgb3/common.h index c46c249..55922ed 100644 --- a/drivers/net/cxgb3/common.h +++ b/drivers/net/cxgb3/common.h @@ -104,7 +104,7 @@ enum { PROTO_SRAM_LINES = 128, /* size of TP sram */ }; -#define MAX_RX_COALESCING_LEN 16224U +#define MAX_RX_COALESCING_LEN 12288U enum { PAUSE_RX = 1 << 0, - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/11] cxgb3 - driver updates
Jeff, I'm resubmitting the last cxgb3 patch series against netdev-2.6#upstream, minus the first patch that you already applied and the last patch. Here is a brief description: - Modify max HW Rx coalescing size - Log SGE doorbell Fifo overflow - Use Tx immediate data for offload packets whenever possible - RDMA can get internal mem info to workaround HW issues - More validity checks on connection ids - Stop MAC when a fatal error is detected - Log HW serial number - Update internal mem operating mode - Update engine microcode management, version is now 1.1.0 - Update FW management, version is now 4.6.0 - Ignore some HW errors until the HW is initialized Cheers, Divy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Smack: Simplified Mandatory Access Control Kernel
On Aug 21, 2007, at 11:50:48, Casey Schaufler wrote: --- Kyle Moffett <[EMAIL PROTECTED]> wrote: Well, in this case the "box" I want to secure will eventually be running multi-user X on a multi-level-with-IPsec network. For that kind of protection profile, there is presently no substitute for SELinux with some X11 patches. AppArmor certainly doesn't meet the confidentiality requirements (no data labelling), and SMACK has no way of doing the very tight per-syscall security requirements we have to meet. And what requirements would those be? Seriously, I've done Common Criteria and TCSEC evaluations on systems with less flexibility and granularity than Smack that included X, NFSv3, NIS, clusters, and all sorts of spiffy stuff. These are requirements more of the "give the client warm fuzzies". On the other hand, when designing a box that could theoretically be run on a semi-public unclassified network and yet still be safe enough to run classified data over IPsec links, you want to give the client all the warm fuzzies they ask for and more. I mean, if the requirement is anything short of "runs SELinux" I have good reason to believe that a Smack based system is up to it. "up to it", yes, but I think you'll find that beyond the simplest policies, an SELinux policy that properly uses the SELinux infrastructure will be much shorter than the equivalent SMACK policy, not even including all the things that SELinux does and SMACK doesn't. I didn't make this clear initially but that is the kind of system I'm talking about wanting to secure some 50 million lines of code on. Cool. SELinux provides one approach to dealing with that, and the huge multiuser general purpose machine chuck full of legacy software hits the SELinux sweet spot. Well, given that 99.9% of the systems people are really concerned about security on are multi-user general-purpose machines chuck full of legacy software, that seems to work just fine. If it's a single- user box then you don't even need MAC, just a firewall, a good locked rack/case/keyboard/etc, and decent physical security. If it's entirely custom-controlled software then you can just implement the "MAC" entirely in your own software. "General-purpose" vs "special- purpose" is debatable, so I'll just leave that one lie. Replying to another email: but you written it in wrong language. You written it in C, while you should have written it in SELinux policy language (and your favourite scripting language as frontend). I have often marvelled at the notion of a simplification layer. I believe that you build complex things on top of simple things, not the other way around. There is no "one answer" to this question in software development. Generally you prioritize things based on maximizing maintainability and speed and minimizing code, bugs, and complexity. Those are often both conflicting and in agreement. Here are a few common examples of simple-thing-on-complex-thing: * pthreads on top of clone() * open(some_string) on top of all the complex VFS machinery * "netcat" on top of the vast Linux network stack including support for arbitrary packet filtering and transformation. In addition, "simple" is undesirable if it makes the implementation less generic for no good reason. Would you want to use the "simple" MS Windows disk-drive model under Linux? Every disk is its own letter and has its files under it. Oh, you wanted to mount a filesystem over C:\tmp? Sorry, we don't support that, too bad. Under Linux we have a very flexible and powerful VFS which lets you do very crazy things, and then for the user's convenience we have various "simple" interfaces (like Gnome/KDE/XFCE). Software development is very much about finding the Right Model(TM) to underlie the system, and then building any simplifications-to-the- user on top of the very simple model. Look at the SELinux model again; it has the following things: (A) Labels on almost-all user-visible kernel objects (B) Individual access rules for almost every operation on those objects (C) "Transition" rules to set the label on newly created objects. (D) Fundamental "constraints" which enforce hard limits on what may be permitted with "allow" rules From a fundamental standpoint it's harder to get much simpler than that. On top of that model, we also have a bit of additional *flexibility* for MLS/RBAC, although that flexibility may be ignored completely. (1) You can define "users" which may only assume some "roles" (2) You can define "roles" may only run in some "types" (3) There's a simple way of declaring multiple "levels" and "dominance". So you see, SELinux is a pretty fundamental description of the degrees of flexibility needed to secure everything. That kind of FUNDAMENTAL description is what belongs in the kernel. Anything else can and should be built on top with
Re: [linux-usb-devel] [4/4] 2.6.23-rc3: known regressions
On Tue, 21 Aug 2007, Linus Torvalds wrote: > > Side note: after reverting 196705c9bb I can't get the mouse to skip any > more on that mac mini. But since the bad behaviour wasn't 100% reliable to > begin with, that's not really a guarantee of anything. Two out of three > kids are off on camp this week, so that machine probably won't be getting > a lot of testing ;/ Well, my one remaining child said today that "I got so much time on webkinz today - yesterday the mouse locked up after five minutes". Apparently it hadn't had the mouse lock up at all today. So I really do believe that that 196705c9bb commit caused problems on intel-only USB machines too ("ondemand" cpufreq governor, switching between 1.0-1.66 Ghz using acpi-cpufreq: totally bog-standard in all respects, in other words). Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [autofs] [PATCH] autofs4: reinstate negatitive timeout of mount fails
On Wed, 2007-08-22 at 10:56 +0800, Ian Kent wrote: > On Tue, 2007-08-21 at 13:15 -0700, Andrew Morton wrote: > > > > It seems to use a lot of list_for_each[_safe] which could > > have been coded as list_for_each_entry[_safe], btw. > > Mmm .. good point. I've not noticed the list_for_each_entry* macros. A good idea but that change would cover more than just this patch so I'd rather leave the patch as is and submit a cleanup patch to cover this later. Ian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How to learn Linux Kernel Programming
On 8/21/07, Noud Aldenhoven <[EMAIL PROTECTED]> wrote: > I'm a simple Math/Computer Science student and would like to learn > more about linux and it's kernel. > To be more precise, I'd to learn how to program in the linux kernel > and maybe become a developer, > if everything goes fine. > But where do I start? Almost all information I found on the Internet > if from before 2005 and I think that > means it's out-of-date. Are there up-to-date documentations that are > use full to read and explain how > the kernel is build. (for example, is /usr/src/linux/Documentation a > use full dir?) Besides the sources already mentioned, there are a couple of quite good books. I know at least Robert Love's Linus Kernel Development, by O'Reilly, Rubini et. al. Linux Device Drivers, and Mel Gorman's about Virtual Memory, whose exact name I can't recall. You can also try to start following LKML's flow. Maybe you won't understand much in the beginning, but your comprehension on the discussions will improve in the future. (Maybe reading a subsystem mailing list - less traffic - is a good idea, if you have some specific interests) -- Glauber de Oliveira Costa. "Free as in Freedom" http://glommer.net "The less confident you are, the more serious you have to act." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] fix - ensure we don't use bootconsoles after init has been released
From: Robin Getz <[EMAIL PROTECTED]> Gerd Hoffmann pointed out that my patch from yesterday can lead to a null pointer dereference if the kernel is booted with no console, and no earlyprintk defined. This fixes that issue. printk.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) Signed-off-by: Robin Getz <[EMAIL PROTECTED]> --- Index: linux-2.6.x/kernel/printk.c === --- linux-2.6.x/kernel/printk.c +++ linux-2.6.x/kernel/printk.c @@ -1106,10 +1106,12 @@ static int __init disable_boot_consoles(void) { - if (console_drivers->flags & CON_BOOT) { - printk(KERN_INFO "turn off boot console %s%d\n", - console_drivers->name, console_drivers->index); - return unregister_console(console_drivers); + if (console_drivers != NULL) { + if (console_drivers->flags & CON_BOOT) { + printk(KERN_INFO "turn off boot console %s%d\n", + console_drivers->name, console_drivers->index); + return unregister_console(console_drivers); + } } return 0; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 09/14] Convert from class_device to device for SPI
On Tue, Aug 21, 2007 at 11:28:28AM -0700, David Brownell wrote: > Can you update the Documentation/spi/spi-summary text which is > invalidated by this change? That's part of why I rejected an > earlier version of this patch: since it broke the documentation, > it was incomplete. I believe this is the necessary documentation changes. Alas I can't write verbiage you are necessarily happy with, only you can do that, if there is a factual error, I'll be happy to correct but feel free to edit for personal style. I'll be gone thru Sunday so if it needs more adjustment I'll do it then. Tony -- Convert from class_device to device for drivers/spi. This is part of the work to eliminate struct class_device. Signed-off-by: Tony Jones <[EMAIL PROTECTED]> --- Documentation/spi/spi-summary | 13 +++-- drivers/spi/spi.c | 36 ++-- drivers/spi/spi_bitbang.c |2 +- drivers/spi/spi_lm70llp.c |2 +- include/linux/spi/spi.h | 12 ++-- 5 files changed, 33 insertions(+), 32 deletions(-) --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -207,7 +207,7 @@ struct spi_device *spi_new_device(struct struct spi_board_info *chip) { struct spi_device *proxy; - struct device *dev = master->cdev.dev; + struct device *dev = master->dev.parent; int status; /* NOTE: caller did any chip->bus_num checks necessary. @@ -242,7 +242,7 @@ struct spi_device *spi_new_device(struct proxy->modalias = chip->modalias; snprintf(proxy->dev.bus_id, sizeof proxy->dev.bus_id, - "%s.%u", master->cdev.class_id, + "%s.%u", master->dev.bus_id, chip->chip_select); proxy->dev.parent = dev; proxy->dev.bus = _bus_type; @@ -341,18 +341,18 @@ static void scan_boardinfo(struct spi_ma /*-*/ -static void spi_master_release(struct class_device *cdev) +static void spi_master_release(struct device *dev) { struct spi_master *master; - master = container_of(cdev, struct spi_master, cdev); + master = container_of(dev, struct spi_master, dev); kfree(master); } static struct class spi_master_class = { .name = "spi_master", .owner = THIS_MODULE, - .release= spi_master_release, + .dev_release= spi_master_release, }; @@ -360,7 +360,7 @@ static struct class spi_master_class = { * spi_alloc_master - allocate SPI master controller * @dev: the controller, possibly using the platform_bus * @size: how much zeroed driver-private data to allocate; the pointer to this - * memory is in the class_data field of the returned class_device, + * memory is in the driver_data field of the returned device, * accessible with spi_master_get_devdata(). * Context: can sleep * @@ -386,9 +386,9 @@ struct spi_master *spi_alloc_master(stru if (!master) return NULL; - class_device_initialize(>cdev); - master->cdev.class = _master_class; - master->cdev.dev = get_device(dev); + device_initialize(>dev); + master->dev.class = _master_class; + master->dev.parent = get_device(dev); spi_master_set_devdata(master, [1]); return master; @@ -418,7 +418,7 @@ EXPORT_SYMBOL_GPL(spi_alloc_master); int spi_register_master(struct spi_master *master) { static atomic_t dyn_bus_id = ATOMIC_INIT((1<<15) - 1); - struct device *dev = master->cdev.dev; + struct device *dev = master->dev.parent; int status = -ENODEV; int dynamic = 0; @@ -443,12 +443,12 @@ int spi_register_master(struct spi_maste /* register the device, then userspace will see it. * registration fails if the bus ID is in use. */ - snprintf(master->cdev.class_id, sizeof master->cdev.class_id, + snprintf(master->dev.bus_id, sizeof master->dev.bus_id, "spi%u", master->bus_num); - status = class_device_add(>cdev); + status = device_add(>dev); if (status < 0) goto done; - dev_dbg(dev, "registered master %s%s\n", master->cdev.class_id, + dev_dbg(dev, "registered master %s%s\n", master->dev.bus_id, dynamic ? " (dynamic)" : ""); /* populate children from any spi device tables */ @@ -481,8 +481,8 @@ void spi_unregister_master(struct spi_ma { int dummy; - dummy = device_for_each_child(master->cdev.dev, NULL, __unregister); - class_device_unregister(>cdev); + dummy = device_for_each_child(master->dev.parent, NULL, __unregister); +
Re: bug in migrate page
On Wed, 22 Aug 2007 10:50:53 +0800 Shaohua Li <[EMAIL PROTECTED]> wrote: > > At quick glance, above path has no writepage() ops. > > just replace swap's radix tree entry. > I missed swap has .migratepage and thought fallback_migrate_page is > used, then I thought doing rcu lock in PageAnon case is ok. > Thank you, I'll write a patch. Regards, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bug in migrate page
On Wed, 2007-08-22 at 11:52 +0900, KAMEZAWA Hiroyuki wrote: > On Wed, 22 Aug 2007 10:08:09 +0800 > Shaohua Li <[EMAIL PROTECTED]> wrote: > > > commit dc386d4d1e98bb39fb967ee156cd456c802fc692 adds rcu_read_lock, but > > some routines in the lock range might sleep (like lock_buffer, > > aops->writepage), I saw a 'sleep in atomic' warning. It appears the > > patch has several versions before. Doing rcu_read_lock in PageAnon > > sounds break the case of PageAnon(page) && PageSwapCache(page), > > as .writepage might be called. The dummy anon patch maybe is ok. > > > > Thank you for catching. > > Maybe you're correct. > > BTW, in PageAnon(page) && PageSwapCache(page) case, I can't find when > .writepage is called. Could you explain ? > > In my understanding, > > rcu_read_lock() > -> try_to_unmap() > -> move_to_new_page() > -> migrate_page() // swap has .migratepage member. > -> migrate_page_move_mapping(). > -> migrate_page_copy(). > -> remove_migration_ptes(). > > > At quick glance, above path has no writepage() ops. > just replace swap's radix tree entry. I missed swap has .migratepage and thought fallback_migrate_page is used, then I thought doing rcu lock in PageAnon case is ok. Thanks, Shaohua - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
What must I do to get HPET
I have an Acer Ferrari 5000 laptop, it has an AMD64 TL-60 processor, and an RS480 host bridge. I'm running 2.6.22, I get no HPET. >From looking at the kernel source, the HPET driver is looking for PNP0103. I >see no PNP0103 entry on my machine, just PNP0100. Searching around, it looks like kernels running on other RS480 systems are finding HPETs. http://lists.openwall.net/linux-kernel/2007/03/26/215 Do I need to get Acer to update the BIOS to include a PNP0103 entry? Is there some way I can force this? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] autofs4: reinstate negatitive timeout of mount fails
On Tue, 2007-08-21 at 13:15 -0700, Andrew Morton wrote: > On Tue, 21 Aug 2007 17:26:09 +0800 > Ian Kent <[EMAIL PROTECTED]> wrote: > > > Due to a change to fs/dcache.c:d_lookup() in the 2.6 kernel whereby only > > hashed dentrys are returned the negative caching of mount failures > > stopped working in the autofs4 module for nobrowse mount (ie. directory > > created at mount time and removed at umount or following a mount > > failure). > > > > This patch keeps track of the dentrys from mount fails in order to be > > able check the timeout since the last fail and return the appropriate > > status. In addition the timeout value is settable at load time as a > > module option and via sysfs using the module > > parameter /sys/module/autofs4/parameters/negative_timeout. > > Boy, that's a complex-looking patch. I think I'll sit on this one > for 2.6.24 ;) Yes, that's fine .. the principle isn't that complex. > > It seems to use a lot of list_for_each[_safe] which could > have been coded as list_for_each_entry[_safe], btw. Mmm .. good point. I've not noticed the list_for_each_entry* macros. Ian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bug in migrate page
On Wed, 22 Aug 2007 10:08:09 +0800 Shaohua Li <[EMAIL PROTECTED]> wrote: > commit dc386d4d1e98bb39fb967ee156cd456c802fc692 adds rcu_read_lock, but > some routines in the lock range might sleep (like lock_buffer, > aops->writepage), I saw a 'sleep in atomic' warning. It appears the > patch has several versions before. Doing rcu_read_lock in PageAnon > sounds break the case of PageAnon(page) && PageSwapCache(page), > as .writepage might be called. The dummy anon patch maybe is ok. > Thank you for catching. Maybe you're correct. BTW, in PageAnon(page) && PageSwapCache(page) case, I can't find when .writepage is called. Could you explain ? In my understanding, rcu_read_lock() -> try_to_unmap() -> move_to_new_page() -> migrate_page() // swap has .migratepage member. -> migrate_page_move_mapping(). -> migrate_page_copy(). -> remove_migration_ptes(). At quick glance, above path has no writepage() ops. just replace swap's radix tree entry. Thanks, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] autofs4: reinstate negatitive timeout of mount fails
On Tue, 2007-08-21 at 10:19 -0400, Peter Staubach wrote: > Ian Kent wrote: > > Hi, > > > > Due to a change to fs/dcache.c:d_lookup() in the 2.6 kernel whereby only > > hashed dentrys are returned the negative caching of mount failures > > stopped working in the autofs4 module for nobrowse mount (ie. directory > > created at mount time and removed at umount or following a mount > > failure). > > > > This patch keeps track of the dentrys from mount fails in order to be > > able check the timeout since the last fail and return the appropriate > > status. In addition the timeout value is settable at load time as a > > module option and via sysfs using the module > > parameter /sys/module/autofs4/parameters/negative_timeout. > > > > Signed-off-by: Ian Kent <[EMAIL PROTECTED]> > > > > --- > > --- linux-2.6.23-rc2-mm2/fs/autofs4/init.c.negative-timeout 2007-07-09 > > 07:32:17.0 +0800 > > +++ linux-2.6.23-rc2-mm2/fs/autofs4/init.c 2007-08-21 15:44:34.0 > > +0800 > > @@ -14,6 +14,10 @@ > > #include > > #include "autofs_i.h" > > > > +unsigned int negative_timeout = AUTOFS_NEGATIVE_TIMEOUT; > > +module_param(negative_timeout, uint, S_IRUGO | S_IWUSR); > > +MODULE_PARM_DESC(negative_timeout, "Cache mount fails negatively for this > > many seconds"); > > + > > static int autofs_get_sb(struct file_system_type *fs_type, > > int flags, const char *dev_name, void *data, struct vfsmount *mnt) > > { > > --- linux-2.6.23-rc2-mm2/fs/autofs4/inode.c.negative-timeout > > 2007-08-17 11:52:33.0 +0800 > > +++ linux-2.6.23-rc2-mm2/fs/autofs4/inode.c 2007-08-21 15:44:34.0 > > +0800 > > @@ -46,6 +46,7 @@ struct autofs_info *autofs4_init_ino(str > > ino->inode = NULL; > > ino->dentry = NULL; > > ino->size = 0; > > + ino->negative_timeout = negative_timeout; > > > > INIT_LIST_HEAD(>rehash); > > > > @@ -98,11 +99,24 @@ void autofs4_free_ino(struct autofs_info > > static void autofs4_force_release(struct autofs_sb_info *sbi) > > { > > struct dentry *this_parent = sbi->sb->s_root; > > - struct list_head *next; > > + struct list_head *p, *next; > > > > if (!sbi->sb->s_root) > > return; > > > > + /* Cleanup the negative dentry cache */ > > + spin_lock(>rehash_lock); > > + list_for_each_safe(p, next, >rehash_list) { > > + struct autofs_info *ino; > > + struct dentry *dentry; > > + ino = list_entry(p, struct autofs_info, rehash); > > + dentry = ino->dentry; > > + spin_unlock(>rehash_lock); > > + dput(ino->dentry); > > > > Should this be dput(dentry);? It could be since they're the same or maybe I should get rid of the assignment. Maybe that would save a couple of cpu cycles. > > Thanx... > >ps > > > > + spin_lock(>rehash_lock); > > + } > > + spin_unlock(>rehash_lock); > > + > > spin_lock(_lock); > > repeat: > > next = this_parent->d_subdirs.next; > > --- linux-2.6.23-rc2-mm2/fs/autofs4/autofs_i.h.negative-timeout > > 2007-08-17 11:52:33.0 +0800 > > +++ linux-2.6.23-rc2-mm2/fs/autofs4/autofs_i.h 2007-08-21 > > 15:44:34.0 +0800 > > @@ -40,6 +40,14 @@ > > #define DPRINTK(fmt,args...) do {} while(0) > > #endif > > > > +/* > > + * If the daemon returns a negative response (AUTOFS_IOC_FAIL) then we keep > > + * the negative response cached for up to the time given here, although > > + * the time can be shorter if the kernel throws the dcache entry away. > > + */ > > +#define AUTOFS_NEGATIVE_TIMEOUT60 /* default 1 minute */ > > +extern unsigned int negative_timeout; > > + > > /* Unified info structure. This is pointed to by both the dentry and > > inode structures. Each file in the filesystem has an instance of this > > structure. It holds a reference to the dentry, so dentries are never > > @@ -52,8 +60,16 @@ struct autofs_info { > > > > int flags; > > > > + /* > > +* Two types of unhashed dentry can exist on this list. > > +* Negative dentrys from failed mounts and positive dentrys > > +* resulting from a race between expire and mount. This > > +* fact is used when looking for dentrys in the list. > > +*/ > > struct list_head rehash; > > > > + unsigned int negative_timeout; > > + > > struct autofs_sb_info *sbi; > > unsigned long last_used; > > atomic_t count; > > --- linux-2.6.23-rc2-mm2/fs/autofs4/root.c.negative-timeout 2007-08-17 > > 11:53:38.0 +0800 > > +++ linux-2.6.23-rc2-mm2/fs/autofs4/root.c 2007-08-21 15:44:34.0 > > +0800 > > @@ -238,6 +238,125 @@ out: > > return dcache_readdir(file, dirent, filldir); > > } > > > > +static int autofs4_compare_dentry(struct dentry *parent, struct dentry > > *dentry, struct qstr *name) > > +{ > > + unsigned int len = name->len; > > + unsigned int hash = name->hash; > > + const unsigned char *str = name->name; > > + struct qstr *qstr = >d_name; >
Re: Problems with IDE on linux 2.6.22.X
On 08/22/2007 03:39 AM, José Luis Patiño Andrés wrote: You have a SATA harddrive (Hitachi Travelstar 5K100 100GB SATA/2.5") and an IDE (also known as PATA) DVD drive (LG GMA-4082N). That is, your disk should be driven by the: "Intel ESB, ICH, PIIX3, PIIX4 PATA/SATA support" under the "Serial ATA (prod) and Parallel ATA (experimental) drivers" menu, and it seems this driver should also take care of your DVD. Not sure from your report what you are using -- first try with only that driver, and nothing from the old "ATA/ATAPI/MFM/RLL support" menu selected. In that situation, your harddrive works, but your DVD does not? Okay, now it's tested as you said. In fact, in this way with only the SATA drivers activated and ATA/ATAPI support completely unselected, my HDD works but my DVD not. Okay. Jeff, Alan -- 2.6.20.15 apparently working. A few weeks ago there was another report of a DVD drive failing detection on pata_amd (my CD and DVD drives work fine on pata_amd). Did some ATAPI timeouts change or something? He's using: 00:1f.2 IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family) Serial ATA Storage Controller IDE (rev 02) (prog-if 80 [Master]) And so... If so, this should be fixed in the driver, but to get things working I believe you may try with both the above driver for your harddisk and the old IDE driver for the DVD: <*> Enhanced IDE/MFM/RLL disk/cdrom/tape/floppy support <*> Include IDE/ATAPI CDROM support (NEW) [*] PCI IDE chipset support [*] Generic PCI bus-master DMA support <*> Intel PIIXn chipsets support Checked. (do not select IDE/ATA-2 disk support) Unselected. Now, I have this kernel panic: ### #VFS: cannot open root device "sda3" or unknown-block (0,0) #Please, append a correct "root=" boot option; here are the available #partitions: #1600 4194302 hdc driver: ide-cdrom Okay, makes sense, seems the new driver simply can't grab the SATA part anymore when the old driver already's got the IDE part -- I wasn't sure about that (not a SATA user myself -- just noticed your report due to noticing that previous one due to pata_amd...). The old SATA driver available from the IDE menu also does not support your chip, so I don't believe there are any workarounds -- you'll need the issue fixed. Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][2.6.23-rc2-mm2] small fix for ia64 icache sync patch
This is updated version. Andrew, could you repleace ? -Kame == Fixing 2 small issues pointed by Tony Luck. Changelog v1 -> v2 * add pte_present_exec_user() * remove pte_user * fixed comments. v1. * removing redundant BUG_ON in __ia64_sync_icache_dcache(). * check pte_present() first. Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> --- arch/ia64/mm/init.c|2 -- include/asm-ia64/pgtable.h | 17 +++-- 2 files changed, 11 insertions(+), 8 deletions(-) Index: linux-2.6.23-rc2-mm2/include/asm-ia64/pgtable.h === --- linux-2.6.23-rc2-mm2.orig/include/asm-ia64/pgtable.h +++ linux-2.6.23-rc2-mm2/include/asm-ia64/pgtable.h @@ -297,7 +297,6 @@ ia64_phys_addr_valid (unsigned long addr /* * The following have defined behavior only work if pte_present() is true. */ -#define pte_user(pte) ((pte_val(pte) & _PAGE_PL_MASK) == _PAGE_PL_3) #define pte_write(pte) ((unsigned) (((pte_val(pte) & _PAGE_AR_MASK) >> _PAGE_AR_SHIFT) - 2) <= 4) #define pte_exec(pte) ((pte_val(pte) & _PAGE_AR_RX) != 0) #define pte_dirty(pte) ((pte_val(pte) & _PAGE_D) != 0) @@ -324,14 +323,20 @@ ia64_phys_addr_valid (unsigned long addr * set_pte() is also called by the kernel, but we can expect that the kernel * flushes icache explicitly if necessary. */ +#define pte_present_exec_user(pte)\ + ((pte_val(pte) & (_PAGE_P | _PAGE_PL_MASK | _PAGE_AR_RX)) == \ + (_PAGE_P | _PAGE_PL_3 | _PAGE_AR_RX)) + extern void __ia64_sync_icache_dcache(pte_t pteval); static inline void set_pte(pte_t *ptep, pte_t pteval) { - if (pte_exec(pteval) &&// flush only new executable page. - pte_present(pteval) && // swap out ? - pte_user(pteval) &&// ignore kernel page - (!pte_present(*ptep) ||// do_no_page or swap in, migration, - pte_pfn(*ptep) != pte_pfn(pteval))) // do_wp_page(), page copy + /* page is present && page is user && page is executable +* && (page swapin or new page or page migraton +* || copy_on_write with page copying.) +*/ + if (pte_present_exec_user(pteval) && + (!pte_present(*ptep) || + pte_pfn(*ptep) != pte_pfn(pteval))) /* load_module() calles flush_icache_range() explicitly*/ __ia64_sync_icache_dcache(pteval); *ptep = pteval; Index: linux-2.6.23-rc2-mm2/arch/ia64/mm/init.c === --- linux-2.6.23-rc2-mm2.orig/arch/ia64/mm/init.c +++ linux-2.6.23-rc2-mm2/arch/ia64/mm/init.c @@ -60,8 +60,6 @@ __ia64_sync_icache_dcache (pte_t pte) struct page *page; unsigned long order; - BUG_ON(!pte_exec(pte)); - page = pte_page(pte); addr = (unsigned long) page_address(page); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] get_nodes should ignore invalid node
get_nodes doesn't check if nodes in node mask are valid, cause a kernel oops when an invalid node is used.. Signed-off-by: Shaohua Li <[EMAIL PROTECTED]> Index: linux/mm/mempolicy.c === --- linux.orig/mm/mempolicy.c 2007-07-25 09:14:33.0 +0800 +++ linux/mm/mempolicy.c2007-08-21 13:15:41.0 +0800 @@ -850,6 +850,8 @@ static int get_nodes(nodemask_t *nodes, if (copy_from_user(nodes_addr(*nodes), nmask, nlongs*sizeof(unsigned long))) return -EFAULT; nodes_addr(*nodes)[nlongs-1] &= endmask; + + nodes_and(*nodes, *nodes, node_online_map); return 0; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
bug in migrate page
commit dc386d4d1e98bb39fb967ee156cd456c802fc692 adds rcu_read_lock, but some routines in the lock range might sleep (like lock_buffer, aops->writepage), I saw a 'sleep in atomic' warning. It appears the patch has several versions before. Doing rcu_read_lock in PageAnon sounds break the case of PageAnon(page) && PageSwapCache(page), as .writepage might be called. The dummy anon patch maybe is ok. Thanks, Shaohua - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] add some Blackfin specific checks to checkpatch.pl
Check for a few common errors in Blackfin-specific code wrt MMR loading in assembly and doing core/system syncs. Restrict the Blackfin MMR checks to actual Blackfin assembly files as pointed out by Joe Perches. Signed-off-by: Mike Frysinger <[EMAIL PROTECTED]> CC: Bryan Wu <[EMAIL PROTECTED]> CC: Andy Whitcroft <[EMAIL PROTECTED]> --- scripts/checkpatch.pl | 22 1 files changed, 22 insertions(+), 0 deletions(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index dae7d30..ead9675 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -486,9 +486,31 @@ sub process { WARN("line over 80 characters\n" . $herecurr); } +# Blackfin: use hi/lo macros + if ($realfile =~ [EMAIL PROTECTED]/blackfin/.*\.S$@) { + if ($line =~ /\.[lL][[:space:]]*=.*&[[:space:]]*0x[fF][fF][fF][fF]/) { + my $herevet = "$here\n" . cat_vet($line) . "\n"; + ERROR("use the LO() macro, not (... & 0x)\n" . $herevet); + } + if ($line =~ /\.[hH][[:space:]]*=.*>>[[:space:]]*16/) { + my $herevet = "$here\n" . cat_vet($line) . "\n"; + ERROR("use the HI() macro, not (... >> 16)\n" . $herevet); + } + } + # check we are in a valid source file *.[hc] if not then ignore this hunk next if ($realfile !~ /\.[hc]$/); +# Blackfin: don't use __builtin_bfin_[cs]sync + if ($line =~ /__builtin_bfin_csync/) { + my $herevet = "$here\n" . cat_vet($line) . "\n"; + ERROR("use the CSYNC() macro in asm/blackfin.h\n" . $herevet); + } + if ($line =~ /__builtin_bfin_ssync/) { + my $herevet = "$here\n" . cat_vet($line) . "\n"; + ERROR("use the SSYNC() macro in asm/blackfin.h\n" . $herevet); + } + # at the beginning of a line any tabs must come first and anything # more than 8 must use tabs. if ($line=~/^\+\s* \t\s*\S/ or $line=~/^\+\s*\s*/) { -- 1.5.3.rc5 signature.asc Description: This is a digitally signed message part.
Re: [PATCH 11/23] make atomic_read() and atomic_set() behavior consistent on m32r
From: Chris Snook <[EMAIL PROTECTED]> Date: Mon, 13 Aug 2007 07:24:52 -0400 > From: Chris Snook <[EMAIL PROTECTED]> > > Use volatile consistently in atomic.h on m32r. > > Signed-off-by: Chris Snook <[EMAIL PROTECTED]> Thanks, Acked-by: Hirokazu Takata <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] on the system with companion host controller, error -71 returns
Alan Thank you for your comment. I'll try to change the load order. -- Kiyoshi Sasaki <[EMAIL PROTECTED]> On Tue, 21 Aug 2007, Kiyoshi Sasaki wrote: Hello, I see below errors in dmesg on ICH6/ICH7 machine: usb 1-1: device not accepting address 2, error -71 or usb 1-1: device descriptor read/all, error -71 I'm trying to debug it, but by now I can't make it. Can you give me your help ? There's nothing to debug; these messages are perfectly normal. If you want to prevent them from occurring, you should change the load order of your modules: Load ehci-hcd before uhci-hcd. Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problems with IDE on linux 2.6.22.X
El Miércoles, 22 de Agosto de 2007 00:08, Rene Herman escribió: > You have a SATA harddrive (Hitachi Travelstar 5K100 100GB SATA/2.5") and an > IDE (also known as PATA) DVD drive (LG GMA-4082N). That is, your disk > should be driven by the: > > "Intel ESB, ICH, PIIX3, PIIX4 PATA/SATA support" > > under the "Serial ATA (prod) and Parallel ATA (experimental) drivers" menu, > and it seems this driver should also take care of your DVD. Not sure from > your report what you are using -- first try with only that driver, and > nothing from the old "ATA/ATAPI/MFM/RLL support" menu selected. > > In that situation, your harddrive works, but your DVD does not? Okay, now it's tested as you said. In fact, in this way with only the SATA drivers activated and ATA/ATAPI support completely unselected, my HDD works but my DVD not. And so... > If so, this should be fixed in the driver, but to get things working I > believe you may try with both the above driver for your harddisk and the > old IDE driver for the DVD: > > <*> Enhanced IDE/MFM/RLL disk/cdrom/tape/floppy support > <*> Include IDE/ATAPI CDROM support (NEW) > [*] PCI IDE chipset support > [*] Generic PCI bus-master DMA support > <*> Intel PIIXn chipsets support Checked. > (do not select IDE/ATA-2 disk support) Unselected. Now, I have this kernel panic: ### #VFS: cannot open root device "sda3" or unknown-block (0,0) #Please, append a correct "root=" boot option; here are the available #partitions: #1600 4194302 hdc driver: ide-cdrom #Kernel panic - not syncing: VFS: Unable to mount root fs on #unknown-block(0,0) ### > where you may need to boot with a "libata.atapi_enabled=0" kernel > parameter. This parameter has no effect. I have the same kernel panic with or without it. José Luis Patiño. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] fix the max path calculation in radix-tree.c
On Tue, Aug 21, 2007 at 03:48:42PM -0400, Jeff Moyer wrote: > Hi, > > A while back, Nick Piggin introduced a patch to reduce the node memory > usage for small files (commit cfd9b7df4abd3257c9e381b0e445817b26a51c0c): > > -#define RADIX_TREE_MAP_SHIFT 6 > +#define RADIX_TREE_MAP_SHIFT (CONFIG_BASE_SMALL ? 4 : 6) > > Unfortunately, he didn't take into account the fact that the > calculation of the maximum path was based on an assumption of having > to round up: > > #define RADIX_TREE_MAX_PATH (RADIX_TREE_INDEX_BITS/RADIX_TREE_MAP_SHIFT + 2) > > So, if CONFIG_BASE_SMALL is set, you will end up with a > RADIX_TREE_MAX_PATH that is one greater than necessary. The practical > upshot of this is just a bit of wasted memory (one long in the > height_to_maxindex array, an extra pre-allocated radix tree node per > cpu, and extra stack usage in a couple of functions), but it seems > worth getting right. > > It's also worth noting that I never build with CONFIG_BASE_SMALL. > What I did to test this was duplicate the code in a small user-space > program and check the results of the calculations for max path and the > contents of the height_to_maxindex array. > > Cheers. > > Signed-off-by: Jeff Moyer <[EMAIL PROTECTED]> > > diff --git a/lib/radix-tree.c b/lib/radix-tree.c > index 514efb2..67c908f 100644 > --- a/lib/radix-tree.c > +++ b/lib/radix-tree.c > @@ -60,7 +60,8 @@ struct radix_tree_path { > }; > > #define RADIX_TREE_INDEX_BITS (8 /* CHAR_BIT */ * sizeof(unsigned long)) > -#define RADIX_TREE_MAX_PATH (RADIX_TREE_INDEX_BITS/RADIX_TREE_MAP_SHIFT + 2) > +#define RADIX_TREE_MAX_PATH (DIV_ROUND_UP(RADIX_TREE_INDEX_BITS, \ > + RADIX_TREE_MAP_SHIFT) + 1) > > static unsigned long height_to_maxindex[RADIX_TREE_MAX_PATH] __read_mostly; > OK, after you DIV_ROUND_UP, what is the extra 1 for? For paths, it is because they are NULL terminated paths I guess (without remembering too hard), and for height_to_maxindex array it is needed for 0-height trees I think. So it would be kinda cleaner to have the _real_ MAX_PATH, and two other constants for this array and the paths arrays (that just happen to be identical due to implementation). Don't you think? But that's not to nack this patch. On the contrary I think your logic is correct, and it should be fixed. I didn't check the maths myself but I trust you :) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG?] 2.6.23-rc3 on alpha
Thanks to Richard for the "aboot" fixes. I'm seeing something new and strange with 2.6.23-rc3 that I wasn't seeing in the 2.6.22+ kernels. I've got the bootlogo code enabled, and at the point during system initialization where the logo disappears, the console switches from tty1 to tty2. I can switch back to tty1, so other than the unexpected console tty switch, there doesn't seem to be anything "unfortunate" happening. Any ideas/explanations? It's completely repeatable. I don't think it's related to the "aboot" patches :-). -- --- Bob Tracy | "Eagles may soar, but weasels don't get [EMAIL PROTECTED]| sucked into jet engines." --Anon --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Fix lazy mode vmalloc synchronization for paravirt
Found this looping Ubuntu installs with VMI. If unlucky enough to hit a vmalloc sync fault during a lazy mode operation (from an IRQ handler for a module which was not yet populated in current page directory, or from inside copy_one_pte, which touches swap_map, and hit in an unused 4M region), the required PDE update would never get flushed, causing an infinite page fault loop. This bug affects any paravirt-ops backend which uses lazy updates, I believe that makes it a bug in Xen, VMI and lguest. It only happens on LOWMEM kernels. Currently for 2.6.23, but we'll want to backport to -stable as well. Zach Touching vmalloc memory in the middle of a lazy mode update can generate a kernel PDE update, which must be flushed immediately. The fix is to leave lazy mode when doing a vmalloc sync. Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]> diff --git a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c diff --git a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c index 01ffdd4..fcb38e7 100644 --- a/arch/i386/mm/fault.c +++ b/arch/i386/mm/fault.c @@ -249,9 +249,10 @@ static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address) pmd_k = pmd_offset(pud_k, address); if (!pmd_present(*pmd_k)) return NULL; - if (!pmd_present(*pmd)) + if (!pmd_present(*pmd)) { set_pmd(pmd, *pmd_k); - else + arch_flush_lazy_mmu_mode(); + } else BUG_ON(pmd_page(*pmd) != pmd_page(*pmd_k)); return pmd_k; }
Re: [PATCH] SLUB use cmpxchg_local
* Christoph Lameter ([EMAIL PROTECTED]) wrote: > On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > > > As I am going back through the initial cmpxchg_local implementation, it > > seems like it was executing __slab_alloc() with preemption disabled, > > which is wrong. new_slab() is not designed for that. > > The version I send you did not use preemption. > > We need to make a decision if we want to go without preemption and cmpxchg > or with preemption and cmpxchg_local. > I don't expect any performance improvements with cmpxchg() over irq disable/restore. I think we'll have to use cmpxchg_local Also, we may argue that locked cmpxchg will have more scalability impact than cmpxchg_local. Actually, I expect the LOCK prefix to have a bigger scalability impact than the irq save/restore pair. > If we really want to do this then the implementation of all of these > components need to result in competitive performance on all platforms. > The minor issue I see here is on architectures where we have to simulate cmpxchg_local with irq save/restore. Depending on how we implement the code, it may result in two irq save/restore pairs instead of one, which could make the code slower. However, if we are clever enough in our low-level primitive usage, I think we could make the code use cmpxchg_local when available and fall back on only _one_ irq disabled section surrounding the whole code for other architectures. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/6] writeback time order/delay fixes take 3
On Tue, Aug 21, 2007 at 08:23:14PM -0400, Chris Mason wrote: > On Sun, 12 Aug 2007 17:11:20 +0800 > Fengguang Wu <[EMAIL PROTECTED]> wrote: > > > Andrew and Ken, > > > > Here are some more experiments on the writeback stuff. > > Comments are highly welcome~ > > I've been doing benchmarks lately to try and trigger fragmentation, and > one of them is a simulation of make -j N. It takes a list of all > the .o files in the kernel tree, randomly sorts them and then > creates bogus files with the same names and sizes in clean kernel trees. > > This is basically creating a whole bunch of files in random order in a > whole bunch of subdirectories. > > The results aren't pretty: > > http://oss.oracle.com/~mason/compilebench/makej/compare-compile-dirs-0.png > > The top graph shows one dot for each write over time. It shows that > ext3 is basically writing all over the place the whole time. But, ext3 > actually wins the read phase, so the layout isn't horrible. My guess > is that if we introduce some write clustering by sending a group of > inodes down at the same time, it'll go much much better. > > Andrew has mentioned bringing a few radix trees into the writeback paths > before, it seems like file servers and other general uses will benefit > from better clustering here. > > I'm hoping to talk you into trying it out ;) Thank you for the description of problem. So far I have a similar one in mind: if we are to delay writeback of atime-dirty-only inodes to above 1 hour, some grouping/piggy-backing scenario would be beneficial. (Which I guess does not deserve the complexity now that we have Ingo's make-reltime-default patch.) My vague idea is to - keep the s_io/s_more_io as a FIFO/cyclic writeback dispatching queue. - convert s_dirty to some radix-tree/rbtree based data structure. It would have dual functions: delayed-writeback and clustered-writeback. clustered-writeback: - Use inode number as clue of locality, hence the key for the sorted tree. - Drain some more s_dirty inodes into s_io on every kupdate wakeup, but do it in the ascending order of inode number instead of ->dirtied_when. delayed-writeback: - Make sure that a full scan of the s_dirty tree takes <=30s, i.e. dirty_expire_interval. Notes: (1) I'm not sure inode number is correlated to disk location in filesystems other than ext2/3/4. Or parent dir? (2) It duplicates some function of elevators. Why is it necessary? Maybe we have no clue on the exact data location at this time? Fengguang - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ver_linux is [censored]
Fix ver_linux glibc version printing (for real this time) Alexey Dobriyan reported that commit 4a645d5ea65baaa5736bcb566673bf4a351b2ad8 broke ver_linux when glibc has a 3 digit version number, and proposed a patch. Al Viro then suggested a simpler way to solve the problem which I've then simply put into patch form. Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]> Signed-off-by: Al Viro <[EMAIL PROTECTED]> Signed-off-by: Jesper Juhl <[EMAIL PROTECTED]> --- scripts/ver_linux |5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) diff --git a/scripts/ver_linux b/scripts/ver_linux index 8f8df93..5a16bad 100755 --- a/scripts/ver_linux +++ b/scripts/ver_linux @@ -65,9 +65,8 @@ isdnctrl 2>&1 | grep version | awk \ showmount --version 2>&1 | grep nfs-utils | awk \ 'NR==1{print "nfs-utils ", $NF}' -ls -l `ldd /bin/sh | awk '/libc/{print $3}'` | sed \ --e 's/\.so$//' | sed -e 's/>//' | \ -awk -F'[.-]' '{print "Linux C Library"$(NF-1)"."$NF}' +echo -n "Linux C Library" +sed -n -e '/^.*\/libc-\([^/]*\)\.so$/{s//\1/;p;q}' < /proc/self/maps ldd -v > /dev/null 2>&1 && ldd -v || ldd --version |head -n 1 | awk \ 'NR==1{print "Dynamic linker (ldd) ", $NF}' - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] dma: override "dma_flags_set_dmaflush" for sn-ia64
On Tue, 2007-08-21 at 17:34 -0700, [EMAIL PROTECTED] wrote: > On Tue, Aug 21, 2007 at 03:55:29PM -0500, James Bottomley wrote: > > > . > > Almost every platform supports posted DMA ... its a property of most PCI > > bridge chips. > > > > The term "posted DMA" is used to describe this behavior in the Altix > Device Driver Writer's Guide, but it may be confusing things here. > Maybe a better term will suggest itself if I can clarify OK, but posted DMA has a pretty specific meaning in terms of PCI, hence the confusion. > On Altix, DMA from a device isn't guaranteed to arrive in host memory > in the order it was sent from the device. This reordering can happen > in the NUMA interconnect (it's specifically not a PCI reordering.) This is mmiowb and read_relaxed() again, isn't it? > > .. > > This isn't possible on most platforms. PCI write posting can only be > > flushed by a read transaction on the device (or sometimes any device on > > the bridge). Either this interface is misnamed and misdescribed, or it > > can't work for most systems. > > > > Clearly it wasn't described adequately... > > A read transaction on the device will flush pending writes to the > device. But I'm worried about DMA from the device to host memory. > On Altix, there are two mechanisms that flush all in-flight DMA > to host memory: 1) an interrupt, and 2) a write to a memory region > which has a "barrier" attribute set. Obviously option 1 isn't > viable for performance reasons. This new interface is about making > "option 2" generally available. (As it is now, the only way to get > memory with the "barrier" attribute is to allocate it with > dma_alloc_coherent().) Which sounds exactly what mmiowb does ... is there a need for a new API; can't you just use mmiowb()? James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
* Christoph Lameter ([EMAIL PROTECTED]) wrote: > Ok. Measurements vs. simple cmpxchg on a Intel(R) Pentium(R) 4 CPU 3.20GHz > (hyperthreading enabled). Test run with your module show only minor > performance improvements and lots of regressions. So we must have > cmpxchg_local to see any improvements? Some kind of a recent optimization > of cmpxchg performance that we do not see on older cpus? > I did not expect the cmpxchg with LOCK prefix to be faster than irq save/restore. You will need to run these tests using cmpxchg_local to see an improvement. Mathieu > > Code of kmem_cache_alloc (to show you that there are no debug options on): > > Dump of assembler code for function kmem_cache_alloc: > 0x4015cfa9 :push %ebp > 0x4015cfaa :mov%esp,%ebp > 0x4015cfac :push %edi > 0x4015cfad :push %esi > 0x4015cfae :push %ebx > 0x4015cfaf :sub$0x10,%esp > 0x4015cfb2 :mov%eax,%esi > 0x4015cfb4 : mov%edx,0xffe8(%ebp) > 0x4015cfb7 : mov0x4(%ebp),%eax > 0x4015cfba : mov%eax,0xfff0(%ebp) > 0x4015cfbd : mov%fs:0x404af008,%eax > 0x4015cfc3 : mov0x90(%esi,%eax,4),%edi > 0x4015cfca : mov(%edi),%ecx > 0x4015cfcc : test %ecx,%ecx > 0x4015cfce : je 0x4015d00a > > 0x4015cfd0 : mov0xc(%edi),%eax > 0x4015cfd3 : mov(%ecx,%eax,4),%eax > 0x4015cfd6 : mov%eax,%edx > 0x4015cfd8 : mov%ecx,%eax > 0x4015cfda : lock cmpxchg %edx,(%edi) > 0x4015cfde : mov%eax,%ebx > 0x4015cfe0 : cmp%ecx,%eax > 0x4015cfe2 : jne0x4015cfbd > > 0x4015cfe4 : cmpw $0x0,0xffe8(%ebp) > 0x4015cfe9 : jns0x4015d006 > > 0x4015cfeb : mov0x10(%edi),%edx > 0x4015cfee : xor%eax,%eax > 0x4015cff0 : mov%edx,%ecx > 0x4015cff2 : shr$0x2,%ecx > 0x4015cff5 : mov%ebx,%edi > > Base > > 1. Kmalloc: Repeatedly allocate then free test > 1 times kmalloc(8) -> 332 cycles kfree -> 422 cycles > 1 times kmalloc(16) -> 218 cycles kfree -> 360 cycles > 1 times kmalloc(32) -> 214 cycles kfree -> 368 cycles > 1 times kmalloc(64) -> 244 cycles kfree -> 390 cycles > 1 times kmalloc(128) -> 320 cycles kfree -> 417 cycles > 1 times kmalloc(256) -> 438 cycles kfree -> 550 cycles > 1 times kmalloc(512) -> 527 cycles kfree -> 626 cycles > 1 times kmalloc(1024) -> 678 cycles kfree -> 775 cycles > 1 times kmalloc(2048) -> 748 cycles kfree -> 822 cycles > 1 times kmalloc(4096) -> 641 cycles kfree -> 650 cycles > 1 times kmalloc(8192) -> 741 cycles kfree -> 817 cycles > 1 times kmalloc(16384) -> 872 cycles kfree -> 927 cycles > 2. Kmalloc: alloc/free test > 1 times kmalloc(8)/kfree -> 332 cycles > 1 times kmalloc(16)/kfree -> 327 cycles > 1 times kmalloc(32)/kfree -> 323 cycles > 1 times kmalloc(64)/kfree -> 320 cycles > 1 times kmalloc(128)/kfree -> 320 cycles > 1 times kmalloc(256)/kfree -> 333 cycles > 1 times kmalloc(512)/kfree -> 332 cycles > 1 times kmalloc(1024)/kfree -> 330 cycles > 1 times kmalloc(2048)/kfree -> 334 cycles > 1 times kmalloc(4096)/kfree -> 674 cycles > 1 times kmalloc(8192)/kfree -> 1155 cycles > 1 times kmalloc(16384)/kfree -> 1226 cycles > > Slub cmpxchg. > > 1. Kmalloc: Repeatedly allocate then free test > 1 times kmalloc(8) -> 296 cycles kfree -> 515 cycles > 1 times kmalloc(16) -> 193 cycles kfree -> 412 cycles > 1 times kmalloc(32) -> 188 cycles kfree -> 422 cycles > 1 times kmalloc(64) -> 222 cycles kfree -> 441 cycles > 1 times kmalloc(128) -> 292 cycles kfree -> 476 cycles > 1 times kmalloc(256) -> 414 cycles kfree -> 589 cycles > 1 times kmalloc(512) -> 513 cycles kfree -> 673 cycles > 1 times kmalloc(1024) -> 694 cycles kfree -> 825 cycles > 1 times kmalloc(2048) -> 739 cycles kfree -> 878 cycles > 1 times kmalloc(4096) -> 636 cycles kfree -> 653 cycles > 1 times kmalloc(8192) -> 715 cycles kfree -> 799 cycles > 1 times kmalloc(16384) -> 855 cycles kfree -> 927 cycles > 2. Kmalloc: alloc/free test > 1 times kmalloc(8)/kfree -> 354 cycles > 1 times kmalloc(16)/kfree -> 336 cycles > 1 times kmalloc(32)/kfree -> 335 cycles > 1 times kmalloc(64)/kfree -> 337 cycles > 1 times kmalloc(128)/kfree -> 337 cycles > 1 times kmalloc(256)/kfree -> 355 cycles > 1 times kmalloc(512)/kfree -> 354 cycles > 1 times kmalloc(1024)/kfree -> 337 cycles > 1 times kmalloc(2048)/kfree -> 339 cycles > 1 times kmalloc(4096)/kfree -> 674 cycles > 1 times kmalloc(8192)/kfree -> 1128 cycles > 1 times kmalloc(16384)/kfree -> 1240 cycles > > -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the
Re: [PATCH] SLUB use cmpxchg_local
Ok. Measurements vs. simple cmpxchg on a Intel(R) Pentium(R) 4 CPU 3.20GHz (hyperthreading enabled). Test run with your module show only minor performance improvements and lots of regressions. So we must have cmpxchg_local to see any improvements? Some kind of a recent optimization of cmpxchg performance that we do not see on older cpus? Code of kmem_cache_alloc (to show you that there are no debug options on): Dump of assembler code for function kmem_cache_alloc: 0x4015cfa9 :push %ebp 0x4015cfaa :mov%esp,%ebp 0x4015cfac :push %edi 0x4015cfad :push %esi 0x4015cfae :push %ebx 0x4015cfaf :sub$0x10,%esp 0x4015cfb2 :mov%eax,%esi 0x4015cfb4 : mov%edx,0xffe8(%ebp) 0x4015cfb7 : mov0x4(%ebp),%eax 0x4015cfba : mov%eax,0xfff0(%ebp) 0x4015cfbd : mov%fs:0x404af008,%eax 0x4015cfc3 : mov0x90(%esi,%eax,4),%edi 0x4015cfca : mov(%edi),%ecx 0x4015cfcc : test %ecx,%ecx 0x4015cfce : je 0x4015d00a 0x4015cfd0 : mov0xc(%edi),%eax 0x4015cfd3 : mov(%ecx,%eax,4),%eax 0x4015cfd6 : mov%eax,%edx 0x4015cfd8 : mov%ecx,%eax 0x4015cfda : lock cmpxchg %edx,(%edi) 0x4015cfde : mov%eax,%ebx 0x4015cfe0 : cmp%ecx,%eax 0x4015cfe2 : jne0x4015cfbd 0x4015cfe4 : cmpw $0x0,0xffe8(%ebp) 0x4015cfe9 : jns0x4015d006 0x4015cfeb : mov0x10(%edi),%edx 0x4015cfee : xor%eax,%eax 0x4015cff0 : mov%edx,%ecx 0x4015cff2 : shr$0x2,%ecx 0x4015cff5 : mov%ebx,%edi Base 1. Kmalloc: Repeatedly allocate then free test 1 times kmalloc(8) -> 332 cycles kfree -> 422 cycles 1 times kmalloc(16) -> 218 cycles kfree -> 360 cycles 1 times kmalloc(32) -> 214 cycles kfree -> 368 cycles 1 times kmalloc(64) -> 244 cycles kfree -> 390 cycles 1 times kmalloc(128) -> 320 cycles kfree -> 417 cycles 1 times kmalloc(256) -> 438 cycles kfree -> 550 cycles 1 times kmalloc(512) -> 527 cycles kfree -> 626 cycles 1 times kmalloc(1024) -> 678 cycles kfree -> 775 cycles 1 times kmalloc(2048) -> 748 cycles kfree -> 822 cycles 1 times kmalloc(4096) -> 641 cycles kfree -> 650 cycles 1 times kmalloc(8192) -> 741 cycles kfree -> 817 cycles 1 times kmalloc(16384) -> 872 cycles kfree -> 927 cycles 2. Kmalloc: alloc/free test 1 times kmalloc(8)/kfree -> 332 cycles 1 times kmalloc(16)/kfree -> 327 cycles 1 times kmalloc(32)/kfree -> 323 cycles 1 times kmalloc(64)/kfree -> 320 cycles 1 times kmalloc(128)/kfree -> 320 cycles 1 times kmalloc(256)/kfree -> 333 cycles 1 times kmalloc(512)/kfree -> 332 cycles 1 times kmalloc(1024)/kfree -> 330 cycles 1 times kmalloc(2048)/kfree -> 334 cycles 1 times kmalloc(4096)/kfree -> 674 cycles 1 times kmalloc(8192)/kfree -> 1155 cycles 1 times kmalloc(16384)/kfree -> 1226 cycles Slub cmpxchg. 1. Kmalloc: Repeatedly allocate then free test 1 times kmalloc(8) -> 296 cycles kfree -> 515 cycles 1 times kmalloc(16) -> 193 cycles kfree -> 412 cycles 1 times kmalloc(32) -> 188 cycles kfree -> 422 cycles 1 times kmalloc(64) -> 222 cycles kfree -> 441 cycles 1 times kmalloc(128) -> 292 cycles kfree -> 476 cycles 1 times kmalloc(256) -> 414 cycles kfree -> 589 cycles 1 times kmalloc(512) -> 513 cycles kfree -> 673 cycles 1 times kmalloc(1024) -> 694 cycles kfree -> 825 cycles 1 times kmalloc(2048) -> 739 cycles kfree -> 878 cycles 1 times kmalloc(4096) -> 636 cycles kfree -> 653 cycles 1 times kmalloc(8192) -> 715 cycles kfree -> 799 cycles 1 times kmalloc(16384) -> 855 cycles kfree -> 927 cycles 2. Kmalloc: alloc/free test 1 times kmalloc(8)/kfree -> 354 cycles 1 times kmalloc(16)/kfree -> 336 cycles 1 times kmalloc(32)/kfree -> 335 cycles 1 times kmalloc(64)/kfree -> 337 cycles 1 times kmalloc(128)/kfree -> 337 cycles 1 times kmalloc(256)/kfree -> 355 cycles 1 times kmalloc(512)/kfree -> 354 cycles 1 times kmalloc(1024)/kfree -> 337 cycles 1 times kmalloc(2048)/kfree -> 339 cycles 1 times kmalloc(4096)/kfree -> 674 cycles 1 times kmalloc(8192)/kfree -> 1128 cycles 1 times kmalloc(16384)/kfree -> 1240 cycles - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Trouble booting with new 2.6.22.3 kernel
I resolved the issue like so: 1) Download kernel source (K) from kernel.org 2) move the archive to a place where source is, /home/user/linux for example (will have to do mkdir /home/user/linux first of course!) via mv /home/user/Desktop/archive /dest/archive 3) Extract the archive: if .tar.bz2: tar -jxvf archive.tar.bz2 else tar -zxvf archive.tar.gz 4) cd into resulting directory 5) do a sudo make defconfig, then sudo make xconfig (or if in terminal menuconfig) and customize options for system, filesystem and controller support must be built in! 6) do a sudo make 7) then sudo make modules_install 8) then sudo make install 9) then do a sudo update-initramfs -k kernelversion -c -v 10) then grub-update 11) reboot and enjoy! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ver_linux is [censored]
On Wed, Aug 22, 2007 at 02:02:44AM +0200, Jesper Juhl wrote: > > How about simply doing > > sh -c 'cat /proc/$$/maps'|sed -n -e '/^.*\/libc-\([^/]*\)\.so$/{s//\1/;p;q}' > > and to hell with parsing ls -l output? > > > Works for me. or, simpler yet, sed -n -e '/^.*\/libc-\([^/]*\)\.so$/{s//\1/;p;q}' http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Tue, Aug 21, 2007 at 06:51:16PM -0400, [EMAIL PROTECTED] wrote: > On Tue, 21 Aug 2007 09:16:43 PDT, "Paul E. McKenney" said: > > > I agree that instant gratification is hard to come by when synching > > up compiler and kernel versions. Nonetheless, it should be possible > > to create APIs that are are conditioned on the compiler version. > > We've tried that, sort of. See the mess surrounding the whole > extern/static/inline/__whatever boondogle, which seems to have > changed semantics in every single gcc release since 2.95 or so. > > And recently mention was made that gcc4.4 will have *new* semantics > in this area. Yee. Hah. ;-) Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How to learn Linux Kernel Programming
On 21/08/07, Noud Aldenhoven <[EMAIL PROTECTED]> wrote: > Hello Kernel Develop mailing list, > ... > > I'm a simple Math/Computer Science student and would like to learn > more about linux and it's kernel. > To be more precise, I'd to learn how to program in the linux kernel > and maybe become a developer, > if everything goes fine. > But where do I start? Start by reading Documentation/HOWTO from a recent copy of the kernel source. > Almost all information I found on the Internet > if from before 2005 There's lots of good kernel related material to be found online. See for example : http://kernelnewbies.org/ http://janitor.kernelnewbies.org/ http://lwn.net/Kernel/LDD3/ http://lwn.net/Kernel/ http://kerneltrap.org/ http://kerneltraffic.org/ > and I think that > means it's out-of-date. That's not always true. > Are there up-to-date documentations that are > use full to read and explain how > the kernel is build. (for example, is /usr/src/linux/Documentation a > use full dir?) Yes it is useful. Not everything in there is 100% up-to-date, but there is still a *LOT* of useful documentation to be found there. > An other question I'd like to ask is how and where did you start? I'd > like to know how you manage to became > linux kernel developers. > Most people start out fixing small bugs, cleanups etc or by implementing some small feature or driver that they need. There's no fixed way. -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
* Andi Kleen ([EMAIL PROTECTED]) wrote: > Mathieu Desnoyers <[EMAIL PROTECTED]> writes: > > > > The measurements I get (in cycles): > > enable interrupts (STI) disable interrupts (CLI) local > > CMPXCHG > > IA32 (P4)11282 26 > > x86_64 AMD64 125 102 19 > > What exactly did you benchmark here? On K8 CLI/STI are only supposed > to be a few cycles. pushf/popf might me more expensive, but not that much. > Hi Andi, I benchmarked cmpxchg_local vs local_irq_save/local_irq_restore. Details, and code, follow. * cpuinfo: processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 35 model name : AMD Athlon(tm)64 X2 Dual Core Processor 3800+ stepping: 2 cpu MHz : 2009.204 cache size : 512 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy bogomips: 4023.38 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 35 model name : AMD Athlon(tm)64 X2 Dual Core Processor 3800+ stepping: 2 cpu MHz : 2009.204 cache size : 512 KB physical id : 0 siblings: 2 core id : 1 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy bogomips: 4018.49 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp * Test ran: /* test-cmpxchg-nolock.c * * Compare local cmpxchg with irq disable / enable. */ #include #include #include #include #include #include #include #define NR_LOOPS 2 int test_val; static void do_test_cmpxchg(void) { int ret; long flags; unsigned int i; cycles_t time1, time2, time; long rem; local_irq_save(flags); preempt_disable(); time1 = get_cycles(); for (i = 0; i < NR_LOOPS; i++) { ret = cmpxchg_local(_val, 0, 0); } time2 = get_cycles(); local_irq_restore(flags); preempt_enable(); time = time2 - time1; printk(KERN_ALERT "test results: time for non locked cmpxchg\n"); printk(KERN_ALERT "number of loops: %d\n", NR_LOOPS); printk(KERN_ALERT "total time: %llu\n", time); time = div_long_long_rem(time, NR_LOOPS, ); printk(KERN_ALERT "-> non locked cmpxchg takes %llu cycles\n", time); printk(KERN_ALERT "test end\n"); } /* * This test will have a higher standard deviation due to incoming interrupts. */ static void do_test_enable_int(void) { long flags; unsigned int i; cycles_t time1, time2, time; long rem; local_irq_save(flags); preempt_disable(); time1 = get_cycles(); for (i = 0; i < NR_LOOPS; i++) { local_irq_restore(flags); } time2 = get_cycles(); local_irq_restore(flags); preempt_enable(); time = time2 - time1; printk(KERN_ALERT "test results: time for enabling interrupts (STI)\n"); printk(KERN_ALERT "number of loops: %d\n", NR_LOOPS); printk(KERN_ALERT "total time: %llu\n", time); time = div_long_long_rem(time, NR_LOOPS, ); printk(KERN_ALERT "-> enabling interrupts (STI) takes %llu cycles\n", time); printk(KERN_ALERT "test end\n"); } static void do_test_disable_int(void) { unsigned long flags, flags2; unsigned int i; cycles_t time1, time2, time; long rem; local_irq_save(flags); preempt_disable(); time1 = get_cycles(); for ( i = 0; i < NR_LOOPS; i++) { local_irq_save(flags2); } time2 = get_cycles(); local_irq_restore(flags); preempt_enable(); time = time2 - time1; printk(KERN_ALERT "test results: time for disabling interrupts (CLI)\n"); printk(KERN_ALERT "number of loops: %d\n", NR_LOOPS); printk(KERN_ALERT "total time: %llu\n", time); time = div_long_long_rem(time, NR_LOOPS, ); printk(KERN_ALERT "-> disabling interrupts (CLI) takes %llu cycles\n", time);
Re: [PATCH 2/3] dma: override "dma_flags_set_dmaflush" for sn-ia64
On Tue, Aug 21, 2007 at 03:55:29PM -0500, James Bottomley wrote: > . > Almost every platform supports posted DMA ... its a property of most PCI > bridge chips. > The term "posted DMA" is used to describe this behavior in the Altix Device Driver Writer's Guide, but it may be confusing things here. Maybe a better term will suggest itself if I can clarify On Altix, DMA from a device isn't guaranteed to arrive in host memory in the order it was sent from the device. This reordering can happen in the NUMA interconnect (it's specifically not a PCI reordering.) > .. > This isn't possible on most platforms. PCI write posting can only be > flushed by a read transaction on the device (or sometimes any device on > the bridge). Either this interface is misnamed and misdescribed, or it > can't work for most systems. > Clearly it wasn't described adequately... A read transaction on the device will flush pending writes to the device. But I'm worried about DMA from the device to host memory. On Altix, there are two mechanisms that flush all in-flight DMA to host memory: 1) an interrupt, and 2) a write to a memory region which has a "barrier" attribute set. Obviously option 1 isn't viable for performance reasons. This new interface is about making "option 2" generally available. (As it is now, the only way to get memory with the "barrier" attribute is to allocate it with dma_alloc_coherent().) -- Arthur - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > As I am going back through the initial cmpxchg_local implementation, it > seems like it was executing __slab_alloc() with preemption disabled, > which is wrong. new_slab() is not designed for that. The version I send you did not use preemption. We need to make a decision if we want to go without preemption and cmpxchg or with preemption and cmpxchg_local. If we really want to do this then the implementation of all of these components need to result in competitive performance on all platforms. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
Mathieu Desnoyers <[EMAIL PROTECTED]> writes: > > The measurements I get (in cycles): > enable interrupts (STI) disable interrupts (CLI) local > CMPXCHG > IA32 (P4)11282 26 > x86_64 AMD64 125 102 19 What exactly did you benchmark here? On K8 CLI/STI are only supposed to be a few cycles. pushf/popf might me more expensive, but not that much. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] x86_64 EFI runtime service support
> current LinuxBIOS's path: the elfboot in LinuxBIOS will prepare the > e820 table, and jump to startup_32 in kernel. is that not good and > simple? The problem is that the zero page cannot be changed at all in this setup. Or rather it can be only changed by breaking LinuxBios. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[LOCKDEP][2.6.23-rc2-rt]
Hi Ingo, here is a lockdep trace I just encountered in the latest rt patch series. (which has gotten a bit stale, btw.) Enjoy, Sven = [ BUG: bad unlock balance detected! ] - swapper/1 is trying to release lock (per_cpu_lock__slab_irq_locks_locked) at: [] kmem_cache_alloc+0xb4/0x150 but there are no more locks to release! other info that might help us debug this: 1 lock held by swapper/1: #0: (per_cpu_lock__slab_irq_locks_locked#7){--..}, at: [] c0 stack backtrace: Call Trace: [] print_unlock_inbalance_bug+0xf7/0x100 [] lock_release_non_nested+0x111/0x1a0 [] kmem_cache_alloc+0xb4/0x150 [] lock_release+0xd2/0x1f0 [] rt_spin_unlock+0x26/0x40 [] kmem_cache_alloc+0xb4/0x150 [] kobject_uevent_env+0x13c/0x520 [] trace_hardirqs_on+0xd/0x10 [] rt_mutex_slowunlock+0x54/0x90 [] get_bus+0x9/0x40 [] kobject_uevent+0x10/0x20 [] device_add+0x516/0x680 [] device_register+0x1e/0x30 [] device_create+0xec/0x130 [] sprintf+0x6d/0x70 [] add_preempt_count+0x2b/0x150 [] put_lock_stats+0x13/0x40 [] lock_release_holdtime+0x6b/0x90 [] mark_held_locks+0x10/0x90 [] trace_hardirqs_on+0xd/0x10 [] tty_register_device+0x74/0x100 [] rt_mutex_slowunlock+0x54/0x90 [] tty_register_driver+0x16c/0x2a0 [] pty_init+0x22e/0x570 [] kernel_init+0x194/0x490 [] trace_hardirqs_on+0xd/0x10 [] mark_held_locks+0x10/0x90 [] trace_hardirqs_on_thunk+0x3a/0x3c [] trace_hardirqs_on_caller+0xd7/0x170 [] child_rip+0xa/0x12 [] restore_args+0x0/0x30 [] kernel_init+0x0/0x490 [] child_rip+0x0/0x12 INFO: lockdep is turned off. --- | preempt count: ] | 0-level deep critical section nesting: [ cut here ] kernel BUG at kernel/rtmutex.c:682! invalid opcode: [1] PREEMPT SMP CPU 6 Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.23-rc2-rt1-debug #1 RIP: 0010:[] [] rt_spin_lock_slowlock+0x1b0 RSP: 0018:81041d837a10 EFLAGS: 00010046 RAX: 81031d836040 RBX: 81032c1542a0 RCX: RDX: 81031d836040 RSI: 81032c1542b8 RDI: 81032c1542a0 RBP: 81041d837ad0 R08: 0002 R09: 0001 R10: R11: R12: 0246 R13: 80d0 R14: 81011b9009c0 R15: 805606b4 FS:
Re: [PATCH] SLUB use cmpxchg_local
* Christoph Lameter ([EMAIL PROTECTED]) wrote: > On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > > > - Rounding error.. you seem to round at 0.1ms, but I keep the values in > > cycles. The times that you get (1.1ms) seems strangely higher than > > mine, which are under 1000 cycles on a 3GHz system (less than 333ns). > > I guess there is both a ms - ns error there and/or not enough > > precision in your numbers. > > Nope the rounding for output is depending on the amount. Rounds to one > digit after whatever unit we figured out is best to display. > > And multiplications (cyc2ns) do not result in rounding errors. > Ok, I see now that the 1.1ms was for the 1 iterations, which makes it about 230 ns/iteration for the 1 times kmalloc(8) = 2.3ms test. As I am going back through the initial cmpxchg_local implementation, it seems like it was executing __slab_alloc() with preemption disabled, which is wrong. new_slab() is not designed for that. I'll try to run my tests on AMD64. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/6] writeback time order/delay fixes take 3
On Sun, 12 Aug 2007 17:11:20 +0800 Fengguang Wu <[EMAIL PROTECTED]> wrote: > Andrew and Ken, > > Here are some more experiments on the writeback stuff. > Comments are highly welcome~ I've been doing benchmarks lately to try and trigger fragmentation, and one of them is a simulation of make -j N. It takes a list of all the .o files in the kernel tree, randomly sorts them and then creates bogus files with the same names and sizes in clean kernel trees. This is basically creating a whole bunch of files in random order in a whole bunch of subdirectories. The results aren't pretty: http://oss.oracle.com/~mason/compilebench/makej/compare-compile-dirs-0.png The top graph shows one dot for each write over time. It shows that ext3 is basically writing all over the place the whole time. But, ext3 actually wins the read phase, so the layout isn't horrible. My guess is that if we introduce some write clustering by sending a group of inodes down at the same time, it'll go much much better. Andrew has mentioned bringing a few radix trees into the writeback paths before, it seems like file servers and other general uses will benefit from better clustering here. I'm hoping to talk you into trying it out ;) -chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Please pull from 'fixes-2.6.23' branch
Please pull from 'fixes-2.6.23' branch of master.kernel.org:/pub/scm/linux/kernel/git/galak/powerpc.git fixes-2.6.23 to receive the following updates: arch/powerpc/sysdev/fsl_pci.c |2 ++ include/linux/pci_ids.h |6 -- 2 files changed, 6 insertions(+), 2 deletions(-) Kumar Gala (1): [POWERPC] Fix PCI Device ID for MPC8544/8533 processors commit 15f6ddc7d9cf96f2ee88897c7164198ed6e45a77 Author: Kumar Gala <[EMAIL PROTECTED]> Date: Tue Aug 21 19:15:31 2007 -0500 [POWERPC] Fix PCI Device ID for MPC8544/8533 processors The initial user manuals for MPC8544/8533 had some issues with properly documenting the device IDs for MPC8544/8533. These processors are almost identical and both show up on the reference boards. Fix up the quirks for PCIe support to handle MPC8533/E. Signed-off-by: Kumar Gala <[EMAIL PROTECTED]> diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c index 9fb0ce5..114c90f 100644 --- a/arch/powerpc/sysdev/fsl_pci.c +++ b/arch/powerpc/sysdev/fsl_pci.c @@ -251,6 +251,8 @@ DECLARE_PCI_FIXUP_EARLY(0x1957, PCI_DEVICE_ID_MPC8568E, quirk_fsl_pcie_transpare DECLARE_PCI_FIXUP_EARLY(0x1957, PCI_DEVICE_ID_MPC8568, quirk_fsl_pcie_transparent); DECLARE_PCI_FIXUP_EARLY(0x1957, PCI_DEVICE_ID_MPC8567E, quirk_fsl_pcie_transparent); DECLARE_PCI_FIXUP_EARLY(0x1957, PCI_DEVICE_ID_MPC8567, quirk_fsl_pcie_transparent); +DECLARE_PCI_FIXUP_EARLY(0x1957, PCI_DEVICE_ID_MPC8533E, quirk_fsl_pcie_transparent); +DECLARE_PCI_FIXUP_EARLY(0x1957, PCI_DEVICE_ID_MPC8533, quirk_fsl_pcie_transparent); DECLARE_PCI_FIXUP_EARLY(0x1957, PCI_DEVICE_ID_MPC8544E, quirk_fsl_pcie_transparent); DECLARE_PCI_FIXUP_EARLY(0x1957, PCI_DEVICE_ID_MPC8544, quirk_fsl_pcie_transparent); DECLARE_PCI_FIXUP_EARLY(0x1957, PCI_DEVICE_ID_MPC8641, quirk_fsl_pcie_transparent); diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 07fc574..8938d59 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2092,8 +2092,10 @@ #define PCI_DEVICE_ID_MPC8568 0x0021 #define PCI_DEVICE_ID_MPC8567E 0x0022 #define PCI_DEVICE_ID_MPC8567 0x0023 -#define PCI_DEVICE_ID_MPC8544E 0x0030 -#define PCI_DEVICE_ID_MPC8544 0x0031 +#define PCI_DEVICE_ID_MPC8533E 0x0030 +#define PCI_DEVICE_ID_MPC8533 0x0031 +#define PCI_DEVICE_ID_MPC8544E 0x0032 +#define PCI_DEVICE_ID_MPC8544 0x0033 #define PCI_DEVICE_ID_MPC8641 0x7010 #define PCI_DEVICE_ID_MPC8641D 0x7011 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > - Rounding error.. you seem to round at 0.1ms, but I keep the values in > cycles. The times that you get (1.1ms) seems strangely higher than > mine, which are under 1000 cycles on a 3GHz system (less than 333ns). > I guess there is both a ms - ns error there and/or not enough > precision in your numbers. Nope the rounding for output is depending on the amount. Rounds to one digit after whatever unit we figured out is best to display. And multiplications (cyc2ns) do not result in rounding errors. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: drop support for gcc < 4.0
How many people e.g. test -rc kernels compiled with gcc 3.2? Why would that matter? It either works or not. If it doesn't work, it can either be fixed, or support for that old compiler version can be removed. One bug report "kernel doesn't work / crash / ... when compiled with gcc 3.2, but works when compiled with gcc 4.2" will most likely be lost in the big pile of unhandled bugs, not cause the removal of gcc 3.2 support... While that might be true, it's a separate problem. The only other policy than "only remove support if things are badly broken" would be "only support what the GCC team supports", which would be >= 4.1 now; and there are very good arguments for supporting more than that with the Linux kernel. No, it's not about bugs in gcc, it's about kernel+gcc combinations that are mostly untested but officially supported. What does "officially supported" mean? Especially the "officially" part. Is this documented somewhere? E.g. how many kernel developers use kernels compiled without unit-at-a-time? And unit-at-a-time does paper over some bugs, e.g. at about half a dozen section mismatch bugs I've fixed recently are not present with it. If any developer is interested in supporting some certain old compiler version, he should be testing regularly with it. Sounds like that's you ;-) If no developer is interested, we shouldn't claim to support using that compiler version. But as the discussions have shown gcc 4.0 is currently too high for making a cut, and it is not yet the right time for raising the minimum required gcc version. Agreed. Segher - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][2.6.23-rc2-mm2] small fix for ia64 icache sync patch
On Tue, 21 Aug 2007 14:12:02 -0700 "Luck, Tony" <[EMAIL PROTECTED]> wrote: > > + if (pte_present(pteval) &&// swap out ? > > + pte_exec(pteval) &&// flush only new executable page. > > pte_user(pteval) &&// ignore kernel page > > (!pte_present(*ptep) ||// do_no_page or swap in, migration, > > pte_pfn(*ptep) != pte_pfn(pteval))) // do_wp_page(), page copy > > David Mosberger was concerned about the increase in code > size from this inline function. We can reduce the bloat > a bit by defining a macro that tests for "present && > executable && user-mode" in one go: > > #define pte_pux(pte)((pte_val(pte) & > (_PAGE_P|_PAGE_PL_MASK|_PAGE_AR_RX)) == \ > (_PAGE_P|_PAGE_PL_3|_PAGE_AR_RX)) > Hmm, ok. > Perhaps there is a better name than "pte_pux"? I don't know whether > the code that this generates is faster, but it is smaller (bloat > is only 3k instead of 4k). > > One last cleanup needed ... don't use C-99/C++ style comments. > ok. -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Add missing PCI capability IDs
These IDs are in pciutils, but haven't been added to the kernel yet. Signed-off-by: Alex Chiang <[EMAIL PROTECTED]> Signed-off-by: Matthew Wilcox <[EMAIL PROTECTED]> --- diff --git a/include/linux/pci_regs.h b/include/linux/pci_regs.h index 495d368..1ef8712 100644 --- a/include/linux/pci_regs.h +++ b/include/linux/pci_regs.h @@ -202,8 +202,12 @@ #define PCI_CAP_ID_CHSWP 0x06/* CompactPCI HotSwap */ #define PCI_CAP_ID_PCIX 0x07/* PCI-X */ #define PCI_CAP_ID_HT 0x08/* HyperTransport */ -#define PCI_CAP_ID_VNDR 0x09/* Vendor specific capability */ +#define PCI_CAP_ID_VNDR 0x09/* Vendor specific */ +#define PCI_CAP_ID_DBG0x0A/* Debug port */ +#define PCI_CAP_ID_CCRC 0x0B/* CompactPCI Central Resource Control */ #define PCI_CAP_ID_SHPC 0x0C/* PCI Standard Hot-Plug Controller */ +#define PCI_CAP_ID_SSVID 0x0D/* Bridge subsystem vendor/device ID */ +#define PCI_CAP_ID_AGP3 0x0E/* AGP Target PCI-PCI bridge */ #define PCI_CAP_ID_EXP0x10/* PCI Express */ #define PCI_CAP_ID_MSIX 0x11/* MSI-X */ #define PCI_CAP_LIST_NEXT 1 /* Next capability in the list */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ver_linux is [censored]
On 22/08/07, Al Viro <[EMAIL PROTECTED]> wrote: > On Tue, Aug 21, 2007 at 11:56:32AM +0200, Jesper Juhl wrote: > > On 21/08/07, Alexey Dobriyan <[EMAIL PROTECTED]> wrote: > > > Commit 4a645d5ea65baaa5736bcb566673bf4a351b2ad8 broke ver_linux > > > on etch which glibc has 3-digit version number. > > > > Whoops, sorry about that. > > > > > Patch replaces awk > > > wanking with more robust sed wanking. > > > > > > Tested on gentoo, etch, centos 4.2. > > > > > I tested your patch on Slackware 12.0, Debian 3.1 & Gentoo Base System > > release 1.12.9 and it works fine on those as well. > > How about simply doing > sh -c 'cat /proc/$$/maps'|sed -n -e '/^.*\/libc-\([^/]*\)\.so$/{s//\1/;p;q}' > and to hell with parsing ls -l output? > Works for me. -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
* Christoph Lameter ([EMAIL PROTECTED]) wrote: > On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > > > Are you running a UP or SMP kernel ? If you run a UP kernel, the > > cmpxchg_local and cmpxchg are identical. > > UP. > > > Oh, and if you run your tests at boot time, the alternatives code may > > have removed the lock prefix, therefore making cmpxchg and cmpxchg_local > > exactly the same. > > Tests were run at boot time. > > That still does not explain kmalloc not showing improvements. > Hrm, weird.. because it should. Here are the numbers I posted previously: The measurements I get (in cycles): enable interrupts (STI) disable interrupts (CLI) local CMPXCHG IA32 (P4)11282 26 x86_64 AMD64 125 102 19 So both AMD64 and IA32 should be improved. So why those improvements are not shown in your test ? A few possible causes: - Do you have any CONFIG_DEBUG_* options activated ? smp_processor_id() may end up being more expensive in these cases. - Rounding error.. you seem to round at 0.1ms, but I keep the values in cycles. The times that you get (1.1ms) seems strangely higher than mine, which are under 1000 cycles on a 3GHz system (less than 333ns). I guess there is both a ms - ns error there and/or not enough precision in your numbers. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] x86_64 EFI runtime service support
On 8/21/07, Andi Kleen <[EMAIL PROTECTED]> wrote: > On Tue, Aug 21, 2007 at 03:41:44AM -0700, H. Peter Anvin wrote: > > Andi Kleen wrote: > > >> - "struct boot_params" (the zeropage) is kept as a legacy interface. > > > > > > Legacy interface for what? Just for kexec utils which never should > > > have been using it anyways keeping backwards cruft around seems > > > misplac.ed > > > > Worse. LinuxBIOS. :( > > Sigh. Perhaps it should be renamed AntiLinuxBios: it seems to be actively > adverse. current LinuxBIOS's path: the elfboot in LinuxBIOS will prepare the e820 table, and jump to startup_32 in kernel. is that not good and simple? kernel is not supposed to switch back and forth to get such memmap... Why not using ACPI mean AntiLinux? YH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > Are you running a UP or SMP kernel ? If you run a UP kernel, the > cmpxchg_local and cmpxchg are identical. UP. > Oh, and if you run your tests at boot time, the alternatives code may > have removed the lock prefix, therefore making cmpxchg and cmpxchg_local > exactly the same. Tests were run at boot time. That still does not explain kmalloc not showing improvements. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
* Christoph Lameter ([EMAIL PROTECTED]) wrote: > On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > > > Using cmpxchg_local vs cmpxchg has a clear impact on the fast paths, as > > shown below: it saves about 60 to 70 cycles for kmalloc and 200 cycles > > for the kmalloc/kfree pair (test 2). > > H.. I wonder if the AMD processors simply do the same in either > version. No supposed to. I remember having posted numbers that show a difference. Are you running a UP or SMP kernel ? If you run a UP kernel, the cmpxchg_local and cmpxchg are identical. Oh, and if you run your tests at boot time, the alternatives code may have removed the lock prefix, therefore making cmpxchg and cmpxchg_local exactly the same. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
* Mathieu Desnoyers ([EMAIL PROTECTED]) wrote: > * Christoph Lameter ([EMAIL PROTECTED]) wrote: > > On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > > > > > - Changed smp_rmb() for barrier(). We are not interested in read order > > > across cpus, what we want is to be ordered wrt local interrupts only. > > > barrier() is much cheaper than a rmb(). > > > > But this means a preempt disable is required. RT users do not want that. > > Without preemption the processor can be moved after c has been determined. > > That is why the smp_rmb() is there. > > preemption is required if we want to use cmpxchg_local anyway. > > We may have to find a way to use preemption while being able to give an > upper bound on the preempt disabled execution time. I think I got a way > to do this yesterday.. I'll dig in my patches. > Yeah, I remember having done so : moving the preempt disable nearer to the cmpxchg, checking if the cpuid has changed between the raw_smp_processor_id() read and the preempt_disable done later, redo if it is different. It makes the slow path faster, but makes the fast path more complex, therefore I finally dropped the patch. And we talk about ~10 cycles for the slow path here, I doubt it's worth the complexity added to the fast path. -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > Using cmpxchg_local vs cmpxchg has a clear impact on the fast paths, as > shown below: it saves about 60 to 70 cycles for kmalloc and 200 cycles > for the kmalloc/kfree pair (test 2). H.. I wonder if the AMD processors simply do the same in either version. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > kmalloc(8)/kfree = 112 cycles > kmalloc(16)/kfree = 103 cycles > kmalloc(32)/kfree = 103 cycles > kmalloc(64)/kfree = 103 cycles > kmalloc(128)/kfree = 112 cycles > kmalloc(256)/kfree = 111 cycles > kmalloc(512)/kfree = 111 cycles > kmalloc(1024)/kfree = 111 cycles > kmalloc(2048)/kfree = 121 cycles Looks good. This improves handling for short lived objects about threefold. > kmalloc(4096)/kfree = 650 cycles > kmalloc(8192)/kfree = 1042 cycles > kmalloc(16384)/kfree = 1149 cycles Hmmm... The page allocator is really bad here Could we use the cmpxchg_local approach for the per cpu queues in the page_allocator? May have an even greater influence on overall system performance than the SLUB changes. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
* Christoph Lameter ([EMAIL PROTECTED]) wrote: > On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > > > SLUB Use cmpxchg() everywhere. > > > > It applies to "SLUB: Single atomic instruction alloc/free using > > cmpxchg". > > > +++ slab/mm/slub.c 2007-08-20 18:42:28.0 -0400 > > @@ -1682,7 +1682,7 @@ redo: > > > > object[c->offset] = freelist; > > > > - if (unlikely(cmpxchg_local(>freelist, freelist, object) != freelist)) > > + if (unlikely(cmpxchg(>freelist, freelist, object) != freelist)) > > goto redo; > > return; > > slow: > > Ok so regular cmpxchg, no cmpxchg_local. cmpxchg_local does not bring > anything more? My measurements did not show any difference. I measured on > Athlon64. What processor is being used? > This patch only cleans up the tree before proposing my cmpxchg_local changes. There was an inconsistent use of cmpxchg/cmpxchg_local there. Using cmpxchg_local vs cmpxchg has a clear impact on the fast paths, as shown below: it saves about 60 to 70 cycles for kmalloc and 200 cycles for the kmalloc/kfree pair (test 2). Pros : - we can use barrier() instead of rmb() - cmpxchg_local is faster Con : - we must disable preemption I use a 3GHz Pentium 4 for my tests. Results (compared to cmpxchg_local numbers) : SLUB Performance testing 1. Kmalloc: Repeatedly allocate then free test (kfree here is slow path) * cmpxchg kmalloc(8) = 271 cycles kfree = 645 cycles kmalloc(16) = 158 cycles kfree = 428 cycles kmalloc(32) = 153 cycles kfree = 446 cycles kmalloc(64) = 178 cycles kfree = 459 cycles kmalloc(128) = 247 cycles kfree = 481 cycles kmalloc(256) = 363 cycles kfree = 605 cycles kmalloc(512) = 449 cycles kfree = 677 cycles kmalloc(1024) = 626 cycles kfree = 810 cycles kmalloc(2048) = 681 cycles kfree = 869 cycles kmalloc(4096) = 471 cycles kfree = 575 cycles kmalloc(8192) = 666 cycles kfree = 747 cycles kmalloc(16384) = 736 cycles kfree = 853 cycles * cmpxchg_local kmalloc(8) = 83 cycles kfree = 363 cycles kmalloc(16) = 85 cycles kfree = 372 cycles kmalloc(32) = 92 cycles kfree = 377 cycles kmalloc(64) = 115 cycleskfree = 397 cycles kmalloc(128) = 179 cycles kfree = 438 cycles kmalloc(256) = 314 cycles kfree = 564 cycles kmalloc(512) = 398 cycles kfree = 615 cycles kmalloc(1024) = 573 cycles kfree = 745 cycles kmalloc(2048) = 629 cycles kfree = 816 cycles kmalloc(4096) = 473 cycles kfree = 548 cycles kmalloc(8192) = 659 cycles kfree = 745 cycles kmalloc(16384) = 724 cycles kfree = 843 cycles 2. Kmalloc: alloc/free test *cmpxchg kmalloc(8)/kfree = 321 cycles kmalloc(16)/kfree = 308 cycles kmalloc(32)/kfree = 311 cycles kmalloc(64)/kfree = 310 cycles kmalloc(128)/kfree = 306 cycles kmalloc(256)/kfree = 325 cycles kmalloc(512)/kfree = 324 cycles kmalloc(1024)/kfree = 322 cycles kmalloc(2048)/kfree = 309 cycles kmalloc(4096)/kfree = 678 cycles kmalloc(8192)/kfree = 1027 cycles kmalloc(16384)/kfree = 1204 cycles * cmpxchg_local kmalloc(8)/kfree = 112 cycles kmalloc(16)/kfree = 103 cycles kmalloc(32)/kfree = 103 cycles kmalloc(64)/kfree = 103 cycles kmalloc(128)/kfree = 112 cycles kmalloc(256)/kfree = 111 cycles kmalloc(512)/kfree = 111 cycles kmalloc(1024)/kfree = 111 cycles kmalloc(2048)/kfree = 121 cycles kmalloc(4096)/kfree = 650 cycles kmalloc(8192)/kfree = 1042 cycles kmalloc(16384)/kfree = 1149 cycles -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm2
On Sun, 19 Aug 2007 15:56:07 + (UTC) richard kennedy <[EMAIL PROTECTED]> wrote: > On Thu, 09 Aug 2007 22:42:54 -0700, Andrew Morton wrote: > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23- > rc2/2.6.23-rc2-mm2/ > > > > - Various problems from 2.6.23-rc2-mm1 were fixed > > > > > > > > Boilerplate: > > > > - See the `hot-fixes' directory for any important updates to this > > patchset. > > > > - To fetch an -mm tree using git, use (for example) > > > > git-fetch > > git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git > > tag v2.6.16-rc2-mm1 git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1 > > > Hi Andrew, Please always do reply-to-all. Otherwise you end up thinking that you're being ignored ;) > the git tree you mentioned in the boilerplate doesn't seem to have been > updated in about 7 weeks. > 2.6.22-rc6-mm1 is the last tag I can see on the summary page. Is > something broken ? Yes, the software which auto-imports -mm into git appears to have broken a few weeks ago. Matthias has been informed, but I guess he is busy. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > * cmpxchg_local Slub test > kmalloc(8) = 83 cycleskfree = 363 cycles > kmalloc(16) = 85 cycles kfree = 372 cycles > kmalloc(32) = 92 cycles kfree = 377 cycles > kmalloc(64) = 115 cycleskfree = 397 cycles > kmalloc(128) = 179 cycles kfree = 438 cycles So for consecutive allocs of small slabs up to 128 bytes this effectively doubles the speed of kmalloc. > kmalloc(256) = 314 cycles kfree = 564 cycles > kmalloc(512) = 398 cycles kfree = 615 cycles > kmalloc(1024) = 573 cycleskfree = 745 cycles Less of a benefit. > kmalloc(2048) = 629 cycleskfree = 816 cycles Allmost as before. > kmalloc(4096) = 473 cycleskfree = 548 cycles > kmalloc(8192) = 659 cycleskfree = 745 cycles > kmalloc(16384) = 724 cycles kfree = 843 cycles Page allocator pass through measurements. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
* Christoph Lameter ([EMAIL PROTECTED]) wrote: > On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > > > - Changed smp_rmb() for barrier(). We are not interested in read order > > across cpus, what we want is to be ordered wrt local interrupts only. > > barrier() is much cheaper than a rmb(). > > But this means a preempt disable is required. RT users do not want that. > Without preemption the processor can be moved after c has been determined. > That is why the smp_rmb() is there. preemption is required if we want to use cmpxchg_local anyway. We may have to find a way to use preemption while being able to give an upper bound on the preempt disabled execution time. I think I got a way to do this yesterday.. I'll dig in my patches. -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
Reformatting... * Mathieu Desnoyers ([EMAIL PROTECTED]) wrote: > Hi Christoph, > > If you are interested in the raw numbers: > > The (very basic) test module follows. Make sure you change get_cycles() > for get_cycles_sync() if you plan to run this on x86_64. > > (tests taken on a 3GHz Pentium 4) > (Note: test 1 uses the kfree slow path, as figured out by instrumentation) SLUB Performance testing 1. Kmalloc: Repeatedly allocate then free test * slub HEAD, test 1 kmalloc(8) = 201 cycles kfree = 351 cycles kmalloc(16) = 198 cycles kfree = 359 cycles kmalloc(32) = 200 cycles kfree = 381 cycles kmalloc(64) = 224 cycles kfree = 394 cycles kmalloc(128) = 285 cycles kfree = 424 cycles kmalloc(256) = 411 cycles kfree = 546 cycles kmalloc(512) = 480 cycles kfree = 619 cycles kmalloc(1024) = 623 cycles kfree = 750 cycles kmalloc(2048) = 686 cycles kfree = 811 cycles kmalloc(4096) = 482 cycles kfree = 538 cycles kmalloc(8192) = 680 cycles kfree = 734 cycles kmalloc(16384) = 713 cycles kfree = 843 cycles * Slub HEAD, test 2 kmalloc(8) = 190 cycles kfree = 351 cycles kmalloc(16) = 195 cycles kfree = 360 cycles kmalloc(32) = 201 cycles kfree = 370 cycles kmalloc(64) = 245 cycles kfree = 389 cycles kmalloc(128) = 283 cycles kfree = 413 cycles kmalloc(256) = 409 cycles kfree = 547 cycles kmalloc(512) = 476 cycles kfree = 616 cycles kmalloc(1024) = 628 cycles kfree = 753 cycles kmalloc(2048) = 684 cycles kfree = 811 cycles kmalloc(4096) = 480 cycles kfree = 539 cycles kmalloc(8192) = 661 cycles kfree = 746 cycles kmalloc(16384) = 741 cycles kfree = 856 cycles * cmpxchg_local Slub test kmalloc(8) = 83 cycles kfree = 363 cycles kmalloc(16) = 85 cycles kfree = 372 cycles kmalloc(32) = 92 cycles kfree = 377 cycles kmalloc(64) = 115 cycles kfree = 397 cycles kmalloc(128) = 179 cycles kfree = 438 cycles kmalloc(256) = 314 cycles kfree = 564 cycles kmalloc(512) = 398 cycles kfree = 615 cycles kmalloc(1024) = 573 cycles kfree = 745 cycles kmalloc(2048) = 629 cycles kfree = 816 cycles kmalloc(4096) = 473 cycles kfree = 548 cycles kmalloc(8192) = 659 cycles kfree = 745 cycles kmalloc(16384) = 724 cycles kfree = 843 cycles 2. Kmalloc: alloc/free test * slub HEAD, test 1 kmalloc(8)/kfree = 322 cycles kmalloc(16)/kfree = 318 cycles kmalloc(32)/kfree = 318 cycles kmalloc(64)/kfree = 325 cycles kmalloc(128)/kfree = 318 cycles kmalloc(256)/kfree = 328 cycles kmalloc(512)/kfree = 328 cycles kmalloc(1024)/kfree = 328 cycles kmalloc(2048)/kfree = 328 cycles kmalloc(4096)/kfree = 678 cycles kmalloc(8192)/kfree = 1013 cycles kmalloc(16384)/kfree = 1157 cycles * Slub HEAD, test 2 kmalloc(8)/kfree = 323 cycles kmalloc(16)/kfree = 318 cycles kmalloc(32)/kfree = 318 cycles kmalloc(64)/kfree = 318 cycles kmalloc(128)/kfree = 318 cycles kmalloc(256)/kfree = 328 cycles kmalloc(512)/kfree = 328 cycles kmalloc(1024)/kfree = 328 cycles kmalloc(2048)/kfree = 328 cycles kmalloc(4096)/kfree = 648 cycles kmalloc(8192)/kfree = 1009 cycles kmalloc(16384)/kfree = 1105 cycles * cmpxchg_local Slub test kmalloc(8)/kfree = 112 cycles kmalloc(16)/kfree = 103 cycles kmalloc(32)/kfree = 103 cycles kmalloc(64)/kfree = 103 cycles kmalloc(128)/kfree = 112 cycles kmalloc(256)/kfree = 111 cycles kmalloc(512)/kfree = 111 cycles kmalloc(1024)/kfree = 111 cycles kmalloc(2048)/kfree = 121 cycles kmalloc(4096)/kfree = 650 cycles kmalloc(8192)/kfree = 1042 cycles kmalloc(16384)/kfree = 1149 cycles -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + git-net-fix-export.patch added to -mm tree
From: [EMAIL PROTECTED] Date: Tue, 21 Aug 2007 16:03:06 -0700 > Subject: git-net fix export > From: Andrew Morton <[EMAIL PROTECTED]> > > Must be silly season or something. > > Cc: "David S. Miller" <[EMAIL PROTECTED]> > Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> I've applied this to net-2.6.24, thanks Andrew. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > SLUB Use cmpxchg() everywhere. > > It applies to "SLUB: Single atomic instruction alloc/free using > cmpxchg". > +++ slab/mm/slub.c2007-08-20 18:42:28.0 -0400 > @@ -1682,7 +1682,7 @@ redo: > > object[c->offset] = freelist; > > - if (unlikely(cmpxchg_local(>freelist, freelist, object) != freelist)) > + if (unlikely(cmpxchg(>freelist, freelist, object) != freelist)) > goto redo; > return; > slow: Ok so regular cmpxchg, no cmpxchg_local. cmpxchg_local does not bring anything more? My measurements did not show any difference. I measured on Athlon64. What processor is being used? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ver_linux is [censored]
On Tue, Aug 21, 2007 at 11:56:32AM +0200, Jesper Juhl wrote: > On 21/08/07, Alexey Dobriyan <[EMAIL PROTECTED]> wrote: > > Commit 4a645d5ea65baaa5736bcb566673bf4a351b2ad8 broke ver_linux > > on etch which glibc has 3-digit version number. > > Whoops, sorry about that. > > > Patch replaces awk > > wanking with more robust sed wanking. > > > > Tested on gentoo, etch, centos 4.2. > > > I tested your patch on Slackware 12.0, Debian 3.1 & Gentoo Base System > release 1.12.9 and it works fine on those as well. How about simply doing sh -c 'cat /proc/$$/maps'|sed -n -e '/^.*\/libc-\([^/]*\)\.so$/{s//\1/;p;q}' and to hell with parsing ls -l output? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > - Changed smp_rmb() for barrier(). We are not interested in read order > across cpus, what we want is to be ordered wrt local interrupts only. > barrier() is much cheaper than a rmb(). But this means a preempt disable is required. RT users do not want that. Without preemption the processor can be moved after c has been determined. That is why the smp_rmb() is there. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problems with IDE on linux 2.6.22.X
On 08/22/2007 01:00 AM, Alan Cox wrote: "Intel ESB, ICH, PIIX3, PIIX4 PATA/SATA support" Not for the newer chips. You want ATA/SATA (PIIX and possibly AHCI) support from the new drivers, SCSI disk and SCSI cd. That _is_ the *config description for the new (CONFIG_ATA_PIIX) driver (in 2.6.22.x). where you may need to boot with a "libata.atapi_enabled=0" kernel parameter. Why deliberately disable atapi when you need atapi ? Because he described the problem that if he got his (SATA) disk supported he lost his (PATA) DVD drive. Although I'm as said not completely sure it would actually work, I suggested compiling in both ATA_PIIX (yes, and sd) for his drive and the IDE PIIX/ICH driver and ide-cd for his DVD, where if it works at all, passing the above option may or may not be useful. But his report, although expansive, was a little unclear. Let's wait for what happens with only ATA_PIIIX. Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SLUB use cmpxchg_local
* Christoph Lameter ([EMAIL PROTECTED]) wrote: > On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > > > - Fixed an erroneous test in slab_free() (logic was flipped from the > > original code when testing for slow path. It explains the wrong > > numbers you have with big free). > > If you look at the numbers that I posted earlier then you will see that > even the measurements without free were not up to par. > I seem to get a clear performance improvement in the kmalloc fast path. > > It applies on top of the > > "SLUB Use cmpxchg() everywhere" patch. > > Which one is that? > This one: SLUB Use cmpxchg() everywhere. It applies to "SLUB: Single atomic instruction alloc/free using cmpxchg". Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> --- mm/slub.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: slab/mm/slub.c === --- slab.orig/mm/slub.c 2007-08-20 18:42:16.0 -0400 +++ slab/mm/slub.c 2007-08-20 18:42:28.0 -0400 @@ -1682,7 +1682,7 @@ redo: object[c->offset] = freelist; - if (unlikely(cmpxchg_local(>freelist, freelist, object) != freelist)) + if (unlikely(cmpxchg(>freelist, freelist, object) != freelist)) goto redo; return; slow: > > | slab.git HEAD slub (min-max)| cmpxchg_local slub > > kmalloc(8) | 190 - 201 | 83 > > kfree(8) | 351 - 351 |363 > > kmalloc(64) | 224 - 245 |115 > > kfree(64)| 389 - 394 |397 > > kmalloc(16384)|713 - 741 |724 > > kfree(16384) | 843 - 856 |843 > > > > Therefore, there seems to be a repeatable gain on the kmalloc fast path > > (more than twice faster). No significant performance hit for the kfree > > case, but no gain neither, same for large kmalloc, as expected. > > There is a consistent loss on slab_free it seems. The 16k numbers are > irrelevant since we do not use slab_alloc/slab_free due to the direct pass > through patch but call the page allocator directly. That also explains > that there is no loss there. > Yes. slab_free in these tests falls mostly into __slab_free() slow path (I instrumented the number of slow and fast path to get this). The small performance hit (~10 cycles) can be explained by the added preempt_disable()/preempt_enable(). > The kmalloc numbers look encouraging. I will check to see if I can > reproduce it once I sort out the patches. Ok. -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NFS hang + umount -f: better behaviour requested.
On Tue, 21 Aug 2007 14:50:42 EDT, John Stoffel said: > Now maybe those issues are raised when you have a Linux NFS server > with Solaris clients. But in my book, reliable NFS servers are key, > and if they are reliable, 'soft,intr' works just fine. And you don't need all that ext3 journal overhead if your disk drives are reliable too. Gotcha. :) pgp2uTg72zF5n.pgp Description: PGP signature
input: limit memory allocated by uinput ff drivers
input: limit memory allocated by uinput ff drivers Don't let force feedback drivers allocate more than 256K of kernel memory. On kernel 2.6.22 this causes a kernel OOPS with the SLUB memory allocator; on later kernels the drivers may allocate large amounts of memory. Signed-off-by: Chuck Ebbert <[EMAIL PROTECTED]> --- drivers/input/ff-core.c |8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) --- linux-2.6.22.noarch.orig/drivers/input/ff-core.c +++ linux-2.6.22.noarch/drivers/input/ff-core.c @@ -306,6 +306,7 @@ int input_ff_create(struct input_dev *de { struct ff_device *ff; int i; + int needed_mem; if (!max_effects) { printk(KERN_ERR @@ -313,8 +314,11 @@ int input_ff_create(struct input_dev *de return -EINVAL; } - ff = kzalloc(sizeof(struct ff_device) + -max_effects * sizeof(struct file *), GFP_KERNEL); + needed_mem = sizeof(struct ff_device) + max_effects * sizeof(struct file *); + if (needed_mem > 256 * 1024) + return -ENOMEM; + + ff = kzalloc(needed_mem, GFP_KERNEL); if (!ff) return -ENOMEM; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Tue, 21 Aug 2007 09:16:43 PDT, "Paul E. McKenney" said: > I agree that instant gratification is hard to come by when synching > up compiler and kernel versions. Nonetheless, it should be possible > to create APIs that are are conditioned on the compiler version. We've tried that, sort of. See the mess surrounding the whole extern/static/inline/__whatever boondogle, which seems to have changed semantics in every single gcc release since 2.95 or so. And recently mention was made that gcc4.4 will have *new* semantics in this area. Yee. Hah. pgpGx7YTiWc5V.pgp Description: PGP signature
Re: Problems with IDE on linux 2.6.22.X
> "Intel ESB, ICH, PIIX3, PIIX4 PATA/SATA support" Not for the newer chips. You want ATA/SATA (PIIX and possibly AHCI) support from the new drivers, SCSI disk and SCSI cd. > where you may need to boot with a "libata.atapi_enabled=0" kernel parameter. Why deliberately disable atapi when you need atapi ? Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 0/7] Postphone reclaim laundry to write at high water marks
On Wed, 22 Aug 2007, Peter Zijlstra wrote: > Also, all I want is for slab to honour gfp flags like page allocation > does, nothing more, nothing less. > > (well, actually slightly less, since I'm only really interrested in the > ALLOC_MIN|ALLOC_HIGH|ALLOC_HARDER -> ALLOC_NO_WATERMARKS transition and > not all higher ones) I am still not sure what that brings you. There may be multiple PF_MEMALLOC going on at the same time. On a large system with N cpus there may be more than N of these that can steal objects from one another. A NUMA system will be shot anyways if memory gets that problematic to handle since the OS cannot effectively place memory if all zones are overallocated so that only a few pages are left. > I want slab to fail when a similar page alloc would fail, no magic. Yes I know. I do not want allocations to fail but that reclaim occurs in order to avoid failing any allocation. We need provisions that make sure that we never get into such a bad memory situation that would cause severe slowless and usually end up in a livelock anyways. > > > Anonymous pages are a there to stay, and we cannot tell people how to > > > use them. So we need some free or freeable pages in order to avoid the > > > vm deadlock that arises from all memory dirty. > > > > No one is trying to abolish Anonymous pages. Free memory is readily > > available on demand if one calls reclaim. Your scheme introduces complex > > negotiations over a few scraps of memory when large amounts of memory > > would still be readily available if one would do the right thing and call > > into reclaim. > > This is the thing I contend, there need not be large amounts of memory > around. In my test prog the hot code path fits into a single page, the > rest can be anonymous. Thats a bit extreme We need to make sure that there are larger amounts of memory around. Pages are used for all shorts of short term uses (like slab shrinking etc etc.). If memory is that low that a single page matters then we are in very bad shape anyways. > > Sounds like you would like to change the way we handle memory in general > > in the VM? Reclaim (and thus finding freeable pages) is basic to Linux > > memory management. > > Not quite, currently we have free pages in the reserves, if you want to > replace some (or all) of that by freeable pages then that is a change. We have free pages primarily to optimize the allocation. Meaning we do not have to run reclaim on every call. We want to use all of memory. The reserves are there for the case that we cannot call into reclaim. The easy solution if that is problematic is to enhance the reclaim to work in the critical situations that we care about. > > Sorry I just got into this a short time ago and I may need a few cycles > > to get this all straight. An approach that uses memory instead of > > ignoring available memory is certainly better. > > Sure if and when possible. There will always be need to fall back to the > reserves. Maybe. But we can certainly avoid that as much as possible which would also increase our ability to use all available memory instead of leaving some of it unused./ > A bit off-topic, re that reclaim from atomic context: > Currently we try to hold spinlocks only for short periods of time so > that reclaim can be preempted, if you run all of reclaim from a > non-preemptible context you get very large preemption latencies and if > done from int context it'd also generate large int latencies. If you call into the page allocator from an interrupt context then you are already in bad shape since we may check pcps lists and then potentially have to traverse the zonelists and check all sorts of things. If we would implement atomic reclaim then the reserves may become a latency optimizations. At least we will not fail anymore if the reserves are out. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 0/7] Postphone reclaim laundry to write at high water marks
On Tue, 21 Aug 2007, Rik van Riel wrote: > Christoph Lameter wrote: > > > I want general improvements to reclaim to address the issues that you see > > and other issues related to reclaim instead of the strange code that makes > > PF_MEMALLOC allocs compete for allocations from a single slab and putting > > logic into the kernel to decide which allocs to fail. We can reclaim after > > all. Its just a matter of finding the right way to do this. > > The simplest way of achieving that would be to allow > recursion of the page reclaim code, under the condition > that the second level call can only reclaim clean pages, > while the "outer" call does what the VM does today. Yes that is what the precursor to this patchset does. See http://marc.info/?l=linux-mm=118710207203449=2 This one did not even come up to the level of the earlier one. Sigh. The way forward may be: 1. Like in the earlier patchset allow reentry to reclaim under PF_MEMALLOC if we are out of all memory. 2. Do the laundry as here but do not write out laundry directly. Instead move laundry to a new lru style list in the zone structure. This will allow the recursive reclaim to also trigger writeout of pages (what this patchset was supposed to accomplish). 3. Perform writeback only from kswapd. Make other threads wait on kswapd if memory is low, we can wait and writeback still has to progress. 4. Then allow reclaim of GFP_ATOMIC allocs (see http://marc.info/?l=linux-kernel=118710595617696=2). Atomic reclaim can then also put pages onto the zone laundry lists from where it is going to be picked up and written out by kswapd ASAP. This one may be tricky so maybe keep this separate. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CFS review
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > There is one workload that still isn't performing well; it's a > > web-server workload that spawns 1K+ client procs. It can be emulated > > by using this: > > > > for i in `seq 1 to `; do ping 10.1 -A > /dev/null & done > > on bash i did this as: > > for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done > > and this quickly creates a monster-runqueue with tons of ping tasks > pending. (i replaced 10.1 with the IP of another box on the same LAN as > the testbox) Is this what should happen? Yes, sometimes they start pending and sometimes they run immediately. > > The problem is that consecutive runs don't give consistent results and > > sometimes stalls. You may want to try that. > > well, there's a natural saturation point after a few hundred tasks > (depending on your CPU's speed), at which point there's no idle time > left. From that point on things get slower progressively (and the > ability of the shell to start new ping tasks is impacted as well), but > that's expected on an overloaded system, isnt it? Of course, things should get slower with higher load, but it should be consistent without stalls. To see this problem, make sure you boot into /bin/sh with the normal VGA console (ie. not fb-console). Then try each loop a few times to show different behaviour; loops like: # for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done # for ((i=0; i<; i++)); do nice -99 ping 10.1 -A > /dev/null & done # { for ((i=0; i<; i++)); do ping 10.1 -A > /dev/null & done } > /dev/null 2>&1 Especially the last one sometimes causes a complete console lock-up, while the other two sometimes stall then surge periodically. BTW, I am also wondering how one might test threading behaviour wrt to startup and sync-on-exit with parent thread. This may not show any problems with small number of threads, but how does it scale with 1K+? Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] add some Blackfin specific checks to checkpatch.pl
Check for a few common errors in Blackfin-specific code wrt MMR loading in assembly and doing core/system syncs. Signed-off-by: Mike Frysinger <[EMAIL PROTECTED]> CC: Bryan Wu <[EMAIL PROTECTED]> CC: Andy Whitcroft <[EMAIL PROTECTED]> --- scripts/checkpatch.pl | 20 1 files changed, 20 insertions(+), 0 deletions(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index dae7d30..ead9675 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -486,9 +486,29 @@ sub process { WARN("line over 80 characters\n" . $herecurr); } +# Blackfin: use hi/lo macros + if ($line =~ /\.[lL][[:space:]]*=.*&[[:space:]]*0x[fF][fF][fF][fF]/) { + my $herevet = "$here\n" . cat_vet($line) . "\n"; + ERROR("use the LO() macro, not (... & 0x)\n" . $herevet); + } + if ($line =~ /\.[hH][[:space:]]*=.*>>[[:space:]]*16/) { + my $herevet = "$here\n" . cat_vet($line) . "\n"; + ERROR("use the HI() macro, not (... >> 16)\n" . $herevet); + } + # check we are in a valid source file *.[hc] if not then ignore this hunk next if ($realfile !~ /\.[hc]$/); +# Blackfin: don't use __builtin_bfin_[cs]sync + if ($line =~ /__builtin_bfin_csync/) { + my $herevet = "$here\n" . cat_vet($line) . "\n"; + ERROR("use the CSYNC() macro in asm/blackfin.h\n" . $herevet); + } + if ($line =~ /__builtin_bfin_ssync/) { + my $herevet = "$here\n" . cat_vet($line) . "\n"; + ERROR("use the SSYNC() macro in asm/blackfin.h\n" . $herevet); + } + # at the beginning of a line any tabs must come first and anything # more than 8 must use tabs. if ($line=~/^\+\s* \t\s*\S/ or $line=~/^\+\s*\s*/) { -- 1.5.3.rc5 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Restricting CDC-ACM devices
I would like to use the cdc-acm driver in the Linux kernel (2.6.22-rc1), but restrict the access to only my VID/PID devices. Is there an easy way to do with without modifying cdc-acm.c? In a past prototype I made a simple wrapper driver for usb serial by adding my VID/PID numbers to the wrapper driver's id_table. Then when that usb driver was accessed on connection, the driver just pointed to the usb_serial_* functions (probe, disconnect, etc). I tried to do the same with the cdc-acm driver, but the cdc-acm driver's probe function was called before my driver's probe. I noticed that the cdc-amc driver will attach when it detects the two CDC-ACM interfaces, so I removed the cdc-acm driver with "make menuconfig". This didn't work because the cdc-acm functions I was attempting to call from my driver do not exist. Thanks for the help, -Nate - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Add stack checking for Blackfin
Simply fill out the bits in checkstack.pl for Blackfin. I thought I already sent this, but I don't see it in -mm anywhere ... Signed-off-by: Mike Frysinger <[EMAIL PROTECTED]> CC: Bryan Wu <[EMAIL PROTECTED]> --- scripts/checkstack.pl |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/scripts/checkstack.pl b/scripts/checkstack.pl index f7844f6..9226381 100755 --- a/scripts/checkstack.pl +++ b/scripts/checkstack.pl @@ -73,6 +73,9 @@ my (@stack, $re, $x, $xs); # pair for larger users. -- PFM. #a00048e0: d4fc40f0addi.l r15,-240,r15 $re = qr/.*addi\.l.*r15,-(([0-9]{2}|[3-9])[0-9]{2}),r15/o; + } elsif ($arch =~ /^blackfin$/) { + # 0: 00 e8 38 01 LINK 0x4e0; + $re = qr/.*[[:space:]]LINK[[:space:]]*(0x$x{1,8})/o; } else { print("wrong or unknown architecture\n"); exit -- 1.5.3.rc5 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] Fix mainline filesystems to handle ATTR_KILL_ bits correctly
On Tue, 21 Aug 2007 17:21:28 -0400 Josef Sipek <[EMAIL PROTECTED]> wrote: > On Tue, Aug 21, 2007 at 07:35:51AM -0400, Jeff Layton wrote: > > On Tue, 21 Aug 2007 15:35:08 +1000 > > Timothy Shimmin <[EMAIL PROTECTED]> wrote: > > > > > Jeff Layton wrote: > > > > This should fix all of the filesystems in the mainline kernels to handle > > > > ATTR_KILL_SUID and ATTR_KILL_SGID correctly. For most of them, this is > > > > just a matter of making sure that they call generic_attrkill early in > > > > the setattr inode op. > > > > > > > > Signed-off-by: Jeff Layton <[EMAIL PROTECTED]> > > > > --- > > > > fs/xfs/linux-2.6/xfs_iops.c |5 - > > > > --- a/fs/xfs/linux-2.6/xfs_iops.c > > > > +++ b/fs/xfs/linux-2.6/xfs_iops.c > > > > @@ -651,12 +651,15 @@ xfs_vn_setattr( > > > > struct iattr*attr) > > > > { > > > > struct inode*inode = dentry->d_inode; > > > > - unsigned intia_valid = attr->ia_valid; > > > > + unsigned intia_valid; > > > > bhv_vnode_t *vp = vn_from_inode(inode); > > > > bhv_vattr_t vattr = { 0 }; > > > > int flags = 0; > > > > int error; > > > > > > > > + generic_attrkill(inode->i_mode, attr); > > > > + ia_valid = attr->ia_valid; > > > > + > > > > if (ia_valid & ATTR_UID) { > > > > vattr.va_mask |= XFS_AT_UID; > > > > vattr.va_uid = attr->ia_uid; > > > > > > Looks reasonable to me for XFS. > > > Acked-by: Tim Shimmin <[EMAIL PROTECTED]> > > > > > > So before, this clearing would happen directly in notify_change() > > > and now this won't happen until notify_change() calls i_op->setattr > > > which for a particular fs it can call generic_attrkill() to do it. > > > So I guess for the cases where i_op->setattr is called outside of > > > via notify_change, we don't normally have ATTR_KILL_SUID/SGID > > > set so that nothing will happen there? > > > > Right. If neither ATTR_KILL bit is set then generic_attrkill is a > > noop. > > > > > I guess just wondering the effect with having the code on all > > > setattr's. (I'm not familiar with the code path) > > > > > > > These bits are referenced in very few places in the current kernel > > tree -- mostly in the VFS layer. The *only* place I see that they > > actually get interpreted into a mode change is in notify_change. So > > places that call setattr ops w/o going through notify_change are > > not likely to have those bits set. > > > > But hypothetically, if a fs did set ATTR_KILL_* and call setattr > > directly, then the setattr would now include a mode change that > > clears setuid or setgid bits where it may not have before. > I should probably clarify -- in the hypothetical situation above, the setattr function would have to call generic_attrkill (as most filesystems should do with this change). > It almost sounds like an argument for a new inode op (NULL would use > generic_attr_kill). > That's not a bad idea at all. I suppose that would be easier than modifying every fs like this, and it does seem like it might be cleaner. I need to mull it over, but that might be the best solution. -- Jeff Layton <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problems with IDE on linux 2.6.22.X
On 08/21/2007 09:49 PM, José Luis Patiño Andrés wrote: Somebody tolds me that I can solve this problem unchecking the IDE_GENERIC option in the kernel configuration. It's true, but when I do this the DVD device is not recognized by the kernel. No exists. The OpenSuSE Live CD thing not booting may mean you have a deeper problem, but please note that you shouldn't be using ide-disk: In my working 2.6.20.15 kernel, the 'cat /proc/ide/drivers' command outputs this: ### #ide-disk version 1.18 #ide-cdrom version 4.61 ### But in 2.6.22.X, the output is only: ### #ide-disk version 1.18 ### You have a SATA harddrive (Hitachi Travelstar 5K100 100GB SATA/2.5") and an IDE (also known as PATA) DVD drive (LG GMA-4082N). That is, your disk should be driven by the: "Intel ESB, ICH, PIIX3, PIIX4 PATA/SATA support" under the "Serial ATA (prod) and Parallel ATA (experimental) drivers" menu, and it seems this driver should also take care of your DVD. Not sure from your report what you are using -- first try with only that driver, and nothing from the old "ATA/ATAPI/MFM/RLL support" menu selected. In that situation, your harddrive works, but your DVD does not? If so, this should be fixed in the driver, but to get things working I believe you may try with both the above driver for your harddisk and the old IDE driver for the DVD: <*> Enhanced IDE/MFM/RLL disk/cdrom/tape/floppy support <*> Include IDE/ATAPI CDROM support (NEW) [*] PCI IDE chipset support [*] Generic PCI bus-master DMA support <*> Intel PIIXn chipsets support (do not select IDE/ATA-2 disk support) where you may need to boot with a "libata.atapi_enabled=0" kernel parameter. Not actually particularly sure if that works given that it's the same chip and all it seems but anyways, please first verify results with only that SATA driver. Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: drop support for gcc < 4.0
On Tue, Aug 21, 2007 at 04:49:38PM -0500, James Bottomley wrote: > On Tue, 2007-08-21 at 23:21 +0200, Adrian Bunk wrote: > > On Tue, Aug 21, 2007 at 10:49:49PM +0200, Segher Boessenkool wrote: > > >> How many people e.g. test -rc kernels compiled with gcc 3.2? > > > > > > Why would that matter? It either works or not. If it doesn't > > > work, it can either be fixed, or support for that old compiler > > > version can be removed. > > > > One bug report "kernel doesn't work / crash / ... when compiled with > > gcc 3.2, but works when compiled with gcc 4.2" will most likely be lost > > in the big pile of unhandled bugs, not cause the removal of gcc 3.2 > > support... > > What's the bugzilla or pointer to this report please? Those of us who > use gcc-3 as the default kernel compiler will take it seriously (if it > looks to have an impact to our kernel builds) otherwise we can tell you > it's unreproducible/not a problem etc. This was an example in response to Segher's point we would remove support for a gcc version in such a case. I remember we had such issues, but I don't find any pointer to a specific one at the moment. I'll keep you informed when bug reports come in that only occur with older gcc versions and that aren't easily fixable. > James cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/