Re: [PATCH] sched: staircase deadline misc fixes
Oh my, I'm on a roll here... somebody stop me ;-) Some emphasis: On Thu, 2007-03-29 at 08:29 +0200, Mike Galbraith wrote: > On Thu, 2007-03-29 at 07:50 +0200, Mike Galbraith wrote: > > > Opinion polls are nice, but I'm more interested in gathering numbers > > which either validate or invalidate the claims of the design documents. > > Suggestion: try the testcase that Satoru Takeuch posted. The numbers I > got with latest SD were no better than the numbers I got with the patch > I posted to try to solve it. Seems to me the numbers with SD should > have been much better, but they in fact were not. > > Running that thing, mainline's GUI was not usable, even with my patch, > but neither was it usable with SD. What's the difference between > horrible with mainline and merely terrible with SD? In both, the GUI > ends up doing round-robin with a slew of hogs. In mainline, this > happens because the history logic can and does get it wrong sometimes, > which this exploit deliberately triggers. With SD, it's by design. The much maligned history mechanism in mainline didn't start it's life as an interactivity estimator, that's a name it acquired later. What it was first put there for was to ensure fairness for sleeping tasks. I found it most ironic that the numbers I posted showed that mechanism working perfectly, with an exploit that was designed specifically to expose it's weakness, despite the deliberate tweaks that have gone in tweaking it very heavily in the unfair direction, and this went uncommented. If I had run more of them, it would have shown that weakness very well. We all know that weakness exists. What the numbers clearly showed was that sleeping tasks did not get the fairness RSDL advertised with the particular test I ran, yet it went uncommented/uncontested. Anyone could have tested with the trivial proggy of their choice... but nobody did. The history mechanism is not only about interactivity, and never was. -Mike I'm gonna go piddle around with code now, much more fun than yacking :) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: staircase deadline misc fixes
On Thursday 29 March 2007 02:37, Con Kolivas wrote: > I'm cautiously optimistic that we're at the thin edge of the bugfix wedge > now. My neck condition got a lot worse today. I'm forced offline for a week and will be uncontactable. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: staircase deadline misc fixes
On Thu, 2007-03-29 at 07:50 +0200, Mike Galbraith wrote: > Opinion polls are nice, but I'm more interested in gathering numbers > which either validate or invalidate the claims of the design documents. Suggestion: try the testcase that Satoru Takeuch posted. The numbers I got with latest SD were no better than the numbers I got with the patch I posted to try to solve it. Seems to me the numbers with SD should have been much better, but they in fact were not. Running that thing, mainline's GUI was not usable, even with my patch, but neither was it usable with SD. What's the difference between horrible with mainline and merely terrible with SD? In both, the GUI ends up doing round-robin with a slew of hogs. In mainline, this happens because the history logic can and does get it wrong sometimes, which this exploit deliberately triggers. With SD, it's by design. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Software RAID (non-preempt) server blocking question. (2.6.20.4)
On Tuesday March 27, [EMAIL PROTECTED] wrote: > I ran a check on my SW RAID devices this morning. However, when I did so, > I had a few lftp sessions open pulling files. After I executed the check, > the lftp processes entered 'D' state and I could do 'nothing' in the > process until the check finished. Is this normal? Should a check block > all I/O to the device and put the processes writing to a particular device > in 'D' state until it is finished? No, that shouldn't happen. The 'check' should notice any other disk activity and slow down if anything else is happening on the device. Did the check run to completion? And if so, did the 'lftp' start working normally again? Did you look at "cat /proc/mdstat" ?? What sort of speed was the check running at? NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Andrew Wbeelsoi says: I think I have a vagina!
Oh shit! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Andrew Wbeelsoi says: Fuck you!!
Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! Fuck you! You\'re dead to me! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: staircase deadline misc fixes
On Thu, 2007-03-29 at 09:44 +1000, Con Kolivas wrote: > On Thursday 29 March 2007 04:48, Ingo Molnar wrote: > > hm, how about the questions Mike raised (there were a couple of cases of > > friction between 'the design as documented and announced' and 'the code > > as implemented')? As far as i saw they were still largely unanswered - > > but let me know if they are all answered and addressed: > > I spent less time emailing and more time coding. I have been working on > addressing whatever people brought up. > > > http://marc.info/?l=linux-kernel=117465220309006=2 > > Attended to. > > > http://marc.info/?l=linux-kernel=117489673929124=2 > > Attended to. > > > http://marc.info/?l=linux-kernel=117489831930240=2 > > Checked fine. That one's not fine. +static void recalc_task_prio(struct task_struct *p, struct rq *rq) +{ + struct prio_array *array = rq->active; + int queue_prio; + + update_if_moved(p, rq); + if (p->rotation == rq->prio_rotation) { + if (p->array == array) { + if (p->time_slice > 0) + return; + p->time_slice = p->quota; + } else if (p->array == rq->expired) { You implemented nanosecond accounting, but here you give a task which has either missed the tick ofter enough, or accumulated enough cross cpu clock drift to have an I.O.U. in it's wallet a shiny new $8 bill. WRT clock drift/timewarps, your latest code cedes that these do occur, but where these timewarps can be anywhere between minuscule with Intel same package processors, up to a tick elsewhere, charges a tick. - /* cpu scheduler quota accounting is performed here */ + if (tick) { + /* +* Called from scheduler_tick() there should be less than two +* jiffies worth, and not negative/overflow. +*/ + if (time_diff > JIFFIES_TO_NS(2) || time_diff < min_diff) + time_diff = JIFFIES_TO_NS(1); > > and the numbers he posted: > > > > http://marc.info/?l=linux-kernel=117448900626028=2 > > Attended to. Hm. How, where? I'm getting inconsistent results with current, but sleeping tasks still don't _appear_ to be able to compete with hogs on an equal footing, and I don't see how they really can. What happens if a sleeper sleeps after using say half of it's slice, and the hog it's sharing the CPU with then sleeps briefly after using most of it's slice. That's the end of the rotation. They are put back on an equal footing, but what just happened to the differential in cpu usage? > > his test conclusion was that under CPU load, RSDL (SD) generally does > > not hold up to mainline's interactivity. > > There have been improvements since the earlier iterations but it's still a > fairness based design. Mike's "sticking point" test case should be improved > as well. The behavior is different, and is less ragged, but I wouldn't say it's really been improved. The below was added as a workaround. + * This contains a bitmap for each dynamic priority level with empty slots + * for the valid priorities each different nice level can have. It allows + * us to stagger the slots where differing priorities run in a way that + * keeps latency differences between different nice levels at a minimum. + * ie, where 0 means a slot for that priority, priority running from left to + * right: + * nice -20 + * nice -10 1001000100100010001001000100010010001000 + * nice 0 0101010101010101010101010101010101010101 + * nice 5 1101011010110101101011010110101101011011 + * nice 10 0110111011011101110110111011101101110111 + * nice 15 0101101101011011 + * nice 19 1110 I don't really know what to say about this. I think it explains reduced context switching, but I don't see how this could be a good thing. Consider a nice -20 fast/light task trying to get CPU with nice 0 tasks being constantly spawned. How can this latency bound fast mover perform if it can't preempt? What am I missing? > My call based on my own testing and feedback from users is: > > Under niced loads it is 99% in favour of SD. > > Under light loads it is 95% in favour of SD. > > Under Heavy loads it becomes proportionately in favour of mainline. The > crossover is somewhere around a load of 4. Opinion polls are nice, but I'm more interested in gathering numbers which either validate or invalidate the claims of the design documents. WRT this subjective opinion thing, I see regressions with all loads, and I don't see what a < 95% load really means. If CPU isn't contended, dishing it out is dirt simple. Just give everybody frequent, and fairly short chunks, and everybody is fairly happy. The only time scheduling becomes interesting is when there IS contention, and mainline seems to do much better at this, with the caveat that the history
Re: [ PATCH] Add suspend/resume for HPET was: Re: [3/6] 2.6.21-rc4: known regressions
On Thursday 29 March 2007 07:08:58 Linus Torvalds wrote: > > On Thu, 29 Mar 2007, Maxim wrote: > > > > I am sending here a patch that as was discussed here adds hpet to list > > of system devices > > and adds suspend/resume hooks this way. > > I tested it and it works fine. > > Ok, it certainly looks better, but it *also* looks like it just assumes > the HPET is there. Which would work in testing _with_ a HPET, but would > likely break on hardware without one, no? > > Shouldn't there be at least something like a > > if (!is_hpet_capable()) > return 0; > > at the top of that init routine? I'd also expect that you'd need to check > that "hpet_virt_address" is valid or something? > > (Or, better yet, shouldn't we set "boot_hpet_disable" when we decide not > to use the HPET, and set hpet_virt_address to NULL?) This is done here out_nohpet: iounmap(hpet_virt_address); hpet_virt_address = NULL; > > Linus > Hi, Of course, I forgot. I was planning to put sysdev code in hpet_enable() but it is not possible because this function is called too early. Thus I put sysdev initialization in separate function but forgot to test for HPET Thanks a lot. Best regards Maxim Levitsky --- This adds support of suspend/resume on i386 for HPET Signed-off-by: Maxim Levitsky <[EMAIL PROTECTED]> --- arch/i386/kernel/hpet.c | 68 +++ 1 files changed, 68 insertions(+), 0 deletions(-) diff --git a/arch/i386/kernel/hpet.c b/arch/i386/kernel/hpet.c index 0fd9fba..7c67780 100644 --- a/arch/i386/kernel/hpet.c +++ b/arch/i386/kernel/hpet.c @@ -3,6 +3,8 @@ #include #include #include +#include +#include #include #include @@ -310,6 +312,7 @@ int __init hpet_enable(void) out_nohpet: iounmap(hpet_virt_address); hpet_virt_address = NULL; + boot_hpet_disable = 1; return 0; } @@ -524,3 +527,68 @@ irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id) return IRQ_HANDLED; } #endif + + +/* + * Suspend/resume part + */ + +#ifdef CONFIG_PM + +static int hpet_suspend(struct sys_device *sys_device, pm_message_t state) +{ + unsigned long cfg = hpet_readl(HPET_CFG); + + cfg &= ~(HPET_CFG_ENABLE|HPET_CFG_LEGACY); + hpet_writel(cfg, HPET_CFG); + + return 0; +} + +static int hpet_resume(struct sys_device *sys_device) +{ + unsigned int id; + + hpet_start_counter(); + + id = hpet_readl(HPET_ID); + + if (id & HPET_ID_LEGSUP) + hpet_enable_int(); + + return 0; +} + +static struct sysdev_class hpet_class = { + set_kset_name("hpet"), + .suspend= hpet_suspend, + .resume = hpet_resume, +}; + +static struct sys_device hpet_device = { + .id = 0, + .cls= _class, +}; + + +static __init int hpet_register_sysfs(void) +{ + int err; + + if (!is_hpet_capable()) + return 0; + + err = sysdev_class_register(_class); + + if (!err) { + sysdev_register(_device); + if (err) + sysdev_class_unregister(_class); + } + + return err; +} + +device_initcall(hpet_register_sysfs); + +#endif -- 1.4.4.2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] pid: Properly detect orphaned process groups in exit_notify
In commit 0475ac0845f9295bc5f69af45f58dff2c104c8d1 when converting the converting the orphaned process group handling to use struct pid I made a small mistake. I accidentally replaced an == with a !=. Besides just being a dumb thing to do apparently this has a bad side effect. The improper orphaned process group detection causes kwin to die after a suspend/resume cycle. I'm amazed this patch has been around as long as it has without anyone else noticing something funny going on. And the following people deserve credit for spotting and helping to reproduce this. Thanks to: Sid Boyce <[EMAIL PROTECTED]> Thanks to: "Michael Wu" Signed-off-by: "Eric W. Biederman" <[EMAIL PROTECTED]> --- diff --git a/kernel/exit.c b/kernel/exit.c index f132349..b55ed4c 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -790,7 +790,7 @@ static void exit_notify(struct task_struct *tsk) pgrp = task_pgrp(tsk); if ((task_pgrp(t) != pgrp) && - (task_session(t) != task_session(tsk)) && + (task_session(t) == task_session(tsk)) && will_become_orphaned_pgrp(pgrp, tsk) && has_stopped_jobs(pgrp)) { __kill_pgrp_info(SIGHUP, SEND_SIG_PRIV, pgrp); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] [RFC] HID bus design overview.
Jiri Kosina wrote: > JFYI the preliminary version of the hidraw interface is now in the > hid/usbhid git tree, and has also been in a few recent -mm kernels > already. > > The shadow driver support works now. The most largest problem is HID/Bluetooth can not work now. And, I have no any bluetooth input device to test, So ... I think I should port current implementation to 2.6.21-rc5-mm2, and support hiddev, then release it. The last word is a question, what's the future of hiddev? It will merge into hidraw later? I think so, but can't sure. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 14/21] MSI: Use a list instead of the custom link structure
Michael Ellerman <[EMAIL PROTECTED]> writes: > > I thought about doing it in the MSI enable methods, but I think it > really belongs in the (nonexistant) routine that allocs and sets up a > pci_dev. I agree that would be a good place for it as well. > I think it's pretty dicy to be passing around a pci_dev with an > uninitialised msi_list. Even if currently no code outside the MSI enable > methods looks at it, I think we're asking for bugs in the future. Reasonable. > So I'll do a patch which adds alloc_pci_dev(), update the callers, and > then put the msi_list initialisation in there. Sounds good. That will allow us to initialize all of the fields in struct pci_dev to a default value in one place. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFT] e100 driver on ARM
Kok, Auke wrote: Lennert Buytenhek wrote: On Mon, Sep 04, 2006 at 06:39:29AM -0400, Jeff Garzik wrote: 1) Does e100 driver work on ARM? FWIW, e100 seems to work okay for me on an intel ixp2400 (xscale based) board, an ixp2850 (xscale based) board and an ixp2350 (xscale3 based) board. ixp2350 works both with hardware coherency turned on (cpu snoops bus) and turned off (manual dma cache clean/invalidate as usual.) As for the other ARM platforms that I'm interested in / have hardware for / maintain, the at91/ep93xx/pxa270 don't have PCI, and the other two (iop32x/iop33x) I can't test because I don't have such systems with e100 NICs, but I expect those would work, since they're both xscale based like the ixp2400, and the ixp2400 works. I just got an iop342 board dropped on my lap. Once it's running, I'll make sure to make this the first thing to test. I have a pxa255 based system with PCI added to it. The e100 would have memory corruption in its receive buffers detected by slab debugging unless I put in the patch to use the S-bit. Here is a link to the patch posting: http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc3/2.6.20-rc3-mm1/broken-out/git-netdev-all.patch Search for e100.c. http://www-gatago.com/linux/kernel/15457063.html - This discussion seems to hit the issue. There appears to be a race on the cache line where the EL bit and the next packet info live. In my case the hardware appeared to write to a free packet. The S-bit seems to make the hardware stop and spin on the bit, while the EL bit seems to let the hardware try to use that packet. This race would occur less often when the receive buffer chain is always refilled before the hardware can use them up. On our 400 Mhz Xscale, we can use up all 256 buffers if the PCI bus has another busy device on it. In our case it is an 802.11g miniPCI card and our software was routing all ethernet packets to the wireless interface and vice versa while TCP streams were running accross these connections. -Ack - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC:PATCH]regster memory init functions into white list of section mismatch.
> > > > WARNING: mm/built-in.o - Section mismatch: reference to > > > > .init.text:__alloc_bootmem_node from .text between 'sparse_init' (at > > > > offset 0x15c8f) and '__section_nr' > > > I took a look at this one. > > > You have SPARSEMEM enabled in your config. > > > And then I see that in sparse.c we call alloc_bootmem_node() > > > from a function I thought should be marked __devinit (it > > > is used by memory_hotplug.c). > > > But I am not familiar enough to judge if __alloc_bootmen_node > > > are marked correct with __init or __devinit (to say > > > this is used in the HOTPLUG case) is more correct. > > > Anyone? > > > > > > > WARNING: mm/built-in.o - Section mismatch: reference to > > > > .init.text:__alloc_bootmem_node from .text between 'sparse_init' (at > > > > offset 0x15d02) and '__section_nr' > > > Same as above > > > > Memory hotplug code has __meminit for its purpose. > > But, I suspect that many other places of memory hotplug code may have > > same issue. I will chase them. Hello. I chased section mismatch codes on memory hotplug code. Many of them should be defined as __meminit. (This check was great helpful for checking it. Thanks!) But, I would like to add a new pattern in white list for some of them. (I'll post another patch for others.) sparse.c (sparse_index_alloc()) calles alloc_bootmem_node() as you mentioned. And, zone_wait_table_init() calles it too. These functions call it on only boot time, and call vmalloc()/kmalloc() on hotplug time. It is distinguished by system_state value or slab_is_available(). Just refrerences remain at them after boot. Bootmem allocation functions are called by many functions and it must be used only at boot time. I think __init of them should keep for section mismatch check. So, I would like to register sparse_index_alloc() and zone_wait_table_init() into white list. Please comment. If there is a more good way, please let me know... Thanks. P.S. Pattarn 10 is for ia64 (not for memory hotplug). ia64's .machvec section is mixture table of .init functions and normal text. It is defined for platform dependent functions. This is also cause of warnings. I think this should be registered too. Signed-off-by: Yasunori Goto <[EMAIL PROTECTED]> --- mm/page_alloc.c |2 +- mm/sparse.c |2 +- scripts/mod/modpost.c | 29 + 3 files changed, 31 insertions(+), 2 deletions(-) Index: current_test/scripts/mod/modpost.c === --- current_test.orig/scripts/mod/modpost.c 2007-03-27 20:21:20.0 +0900 +++ current_test/scripts/mod/modpost.c 2007-03-29 14:16:05.0 +0900 @@ -643,6 +643,17 @@ static int strrcmp(const char *s, const * The pattern is: * tosec= .init.text * fromsec = __ksymtab* + * + * Pattern 9: + * Some of functions are common code between boot time and hotplug + * time. The bootmem allocater is called only boot time in its + * functions. So it's ok to reference. + * tosec= .init.text + * + * Pattern 10: + * ia64 has machvec table for each platform. It is mixture of function + * pointer of .init.text and .text. + * fromsec = .machvec **/ static int secref_whitelist(const char *modname, const char *tosec, const char *fromsec, const char *atsym, @@ -669,6 +680,12 @@ static int secref_whitelist(const char * NULL }; + const char *pat4sym[] = { + "sparse_index_alloc", + "zone_wait_table_init", + NULL + }; + /* Check for pattern 1 */ if (strcmp(tosec, ".init.data") != 0) f1 = 0; @@ -725,6 +742,18 @@ static int secref_whitelist(const char * if ((strcmp(tosec, ".init.text") == 0) && (strncmp(fromsec, "__ksymtab", strlen("__ksymtab")) == 0)) return 1; + + /* Check for pattern 9 */ + if ((strcmp(tosec, ".init.text") == 0) && + (strcmp(fromsec, ".text") == 0)) + for (s = pat4sym; *s; s++) + if (strcmp(atsym, *s) == 0) + return 1; + + /* Check for pattern 10 */ + if (strcmp(fromsec, ".machvec") == 0) + return 1; + return 0; } Index: current_test/mm/page_alloc.c === --- current_test.orig/mm/page_alloc.c 2007-03-27 16:04:41.0 +0900 +++ current_test/mm/page_alloc.c2007-03-29 14:14:42.0 +0900 @@ -2673,7 +2673,7 @@ void __init setup_per_cpu_pageset(void) #endif -static __meminit +static __meminit noinline int zone_wait_table_init(struct zone *zone, unsigned long zone_size_pages) { int i; Index: current_test/mm/sparse.c === --- current_test.orig/mm/sparse.c 2007-03-27 16:04:41.0
Re: [ PATCH] Add suspend/resume for HPET was: Re: [3/6] 2.6.21-rc4: known regressions
On Thu, 29 Mar 2007, Maxim wrote: > > I am sending here a patch that as was discussed here adds hpet to list > of system devices > and adds suspend/resume hooks this way. > I tested it and it works fine. Ok, it certainly looks better, but it *also* looks like it just assumes the HPET is there. Which would work in testing _with_ a HPET, but would likely break on hardware without one, no? Shouldn't there be at least something like a if (!is_hpet_capable()) return 0; at the top of that init routine? I'd also expect that you'd need to check that "hpet_virt_address" is valid or something? (Or, better yet, shouldn't we set "boot_hpet_disable" when we decide not to use the HPET, and set hpet_virt_address to NULL?) Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86_64 irq: keep consistent for changing IRQ0_VECTOR from 0x20 to 0x30
"Yinghai Lu" <[EMAIL PROTECTED]> writes: > On 3/7/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: >> The comment fixes or some variation on them are needed. > > Please check the patch about comment. > > YH > Looks good to me. I've cleaned up the description and placed the patch inline for easier consumption. Everything this patch touches is a comment. So it is as safe as they come. And the patch appears to apply to the Linus's latest tree. --- From: Yinghai Lu <[EMAIL PROTECTED]> Subject: x86_64 irq: Fix comments after changing IRQ0_VECTOR from 0x20 to 0x30 Signed-off-by: "Eric W. Biederman" <[EMAIL PROTECTED]> Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]> diff --git a/arch/x86_64/kernel/i8259.c b/arch/x86_64/kernel/i8259.c index 21d95b7..4894266 100644 --- a/arch/x86_64/kernel/i8259.c +++ b/arch/x86_64/kernel/i8259.c @@ -45,7 +45,7 @@ /* * ISA PIC or low IO-APIC triggered (INTA-cycle or APIC) interrupts: - * (these are usually mapped to vectors 0x20-0x2f) + * (these are usually mapped to vectors 0x30-0x3f) */ /* @@ -299,7 +299,7 @@ void init_8259A(int auto_eoi) * outb_p - this has to work on a wide range of PC hardware. */ outb_p(0x11, 0x20); /* ICW1: select 8259A-1 init */ - outb_p(IRQ0_VECTOR, 0x21); /* ICW2: 8259A-1 IR0-7 mapped to 0x20-0x27 */ + outb_p(IRQ0_VECTOR, 0x21); /* ICW2: 8259A-1 IR0-7 mapped to 0x30-0x37 */ outb_p(0x04, 0x21); /* 8259A-1 (the master) has a slave on IR2 */ if (auto_eoi) outb_p(0x03, 0x21); /* master does Auto EOI */ @@ -307,7 +307,7 @@ void init_8259A(int auto_eoi) outb_p(0x01, 0x21); /* master expects normal EOI */ outb_p(0x11, 0xA0); /* ICW1: select 8259A-2 init */ - outb_p(IRQ8_VECTOR, 0xA1); /* ICW2: 8259A-2 IR0-7 mapped to 0x28-0x2f */ + outb_p(IRQ8_VECTOR, 0xA1); /* ICW2: 8259A-2 IR0-7 mapped to 0x38-0x3f */ outb_p(0x02, 0xA1); /* 8259A-2 is a slave on master's IR2 */ outb_p(0x01, 0xA1); /* (slave's support for AEOI in flat mode is to be investigated) */ diff --git a/include/asm-x86_64/hw_irq.h b/include/asm-x86_64/hw_irq.h index 2e4b7a5..6153ae5 100644 --- a/include/asm-x86_64/hw_irq.h +++ b/include/asm-x86_64/hw_irq.h @@ -38,7 +38,7 @@ #define IRQ_MOVE_CLEANUP_VECTORFIRST_EXTERNAL_VECTOR /* - * Vectors 0x20-0x2f are used for ISA interrupts. + * Vectors 0x30-0x3f are used for ISA interrupts. */ #define IRQ0_VECTORFIRST_EXTERNAL_VECTOR + 0x10 #define IRQ1_VECTORIRQ0_VECTOR + 1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] the scheduled eepro100 removal
On 3/28/07, Jeff Garzik <[EMAIL PROTECTED]> wrote: Kok, Auke wrote: Sounds sane to me. My overall opinion on eepro100 removal is that we're not there yet. Rare problem cases remain where e100 fails but eepro100 works, and it's older drivers so its low priority for everybody. Needs to happen, though... It seems that several Tyan Opteron base system that were using IPMI add on card. the IPMI card share intel 100Mhz nic onboard. you need to use eepro100 instead of e100 otherwise the e100 will shutdown OOB (out of Band) connection for IPMI when shut down the OS. YH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] MSI-X: fix resume crash
Len Brown <[EMAIL PROTECTED]> writes: >> Tony, Len the way pci_disable_device is being used in a suspend/resume >> path by a few drivers is completely incompatible with the way irqs are >> allocated on ia64. In particular people the following sequence occurs >> in several drivers. >> >> probe: >> pci_enable_device(pdev); >> request_irq(pdev->irq); >> suspend: >> pci_disable_device(pdev); >> resume: >> pci_enable_device(pdev); >> remove: >> free_irq(pdev->irq); >> pci_disable_device(pdev); > > There are no IA64 machines that support system suspend/resume today -- > so you have 0 chance of breaking the IA64 suspend/resume installed base. Ok. So that is why the inconsistency persists... > My understanding is that Luming Yu has cobbled IA64 S4 support > together for a future release though. > >> What I'm proposing we do is move the irq allocation code out of >> pci_enable_device and the irq freeing code out of pci_disable_device in >> the future. If we move ia64 to a model where the irq number equal the >> gsi like we have for x86_64 and are in the middle of for i386 that >> should be pretty straight forward. It would even be relatively simple >> to delay vector allocation in that context until request_irq, if we >> needed the delayed allocation benefit. Do you two have any problems >> with moving in that direction? > > I think consistency here would be _wonderful_. > Of course the beauty of having identity GSI=IRQ and a /proc/interrupts > that tells you what IOAPIC pin you are using become moot with MSI -- > but hey, showing the IRQ number rather than the vector number > is consistent and makes sense. Yes. It also allows for bigger machines. And I can get a consistent number out of MSI if we allocate irq numbers in a sufficiently non-sparse way. Something like bus|device|func|irq which is 8+5+3+12 or 28 bits... I'll never get there though if i keep unearthing this long standing bugs. >> If fixing the arch code is unacceptable for some reason I'm not aware of >> we need to audit the 10-20 drivers that call pci_disable_device in their >> suspend/resume processing and ensure that they have freed all of the >> irqs before that point. Given that I have bug reports on the msi path I >> know that isn't true. > > I think the suspend/resume interrupt logic needs some serious attention. > We've had several schemes for suspend/resume of interrupts, several > changes in strategy, and right now I think we are inconsistent, > and frankly, I'm amazed it works at all. What I have been doing lately is to aim at consistency in how a function is called (and thus how it is expected to be used) and how it is actually implemented. When I have a choice I try to pick a forgiving implementation so that driver writers don't have to follow a magic correct path for things to work correctly. Removing the irq assignment from pci_enable_device is something that matches implementation with use. As for the rest it seems reasonable to me to allow an irq to be held requested over suspend/resume and to save and restore apic and msi capability state. Especially since irq numbers are a kernel abstraction we should be able to do with them what we need to. Honestly the whole suspend/resume thing is beyond me at this point I'm laptop free... But I do know how to make code consistent with itself. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 14/21] MSI: Use a list instead of the custom link structure
On Wed, 2007-03-28 at 00:29 -0600, Eric W. Biederman wrote: > Michael Ellerman <[EMAIL PROTECTED]> writes: > > > The msi descriptors are linked together with what looks a lot like > > a linked list, but isn't a struct list_head list. Make it one. > > > > The only complication is that previously we walked a list of irqs, and > > got the descriptor for each with get_irq_msi(). Now we have a list of > > descriptors and need to get the irq out of it, so it needs to be in the > > actual struct msi_desc. We use 0 to indicate no irq is setup. > > > > At some point after a pci_dev is created we need to initialise its > > msi_list. pci_device_add() looks like the right place to do that, although > > I'm not convinced it's 100% safe. In drivers/char/agp/alpha-agp.c we create > > a pci_dev and I don't see that it ever gets passed to pci_device_add(), but > > we probably don't care. > > Well that one appears to be a dummy place holder and probably should at > least have a kzalloc to initialize all of the fields to know values. > > Regardless the normal pci device allocation does use kzalloc so we will > have well defined if not beautiful behavior if we try and use it. > > Until we have a case where we need to use the msi_list outside of > where we enable and disable msi we should be perfectly fine initializing > the list somewhere inside of pci_enable_msi, and pci_enable_msix. > With dev->msi_enabled and dev->msix_enabled serving as flags to the > rest of the world that it is safe to look at the list. > > It certainly sounds safer to me then becoming to closely coupled with > code that doesn't really care about how msi works. Heck even though > we repeat the call twice I bet it will even be less code. I thought about doing it in the MSI enable methods, but I think it really belongs in the (nonexistant) routine that allocs and sets up a pci_dev. I think it's pretty dicy to be passing around a pci_dev with an uninitialised msi_list. Even if currently no code outside the MSI enable methods looks at it, I think we're asking for bugs in the future. So I'll do a patch which adds alloc_pci_dev(), update the callers, and then put the msi_list initialisation in there. > > --- msi-new.orig/include/linux/msi.h > > +++ msi-new/include/linux/msi.h > > @@ -1,6 +1,8 @@ > > #ifndef LINUX_MSI_H > > #define LINUX_MSI_H > > > > +#include > > + > > struct msi_msg { > > u32 address_lo; /* low 32 bits of msi message address */ > > u32 address_hi; /* high 32 bits of msi message address */ > > @@ -24,10 +26,8 @@ struct msi_desc { > > unsigned default_irq; /* default pre-assigned irq */ > > }msi_attrib; > > > > - struct { > > - __u16 head; > > - __u16 tail; > > - }link; > > + int irq; > This should be "unsigned int irq" Oops, I'll fix that. cheers -- Michael Ellerman OzLabs, IBM Australia Development Lab wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person signature.asc Description: This is a digitally signed message part
Re: [patch resend v4] update ctime and mtime for mmaped write
[EMAIL PROTECTED] wrote: But if you didn't notice until now, then the current implementation must be pretty reasonable for you use as well. Oh, I definitely noticed. As soon as I tried to port my application to 2.6, it broke - as evidenced by my complaints last year. The current solution is simple - since it's running on dedicated boxes, leave them on 2.4. Well I didn't know that was a change in behaviour vs 2.4 (or maybe I did and forgot). That was probably a bit silly, unless there was a good reason for it. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86_64 irq: keep consistent for changing IRQ0_VECTOR from 0x20 to 0x30
On 3/7/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: The comment fixes or some variation on them are needed. Please check the patch about comment. YH [PATCH] x86_64 irq: keep consistent for changing IRQ0_VECTOR from 0x20 to 0x30 FIRST_EXTERNAL_VECTOR is used for IRQ_MOVE_CLEANUP_VECTOR, and IRQ0 starting from FIRST_EXTERNAL_VECTOR + 0x10. Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]> diff --git a/arch/x86_64/kernel/i8259.c b/arch/x86_64/kernel/i8259.c index 21d95b7..4894266 100644 --- a/arch/x86_64/kernel/i8259.c +++ b/arch/x86_64/kernel/i8259.c @@ -45,7 +45,7 @@ /* * ISA PIC or low IO-APIC triggered (INTA-cycle or APIC) interrupts: - * (these are usually mapped to vectors 0x20-0x2f) + * (these are usually mapped to vectors 0x30-0x3f) */ /* @@ -299,7 +299,7 @@ void init_8259A(int auto_eoi) * outb_p - this has to work on a wide range of PC hardware. */ outb_p(0x11, 0x20); /* ICW1: select 8259A-1 init */ - outb_p(IRQ0_VECTOR, 0x21); /* ICW2: 8259A-1 IR0-7 mapped to 0x20-0x27 */ + outb_p(IRQ0_VECTOR, 0x21); /* ICW2: 8259A-1 IR0-7 mapped to 0x30-0x37 */ outb_p(0x04, 0x21); /* 8259A-1 (the master) has a slave on IR2 */ if (auto_eoi) outb_p(0x03, 0x21); /* master does Auto EOI */ @@ -307,7 +307,7 @@ void init_8259A(int auto_eoi) outb_p(0x01, 0x21); /* master expects normal EOI */ outb_p(0x11, 0xA0); /* ICW1: select 8259A-2 init */ - outb_p(IRQ8_VECTOR, 0xA1); /* ICW2: 8259A-2 IR0-7 mapped to 0x28-0x2f */ + outb_p(IRQ8_VECTOR, 0xA1); /* ICW2: 8259A-2 IR0-7 mapped to 0x38-0x3f */ outb_p(0x02, 0xA1); /* 8259A-2 is a slave on master's IR2 */ outb_p(0x01, 0xA1); /* (slave's support for AEOI in flat mode is to be investigated) */ diff --git a/include/asm-x86_64/hw_irq.h b/include/asm-x86_64/hw_irq.h index 2e4b7a5..6153ae5 100644 --- a/include/asm-x86_64/hw_irq.h +++ b/include/asm-x86_64/hw_irq.h @@ -38,7 +38,7 @@ #define IRQ_MOVE_CLEANUP_VECTOR FIRST_EXTERNAL_VECTOR /* - * Vectors 0x20-0x2f are used for ISA interrupts. + * Vectors 0x30-0x3f are used for ISA interrupts. */ #define IRQ0_VECTOR FIRST_EXTERNAL_VECTOR + 0x10 #define IRQ1_VECTOR IRQ0_VECTOR + 1
Re: [PATCH 10/21] MSI: Add an arch_msi_supported()
Michael Ellerman <[EMAIL PROTECTED]> writes: > I agree with most of that. I thought of doing that change, but didn't > want to have the powerpc code stuck behind a huge pile of driver > changes. > > My only other worry is that at some point we'll get a driver that does > want to choose the entries it's allocated, and at that point we'll have > to put back the msix_entry code (or something similar). I don't have any > idea of when/if that sort of hardware/driver requirement is likely to > surface though, if it's "not for a while" it might be worth ripping out > the complexity until we really need it. Yes. Allocating everything and just requesting the irqs you really want is works as well. So drivers like that would need to be common and the savings significant before it would really be worthwhile to change the API back the way it is now. >> I was tempted to drop nvec as well since our irq numbers are virtual, >> we could always delay the failure into request_irq. But there are >> a few embedded architectures like the arm where the number irqs >> numbers may stay limited for a long time and if the driver will never >> use all of the irqs we get to save some resources and some work. So >> that makes sense. > > I think nvec should stay. Agreed. >> So can we please at least move this patch down to the end with the >> rest of the RTAS arch support? >> >> Moving it towards the end will allow it to be reviewed in the context >> where it will be used and it will give us a chance to simplify >> pci_enable_msix before we get there. > > I'm happy to move it to the end of the series. I'm also happy to stop > passing the msix_entry into the arch. > > But I don't want to predicate the merge of our powerpc stuff on the > removal of msix_entry entirely, there's too much risk that we'll slip to > v23. Sure. But if we can kill msix_entry in the same time frame it would be a good thing. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[ PATCH] Add suspend/resume for HPET was: Re: [3/6] 2.6.21-rc4: known regressions
On Wednesday 28 March 2007 18:38:48 Linus Torvalds wrote: > > On Wed, 28 Mar 2007, Maxim wrote: > > > > Now I don't have a clue how to set those bits if only HPET is used as > > clock source because now clocksources > > don't have _any_ resume hook. > > One thing that drives me wild about that "clocksource resume" thing is > that it seems to think that clocksources are somehow different from any > other system devices.. > > Why isn't the HPET considered a "device", and has it's own *device* > "suspend" and "resume"? Why do we seem to think that only "set_mode()" > etc should wake up clock sources? > > It's a *device*, dammit. It should save and resume like one (probably as a > system device). The "set_mode()" etc stuff is at a completely different > (higher) conceptual level. > > Thomas? It does seem like Maxim has hit the nail on the head (at least > partly) on the HPET timer resume problems.. > > Linus > Hi, I am sending here a patch that as was discussed here adds hpet to list of system devices and adds suspend/resume hooks this way. I tested it and it works fine. --- Add suspend/resume support for HPET Signed-off-by: Maxim Levitsky <[EMAIL PROTECTED]> --- arch/i386/kernel/hpet.c | 64 +++ 1 files changed, 64 insertions(+), 0 deletions(-) diff --git a/arch/i386/kernel/hpet.c b/arch/i386/kernel/hpet.c index 0fd9fba..ac41476 100644 --- a/arch/i386/kernel/hpet.c +++ b/arch/i386/kernel/hpet.c @@ -3,6 +3,8 @@ #include #include #include +#include +#include #include #include @@ -524,3 +526,65 @@ irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id) return IRQ_HANDLED; } #endif + + +/* + * Suspend/resume part + */ + +#ifdef CONFIG_PM + +static int hpet_suspend(struct sys_device *sys_device, pm_message_t state) +{ + unsigned long cfg = hpet_readl(HPET_CFG); + + cfg &= ~(HPET_CFG_ENABLE|HPET_CFG_LEGACY); + hpet_writel(cfg, HPET_CFG); + + return 0; +} + +static int hpet_resume(struct sys_device *sys_device) +{ + unsigned int id; + + hpet_start_counter(); + + id = hpet_readl(HPET_ID); + + if (id & HPET_ID_LEGSUP) + hpet_enable_int(); + + return 0; +} + +static struct sysdev_class hpet_class = { + set_kset_name("hpet"), + .suspend= hpet_suspend, + .resume = hpet_resume, +}; + +static struct sys_device hpet_device = { + .id = 0, + .cls= _class, +}; + + +static __init int hpet_register_sysfs(void) +{ + int err; + + err = sysdev_class_register(_class); + + if (!err) { + sysdev_register(_device); + if (err) + sysdev_class_unregister(_class); + } + + return err; +} + +device_initcall(hpet_register_sysfs); + +#endif -- 1.4.4.2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] MSI-X: fix resume crash
> Tony, Len the way pci_disable_device is being used in a suspend/resume > path by a few drivers is completely incompatible with the way irqs are > allocated on ia64. In particular people the following sequence occurs > in several drivers. > > probe: > pci_enable_device(pdev); > request_irq(pdev->irq); > suspend: > pci_disable_device(pdev); > resume: > pci_enable_device(pdev); > remove: > free_irq(pdev->irq); > pci_disable_device(pdev); There are no IA64 machines that support system suspend/resume today -- so you have 0 chance of breaking the IA64 suspend/resume installed base. My understanding is that Luming Yu has cobbled IA64 S4 support together for a future release though. > What I'm proposing we do is move the irq allocation code out of > pci_enable_device and the irq freeing code out of pci_disable_device in > the future. If we move ia64 to a model where the irq number equal the > gsi like we have for x86_64 and are in the middle of for i386 that > should be pretty straight forward. It would even be relatively simple > to delay vector allocation in that context until request_irq, if we > needed the delayed allocation benefit. Do you two have any problems > with moving in that direction? I think consistency here would be _wonderful_. Of course the beauty of having identity GSI=IRQ and a /proc/interrupts that tells you what IOAPIC pin you are using become moot with MSI -- but hey, showing the IRQ number rather than the vector number is consistent and makes sense. > If fixing the arch code is unacceptable for some reason I'm not aware of > we need to audit the 10-20 drivers that call pci_disable_device in their > suspend/resume processing and ensure that they have freed all of the > irqs before that point. Given that I have bug reports on the msi path I > know that isn't true. I think the suspend/resume interrupt logic needs some serious attention. We've had several schemes for suspend/resume of interrupts, several changes in strategy, and right now I think we are inconsistent, and frankly, I'm amazed it works at all. -Len > From: Eric W. Biederman <[EMAIL PROTECTED]> > Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> > --- > arch/cris/arch-v32/drivers/pci/bios.c |4 +++- > arch/frv/mb93090-mb00/pci-vdk.c |3 ++- > arch/i386/pci/common.c|6 -- > arch/ia64/pci/pci.c |8 ++-- > 4 files changed, 15 insertions(+), 6 deletions(-) > > Index: linux/arch/cris/arch-v32/drivers/pci/bios.c > === > --- linux.orig/arch/cris/arch-v32/drivers/pci/bios.c > +++ linux/arch/cris/arch-v32/drivers/pci/bios.c > @@ -100,7 +100,9 @@ int pcibios_enable_device(struct pci_dev > if ((err = pcibios_enable_resources(dev, mask)) < 0) > return err; > > - return pcibios_enable_irq(dev); > + if (!dev->msi_enabled) > + pcibios_enable_irq(dev); > + return 0; > } > > int pcibios_assign_resources(void) > Index: linux/arch/frv/mb93090-mb00/pci-vdk.c > === > --- linux.orig/arch/frv/mb93090-mb00/pci-vdk.c > +++ linux/arch/frv/mb93090-mb00/pci-vdk.c > @@ -466,6 +466,7 @@ int pcibios_enable_device(struct pci_dev > > if ((err = pcibios_enable_resources(dev, mask)) < 0) > return err; > - pcibios_enable_irq(dev); > + if (!dev->msi_enabled) > + pcibios_enable_irq(dev); > return 0; > } > Index: linux/arch/i386/pci/common.c > === > --- linux.orig/arch/i386/pci/common.c > +++ linux/arch/i386/pci/common.c > @@ -434,11 +434,13 @@ int pcibios_enable_device(struct pci_dev > if ((err = pcibios_enable_resources(dev, mask)) < 0) > return err; > > - return pcibios_enable_irq(dev); > + if (!dev->msi_enabled) > + return pcibios_enable_irq(dev); > + return 0; > } > > void pcibios_disable_device (struct pci_dev *dev) > { > - if (pcibios_disable_irq) > + if (!dev->msi_enabled && pcibios_disable_irq) > pcibios_disable_irq(dev); > } > Index: linux/arch/ia64/pci/pci.c > === > --- linux.orig/arch/ia64/pci/pci.c > +++ linux/arch/ia64/pci/pci.c > @@ -557,14 +557,18 @@ pcibios_enable_device (struct pci_dev *d > if (ret < 0) > return ret; > > - return acpi_pci_irq_enable(dev); > + if (!dev->msi_enabled) > + return acpi_pci_irq_enable(dev); > + return 0; > } > > void > pcibios_disable_device (struct pci_dev *dev) > { > BUG_ON(atomic_read(>enable_cnt)); > - acpi_pci_irq_disable(dev); > + if (!dev->msi_enabled) > + acpi_pci_irq_disable(dev); > + return 0; > } > > void > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a
Re: [PATCH] max_loop limit, t2
On Mar 25 2007 10:40, Tomas M wrote: >On ??, Jan Engelhardt wrote: > >> here's one. Allocates all the fluff dynamically. It does not >> create any dev nodes by itself, so you need to do it (à la mdadm) > > I'm afraid that this would break a lot of things, for example mount > -o loop will not work anymore unless you create /dev/loop* manually > first, am I correct? In this case, this is unusable for many as it > is not backward compatible with old loop.c, am I correct? So here's another try. Use the max_auto_loop= module parameter to define how many device nodes should be created (defaults to 8, like original loop.c) in advance. (More specifically, how many disks you want uevents have generated.) This is because creating all 1048576 possible loop disks in /dev (tmpfs!!) would be really overkill and seldom good for memory usage. On Mar 28 2007 23:54, Kyle Moffett wrote: > Maybe an rbtree would work better here? Maximum number of nodes > traversed to get to the bottom of the tree given 2^(20) loop > devices is 19 as opposed to the 2^(20) for a linked list. Also, to > preserve compatibility with existing userspace loop tools you > should probably always allocate one extra loop device. Keep a > "highest used loopdev" number and create the one after that so that > udev will autocreate a dev node for it. Yeah I already have a ... hack that creates /dev/loop[0-7] but it segfaults ^_^ Perhaps someone knows why. The oops trace I get has kobject_uevent() in it, but I don't think I missed something in the _init function wrt. uevent generation, did I? Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]> Name: dynamic-loop-jengelh2.diff Index: linux-2.6.21-rc5/drivers/block/Makefile === --- linux-2.6.21-rc5.orig/drivers/block/Makefile +++ linux-2.6.21-rc5/drivers/block/Makefile @@ -29,3 +29,4 @@ obj-$(CONFIG_VIODASD) += viodasd.o obj-$(CONFIG_BLK_DEV_SX8) += sx8.o obj-$(CONFIG_BLK_DEV_UB) += ub.o +CFLAGS_loop.o += -O0 Index: linux-2.6.21-rc5/drivers/block/loop.c === --- linux-2.6.21-rc5.orig/drivers/block/loop.c +++ linux-2.6.21-rc5/drivers/block/loop.c @@ -77,9 +77,9 @@ #include -static int max_loop = 8; -static struct loop_device *loop_dev; -static struct gendisk **disks; +static unsigned int max_auto_loop = 8; +static LIST_HEAD(loop_devices); +static DEFINE_SPINLOCK(loop_devices_lock); /* * Transfer functions @@ -183,7 +183,7 @@ figure_loop_size(struct loop_device *lo) if (unlikely((loff_t)x != size)) return -EFBIG; - set_capacity(disks[lo->lo_number], x); + set_capacity(lo->lo_disk, x); return 0; } @@ -812,7 +812,7 @@ static int loop_set_fd(struct loop_devic lo->lo_queue->queuedata = lo; lo->lo_queue->unplug_fn = loop_unplug; - set_capacity(disks[lo->lo_number], size); + set_capacity(lo->lo_disk, size); bd_set_size(bdev, size << 9); set_blocksize(bdev, lo_blocksize); @@ -832,7 +832,7 @@ out_clr: lo->lo_device = NULL; lo->lo_backing_file = NULL; lo->lo_flags = 0; - set_capacity(disks[lo->lo_number], 0); + set_capacity(lo->lo_disk, 0); invalidate_bdev(bdev, 0); bd_set_size(bdev, 0); mapping_set_gfp_mask(mapping, lo->old_gfp_mask); @@ -918,7 +918,7 @@ static int loop_clr_fd(struct loop_devic memset(lo->lo_crypt_name, 0, LO_NAME_SIZE); memset(lo->lo_file_name, 0, LO_NAME_SIZE); invalidate_bdev(bdev, 0); - set_capacity(disks[lo->lo_number], 0); + set_capacity(lo->lo_disk, 0); bd_set_size(bdev, 0); mapping_set_gfp_mask(filp->f_mapping, gfp); lo->lo_state = Lo_unbound; @@ -1357,8 +1357,9 @@ static struct block_device_operations lo /* * And now the modules code and kernel interface. */ -module_param(max_loop, int, 0); -MODULE_PARM_DESC(max_loop, "Maximum number of loop devices (1-256)"); +module_param(max_auto_loop, uint, S_IRUGO); +MODULE_PARM_DESC(max_auto_loop, "Maximum number of auto-generated loop device " + "nodes (0-1048576)"); MODULE_LICENSE("GPL"); MODULE_ALIAS_BLOCKDEV_MAJOR(LOOP_MAJOR); @@ -1383,7 +1384,7 @@ int loop_unregister_transfer(int number) xfer_funcs[n] = NULL; - for (lo = _dev[0]; lo < _dev[max_loop]; lo++) { + list_for_each_entry(lo, _devices, lo_list) { mutex_lock(>lo_ctl_mutex); if (lo->lo_encryption == xfer) @@ -1398,102 +1399,120 @@ int loop_unregister_transfer(int number) EXPORT_SYMBOL(loop_register_transfer); EXPORT_SYMBOL(loop_unregister_transfer); -static int __init loop_init(void) +static struct loop_device *loop_find_dev(unsigned int number) +{ + struct loop_device *lo; + list_for_each_entry(lo, _devices, lo_list) + if (lo->lo_number == number) +
Re: [PATCH 10/21] MSI: Add an arch_msi_supported()
On Tue, 2007-03-27 at 23:54 -0600, Eric W. Biederman wrote: > Michael Ellerman <[EMAIL PROTECTED]> writes: > > > Add an arch_msi_supported(), which gives archs a chance to check the input > > to pci_enable_msi/x. For MSI-X this routine might need the entry array, so > > pass it in. For plain MSI, NULL is passed, the arch routine needs to cope > > with that. Propagate the error value returned from the arch routine out to > > the caller. > > Ugh. I'm not very comfortable with passing struct msix_entry into > the architectures right now. > > There are a couple of reasons. > - It's irq field is to small (so we need to change it at some point) > - No a single driver that calls pci_enable_msix uses the scatter gather > feature (so the entry member is redundant). > > So this struct msix_entry needs to change and we need to change the drivers > along with it. Having to change a couple of architectures as well sounds > painful. So we might as well fix that at the same time as we are > adding the RTAS support so architectures don't have to deal with this > nasty unused concept. > > I'm thinking the same thing to do is to completely remove struct msix_entry > and just let drivers walk the linked list you introduce a few patches > later down. All they need is to get their irq numbers anyway. I agree with most of that. I thought of doing that change, but didn't want to have the powerpc code stuck behind a huge pile of driver changes. My only other worry is that at some point we'll get a driver that does want to choose the entries it's allocated, and at that point we'll have to put back the msix_entry code (or something similar). I don't have any idea of when/if that sort of hardware/driver requirement is likely to surface though, if it's "not for a while" it might be worth ripping out the complexity until we really need it. > I was tempted to drop nvec as well since our irq numbers are virtual, > we could always delay the failure into request_irq. But there are > a few embedded architectures like the arm where the number irqs > numbers may stay limited for a long time and if the driver will never > use all of the irqs we get to save some resources and some work. So > that makes sense. I think nvec should stay. > So can we please at least move this patch down to the end with the > rest of the RTAS arch support? > > Moving it towards the end will allow it to be reviewed in the context > where it will be used and it will give us a chance to simplify > pci_enable_msix before we get there. I'm happy to move it to the end of the series. I'm also happy to stop passing the msix_entry into the arch. But I don't want to predicate the merge of our powerpc stuff on the removal of msix_entry entirely, there's too much risk that we'll slip to v23. cheers -- Michael Ellerman OzLabs, IBM Australia Development Lab wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person signature.asc Description: This is a digitally signed message part
Re: 2.6.21-rc1 and 2.6.21-rc2 kwin dies silently
Sid I think I have found the problem. Could you try the following patch. I believe I accidentally switched the sense of a test diff --git a/kernel/exit.c b/kernel/exit.c index f132349..b55ed4c 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -790,7 +790,7 @@ static void exit_notify(struct task_struct *tsk) pgrp = task_pgrp(tsk); if ((task_pgrp(t) != pgrp) && - (task_session(t) != task_session(tsk)) && + (task_session(t) == task_session(tsk)) && will_become_orphaned_pgrp(pgrp, tsk) && has_stopped_jobs(pgrp)) { __kill_pgrp_info(SIGHUP, SEND_SIG_PRIV, pgrp); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] max_loop limit
On Mar 23, 2007, at 19:26:34, Jan Engelhardt wrote: here's one. Allocates all the fluff dynamically. It does not create any dev nodes by itself, so you need to do it (à la mdadm), but you'll get all 1048576 available minors. +static LIST_HEAD(loop_devices); Maybe an rbtree would work better here? Maximum number of nodes traversed to get to the bottom of the tree given 2^(20) loop devices is 19 as opposed to the 2^(20) for a linked list. Also, to preserve compatibility with existing userspace loop tools you should probably always allocate one extra loop device. Keep a "highest used loopdev" number and create the one after that so that udev will autocreate a dev node for it. Cheers, Kyle Moffett - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [KJ][PATCH] BIT macro cleanup
Milind Arun Choudhary wrote: BIT macro cleanup,now in bitops.h Signed-off-by: Milind Arun Choudhary <[EMAIL PROTECTED]> --- diff --git a/drivers/net/s2io.h b/drivers/net/s2io.h index 0de0c65..5aa3be5 100644 --- a/drivers/net/s2io.h +++ b/drivers/net/s2io.h @@ -14,6 +14,7 @@ #define _S2IO_H #define TBD 0 +#undef BIT #define BIT(loc) (0x8000ULL >> (loc)) #define vBIT(val, loc, sz) (((u64)val) << (64-loc-sz)) #define INV(d) ((d&0xff)<<24) | (((d>>8)&0xff)<<16) | (((d>>16)&0xff)<<8)| ((d>>24)&0xff) Why not use "LLBIT(63 - loc)" instead? Richard Knutsson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kdump/kexec: calculate note size at compile time
On Thu, Mar 29, 2007 at 09:14:21AM +0530, Vivek Goyal wrote: > On Thu, Mar 29, 2007 at 12:30:59PM +0900, Simon Horman wrote: > > Hi, > > > > this is a(nother) minor update to this patch. > > Explanation below. > > > > -- > > Horms > > H: http://www.vergenet.net/~horms/ > > W: http://www.valinux.co.jp/en/ > > > > [PATCH] kdump/kexec: calculate note size at compile time > > > > Currently the size of the per-cpu region reserved to save crash > > notes is set by the per-architecture value MAX_NOTE_BYTES. Which > > in turn is currently set to 1024 on all supported architectures. > > > > While testing ia64 I recently discovered that this value is > > in fact too small. The particular setup I was using actually > > needs 1172 bytes. This lead to very tedious failure mode > > where the tail of one elf note would overwrite the head of > > another if they ended up being alocated sequentially by kmalloc, > > which was often the case. > > > > It seems to me that a far better approach is to caclculate the size > > that the area needs to be. This patch does just that. > > > > If a simpler stop-gap patch for ia64 to be squeezed into 2.6.21(.X) > > is needed then this should be as easy as making MAX_NOTE_BYTES > > larger in arch/asm-ia64/kexec.h. Perhaps 2048 would be a good choice. > > However, I think that the approach in this patch is a much more robust > > idea. > > > > Update I: > > > > Changed KEXEC_NOTE_HEAD_BYTES to KEXEC_NOTE_DESC_BYTES in line > > with the name of the relevant field in struct elf_note > > > > Update II: > > > > * Use KEXEC_NOTE_NAME instead of "CORE" in kernel/kexec.c and > > arch/ia64/kernel/crash.c just to be extra sure that the data > > used to calculate the size, and the data stuffed into the reserved > > area is the same. > > > > Incidently, the ia64 code really ought to use the generic code. > > I am working on a patch for this. But it is not urgent. > > > > Looks good. Another patch to make ia64 also use generic kexec code > for note generation would be nice. Thanks, I will make it so :-) -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kdump/kexec: calculate note size at compile time
On Thu, Mar 29, 2007 at 12:30:59PM +0900, Simon Horman wrote: > Hi, > > this is a(nother) minor update to this patch. > Explanation below. > > -- > Horms > H: http://www.vergenet.net/~horms/ > W: http://www.valinux.co.jp/en/ > > [PATCH] kdump/kexec: calculate note size at compile time > > Currently the size of the per-cpu region reserved to save crash > notes is set by the per-architecture value MAX_NOTE_BYTES. Which > in turn is currently set to 1024 on all supported architectures. > > While testing ia64 I recently discovered that this value is > in fact too small. The particular setup I was using actually > needs 1172 bytes. This lead to very tedious failure mode > where the tail of one elf note would overwrite the head of > another if they ended up being alocated sequentially by kmalloc, > which was often the case. > > It seems to me that a far better approach is to caclculate the size > that the area needs to be. This patch does just that. > > If a simpler stop-gap patch for ia64 to be squeezed into 2.6.21(.X) > is needed then this should be as easy as making MAX_NOTE_BYTES > larger in arch/asm-ia64/kexec.h. Perhaps 2048 would be a good choice. > However, I think that the approach in this patch is a much more robust > idea. > > Update I: > > Changed KEXEC_NOTE_HEAD_BYTES to KEXEC_NOTE_DESC_BYTES in line > with the name of the relevant field in struct elf_note > > Update II: > > * Use KEXEC_NOTE_NAME instead of "CORE" in kernel/kexec.c and > arch/ia64/kernel/crash.c just to be extra sure that the data > used to calculate the size, and the data stuffed into the reserved > area is the same. > > Incidently, the ia64 code really ought to use the generic code. > I am working on a patch for this. But it is not urgent. > Looks good. Another patch to make ia64 also use generic kexec code for note generation would be nice. Thanks Vivek - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] kdump/kexec: calculate note size at compile time
Hi, this is a(nother) minor update to this patch. Explanation below. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ [PATCH] kdump/kexec: calculate note size at compile time Currently the size of the per-cpu region reserved to save crash notes is set by the per-architecture value MAX_NOTE_BYTES. Which in turn is currently set to 1024 on all supported architectures. While testing ia64 I recently discovered that this value is in fact too small. The particular setup I was using actually needs 1172 bytes. This lead to very tedious failure mode where the tail of one elf note would overwrite the head of another if they ended up being alocated sequentially by kmalloc, which was often the case. It seems to me that a far better approach is to caclculate the size that the area needs to be. This patch does just that. If a simpler stop-gap patch for ia64 to be squeezed into 2.6.21(.X) is needed then this should be as easy as making MAX_NOTE_BYTES larger in arch/asm-ia64/kexec.h. Perhaps 2048 would be a good choice. However, I think that the approach in this patch is a much more robust idea. Update I: Changed KEXEC_NOTE_HEAD_BYTES to KEXEC_NOTE_DESC_BYTES in line with the name of the relevant field in struct elf_note Update II: * Use KEXEC_NOTE_NAME instead of "CORE" in kernel/kexec.c and arch/ia64/kernel/crash.c just to be extra sure that the data used to calculate the size, and the data stuffed into the reserved area is the same. Incidently, the ia64 code really ought to use the generic code. I am working on a patch for this. But it is not urgent. * Added Ack from Vivek, which was actually for the update I version of the patch. If this is wrong, please tell me. Acked-by: Vivek Goyal <[EMAIL PROTECTED]> Signed-off-by: Simon Horman <[EMAIL PROTECTED]> arch/ia64/kernel/crash.c|2 +- include/asm-arm/kexec.h |2 -- include/asm-i386/kexec.h|2 -- include/asm-ia64/kexec.h|2 -- include/asm-mips/kexec.h|2 -- include/asm-powerpc/kexec.h |2 -- include/asm-s390/kexec.h|2 -- include/asm-sh/kexec.h |2 -- include/asm-x86_64/kexec.h |2 -- include/linux/kexec.h | 11 ++- kernel/kexec.c |2 +- 11 files changed, 12 insertions(+), 19 deletions(-) Index: linux-2.6/include/asm-ia64/kexec.h === --- linux-2.6.orig/include/asm-ia64/kexec.h 2007-03-28 18:50:25.0 +0900 +++ linux-2.6/include/asm-ia64/kexec.h 2007-03-29 12:19:10.0 +0900 @@ -14,8 +14,6 @@ /* The native architecture */ #define KEXEC_ARCH KEXEC_ARCH_IA_64 -#define MAX_NOTE_BYTES 1024 - #define kexec_flush_icache_page(page) do { \ unsigned long page_addr = (unsigned long)page_address(page); \ flush_icache_range(page_addr, page_addr + PAGE_SIZE); \ Index: linux-2.6/include/linux/kexec.h === --- linux-2.6.orig/include/linux/kexec.h2007-03-28 18:50:25.0 +0900 +++ linux-2.6/include/linux/kexec.h 2007-03-29 12:19:10.0 +0900 @@ -7,6 +7,8 @@ #include #include #include +#include +#include #include /* Verify architecture specific macros are defined */ @@ -31,6 +33,13 @@ #error KEXEC_ARCH not defined #endif +#define KEXEC_NOTE_NAME "CORE" +#define KEXEC_NOTE_HEAD_BYTES ALIGN(sizeof(struct elf_note), 4) +#define KEXEC_NOTE_NAME_BYTES ALIGN(strlen(KEXEC_NOTE_NAME) + 1, 4) +#define KEXEC_NOTE_DESC_BYTES ALIGN(sizeof(struct elf_prstatus), 4) +#define KEXEC_NOTE_BYTES ( (KEXEC_NOTE_HEAD_BYTES * 2) + \ + KEXEC_NOTE_NAME_BYTES + KEXEC_NOTE_DESC_BYTES ) + /* * This structure is used to hold the arguments that are used when loading * kernel binaries. @@ -136,7 +145,7 @@ /* Location of a reserved region to hold the crash kernel. */ extern struct resource crashk_res; -typedef u32 note_buf_t[MAX_NOTE_BYTES/4]; +typedef u32 note_buf_t[KEXEC_NOTE_BYTES/4]; extern note_buf_t *crash_notes; Index: linux-2.6/include/asm-arm/kexec.h === --- linux-2.6.orig/include/asm-arm/kexec.h 2007-03-28 18:50:25.0 +0900 +++ linux-2.6/include/asm-arm/kexec.h 2007-03-29 12:19:10.0 +0900 @@ -16,8 +16,6 @@ #ifndef __ASSEMBLY__ -#define MAX_NOTE_BYTES 1024 - struct kimage; /* Provide a dummy definition to avoid build failures. */ static inline void crash_setup_regs(struct pt_regs *newregs, Index: linux-2.6/include/asm-i386/kexec.h === --- linux-2.6.orig/include/asm-i386/kexec.h 2007-03-28 18:50:25.0 +0900 +++ linux-2.6/include/asm-i386/kexec.h 2007-03-29 12:19:10.0 +0900 @@ -47,8 +47,6 @@ /* The native architecture */ #define KEXEC_ARCH KEXEC_ARCH_386 -#define MAX_NOTE_BYTES 1024 - /* CPU
Re: [KJ][PATCH] BIT macro cleanup
Alexey Dobriyan wrote: On Wed, Mar 28, 2007 at 09:03:09AM +0530, Milind Arun Choudhary wrote: --- a/include/linux/bitops.h +++ b/include/linux/bitops.h @@ -8,6 +8,9 @@ */ #include +#define BIT(nr)(1UL << ((nr) % BITS_PER_LONG)) I think this would be a disaster because something like BIT(123) would not even generate a warning. There were a discussion on this, at KJ, when BIT was first used with a modular operation. I said the same thing as you do now, but a big user of BIT is the input-subsystem who defined their BIT as above. Also it was mentioned that the compiler can only find the statical errors, a variable input can break it in runtime. + if we _really_ want to check the tree for such warnings, it is easy to remove the modular operation temporarily (and keep away of input/) I don't say I like this, just that it is a choose between possible errors. Richard Knutsson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/3] libata: expose AN support to user space via sysfs
Jeff Garzik wrote: AN is a generic concept that I feel will propagate elsewhere. I think SCSI already has it or am I imagining things again? :-) Though perhaps it should be in a 'capability_flags' file rather than a 'media_change_event' file. IMHO, if it's genhd.capability_flags then the flag should be MEDIA_CHANGE_NOTIFY not ASYNC_NOTIFICATION because AN itself doesn't imply any specific event. It's just a notification mechanism, for ATAPI devices, it means media change, for PMP it has a different meaning, so I think we need to export the processed meaning not the specific mechanism to userland. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Corrupt XFS -Filesystems on new Hardware and Kernel
Oliver Joa wrote: eason or another, xfs has detected a corrupted on-disk inode format which it cannot recognize, and shuts down. Oh, one other thing that may not apply in your case, but may. Does your SATA disk support write caching? Does it support something called a barrier function? (not real clear on all the ways this can go wrong, but I believe barriers are supposed to guarantee previous data has been fixed on disk (not in write cache). If the SATA controller issues a reset, it may very well purge the write cache. Theoretically, I can think of a _possibility_, that the reset disk would purge the write cache and the barrier indicator would tell xfs to resume writing. From a recent thread on the xfs list, it would appear this could be a "bad" thing (like crossing the streams ala "ghostbusters", but in a data-integrity context). Just a "shot in the dark" -- absent knowing anything specific about your hardware or situation... If that's the case, you might want to turn off write caching, since when xfs thinks "barriers" work, it turns off some "protection", that can enable some significant speedup in some situations. As an aside, some disks, I gather, may "claim" to support barriers, but really don't. Xfs tries to verify the barrier claim, but I don't know that a reset issued to the disk will have deterministic behavior across all manufacturer's disks. A bunch of "coulds" and "maybe's", but just thinking off top of head... Linda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sky2 PHY setup
On Fri, Mar 16, 2007 at 02:16:48PM -0700, Stephen Hemminger wrote: > On Fri, 16 Mar 2007 14:36:45 -0600 > Rob Sims <[EMAIL PROTECTED]> wrote: > > Are there some debug hooks that can be activated? My sky2 stops > > responding (very light load) about twice a day. The netdev watchdog > > notices after a while and is able to reactivate the interface: > > Mar 15 13:28:12 btd kernel: NETDEV WATCHDOG: eth0: transmit timed out > > Mar 15 13:28:12 btd kernel: sky2 eth0: tx timeout > > Mar 15 13:28:12 btd kernel: sky2 eth0: transmit ring 458 .. 435 report=458 > > done=458 > > Mar 15 13:28:12 btd kernel: sky2 eth0: disabling interface > > Mar 15 13:28:12 btd kernel: sky2 eth0: enabling interface > > Mar 15 13:28:12 btd kernel: sky2 eth0: ram buffer 48K > > Mar 15 13:28:15 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full > > duplex, flow control both > > Use ethtool -S to if there are any pause frames, etc. See if frames are > still making it into PHY statistics but not being received. > > Use ethtool -d to dump registers. Need current version of ethtool with decode > logic. > > Then look for things like is Ram buffer read/write pointer changing? > > Is GMAC stuck in pause: > > Normal is: > GMAC 1 > Status 0x5010 (see GM_GPSR_XXX in sky2.h) > Control 0x1800 > > Stuck is > GMAC 1 > Status 0x5810 (or 0x5A10) First, here's the described hang in action, on the Core2 Duo on a 1Gb hub: GMAC 1 Status/Control remains at 0x5010/0x1800 until module is removed. Read/write buffer pointers are changing. Full ethtool output in http://www.robsims.com/sky2.netmon.log.gz This machine was also having major throughput problems - 17 kB/s. Rebooting brought it to ~ 20 MB/s. Booting into a kernel with the proprietary sk98lin kernel module showed ~ 80MB/s. Finally, returning to sky2 gave 117 MB/s. Tests run using netcat, dd, /dev/zero, and /dev/null, transmitting from the problem box to an e1000 via a Netgear GS108. No hangs were observed during the "load test." I also had a hang on a Pentium 4 w/sky2, 100Mb/s hub. I neglected to try removing and re-inserting the module before rebooting. GMAC 1 Status 0xF004 Control 0x1800 RAMbuffer pointers not moving, Read buffer Read pointer != Write pointer. http://www.robsims.com/sky2.ethtooldumps.tgz Thanks for looking at this. -- Rob signature.asc Description: Digital signature
Re: [patch 2/3] libata: expose AN support to user space via sysfs
Tejun Heo wrote: Jeff Garzik wrote: Kristen Carlson Accardi wrote: Allow user space to determine if an ATAPI device supports async notification (AN) of media changes. This is done by adding a new sysfs file "async_notification" to genhd. If the file reads 1, then the device supports async notification. If the file reads 0, it does not. A flag is set in the generic disk to indicate whether or not AN is supported. This flag is set by the SCSI subsystem when it registers with add_disk. The SCSI system gets information from libata on whether the device supports AN during dev_configure. Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]> 3) I would make the contents of 'media_change_events' be a list of flags, rather than a boolean. Thus, when AN is present, media_change_events would return "AN\n". It would return "\n" (no flags) when AN is absent. This permits future expansion of this capabilities reporting variable. I'm not sure about this. AN is kind of specific term for ATA while media change event is generic. So, I think the original approach is okay. No matter how the actual thing is implemented, it's the same media change event and as long as event delivery interface is the same, upper layer shouldn't care about how it's done. AN is a generic concept that I feel will propagate elsewhere. Though perhaps it should be in a 'capability_flags' file rather than a 'media_change_event' file. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] libata: handle AN interrupt
Kristen Carlson Accardi wrote: When we get an SDB FIS with the 'N' bit set, we should send an event to user space to indicate that there has been a media change. The ahci host controller will send the event via KOBJ_CHANGE uevent. Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]> +static void async_notify_thread(struct work_struct *work) +{ + struct ata_device *atadev = + container_of(work, struct ata_device, async_notify); + + /* +* TBD - who should send this event? I couldn't find an +* easy way to map an ata_device to a genhd device, so +* decided maybe the ata host should send the event and +* allow user space to figure out what happened? +*/ + kobject_uevent(>ap->host->dev->kobj, KOBJ_CHANGE); +} I don't think this is right. If you're gonna make media_change_event capability generic, you gotta make event delivery generic too. You can make it a genhd event and make genhd supply the interface function, say, genhd_notify_media_change() which is then forwarded by SCSI layer. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/3] libata: expose AN support to user space via sysfs
Jeff Garzik wrote: Kristen Carlson Accardi wrote: Allow user space to determine if an ATAPI device supports async notification (AN) of media changes. This is done by adding a new sysfs file "async_notification" to genhd. If the file reads 1, then the device supports async notification. If the file reads 0, it does not. A flag is set in the generic disk to indicate whether or not AN is supported. This flag is set by the SCSI subsystem when it registers with add_disk. The SCSI system gets information from libata on whether the device supports AN during dev_configure. Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]> 3) I would make the contents of 'media_change_events' be a list of flags, rather than a boolean. Thus, when AN is present, media_change_events would return "AN\n". It would return "\n" (no flags) when AN is absent. This permits future expansion of this capabilities reporting variable. I'm not sure about this. AN is kind of specific term for ATA while media change event is generic. So, I think the original approach is okay. No matter how the actual thing is implemented, it's the same media change event and as long as event delivery interface is the same, upper layer shouldn't care about how it's done. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/3] libata: expose AN support to user space via sysfs
Kristen Carlson Accardi wrote: Allow user space to determine if an ATAPI device supports async notification (AN) of media changes. This is done by adding a new sysfs file "async_notification" to genhd. If the file reads 1, then the device supports async notification. If the file reads 0, it does not. A flag is set in the generic disk to indicate whether or not AN is supported. This flag is set by the SCSI subsystem when it registers with add_disk. The SCSI system gets information from libata on whether the device supports AN during dev_configure. I'm not sure whether this should be in generic block layer or in libata proper. libata sysfs hierarchy isn't there yet but is scheduled to be added soon. Async notification of media change is generic event for any block device with removable media, so I guess it can belong to generic layer. BTW, I think you also need to forward the flag in sd - disk device can be removable too. And please cc linux-scsi@vger.kernel.org to get SCSI part reviewed. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/3] libata: check for AN support
Tejun Heo wrote: Kristen Carlson Accardi wrote: Check to see if an ATAPI device supports Asynchronous Notification. If so, enable it. As supporting AN needs host interrupt handler change. I think we need host-supports-AN flag; otherwise, we might end up with screaming interrupts in the worst case. Quite so. Lacking a host flag, we need to know how each and every controller behaves when AN is activated (and supported by the device). I'm willing to bet some of the first-gen SATA controllers' ASIC state machines croak when AN is activated. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/3] libata: check for AN support
Kristen Carlson Accardi wrote: Check to see if an ATAPI device supports Asynchronous Notification. If so, enable it. As supporting AN needs host interrupt handler change. I think we need host-supports-AN flag; otherwise, we might end up with screaming interrupts in the worst case. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc][patch] queued spinlocks (i386)
On Wed, Mar 28, 2007 at 03:00:21PM -0700, Davide Libenzi wrote: > On Wed, 28 Mar 2007, Davide Libenzi wrote: > > > The method you propose is otherwise called "Ticket Lock": > > > > http://en.wikipedia.org/wiki/Ticket_lock > > http://www.cs.rochester.edu/research/synchronization/pseudocode/ss.html#ticket > > That this work prio-art dates to 1991: > > http://www.cs.rochester.edu/u/scott/papers/1991_TOCS_synch.pdf > > So I would not worry to much about patents here. At least W2K MS ones ;) > What I would worry though, is to add another class of locks. There's no No, as you see from my patch I just change the spinlock implementation to a queued one. I agree it doesn't make sense to add a new type of lock. > reason why Ticket Locks would perform worse than our spinlock, in both > contended and not-contended case, AFAICS. And they have a nice FIFO > behaviour. In most cases, no. For the uncontended case they should be about the same. They have the same spinning behaviour. However there is a little window where they might be a bit slower I think... actually perhaps I'm wrong! Currently if you have 4 CPUs spinning and the lock is released, all 4 CPU cachelines will be invalidated, then they will be loaded again, and found to be 0, so they all try to atomic_dec_return the counter, each one invalidating others' cachelines. 1 gets through. With my queued locks, all 4 cachelines are invalidated and loaded, but only one will be allowed to proceed, and there are 0 atomic operations or stores of any kind. So I take that back: our current spinlocks have a worse thundering herd behaviour under contention than my queued ones. So I'll definitely push the patch through. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Inlining can be _very_bad...
On Thu, Mar 29, 2007 at 01:18:38AM +0200, J.A. Magallón wrote: > It looks like is updating the stack on each iteration...This is -march=opteron > code, the -march=pentium4 is similar. Same behaviour with gcc3 and gcc4. > > tst.c and Makefile attached. > > Nice, isn't it ? Please, probe where is my fault... Yes, gcc sucks in its handling of large return values, news at 11. I have several outstanding bugs on cases where gcc could keep things in registers but doesn't. That said, it tends to do much better on plain integer code, as that is what it gets tuned for. Do NOT propagate the blanket myth that inlining is a bad thing. It is very useful for small functions where the overhead associated with call/ret sequences and register clobbers overshadows the work being done. The call/ret updates alone can make a big difference when there are lots of other (more useful) memory transactions to complete. Take a look at things like the notifier hooks for an example of something that does far too little work per function call and should really be inlined. -ben -- "Time is of no importance, Mr. President, only life is important." Don't Email: <[EMAIL PROTECTED]>. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc][patch] queued spinlocks (i386)
On Wed, Mar 28, 2007 at 12:26:57PM -0700, Davide Libenzi wrote: > On Wed, 28 Mar 2007, Nick Piggin wrote: > > > On Sat, Mar 24, 2007 at 06:29:59PM +0100, Ingo Molnar wrote: > > > > > > * Nikita Danilov <[EMAIL PROTECTED]> wrote: > > > > > > > Indeed, this technique is very well known. E.g., > > > > http://citeseer.ist.psu.edu/anderson01sharedmemory.html has a whole > > > > section (3. Local-spin Algorithms) on them, citing papers from the > > > > 1990 onward. > > > > > > that is a cool reference! So i'd suggest to do (redo?) the patch based > > > on those concepts and that terminology and not use 'queued spinlocks' > > > that are commonly associated with MS's stuff. And as a result the > > > contended case would be optimized some more via local-spin algorithms. > > > (which is not a key thing for us, but which would be nice to have > > > nevertheless) > > > > Firstly, the terminology in that paper _is_ "queue lock", which isn't > > really surprising. I don't really know or care about what MS calls their > > locks, but I'd suggest that their queued spinlock is probably named in > > reference to its queueing property rather than its local spin property. > > The method you propose is otherwise called "Ticket Lock": > > http://en.wikipedia.org/wiki/Ticket_lock > http://www.cs.rochester.edu/research/synchronization/pseudocode/ss.html#ticket Yes, a ticket based FIFO queue isn't new... I think we have a lot of xamples already in the kernel. Using them to implement queue locks obviously isn't new either. I don't think we'd have to worry about patents. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] i386: Remove page sized slabs for pgds and pmds
On Wed, Mar 28, 2007 at 02:05:54PM -0700, Christoph Lameter wrote: > Tried this also on x86_64 with an enhanced quicklist patch that also deals > with ptes (at the price of not guaranteeing the free after the tlb flush): ... > Seems that there is a slight benefit but its also barely above noise > level. You're not running a test that will show any benefit in this area. Run some heavy shell scripts or lmbench's fork() and exec() latency tests to get real numbers. -ben -- "Time is of no importance, Mr. President, only life is important." Don't Email: <[EMAIL PROTECTED]>. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[ANNOUNCE] GIT 1.5.0.6
The latest maintenance release GIT 1.5.0.6 is available at the usual places: http://www.kernel.org/pub/software/scm/git/ git-1.5.0.6.tar.{gz,bz2} (tarball) git-htmldocs-1.5.0.6.tar.{gz,bz2} (preformatted docs) git-manpages-1.5.0.6.tar.{gz,bz2} (preformatted docs) RPMS/$arch/git-*-1.5.0.6-1.$arch.rpm (RPM) GIT v1.5.0.6 Release Notes == Fixes since v1.5.0.5 * Bugfixes - a handful small fixes to gitweb. - build procedure for user-manual is fixed not to require locally installed stylesheets. - "git commit $paths" on paths whose earlier contents were already updated in the index were failing out. * Documentation - user-manual has better cross references. - gitweb installation/deployment procedure is now documented. Changes since v1.5.0.5 are as follows: J. Bruce Fields (5): user-manual: run xsltproc without --nonet option user-manual: Use def_ instead of ref_ for glossary references. glossary: stop generating automatically glossary: clean up cross-references user-manual: introduce "branch" and "branch head" differently Jakub Narebski (4): gitweb: Fix "next" link in commit view gitweb: Don't escape attributes in CGI.pm HTML methods gitweb: Fix not marking signoff lines in "log" view gitweb: Add some installation notes in gitweb/INSTALL Jeff King (1): commit: fix pretty-printing of messages with "\nencoding " Jim Meyering (1): user-manual.txt: fix a tiny typo. Johannes Schindelin (1): t4118: be nice to non-GNU sed Junio C Hamano (2): git-commit: "read-tree -m HEAD" is not the right way to read-tree quickly GIT 1.5.0.6 Li Yang (1): gitweb: Change to use explicitly function call cgi->escapHTML() Michael S. Tsirkin (1): fix typo in git-am manpage Peter Eriksen (1): Documentation/pack-format.txt: Clear up description of types. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] i386: Remove page sized slabs for pgds and pmds
On Wed, 28 Mar 2007, William Lee Irwin III wrote: > NIH == "Not Invented Here." Basically a sort of idea theft, often used > to grab credit for patches. You're not the one involved there. That was > a digression. One could say, though, that a solution to the slab issues > is to NIH slab allocators e.g. via quicklist.h/quicklist.c without the > negative connotation. Oh. The quicklist were actually taken from existing IA64 code. Not my idea either. I am not wedded to any solution and I was certainly not intending to abscond with your idea. Would have been difficult given that there was a signoff line with your name on it. > On Wed, Mar 28, 2007 at 04:44:01PM -0700, Christoph Lameter wrote: > > We certainly see even from the rudimentary tests that I have done > > that the limited pgd, pmd caching has some effect. Could we please > > see your local patches? And I guess that you must have some sort of > > benchmark that you run to test these? > > Short answer: No. > > Long answer: Most of the local patches are not likely to be of interest > to the world at large. The ones I probably don't mind mentioning so > much are things like ports of ipt_TARPIT.c to -CURRENT, support for > mmap() of /proc/profile, things to make the notsc boot parameter > actually do what you'd expect it to do instead of the kernel ignoring > the option when you actually need it and mucking with the TSC behind > your back, and so on. There are also things I'd rather keep under wraps > so they don't mysteriously appear on lkml a few years later posted by > someone else without any attribution to me (i.e. the NIH's that bother > me). I've not got any of them ported to current mainline anyway, and > some data loss from fried disks seems to have eaten most/all of the > post-2.6.0 revisions of these patches anyway, though I've got compiled > kernels with them on various kernel versions between then and 2.6.10 > (not to say that's any impediment to my hammering out fresh ports). Ummm. So nothing concrete on the performance issues that we are considering here? We are talking about something that was lost a couple of years ago? There are certainly other people who will have the same ideas given enough time. Software and patches age like groceries. Hiding them will just make them wither away. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [uml-devel] [PATCH] UML - fix I/O hang when multiple devices are in use
On giovedì 29 marzo 2007, Blaisorblade wrote: > On mercoledì 28 marzo 2007, Jeff Dike wrote: > > [ This patch needs to get into 2.6.21, as it fixes a serious bug > > introduced soon after 2.6.20 ] > > > > Commit 62f96cb01e8de7a5daee472e540f726db2801499 introduced per-devices > > queues and locks, which was fine as far as it went, but left in place > > a global which controlled access to submitting requests to the host. > > This should have been made per-device as well, since it causes I/O > > hangs when multiple block devices are in use. > > > > This patch fixes that by replacing the global with an activity flag in > > the device structure in order to tell whether the queue is currently > > being run. > > Finally that variable has a understandable name. However in a mail from > Jens Axboe, titled: > "Re: [uml-devel] [PATCH 06/11] uml ubd driver: ubd_io_lock usage fixup" , > with Date: Mon, 30 Oct 2006 09:26:48 +0100, he suggested removing this flag > > altogether, so we may explore this for the future: > > > Add some comments about requirements for ubd_io_lock and expand its > > > use. > > > > > > When an irq signals that the "controller" (i.e. another thread on the > > > host, which does the actual requests and is the only one blocked on I/O > > > on the host) has done some work, we call again the request function > > > ourselves (do_ubd_request). > > > We now do that with ubd_io_lock held - that's useful to protect against > > > concurrent calls to elv_next_request and so on. > > Not only useful, required, as I think I complained about a year or more > > ago :-) > > > XXX: Maybe we shouldn't call at all the request function. Input needed > > > on this. Are we supposed to plug and unplug the queue? That code > > > "indirectly" does that by setting a flag, called do_ubd, which makes > > > the request function return (it's a residual of 2.4 block layer > > > interface). > > Sometimes you need to. I'd probably just remove the do_ubd check and > > always recall the request function when handling completions, it's > > easier and safe. > Anyway, the main speedups to do on the UBD driver are: > * implement write barriers (so much less fsync) - this is performance > killer n.1 > * possibly to use the new 2.6 request layout with scatter/gather I/O, and > vectorized I/O on the host > * while at vectorizing I/O using async I/O > * to avoid passing requests on pipes (n.2) - on fast disk I/O becomes > cpu-bound. > To make a different but related example, with a SpeedScale laptop, it's > interesting to double CPU frequency and observe tuntap speed double too. > (with 1GHz I get on TCP numbers like 150 Mbit/s - 100 Mbit/s, depending > whether UML trasmits or receives data; with 2GHz double rates). > Update: I now get 150Mbit / 200Mbit (Uml receives/Uml sends) at 1GHz, and > still the double at 2Ghz. > This is a different UML though. > * using futexes instead of pipes for synchronization (required for previous > one). I forgot one thing: remember ubd=mmap? Something like that could have been done using MAP_PRIVATE, so that write had still to be called explicitly but unchanged data was shared with the host. Once a page gets dirty but is then cleaned, sharing it back is difficult - but even without that good savings could be achievable. That's to explore for the very future though. -- Inform me of my mistakes, so I can add them to my list! Paolo Giarrusso, aka Blaisorblade http://www.user-mode-linux.org/~blaisorblade - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
On Wed, Mar 28, 2007 at 05:01:59PM -0700, Andrew Morton wrote: > On Wed, 28 Mar 2007 16:00:21 -0700 > Venki Pallipadi <[EMAIL PROTECTED]> wrote: > > > Please drop the patch you included yesterday and two incremental patches and > > use the patch below. > > As you saw, I went and turned it into an incremental patch again. It makes > it easier to see what changed, but harder to see the whole thing. > > > Introduce a new flag for timers - deferrable: > > OK, but there's nothing in-kernel whcih actually uses this. > > It would be good to identify some timer users which can be switched over (as > many as possible, really) so this thing actually gets some runtime testing. ondemand is the biggest offender and the patch below reduces the number of interrupts by 50% or more (depending on HZ) on different test systems here. Yes. There are quite a few other timers inside kernel that can be migrated. I will use timer_stats and track others and send in the patches soon. Thanks, Venki -- Add a new deferrable delayed work init. This can be used to schedule work that are 'unimportant' when CPU is idle and can be called later, when CPU eventually comes out of idle. Use this init in cpufreq ondemand governor. Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]> Index: new/drivers/cpufreq/cpufreq_ondemand.c === --- new.orig/drivers/cpufreq/cpufreq_ondemand.c 2007-03-28 10:03:21.0 -0800 +++ new/drivers/cpufreq/cpufreq_ondemand.c 2007-03-28 10:05:44.0 -0800 @@ -470,7 +470,7 @@ dbs_info->enable = 1; ondemand_powersave_bias_init(); dbs_info->sample_type = DBS_NORMAL_SAMPLE; - INIT_DELAYED_WORK(_info->work, do_dbs_timer); + INIT_DELAYED_WORK_DEFERRABLE(_info->work, do_dbs_timer); queue_delayed_work_on(dbs_info->cpu, kondemand_wq, _info->work, delay); } Index: new/include/linux/workqueue.h === --- new.orig/include/linux/workqueue.h 2007-03-28 10:03:21.0 -0800 +++ new/include/linux/workqueue.h 2007-03-28 10:05:44.0 -0800 @@ -89,6 +89,12 @@ init_timer(&(_work)->timer);\ } while (0) +#define INIT_DELAYED_WORK_DEFERRABLE(_work, _func) \ + do {\ + INIT_WORK(&(_work)->work, (_func)); \ + init_timer_deferrable(&(_work)->timer); \ + } while (0) + /** * work_pending - Find out whether a work item is currently pending * @work: The work item in question - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] i386: Remove page sized slabs for pgds and pmds
On Wed, Mar 28, 2007 at 02:38:55PM -0700, Christoph Lameter wrote: >>> No that was described in the patch. Quote: >>> "i386 only provides support for caching constructed pgd and pmds. These >>> are comparatively rare to ptes so it is no surprise that the current >>> approach has only minimal effect. " On Thu, Mar 29, 2007 at 01:28:59AM +0100, Alan Cox wrote: > Whatever it was originally for and public or not, the above isn't true > for some non Intel products... Sorry if the descriptions here are misleading. This is basically an attempt to have the kernel keep preconstructed pagetables around so that the bitblitting hits need not be repetitively taken during fork() and faults, where counterarguments revolve around whether this is actually a hit at all and whether it's significant. It's not related to the transparent quasi-ASID/ASN affairs AMD has based on %cr3 contents. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] i386: Remove page sized slabs for pgds and pmds
On Wed, 28 Mar 2007, William Lee Irwin III wrote: >> As far as kernel compiles being relevant to anything besides >> potentially optimizing a particular major benchmark using gcc as one >> of its components... yeah, right. It's too macro to be a microbenchmark >> of anything and too micro to be pertinent to any meaningful >> macrobenchmark such as those from major benchmark publishers (who can't >> be named for trademark/etc. reasons). Hasn't it been at least 5 years >> since people figured out kernel compiles were complete bulls**t as >> benchmarks along with dbench for other reasons and several others? If >> not, I don't know why I bother with this kernel at all. On Wed, Mar 28, 2007 at 04:44:01PM -0700, Christoph Lameter wrote: > All benchmarks have their specific drawbacks. I personally like to do > code review and see what cachelines are touched but that is basically > imagining what a cpu does. Ones thinking may be led astray. Well, the kernel compiles are just terrible at everything they could plausibly be used to measure. I could, in principle, develop a benchmark that simulates a forking server that does many things similar to what a kernel compile is meant to measure without a number of its stupidities, but I've got enough to do already. On Wed, 28 Mar 2007, William Lee Irwin III wrote: >> Even so, I already did this and am done with it. It's not like I'm >> not carrying around numerous patches I know will never be merged all >> the time anyway. If you want to back it all out so badly, just do it >> and stop bothering me about it, and I'll merely continue maintaining my >> local patches without ever posting them as I have been for years. I'm >> not at all happy with the NIH situation, either, not that I'm at such a >> loss for ideas to need to contest every petty NIH that flies past. On Wed, Mar 28, 2007 at 04:44:01PM -0700, Christoph Lameter wrote: > What is NIH? My main concern is to get the use of page struct fields of > the slab removed. We have to do special things to page sized allocs > because of these page struct uses. F.e. the private field is used by > compound pages and if any of the slabs allocate a higher order page that > field will be in use. NIH == "Not Invented Here." Basically a sort of idea theft, often used to grab credit for patches. You're not the one involved there. That was a digression. One could say, though, that a solution to the slab issues is to NIH slab allocators e.g. via quicklist.h/quicklist.c without the negative connotation. On Wed, Mar 28, 2007 at 04:44:01PM -0700, Christoph Lameter wrote: > We certainly see even from the rudimentary tests that I have done > that the limited pgd, pmd caching has some effect. Could we please > see your local patches? And I guess that you must have some sort of > benchmark that you run to test these? Short answer: No. Long answer: Most of the local patches are not likely to be of interest to the world at large. The ones I probably don't mind mentioning so much are things like ports of ipt_TARPIT.c to -CURRENT, support for mmap() of /proc/profile, things to make the notsc boot parameter actually do what you'd expect it to do instead of the kernel ignoring the option when you actually need it and mucking with the TSC behind your back, and so on. There are also things I'd rather keep under wraps so they don't mysteriously appear on lkml a few years later posted by someone else without any attribution to me (i.e. the NIH's that bother me). I've not got any of them ported to current mainline anyway, and some data loss from fried disks seems to have eaten most/all of the post-2.6.0 revisions of these patches anyway, though I've got compiled kernels with them on various kernel versions between then and 2.6.10 (not to say that's any impediment to my hammering out fresh ports). -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new sysfs layout and ethernet device names
Greg KH ([EMAIL PROTECTED]) said: > If you follow the rules in Documentation/ABI/testing/sysfs-class your > program will not have any problems. Oh, of *course*. We add interfaces and then claim years later, after code has been written, "Oh, you shouldn't be using that!" in documentation. Meanwhile, such code using the old interface will still a) continue to compile b) continue to run without any sort of warnings. If interfaces have to change, so be it. But changing the rules for using them years after it's implemented and then claiming "you didn't read the instructions" is pretty lame. Bill - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [uml-devel] [PATCH] UML - fix I/O hang when multiple devices are in use
On mercoledì 28 marzo 2007, Jeff Dike wrote: > [ This patch needs to get into 2.6.21, as it fixes a serious bug > introduced soon after 2.6.20 ] > > Commit 62f96cb01e8de7a5daee472e540f726db2801499 introduced per-devices > queues and locks, which was fine as far as it went, but left in place > a global which controlled access to submitting requests to the host. > This should have been made per-device as well, since it causes I/O > hangs when multiple block devices are in use. > > This patch fixes that by replacing the global with an activity flag in > the device structure in order to tell whether the queue is currently > being run. Finally that variable has a understandable name. However in a mail from Jens Axboe, titled: "Re: [uml-devel] [PATCH 06/11] uml ubd driver: ubd_io_lock usage fixup" , with Date: Mon, 30 Oct 2006 09:26:48 +0100, he suggested removing this flag altogether, so we may explore this for the future: > > Add some comments about requirements for ubd_io_lock and expand its use. > > > > When an irq signals that the "controller" (i.e. another thread on the > > host, which does the actual requests and is the only one blocked on I/O > > on the host) has done some work, we call again the request function > > ourselves (do_ubd_request). > > > > We now do that with ubd_io_lock held - that's useful to protect against > > concurrent calls to elv_next_request and so on. > > Not only useful, required, as I think I complained about a year or more > ago :-) > > > XXX: Maybe we shouldn't call at all the request function. Input needed on > > this. Are we supposed to plug and unplug the queue? That code > > "indirectly" does that by setting a flag, called do_ubd, which makes the > > request function return (it's a residual of 2.4 block layer interface). > > Sometimes you need to. I'd probably just remove the do_ubd check and > always recall the request function when handling completions, it's > easier and safe. Anyway, the main speedups to do on the UBD driver are: * implement write barriers (so much less fsync) - this is performance killer n.1 * possibly to use the new 2.6 request layout with scatter/gather I/O, and vectorized I/O on the host * while at vectorizing I/O using async I/O * to avoid passing requests on pipes (n.2) - on fast disk I/O becomes cpu-bound. To make a different but related example, with a SpeedScale laptop, it's interesting to double CPU frequency and observe tuntap speed double too. (with 1GHz I get on TCP numbers like 150 Mbit/s - 100 Mbit/s, depending whether UML trasmits or receives data; with 2GHz double rates). Update: I now get 150Mbit / 200Mbit (Uml receives/Uml sends) at 1GHz, and still the double at 2Ghz. This is a different UML though. * using futexes instead of pipes for synchronization (required for previous one). -- Inform me of my mistakes, so I can add them to my list! Paolo Giarrusso, aka Blaisorblade http://www.user-mode-linux.org/~blaisorblade - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Corrupt XFS -Filesystems on new Hardware and Kernel
Oliver Joa wrote: eason or another, xfs has detected a corrupted on-disk inode format which it cannot recognize, and shuts down. It is likely the result of something which has gone wrong previously. xfs_repair should fix it. Are there other non-xfs messages in your logs indicating other problems prior to this? i sent already the dmesg output to the list. there is nothing else. I made a xfs_repair. Now I have some Files in lost+found. So I tried it again with a new cable: --- I doubt it has changed significantly, but xfs was designed for stable hardware. That doesn't mean you can't pull the plug, but if you are getting SATA resets, you may be getting some writes aborted, with subsequent writes going through (speculation). I know when I had a flakey SCSI disk problem (was cable or connector in my case), I'd get a rare XFS corruption (out of ~10 years of XFS use, maybe 2-3 corruptions, all caused by loose connections, cables, etc). I'd strongly suggest you get to the bottom of the SATA reset problem. After that is fixed, then try to clean up your XFS disks (or restore from backups). Sometimes, after some intermittent hardware problems, my xfs file system was too corrupt for me to repair (at least with default xfs_repair options). Doesn't mean it was irreparable, just, I didn't know how to proceed and it was easier to restore from a daily backup than attempt to manually repair the damage. The above is based solely on my own experience. I use xfs with max(8?) logbuffs, and noatime/nodiratime, and find it to have among the best performance characteristics of any file system (overall; lowest performance aspect was file delete). XFS has a low fragmentation rate, due to how it allocates space and can delay writes. Even so, it is also one of the few file systems (only?) that comes with a "defragmenter" (xfs_fsr (file system reorganizer)). Sgi used to ship systems with xfs_fsr configured to run weekly to "watch out for" rare, degenerate cases (important for some real-time video apps). My cron runs it nightly, but often it will pass through all file systems making no changes. Fix the flakey hw -- then see if your xfs probs don't "magically" go away...however, YMMV... Linda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2/3] libata: expose AN support to user space via sysfs
Kristen Carlson Accardi wrote: Allow user space to determine if an ATAPI device supports async notification (AN) of media changes. This is done by adding a new sysfs file "async_notification" to genhd. If the file reads 1, then the device supports async notification. If the file reads 0, it does not. A flag is set in the generic disk to indicate whether or not AN is supported. This flag is set by the SCSI subsystem when it registers with add_disk. The SCSI system gets information from libata on whether the device supports AN during dev_configure. Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]> Index: 2.6-mm/block/genhd.c === --- 2.6-mm.orig/block/genhd.c +++ 2.6-mm/block/genhd.c @@ -372,6 +372,11 @@ static ssize_t disk_size_read(struct gen { return sprintf(page, "%llu\n", (unsigned long long)get_capacity(disk)); } +static ssize_t disk_AN_read(struct gendisk * disk, char *page) +{ + return sprintf(page, "%d\n", + (disk->flags & GENHD_FL_ASYNC_NOTIFICATION ? 1 : 0)); +} static ssize_t disk_stats_read(struct gendisk * disk, char *page) { @@ -419,6 +424,10 @@ static struct disk_attribute disk_attr_s .attr = {.name = "stat", .mode = S_IRUGO }, .show = disk_stats_read }; +static struct disk_attribute disk_attr_AN = { + .attr = {.name = "media_change_events", .mode = S_IRUGO }, + .show = disk_AN_read +}; #ifdef CONFIG_FAIL_MAKE_REQUEST @@ -455,6 +464,7 @@ static struct attribute * default_attrs[ _attr_removable.attr, _attr_size.attr, _attr_stat.attr, + _attr_AN.attr, #ifdef CONFIG_FAIL_MAKE_REQUEST _attr_fail.attr, #endif Index: 2.6-mm/include/linux/genhd.h === --- 2.6-mm.orig/include/linux/genhd.h +++ 2.6-mm/include/linux/genhd.h @@ -94,6 +94,7 @@ struct hd_struct { #define GENHD_FL_REMOVABLE 1 #define GENHD_FL_DRIVERFS 2 +#define GENHD_FL_ASYNC_NOTIFICATION4 #define GENHD_FL_CD8 #define GENHD_FL_UP16 #define GENHD_FL_SUPPRESS_PARTITION_INFO 32 Index: 2.6-mm/include/scsi/scsi_device.h === --- 2.6-mm.orig/include/scsi/scsi_device.h +++ 2.6-mm/include/scsi/scsi_device.h @@ -126,7 +126,7 @@ struct scsi_device { unsigned fix_capacity:1;/* READ_CAPACITY is too high by 1 */ unsigned guess_capacity:1; /* READ_CAPACITY might be too high by 1 */ unsigned retry_hwerror:1; /* Retry HARDWARE_ERROR */ - + unsigned async_notification:1; /* device supports async notification */ unsigned int device_blocked;/* Device returned QUEUE_FULL. */ unsigned int max_device_blocked; /* what device_blocked counts down from */ Index: 2.6-mm/drivers/ata/libata-scsi.c === --- 2.6-mm.orig/drivers/ata/libata-scsi.c +++ 2.6-mm/drivers/ata/libata-scsi.c @@ -899,6 +899,9 @@ static void ata_scsi_dev_config(struct s blk_queue_max_hw_segments(q, q->max_hw_segments - 1); } + if (dev->flags & ATA_DFLAG_AN) + sdev->async_notification = 1; + if (dev->flags & ATA_DFLAG_NCQ) { int depth; Index: 2.6-mm/drivers/scsi/sr.c === --- 2.6-mm.orig/drivers/scsi/sr.c +++ 2.6-mm/drivers/scsi/sr.c @@ -603,6 +603,8 @@ static int sr_probe(struct device *dev) dev_set_drvdata(dev, cd); disk->flags |= GENHD_FL_REMOVABLE; + if (sdev->async_notification) + disk->flags |= GENHD_FL_ASYNC_NOTIFICATION; add_disk(disk); (added linux-scsi to CC) Comments: 1) From a procedural standpoint, you'll want to separate this patch into three patches: generic block layer stuff, SCSI stuff, and libata stuff. 2) I don't claim to be a sysfs expert, but this seems like a reasonable approach for reporting async-notification capabilities 3) I would make the contents of 'media_change_events' be a list of flags, rather than a boolean. Thus, when AN is present, media_change_events would return "AN\n". It would return "\n" (no flags) when AN is absent. This permits future expansion of this capabilities reporting variable. 4) Figure out some place to document 'media_change_events', in Documentation/* 5) I think the method of delivery probably needs discussing, and some work. Presumably the normal hotplug paths should be traversed for this sort of thing. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
On Wed, 28 Mar 2007 16:00:21 -0700 Venki Pallipadi <[EMAIL PROTECTED]> wrote: > Please drop the patch you included yesterday and two incremental patches and > use the patch below. As you saw, I went and turned it into an incremental patch again. It makes it easier to see what changed, but harder to see the whole thing. > Introduce a new flag for timers - deferrable: OK, but there's nothing in-kernel whcih actually uses this. It would be good to identify some timer users which can be switched over (as many as possible, really) so this thing actually gets some runtime testing. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] UML - Code cleanups for 2.6.21
On Wed, 28 Mar 2007 11:28:45 -0400 Jeff Dike <[EMAIL PROTECTED]> wrote: > These are tidying patches from Blaisorblade - 2.6.21 material. The three net_kern.c patches invoked a reject storm against mainline, presumably because of uml-network-interface-hotplug-error-handling.patch. So I bumped those three patches into 2.6.22. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Virtual methods for devices and generalized GPIO support using it
Paul Sokolovsky wrote: By this criteria I happened to choose macros syntax. But it's still merely a syntax, and I don't pledge for it. If there's more movement towards using explicit low-level forms like 1) or 2) instead of introducing new syntactic pattern, then macro syntax can be considered to have fulfilled its introductory role and can be dropped. "Movement towards?!" That's been a fundamental part of Linux design since the very beginning. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Virtual methods for devices and generalized GPIO support using it
Hello H., Wednesday, March 28, 2007, 7:32:57 PM, you wrote: > Paul Sokolovsky wrote: >> >> In this respect, VTABLE(), METHOD() macros serve the same purpose as >> container_of() and list_for_each() - they are besides offering (more) >> convenient syntax, also carry important annotattion and educational >> messages, like "it's ok, and encouraged to embed one structure into >> another - use it!" or "list manipulation is a trivial operation for kernel, >> and we want you to treat it as such and use in standard, easily >> distinguishable way". >> > You realize, right, that the Linux kernel already have a much cleaner > way to do vtables in the kernel, without this kind of macro crappage? > It's called an _ops table, and is used in a patternized way: foo->x_ops->func(foo, ...); > ... all over the kernel. We like it that way. Sure! I wrote it's nothing really new. And I hope it's clear why those macros appeared in the first place: with the type of structures the device virtual methods are intended to be used, there're always pretty comprehensive member selection and typecasting is required. In this regard, there were 3 choices: 1. Use long but explicit expressions, like ((struct dev_pdata*)pdev.dev->platform_device)->x_ops->func(dev) 2. Use temporary variables: struct dev_pdata *tmp = (struct dev_pdata*)pdev.dev->platform_device; tmp->x_ops->func(dev); 3. Introduce macros which would hide guts and would provide syntax more resembling usual function call (especially for folks who remember that preprocessor is unalienable part of C ;-) ). As I also noted in the original mail, macros are also nice device for in-place annotation - to emphasize the fact that this is not just a mundane case of pointer manipulation, but paradigmatic thing. By this criteria I happened to choose macros syntax. But it's still merely a syntax, and I don't pledge for it. If there's more movement towards using explicit low-level forms like 1) or 2) instead of introducing new syntactic pattern, then macro syntax can be considered to have fulfilled its introductory role and can be dropped. > -hpa -- Best regards, Paulmailto:[EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] cache pipe buf page address for non-highmem arch
Andrew Morton wrote: > On Wed, 28 Mar 2007 16:21:04 -0700 > Zach Brown <[EMAIL PROTECTED]> wrote: > > >>> Does this look OK? >>> >> Almost... >> >> >>> #ifdef CONFIG_HIGHMEM >>> static inline void pipe_kunmap_atomic(void *addr, enum km_type type) >>> #else /* CONFIG_HIGHMEM */ >>> static inline void pipe_kunmap_atomic(struct page *page, enum >>> km_type type) >>> > > OK, I give up. What are you telling me here? > Also void *addr vs struct page *page. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/3] libata: check for AN support
Check to see if an ATAPI device supports Asynchronous Notification. If so, enable it. Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]> Index: 2.6-mm/drivers/ata/libata-core.c === --- 2.6-mm.orig/drivers/ata/libata-core.c +++ 2.6-mm/drivers/ata/libata-core.c @@ -71,6 +71,7 @@ const unsigned long sata_deb_timing_long static unsigned int ata_dev_init_params(struct ata_device *dev, u16 heads, u16 sectors); static unsigned int ata_dev_set_xfermode(struct ata_device *dev); +static unsigned int ata_dev_set_AN(struct ata_device *dev); static void ata_dev_xfermask(struct ata_device *dev); static unsigned int ata_print_id = 1; @@ -1745,6 +1746,23 @@ int ata_dev_configure(struct ata_device } dev->cdb_len = (unsigned int) rc; + /* +* check to see if this ATAPI device supports +* Asynchronous Notification +*/ + if (ata_id_has_AN(id)) + { + /* issue SET feature command to turn this on */ + rc = ata_dev_set_AN(dev); + if (rc) { + ata_dev_printk(dev, KERN_ERR, + "unable to set AN\n"); + rc = -EINVAL; + goto err_out_nosup; + } + dev->flags |= ATA_DFLAG_AN; + } + if (ata_id_cdb_intr(dev->id)) { dev->flags |= ATA_DFLAG_CDB_INTR; cdb_intr_string = ", CDB intr"; @@ -3642,6 +3660,42 @@ static unsigned int ata_dev_set_xfermode } /** + * ata_dev_set_AN - Issue SET FEATURES - SATA FEATURES + * with sector count set to indicate + * Asynchronous Notification feature + * @dev: Device to which command will be sent + * + * Issue SET FEATURES - SATA FEATURES command to device @dev + * on port @ap. + * + * LOCKING: + * PCI/etc. bus probe sem. + * + * RETURNS: + * 0 on success, AC_ERR_* mask otherwise. + */ +static unsigned int ata_dev_set_AN(struct ata_device *dev) +{ + struct ata_taskfile tf; + unsigned int err_mask; + + /* set up set-features taskfile */ + DPRINTK("set features - SATA features\n"); + + ata_tf_init(dev, ); + tf.command = ATA_CMD_SET_FEATURES; + tf.feature = SETFEATURES_SATA_ENABLE; + tf.flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE; + tf.protocol = ATA_PROT_NODATA; + tf.nsect = SATA_AN; + + err_mask = ata_exec_internal(dev, , NULL, DMA_NONE, NULL, 0); + + DPRINTK("EXIT, err_mask=%x\n", err_mask); + return err_mask; +} + +/** * ata_dev_init_params - Issue INIT DEV PARAMS command * @dev: Device to which command will be sent * @heads: Number of heads (taskfile parameter) Index: 2.6-mm/include/linux/ata.h === --- 2.6-mm.orig/include/linux/ata.h +++ 2.6-mm/include/linux/ata.h @@ -193,6 +193,12 @@ enum { SETFEATURES_WC_ON = 0x02, /* Enable write cache */ SETFEATURES_WC_OFF = 0x82, /* Disable write cache */ + SETFEATURES_SATA_ENABLE = 0x10, /* Enable use of SATA feature */ + SETFEATURES_SATA_DISABLE = 0x90, /* Disable use of SATA feature */ + + /* SETFEATURE Sector counts for SATA features */ + SATA_AN = 0x05, /* Asynchronous Notification */ + /* ATAPI stuff */ ATAPI_PKT_DMA = (1 << 0), ATAPI_DMADIR= (1 << 2), /* ATAPI data dir: @@ -298,6 +304,8 @@ struct ata_taskfile { #define ata_id_queue_depth(id) (((id)[75] & 0x1f) + 1) #define ata_id_removeable(id) ((id)[0] & (1 << 7)) #define ata_id_has_dword_io(id)((id)[50] & (1 << 0)) +#define ata_id_has_AN(id) \ + ((id[76] && (~id[76])) & ((id)[78] & (1 << 5))) #define ata_id_iordy_disable(id) ((id)[49] & (1 << 10)) #define ata_id_has_iordy(id) ((id)[49] & (1 << 9)) #define ata_id_u32(id,n) \ Index: 2.6-mm/include/linux/libata.h === --- 2.6-mm.orig/include/linux/libata.h +++ 2.6-mm/include/linux/libata.h @@ -136,6 +136,7 @@ enum { ATA_DFLAG_CDB_INTR = (1 << 2), /* device asserts INTRQ when ready for CDB */ ATA_DFLAG_NCQ = (1 << 3), /* device supports NCQ */ ATA_DFLAG_FLUSH_EXT = (1 << 4), /* do FLUSH_EXT instead of FLUSH */ + ATA_DFLAG_AN= (1 << 5), /* device supports Async notification */ ATA_DFLAG_CFG_MASK = (1 << 8) - 1, ATA_DFLAG_PIO = (1 << 8), /* device limited to PIO mode */ -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message
[patch 3/3] libata: handle AN interrupt
When we get an SDB FIS with the 'N' bit set, we should send an event to user space to indicate that there has been a media change. The ahci host controller will send the event via KOBJ_CHANGE uevent. Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]> Index: 2.6-mm/drivers/ata/ahci.c === --- 2.6-mm.orig/drivers/ata/ahci.c +++ 2.6-mm/drivers/ata/ahci.c @@ -1164,6 +1164,26 @@ static void ahci_host_intr(struct ata_po return; } + if (status & PORT_IRQ_SDB_FIS) { + /* +* if this is an ATAPI device with AN turned on, +* then we should interrogate the device to +* determine the cause of the interrupt +* +* for AN - this we should check the SDB FIS +* and find the I and N bits set +*/ + const u32 *f = pp->rx_fis + RX_FIS_SDB; + + /* check the 'N' bit in word 0 of the FIS */ + if (f[0] & (1 << 15)) { + int port_addr = ((f[0] & 0x0f00) >> 8); + struct ata_device *adev = >device[port_addr]; + ata_port_printk(ap, KERN_INFO, "N bit set on SDB FIS!\n"); + if (adev->flags & ATA_DFLAG_AN) + ata_async_notify(adev); + } + } if (ap->sactive) qc_active = readl(port_mmio + PORT_SCR_ACT); else Index: 2.6-mm/include/linux/libata.h === --- 2.6-mm.orig/include/linux/libata.h +++ 2.6-mm/include/linux/libata.h @@ -492,6 +492,7 @@ struct ata_device { /* ACPI objects info */ acpi_handle obj_handle; #endif + struct work_struct async_notify; }; /* Offset into struct ata_device. Fields above it are maintained @@ -826,6 +827,7 @@ extern void ata_scsi_slave_destroy(struc extern int ata_scsi_change_queue_depth(struct scsi_device *sdev, int queue_depth); extern struct ata_device *ata_dev_pair(struct ata_device *adev); +extern void ata_async_notify(struct ata_device *atadev); extern int ata_do_set_mode(struct ata_port *ap, struct ata_device **r_failed_dev); extern u8 ata_irq_on(struct ata_port *ap); extern u8 ata_dummy_irq_on(struct ata_port *ap); Index: 2.6-mm/drivers/ata/libata-core.c === --- 2.6-mm.orig/drivers/ata/libata-core.c +++ 2.6-mm/drivers/ata/libata-core.c @@ -1576,6 +1576,26 @@ static void ata_dev_config_ncq(struct at snprintf(desc, desc_sz, "NCQ (depth %d/%d)", hdepth, ddepth); } +static void async_notify_thread(struct work_struct *work) +{ + struct ata_device *atadev = + container_of(work, struct ata_device, async_notify); + + /* +* TBD - who should send this event? I couldn't find an +* easy way to map an ata_device to a genhd device, so +* decided maybe the ata host should send the event and +* allow user space to figure out what happened? +*/ + kobject_uevent(>ap->host->dev->kobj, KOBJ_CHANGE); +} + +void ata_async_notify(struct ata_device *atadev) +{ + schedule_work(>async_notify); +} + + /** * ata_dev_configure - Configure the specified ATA/ATAPI device * @dev: Target device to configure @@ -1761,6 +1781,7 @@ int ata_dev_configure(struct ata_device goto err_out_nosup; } dev->flags |= ATA_DFLAG_AN; + INIT_WORK(>async_notify, async_notify_thread); } if (ata_id_cdb_intr(dev->id)) { @@ -6650,6 +6671,7 @@ EXPORT_SYMBOL_GPL(ata_dummy_irq_on); EXPORT_SYMBOL_GPL(ata_irq_ack); EXPORT_SYMBOL_GPL(ata_dummy_irq_ack); EXPORT_SYMBOL_GPL(ata_dev_try_classify); +EXPORT_SYMBOL_GPL(ata_async_notify); EXPORT_SYMBOL_GPL(ata_cable_40wire); EXPORT_SYMBOL_GPL(ata_cable_80wire); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2/3] libata: expose AN support to user space via sysfs
Allow user space to determine if an ATAPI device supports async notification (AN) of media changes. This is done by adding a new sysfs file "async_notification" to genhd. If the file reads 1, then the device supports async notification. If the file reads 0, it does not. A flag is set in the generic disk to indicate whether or not AN is supported. This flag is set by the SCSI subsystem when it registers with add_disk. The SCSI system gets information from libata on whether the device supports AN during dev_configure. Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]> Index: 2.6-mm/block/genhd.c === --- 2.6-mm.orig/block/genhd.c +++ 2.6-mm/block/genhd.c @@ -372,6 +372,11 @@ static ssize_t disk_size_read(struct gen { return sprintf(page, "%llu\n", (unsigned long long)get_capacity(disk)); } +static ssize_t disk_AN_read(struct gendisk * disk, char *page) +{ + return sprintf(page, "%d\n", + (disk->flags & GENHD_FL_ASYNC_NOTIFICATION ? 1 : 0)); +} static ssize_t disk_stats_read(struct gendisk * disk, char *page) { @@ -419,6 +424,10 @@ static struct disk_attribute disk_attr_s .attr = {.name = "stat", .mode = S_IRUGO }, .show = disk_stats_read }; +static struct disk_attribute disk_attr_AN = { + .attr = {.name = "media_change_events", .mode = S_IRUGO }, + .show = disk_AN_read +}; #ifdef CONFIG_FAIL_MAKE_REQUEST @@ -455,6 +464,7 @@ static struct attribute * default_attrs[ _attr_removable.attr, _attr_size.attr, _attr_stat.attr, + _attr_AN.attr, #ifdef CONFIG_FAIL_MAKE_REQUEST _attr_fail.attr, #endif Index: 2.6-mm/include/linux/genhd.h === --- 2.6-mm.orig/include/linux/genhd.h +++ 2.6-mm/include/linux/genhd.h @@ -94,6 +94,7 @@ struct hd_struct { #define GENHD_FL_REMOVABLE 1 #define GENHD_FL_DRIVERFS 2 +#define GENHD_FL_ASYNC_NOTIFICATION4 #define GENHD_FL_CD8 #define GENHD_FL_UP16 #define GENHD_FL_SUPPRESS_PARTITION_INFO 32 Index: 2.6-mm/include/scsi/scsi_device.h === --- 2.6-mm.orig/include/scsi/scsi_device.h +++ 2.6-mm/include/scsi/scsi_device.h @@ -126,7 +126,7 @@ struct scsi_device { unsigned fix_capacity:1;/* READ_CAPACITY is too high by 1 */ unsigned guess_capacity:1; /* READ_CAPACITY might be too high by 1 */ unsigned retry_hwerror:1; /* Retry HARDWARE_ERROR */ - + unsigned async_notification:1; /* device supports async notification */ unsigned int device_blocked;/* Device returned QUEUE_FULL. */ unsigned int max_device_blocked; /* what device_blocked counts down from */ Index: 2.6-mm/drivers/ata/libata-scsi.c === --- 2.6-mm.orig/drivers/ata/libata-scsi.c +++ 2.6-mm/drivers/ata/libata-scsi.c @@ -899,6 +899,9 @@ static void ata_scsi_dev_config(struct s blk_queue_max_hw_segments(q, q->max_hw_segments - 1); } + if (dev->flags & ATA_DFLAG_AN) + sdev->async_notification = 1; + if (dev->flags & ATA_DFLAG_NCQ) { int depth; Index: 2.6-mm/drivers/scsi/sr.c === --- 2.6-mm.orig/drivers/scsi/sr.c +++ 2.6-mm/drivers/scsi/sr.c @@ -603,6 +603,8 @@ static int sr_probe(struct device *dev) dev_set_drvdata(dev, cd); disk->flags |= GENHD_FL_REMOVABLE; + if (sdev->async_notification) + disk->flags |= GENHD_FL_ASYNC_NOTIFICATION; add_disk(disk); sdev_printk(KERN_DEBUG, sdev, -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Corrupt XFS -Filesystems on new Hardware and Kernel
On Wed, Mar 28, 2007 at 02:42:00PM +0200, Oliver Joa wrote: > Hi, > > David Chinner wrote: > > [...] > > >What is the corruption message in the log from XFS? > >Can you please post that? Without it we really can't help you. > > > >Also, please check to see if there are any I/O errors > >in the log around the time the corruption message appears. > > Ok, here is a test: > > test:/# find / -xdev | cpio -padm /test/ > cpio: /usr/src/linux-2.6.20.2/Documentation/networking/NAPI_HOWTO.txt: > Structure needs cleaning > 3648371 blocks > test:/# > > test:/home/olli# uname -a > Linux test 2.6.20.4-majestix-1 #1 SMP PREEMPT Tue Mar 27 12:15:41 CEST > 2007 i686 GNU/Linux > > dmesg gives the following: > [15442.935941] Filesystem "sda3": XFS internal error xfs_iformat(6) at > line 492 of file fs/xfs/xfs_inode.c. Caller 0xc0211f94 > [15442.936003] [] xfs_iread+0x4ee/0x6e8 > [15442.936039] [] xfs_iget+0x2e4/0x714 > [15442.936071] [] xfs_iget+0x2e4/0x714 > [15442.936101] [] xfs_dir_lookup_int+0x7d/0xd4 So we have a corrupt inode. The error tells me that the corrupted inode is either a regular file, directory or link. Unfortunately it doesn't tell us the inode number that is corrupted. > test:/# rm /usr/src/linux-2.6.20.2/Documentation/networking/NAPI_HOWTO.txt > rm: cannot remove > `/usr/src/linux-2.6.20.2/Documentation/networking/NAPI_HOWTO.txt': > Structure needs cleaning > test:/# Once the filesystem shuts down this will happen to every operation. Next time you get a shutdown, can you unmount the filesystems and run xfs_check and then "xfs_repair -n" on the filesystem. These will tell you the inode numbers that are bad. Can you post the errors reported by these tools? Once you have the bad inode numbers, can you run the following on the bad inodes: # xfs_db -r -c "inode " -c "p" E.g.: # xfs_db -r -c "inode 128" -c p /dev/sdb8 core.magic = 0x494e core.mode = 040755 core.version = 2 core.format = 2 (extents) .. and post the output for us? That will enable us to see exactly what the corruption is on the inode. Cheers, Dave. > > I got: > > [18359.750604] Filesystem "sda3": XFS internal error xfs_iformat(6) at > line 492 of file fs/xfs/xfs_inode.c. Caller 0xc0211f94 > [18359.750701] [] xfs_iread+0x4ee/0x6e8 > [18359.750755] [] xfs_iget+0x2e4/0x714 > [18359.750802] [] xfs_iget+0x2e4/0x714 > [18359.750849] [] xfs_dir_lookup_int+0x7d/0xd4 > [18359.750897] [] xfs_lookup+0x52/0x78 > [18359.750943] [] xfs_vn_lookup+0x3b/0x70 > [18359.750990] [] do_lookup+0xa3/0x140 > [18359.751036] [] __link_path_walk+0x73d/0xb5e > [18359.751086] [] link_path_walk+0x44/0xb3 > [18359.751133] [] rb_insert_color+0x4c/0xad > [18359.751180] [] vma_link+0x54/0xcd > [18359.751226] [] do_path_lookup+0x176/0x191 > [18359.751273] [] getname+0x59/0x8f > [18359.751318] [] __user_walk_fd+0x2f/0x45 > [18359.751364] [] vfs_lstat_fd+0x16/0x3d > [18359.751410] [] rb_insert_color+0x4c/0xad > [18359.751457] [] vma_link+0x54/0xcd > [18359.751501] [] sys_lstat64+0xf/0x23 > [18359.751546] [] do_page_fault+0x277/0x526 > [18359.751595] [] do_page_fault+0x0/0x526 > [18359.751640] [] syscall_call+0x7/0xb > [18359.751686] [] rsc_parse+0x6f/0x37f > [18359.751732] === > [18359.751784] Filesystem "sda3": XFS internal error xfs_iformat(6) at > line 492 of file fs/xfs/xfs_inode.c. Caller 0xc0211f94 > [18359.751859] [] xfs_iread+0x4ee/0x6e8 > [18359.751906] [] xfs_iget+0x2e4/0x714 > [18359.751952] [] xfs_iget+0x2e4/0x714 > [18359.751998] [] xfs_dir_lookup_int+0x7d/0xd4 > [18359.752047] [] xfs_lookup+0x52/0x78 > [18359.752094] [] xfs_vn_lookup+0x3b/0x70 > [18359.752140] [] __lookup_hash+0xb1/0xe1 > [18359.752191] [] do_unlinkat+0x5f/0x126 > [18359.752237] [] do_page_fault+0x277/0x526 > [18359.752285] [] syscall_call+0x7/0xb > [18359.752331] [] rsc_parse+0x6f/0x37f > [18359.752376] === > > > > Thanks a Lot > > Oliver -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 0/3] Asynchronous Notification for SATA ATAPI devices
This patch series implements Asynchronous Notification (AN) for SATA ATAPI devices as defined in SATA 2.5 and AHCI 1.1. Drives which support this feature will send a notification when new media is inserted into the drive, preventing the need for user space to poll for new media. This support is exposed to user space via a file in sysfs (/sys/block/sr*) called "media_change_events". If the drive supports AN, this file will read 1, otherwise 0. User space can disable polling for new media if this file reads 1. When new media is inserted into the ATAPI drive, the ahci driver will send a KOBJ_CHANGE event. I would really like feedback on the user interface - both the location of the sysfs file which indicates AN support, as well as the type of uevent etc. I have not yet tested AN on eject (I assume it doesn't require anything special) as my test drive which supports AN is a bit "quirky" in this respect. Please take a look and let me know what you think. Thanks, Kristen -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] i386: Remove page sized slabs for pgds and pmds
On Wed, 28 Mar 2007, William Lee Irwin III wrote: > As far as kernel compiles being relevant to anything besides > potentially optimizing a particular major benchmark using gcc as one > of its components... yeah, right. It's too macro to be a microbenchmark > of anything and too micro to be pertinent to any meaningful > macrobenchmark such as those from major benchmark publishers (who can't > be named for trademark/etc. reasons). Hasn't it been at least 5 years > since people figured out kernel compiles were complete bulls**t as > benchmarks along with dbench for other reasons and several others? If > not, I don't know why I bother with this kernel at all. All benchmarks have their specific drawbacks. I personally like to do code review and see what cachelines are touched but that is basically imagining what a cpu does. Ones thinking may be led astray. > Even so, I already did this and am done with it. It's not like I'm > not carrying around numerous patches I know will never be merged all > the time anyway. If you want to back it all out so badly, just do it > and stop bothering me about it, and I'll merely continue maintaining my > local patches without ever posting them as I have been for years. I'm > not at all happy with the NIH situation, either, not that I'm at such a > loss for ideas to need to contest every petty NIH that flies past. What is NIH? My main concern is to get the use of page struct fields of the slab removed. We have to do special things to page sized allocs because of these page struct uses. F.e. the private field is used by compound pages and if any of the slabs allocate a higher order page that field will be in use. We certainly see even from the rudimentary tests that I have done that the limited pgd, pmd caching has some effect. Could we please see your local patches? And I guess that you must have some sort of benchmark that you run to test these? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RSDL v0.31
Linus Torvalds wrote: On Tue, 20 Mar 2007, Willy Tarreau wrote: Linus, you're unfair with Con. He initially was on this position, and lately worked with Mike by proposing changes to try to improve his X responsiveness. I was not actually so much speaking about Con, as about a lot of the tone in general here. And yes, it's not been entirely black and white. I was very happy to see the "try this patch" email from Al Boldi - not because I think that patch per se was necessarily the right fix (I have no idea), but simply because I think that's the kind of mindset we need to have. Not a lot of people really *like* the old scheduler, but it's been tweaked over the years to try to avoid some nasty behaviour. I'm really hoping that RSDL would be a lot better (and by all accounts it has the potential for that), but I think it's totally naïve to expect that it won't need some tweaking too. So I'll happily still merge RSDL right after 2.6.21 (and it won't even be a config option - if we want to make it good, we need to make sure *everybody* tests it), but what I want to see is that "can do" spirit wrt tweaking for issues that come up. May I suggest that if you want proper testing that it not only should be a config option but a boot time option as well? Otherwise people will be comparing an old scheduler with an RSDL kernel, and they will diverge as time goes on. More people would be willing to reboot and test on a similar load than will keep two versions of the kernel around. And if you get people testing RSDL against a vendor kernel which might be hacked, it will be even less meaningful. Please consider the benefits of making RSDL the default scheduler, and leaving people with the old scheduler with an otherwise identical kernel as a fair and meaningful comparison. There, that's a technical argument ;-) Because let's face it - nothing is ever perfect. Even a really nice conceptual idea always ends up hitting the "but in real life, things are ugly and complex, and we've depended on behaviour X in the past and can't change it, so we need some tweaking for problem Y". And everything is totally fixable - at least as long as people are willing to! Linus -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] max_loop limit
On Sun, Mar 25, 2007 at 10:40:10AM +0200, Tomas M wrote: > >here's one. Allocates all the fluff dynamically. It does not create any > >dev nodes by itself, so you need to do it (à la mdadm) > > I'm afraid that this would break a lot of things, for example mount -o > loop will not work anymore unless you create /dev/loop* manually first, Yes, "losetup" and "mount -o loop" call stat( /dev/loopN ) when look for an (un)used loop device. > am I correct? In this case, this is unusable for many as it is not > backward compatible with old loop.c, am I correct? udev ? Karel -- Karel Zak <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RSDL v0.31
David Schwartz wrote: there were multiple attempts with renicing X under the vanilla scheduler, and they were utter failures most of the time. _More_ people complained about interactivity issues _after_ X has been reniced to -5 (or -10) than people complained about "nice 0" interactivity issues to begin with. Unfortunately, nicing X is not going to work. It causes X to pre-empt any local process that tries to batch requests to it, defeating the batching. What you really want is X to get scheduled after the client pauses in sending data to it or has sent more than a certain amount. It seems kind of crazy to put such login in a scheduler. Perhaps when one process unblocks another, you put that other process at the head of the run queue but don't pre-empt the currently running process. That way, the process can continue to batch requests, but X's maximum latency delay will be the quantum of the client program. In general I think that's the right idea. See below for more... The vanilla scheduler's auto-nice feature rewards _behavior_, so it gets X right most of the time. The fundamental issue is that sometimes X is very interactive - we boost it then, there's lots of scheduling but nice low latencies. Sometimes it's a hog - we penalize it then and things start to batch up more and we get out of the overload situation faster. That's the case even if all you care about is desktop performance. no doubt it's hard to get the auto-nice thing right, but one thing is clear: currently RSDL causes problems in areas that worked well in the vanilla scheduler for a long time, so RSDL needs to improve. RSDL should not lure itself into the false promise of 'just renice X statically'. It wont work. (You might want to rewrite X's request scheduling - but if so then i'd like to see that being done _first_, because i just dont trust such 10-mile-distance problem analysis.) I am hopeful that there exists a heuristic that both improves this problem and is also inherently fair. If that's true, then such a heuristic can be added to RSDL without damaging its properties and without requiring any special settings. Perhaps longer-term latency benefits to processes that have yielded in the past? I think there are certain circumstances, however, where it is inherently reasonable to insist that 'nice' be used. If you want a CPU-starved task to get more than 1/X of the CPU, where X is the number of CPU-starved tasks, you should have to ask for that. If you want one CPU-starved task to get better latency than other CPU-starved tasks, you should have to ask for that. I agree for giving a process more than a fair share, but I don't think "latency" is the best term for what you describe later. If you think of latency as the time between a process unblocking and the time when it gets CPU, that is a more traditional interpretation. I'm not really sure latency and CPU-starved are compatible. I would like to see processes at the head of the queue (for latency) which were blocked for long term events, keyboard input, network input, mouse input, etc. Then processes blocked for short term events like disk, then processes which exhausted their time slice. This helps latency and responsiveness, while keeping all processes running. A variation is to give those processes at the head of the queue short Fundamentally, the scheduler cannot do it by itself. You can create cases where the load is precisely identical and one person wants X and another person wants Y. The scheduler cannot know what's important to you. DS -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: max_loop limit
On Thu, Mar 22, 2007 at 04:09:13PM +, Pádraig Brady wrote: > William Lee Irwin III wrote: > > Any chance we can get some kind of devices set up for partitions of > > loop devices if we're going to redo loopdev setup? That's been a thorn > > in my side for some time. > > This script might be of use: > http://www.pixelbeat.org/scripts/lomount.sh Ah, lomount... very popular name ;-) Xen guys have lomount too. Unfortunately, these solution are useless with LVM volumes. The kpartx is more usable: http://fedoraproject.org/wiki/FedoraXenQuickstartFC6?highlight=%28Xen%29#head-9c5408e750e8184aece3efe822be0ef6dd1871cd Karel -- Karel Zak <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] cache pipe buf page address for non-highmem arch
On Wed, 28 Mar 2007 16:21:04 -0700 Zach Brown <[EMAIL PROTECTED]> wrote: > > Does this look OK? > > Almost... > > > #ifdef CONFIG_HIGHMEM > > static inline void pipe_kunmap_atomic(void *addr, enum km_type type) > > #else /* CONFIG_HIGHMEM */ > > static inline void pipe_kunmap_atomic(struct page *page, enum > > km_type type) > OK, I give up. What are you telling me here? argh, enum km_type isn't defined if !CONFIG_HIGHMEM, which is extravagantly dumb. From: Andrew Morton <[EMAIL PROTECTED]> Cc: "Ken Chen" <[EMAIL PROTECTED]> Cc: Zach Brown <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- fs/pipe.c | 31 +-- 1 files changed, 25 insertions(+), 6 deletions(-) diff -puN fs/pipe.c~cache-pipe-buf-page-address-for-non-highmem-arch-fix-tidy fs/pipe.c --- a/fs/pipe.c~cache-pipe-buf-page-address-for-non-highmem-arch-fix-tidy +++ a/fs/pipe.c @@ -22,17 +22,36 @@ #include #ifdef CONFIG_HIGHMEM -#define pipe_kmap kmap -#define pipe_kmap_atomic kmap_atomic -#define pipe_kunmapkunmap -#define pipe_kunmap_atomic kunmap_atomic +static inline void *pipe_kmap(struct page *page) +{ + return kmap(page); +} + +static inline void pipe_kunmap(struct page *page) +{ + kunmap(page); +} + +static inline void *pipe_kmap_atomic(struct page *page, enum km_type type) +{ + return kmap_atomic(page, type); +} + +static inline void pipe_kunmap_atomic(void *addr, enum km_type type) +{ + kunmap_atomic(addr, type); +} #else /* CONFIG_HIGHMEM */ static inline void *pipe_kmap(struct page *page) { - return (void *) page->private; + return (void *)page->private; } + +static inline void pipe_kunmap(struct page *page) +{ +} + #define pipe_kmap_atomic(page, type) pipe_kmap(page) -#define pipe_kunmap(page) do { } while (0) #define pipe_kunmap_atomic(page, type) do { } while (0) #endif _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH][2.6.21] uml: fix unreasonably long udelay
Currently we have a confused udelay implementation. * __const_udelay does not accept usecs but xloops in i386 and x86_64 * our implementation requires usecs as arg * it gets a xloops count when called by asm/arch/delay.h Bugs related to this (extremely long shutdown times) where reported by some x86_64 users, especially using Device Mapper. To hit this bug, a compile-time constant time parameter must be passed - that's why UML seems to work most times. Fix this with a simple udelay implementation. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <[EMAIL PROTECTED]> --- arch/um/sys-i386/delay.c | 11 --- arch/um/sys-x86_64/delay.c | 11 --- include/asm-um/delay.h | 17 ++--- 3 files changed, 14 insertions(+), 25 deletions(-) diff --git a/arch/um/sys-i386/delay.c b/arch/um/sys-i386/delay.c index 2c11b97..d623e07 100644 --- a/arch/um/sys-i386/delay.c +++ b/arch/um/sys-i386/delay.c @@ -27,14 +27,3 @@ void __udelay(unsigned long usecs) } EXPORT_SYMBOL(__udelay); - -void __const_udelay(unsigned long usecs) -{ - int i, n; - - n = (loops_per_jiffy * HZ * usecs) / MILLION; -for(i=0;i 2) ? \ + __bad_udelay() : __udelay(n)) + +/* It appears that ndelay is not used at all for UML, and has never been + * implemented. */ +extern void __unimplemented_ndelay(void); +#define ndelay(n) __unimplemented_ndelay() + #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] i386: Remove page sized slabs for pgds and pmds
On Wed, 28 Mar 2007 15:01:31 -0700 William Lee Irwin III <[EMAIL PROTECTED]> wrote: > On Wed, Mar 28, 2007 at 02:38:55PM -0700, Christoph Lameter wrote: > > No that was described in the patch. Quote: > > "i386 only provides support for caching constructed pgd and pmds. These > > are comparatively rare to ptes so it is no surprise that the current > > approach has only minimal effect. " Whatever it was originally for and public or not, the above isn't true for some non Intel products... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Odd log message associated with NFS
On Wed, Mar 28, 2007 at 07:05:36PM +, Thorsten Kranzkowski wrote: > I'll let a tcpdump run this evening and see if I can correlate the message > with anything. > > If you have a printk or other patch for me to try, just let me know. Well, just for fun, you could try something like this--should dump some data the first time it hits the "bad direction" error. --b. diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h index 9e340fa..c2349d2 100644 --- a/include/linux/sunrpc/xdr.h +++ b/include/linux/sunrpc/xdr.h @@ -35,6 +35,45 @@ struct xdr_netobj { */ typedef int(*kxdrproc_t)(void *rqstp, __be32 *data, void *obj); +/* dump the buffer in `emacs-hexl' style */ +#define isprintable(c) ((c > 0x1f) && (c < 0x7f)) + +static inline void dump_hex(void *p, u_int length) +{ + u_int i, j, jm; + u8 c, *cp; + + printk("RPC: print_hexl: length %d\n",length); + cp = p; + + for (i = 0; i < length; i += 0x10) { + printk(" %04x: ", (u_int)i); + jm = length - i; + jm = jm > 16 ? 16 : jm; + + for (j = 0; j < jm; j++) { + if ((j % 2) == 1) + printk("%02x ", (u_int)cp[i+j]); + else + printk("%02x", (u_int)cp[i+j]); + } + for (; j < 16; j++) { + if ((j % 2) == 1) + printk(" "); + else + printk(" "); + } + printk(" "); + + for (j = 0; j < jm; j++) { + c = cp[i+j]; + c = isprintable(c) ? c : '.'; + printk("%c", c); + } + printk("\n"); + } +} + /* * Basic structure for transmission/reception of a client XDR message. * Features a header (for a linear buffer containing RPC headers @@ -61,6 +100,18 @@ struct xdr_buf { }; +static inline void dump_xdr_buf(struct xdr_buf *buf) +{ + printk("buf->head[0].iov_base = %p, buf->head[0].iov_len = %d\n", + buf->head[0].iov_base, buf->head[0].iov_len); + printk("buf->tail[0].iov_base = %p, buf->tail[0].iov_len = %d\n", + buf->tail[0].iov_base, buf->tail[0].iov_len); + printk("pages = %p, page_base = %d, page_len = %d\n", + buf->pages, buf->page_base, buf->page_len); + printk("buflen = %d, len = %d\n", buf->buflen, buf->len); + return; +} + /* * pre-xdr'ed macros. */ diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index b4db53f..977056e 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -776,6 +776,26 @@ svc_register(struct svc_serv *serv, int proto, unsigned short port) return error; } +static void +dump_once(struct svc_rqst *rqstp, __be32 *orig_start) +{ + static int done = 0; + struct kvec *argv = >rq_arg.head[0]; + char buf[RPC_MAX_ADDRBUFLEN]; + + if (done) + return; + done++; + + printk("dumping request; rq_addr = %s, rq_deferred = %p, rq_arg:\n", + svc_print_addr(rqstp, buf, sizeof(buf)), rqstp->rq_deferred); + dump_xdr_buf(>rq_arg); + + printk("head data (from %p):\n", orig_start); + dump_hex(orig_start, (argv->iov_base + argv->iov_len) + - (void *)orig_start); +} + /* * Process the RPC request. */ @@ -794,6 +814,7 @@ svc_process(struct svc_rqst *rqstp) __be32 auth_stat, rpc_stat; int auth_res; __be32 *reply_statp; + __be32 *start; rpc_stat = rpc_success; @@ -819,6 +840,7 @@ svc_process(struct svc_rqst *rqstp) if (rqstp->rq_prot == IPPROTO_TCP) svc_putnl(resv, 0); + start = argv->iov_base; rqstp->rq_xid = svc_getu32(argv); svc_putu32(resv, rqstp->rq_xid); @@ -971,6 +993,7 @@ err_short_len: err_bad_dir: if (net_ratelimit()) printk("svc: bad direction %d, dropping request\n", dir); + dump_once(rqstp, start); serv->sv_stats->rpcbadfmt++; goto dropit;/* drop request */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] cache pipe buf page address for non-highmem arch
Does this look OK? Almost... #ifdef CONFIG_HIGHMEM static inline void pipe_kunmap_atomic(void *addr, enum km_type type) #else /* CONFIG_HIGHMEM */ static inline void pipe_kunmap_atomic(struct page *page, enum km_type type) - z - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Inlining can be _very_bad...
Hi all... I post this here as it can be of direct interest for kernel development (as I recall many discussions about inlining yes or no...). Testing other problems, I finally got this this issue: the same short and stupid loop lasted from 3 to 5 times more if it was in main() than if it was in an out-of-line function. The same (bad thing) happens if the function is inlined. The basic code is like this: float data[]; [inline] double one() { double sum; sum = 0; for (i=0; i tst T0: 1145.12 ms S0: 268435456.00 T1: 457.19 ms S1: 268435456.00 With one() inlined: apolo:~/e4> tst T0: 1200.52 ms S0: 268435456.00 T1: 1200.14 ms S1: 268435456.00 Looking at the assembler, the non-inlined version does: .L2: cvtss2sd(%rdx,%rax,4), %xmm0 incq%rax cmpq$268435456, %rax addsd %xmm0, %xmm1 jne .L2 and the inlined .L13: cvtss2sd(%rdx,%rax,4), %xmm0 incq%rax cmpq$268435456, %rax addsd 8(%rsp), %xmm0 movsd %xmm0, 8(%rsp) jne .L13 It looks like is updating the stack on each iteration...This is -march=opteron code, the -march=pentium4 is similar. Same behaviour with gcc3 and gcc4. tst.c and Makefile attached. Nice, isn't it ? Please, probe where is my fault... -- J.A. Magallon \ Software is like sex: \ It's better when it's free Mandriva Linux release 2007.1 (Cooker) for i586 Linux 2.6.20-jam06 (gcc 4.1.2 20070302 (prerelease) (4.1.2-1mdv2007.1)) #1 SMP PREEMPT Makefile Description: Binary data #include #include #include #define SIZE 256*1024*1024 #define elap(t0,t1) \ ((1000*t1.tv_sec+0.001*t1.tv_usec) - (1000*t0.tv_sec+0.001*t0.tv_usec)) double one(); float *data; #ifdef INLINE inline #endif double one() { int i; double sum; sum = 0; asm("#FBGN"); for (i=0; i
Re: [patch] cache pipe buf page address for non-highmem arch
On Tue, 27 Mar 2007 15:57:53 -0700 Zach Brown <[EMAIL PROTECTED]> wrote: > > +#define pipe_kmap_atomic(page, type) pipe_kmap(page) > > +#define pipe_kunmap(page) do { } while (0) > > +#define pipe_kunmap_atomic(page, type) do { } while (0) > > Please don't drop arguments in stubs. It can let completely broken > code compile, like: > > pipe_kunmap(SOME_COMPLETE_NONSENSE); > > Static inlines with empty bodies are the gold standard. > yup. Does this look OK? #ifdef CONFIG_HIGHMEM static inline void *pipe_kmap(struct page *page) { return kmap(page); } static inline void pipe_kunmap(struct page *page) { kunmap(page); } static inline void *pipe_kmap_atomic(struct page *page, enum km_type type) { return kmap_atomic(page, type); } static inline void pipe_kunmap_atomic(void *addr, enum km_type type) { kunmap_atomic(addr, type); } #else /* CONFIG_HIGHMEM */ static inline void *pipe_kmap(struct page *page) { return (void *)page->private; } static inline void pipe_kunmap(struct page *page) { } static inline void *pipe_kmap_atomic(struct page *page, enum km_type type) { return (void *)page->private; } static inline void pipe_kunmap_atomic(struct page *page, enum km_type type) { } #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: possible mistake in linux kernel header file -- kernel: 2.6.16.29 file: mod_devicetable.h
smitchel wrote: I am not sure where to post this, maybe you can direct me what to do, if anything. We have two computers running slackware for amd64 version 11.0. Tonight we compiled mplayer on each of the systems. On the first, everything compiled fine--it has a core 2 duo cpu and is running a stock kernel off the install DVD for slackware-amd64. it is kernel 2.6.16.29. On the second it would not compile, and it has dual opteron 250 cpus and is running a kernel that we compiled to add some things to for sound, etc. This was from a kernel source that we downloaded a few days ago. it is kernel 2.6.16.29--same as first machine. The error is stopping in the file /usr/include/linux/mod_devicetable.h. It appears that there are 4 extra lines that have been added to the mod_devicetable.h that was part of the kernel source that we downloaded. They are in the first screenful of the file: #ifdef __KERNEL__ #include typedef unsigned long kernel_ulong_t; #endif They are not in the same file in the kernel source from the slackware amd-64 install DVD. ( included somewhere else?) Googling we found: __KERNEL__ is defined for programs that run in kernel mode instead of user programs (whatever that means). A few lines later in mod_devicetable.h it uses the type kernel_ulong_t (in the same file--what if the ifdef path is not taken?) These compile errors are from compiling mplayer? Something is not right here, it shouldn't be including that header file at all - and I'm not sure how anything in /usr/include could be ending up trying to do so. __KERNEL__ is only supposed to be defined when building the kernel itself. Current kernels (not sure if 2.6.16 had this though) have a process which generates header files suitable for userspace from the kernel's header files and strips out everything inside #ifdef __KERNEL__. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: staircase deadline misc fixes
On Thursday 29 March 2007 04:48, Ingo Molnar wrote: > hm, how about the questions Mike raised (there were a couple of cases of > friction between 'the design as documented and announced' and 'the code > as implemented')? As far as i saw they were still largely unanswered - > but let me know if they are all answered and addressed: I spent less time emailing and more time coding. I have been working on addressing whatever people brought up. > http://marc.info/?l=linux-kernel=117465220309006=2 Attended to. > http://marc.info/?l=linux-kernel=117489673929124=2 Attended to. > http://marc.info/?l=linux-kernel=117489831930240=2 Checked fine. > and the numbers he posted: > > http://marc.info/?l=linux-kernel=117448900626028=2 Attended to. > his test conclusion was that under CPU load, RSDL (SD) generally does > not hold up to mainline's interactivity. There have been improvements since the earlier iterations but it's still a fairness based design. Mike's "sticking point" test case should be improved as well. My call based on my own testing and feedback from users is: Under niced loads it is 99% in favour of SD. Under light loads it is 95% in favour of SD. Under Heavy loads it becomes proportionately in favour of mainline. The crossover is somewhere around a load of 4. If the reluctance to renice X goes away I'd say it was 99% across the board and to much higher loads. > Ingo -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFT] e100 driver on ARM
Lennert Buytenhek wrote: On Mon, Sep 04, 2006 at 06:39:29AM -0400, Jeff Garzik wrote: 1) Does e100 driver work on ARM? FWIW, e100 seems to work okay for me on an intel ixp2400 (xscale based) board, an ixp2850 (xscale based) board and an ixp2350 (xscale3 based) board. ixp2350 works both with hardware coherency turned on (cpu snoops bus) and turned off (manual dma cache clean/invalidate as usual.) As for the other ARM platforms that I'm interested in / have hardware for / maintain, the at91/ep93xx/pxa270 don't have PCI, and the other two (iop32x/iop33x) I can't test because I don't have such systems with e100 NICs, but I expect those would work, since they're both xscale based like the ixp2400, and the ixp2400 works. I just got an iop342 board dropped on my lap. Once it's running, I'll make sure to make this the first thing to test. Cheers, Auke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
Andrew, Please drop the patch you included yesterday and two incremental patches and use the patch below. This patch is - yesterday's patch + Your tidy cleanup + minor changes based on comments from Oleg and Andi. This is a lot cleaner (and smaller) than earlier patches. Thanks, Venki Introduce a new flag for timers - deferrable: Timers that work normally when system is busy. But, will not cause CPU to come out of idle (just to service this timer), when CPU is idle. Instead, this timer will be serviced when CPU eventually wakes up with a subsequent non-deferrable timer. The main advantage of this is to avoid unnecessary timer interrupts when CPU is idle. If the routine currently called by a timer can wait until next event without any issues, this new timer can be used to setup timer event for that routine. This, with dynticks, allows CPUs to be lazy, allowing them to stay in idle for extended period of time by reducing unnecesary wakeup and thereby reducing the power consumption. This patch: Builds this new timer on top of existing timer infrastructure. It uses last bit in 'base' pointer of timer_list structure to store this deferrable timer flag. __next_timer_interrupt() function skips over these deferrable timers when CPU looks for next timer event for which it has to wake up. This is exported by a new interface init_timer_deferrable() that can be called in place of regular init_timer(). Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]> Index: new/kernel/timer.c === --- new.orig/kernel/timer.c 2007-03-22 16:27:44.0 -0800 +++ new/kernel/timer.c 2007-03-28 10:05:38.0 -0800 @@ -74,7 +74,7 @@ tvec_t tv3; tvec_t tv4; tvec_t tv5; -} cacheline_aligned_in_smp; +} cacheline_aligned; typedef struct tvec_t_base_s tvec_base_t; @@ -82,6 +82,37 @@ EXPORT_SYMBOL(boot_tvec_bases); static DEFINE_PER_CPU(tvec_base_t *, tvec_bases) = _tvec_bases; +/* + * Note that all tvec_bases is 2 byte aligned and lower bit of + * base in timer_list is guaranteed to be zero. Use the LSB for + * the new flag to indicate whether the timer is deferrable + */ +#define TBASE_DEFERRABLE_FLAG (0x1) + +/* Functions below help us manage 'deferrable' flag */ +static inline unsigned int tbase_get_deferrable(tvec_base_t *base) +{ + return ((unsigned int)(unsigned long)base & TBASE_DEFERRABLE_FLAG); +} + +static inline tvec_base_t *tbase_get_base(tvec_base_t *base) +{ + return ((tvec_base_t *)((unsigned long)base & ~TBASE_DEFERRABLE_FLAG)); +} + +static inline void timer_set_deferrable(struct timer_list *timer) +{ + timer->base = ((tvec_base_t *)((unsigned long)(timer->base) | + TBASE_DEFERRABLE_FLAG)); +} + +static inline void +timer_set_base(struct timer_list *timer, tvec_base_t *new_base) +{ + timer->base = (tvec_base_t *)((unsigned long)(new_base) | + tbase_get_deferrable(timer->base)); +} + /** * __round_jiffies - function to round jiffies to a full second * @j: the time in (absolute) jiffies that should be rounded @@ -295,6 +326,13 @@ } EXPORT_SYMBOL(init_timer); +void fastcall init_timer_deferrable(struct timer_list *timer) +{ + init_timer(timer); + timer_set_deferrable(timer); +} +EXPORT_SYMBOL(init_timer_deferrable); + static inline void detach_timer(struct timer_list *timer, int clear_pending) { @@ -325,10 +363,11 @@ tvec_base_t *base; for (;;) { - base = timer->base; + tvec_base_t *prelock_base = timer->base; + base = tbase_get_base(prelock_base); if (likely(base != NULL)) { spin_lock_irqsave(>lock, *flags); - if (likely(base == timer->base)) + if (likely(prelock_base == timer->base)) return base; /* The timer has migrated to another CPU */ spin_unlock_irqrestore(>lock, *flags); @@ -365,11 +404,11 @@ */ if (likely(base->running_timer != timer)) { /* See the comment in lock_timer_base() */ - timer->base = NULL; + timer_set_base(timer, NULL); spin_unlock(>lock); base = new_base; spin_lock(>lock); - timer->base = base; + timer_set_base(timer, base); } } @@ -397,7 +436,7 @@ timer_stats_timer_set_start_info(timer); BUG_ON(timer_pending(timer) || !timer->function); spin_lock_irqsave(>lock, flags); - timer->base = base; + timer_set_base(timer, base); internal_add_timer(base, timer); spin_unlock_irqrestore(>lock, flags); } @@ -548,7
[Repost][PATCH] Remove "obsolete" label from ISDN4Linux
From: Tilman Schmidt <[EMAIL PROTECTED]> Remove incorrect "obsolete" label from ISDN4Linux. Signed-off-by: Tilman Schmidt <[EMAIL PROTECTED]> --- --- a/drivers/isdn/Kconfig 2006-11-29 22:57:37.0 +0100 +++ b/drivers/isdn/Kconfig 2007-02-21 01:19:19.0 +0100 @@ -25,7 +25,7 @@ menu "Old ISDN4Linux" depends on NET && ISDN config ISDN_I4L - tristate "Old ISDN4Linux (obsolete)" + tristate "Old ISDN4Linux subsystem" ---help--- This driver allows you to use an ISDN-card for networking connections and as dialin/out device. The isdn-tty's have a built @@ -38,8 +38,8 @@ config ISDN_I4L ISDN support in the linux kernel is moving towards a new API, called CAPI (Common ISDN Application Programming Interface). - Therefore the old ISDN4Linux layer is becoming obsolete. It is - still usable, though, if you select this option. + The old ISDN4Linux layer is still available for use with cards + that are not supported by the new CAPI subsystem yet. if ISDN_I4L source "drivers/isdn/i4l/Kconfig" -- Tilman Schmidt E-Mail: [EMAIL PROTECTED] Bonn, Germany In theory, there is no difference between theory and practice. In practice, there is. signature.asc Description: OpenPGP digital signature
[PATCH -rt] Fix build on MIPS
Extra #endif got into atomic.h Signed-off-by: Deepak Saxena <[EMAIL PROTECTED]> Index: linux-2.6.21-rc5/include/asm-mips/atomic.h === --- linux-2.6.21-rc5.orig/include/asm-mips/atomic.h +++ linux-2.6.21-rc5/include/asm-mips/atomic.h @@ -566,7 +566,6 @@ static __inline__ long atomic64_add_retu raw_local_irq_restore(flags); } #endif -#endif smp_mb(); -- Deepak Saxena - [EMAIL PROTECTED] - http://www.plexity.net In the end, they will not say, "those were dark times," they will ask "why were their poets silent?" - Bertolt Brecht - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Linux 2.6.16.46-rc1
Location: ftp://ftp.kernel.org/pub/linux/kernel/people/bunk/linux-2.6.16.y/testing/ git tree: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.16.y.git RSS feed of the git tree: http://www.kernel.org/git/?p=linux/kernel/git/stable/linux-2.6.16.y.git;a=rss Changes since 2.6.16.45: Adrian Bunk (1): Linux 2.6.16.46-rc1 Akinbou Mita (1): md: fix /proc/mdstat refcounting Alan Stern (10): usb-storage: unusual_devs entry for Nikon DSC D70s USB: unusual_devs entry for Nokia N80 USB: unusual_devs entry for Nokia N91 USB: unusual_devs entry for Nokia E61 USB: unusual_devs entry for Lacie DVD+-RW USB: unusual-devs entry for Nokia E60 USB: unusual_devs entry for Nokia 6131 USB: unusual_devs entry for Nokia 6234 unusual_devs update for UCR-61S2B USB: unusual_devs update for Sony P990i phone Amol Lad (1): sound/pci/au88x0/au88x0.c: ioremap balanced with iounmap Andrew Nayenko (1): USB storage: Nokia 6288 unusual_devs entry Andy Isaacson (1): fix read past end of array in md/linear.c Clemens Ladisch (1): usb-audio: work around wrong frequency in CM6501 descriptors David Kuehling (1): USB: unusual_devs entry for A-VOX WSX-300ER MP3 player Davide Perini (1): usb-storage: unusual_devs entry for Motorola RAZR V3x Dylan Taft (1): USB Storage: US_FL_IGNORE_RESIDUE needed for Aiptek MP3 Player Eric Sesterhenn (1): [ALSA] fix NULL pointer dereference in sound/synth/emux/soundfont.c Ernis (1): USB: unusual_devs entry for Samsung MP3 player Florin Malita (1): [ALSA] Dereference after free in snd_hwdep_release() Guennadi Liakhovetski (1): [PPP]: Don't leak an sk_buff on interface destruction. Jaco Kroon (1): USB: add Digitech USB-Storage to unusual_devs.h Jürgen Mell (1): USB floppy drive SAMSUNG SFD-321U/EP was detected 8 times Lars Ellenberg (1): md: pass down BIO_RW_SYNC in raid{1,10} Lars Jacob (1): USB: unusual_devs entry for Sony DSC-H5 Luiz Fernando N. Capitulino (1): USB: unusual_devs.h for Sony floppy Manuel Osdoba (1): USB: unusual_devs.h entry for nokia 6233 Mario Rettig (1): USB: unusual_devs entry for Nokia 3250 Mikko Honkala (1): USB: Nokia E70 is an unusual device Neil Brown (2): MD: Fix problem where hot-added drives are not resynced. md: Fix bug where spares don't always get rebuilt properly when they become live Nick Piggin (1): mm: fix madvise infinine loop Olivier Blondeau (1): USB: storage: atmel unusual dev update Patrick McHardy (3): [NET_SCHED]: Fix endless loops caused by inaccurate qlen counters [NET_SCHED]: cls_basic: fix NULL pointer dereference [NET_SCHED]: Fix ingress locking Pete Zaitcev (3): USB storage: fix ipod ejecting issue USB: unusual_devs.h for 0x046b:ff40 USB: RAZR v3i unusual_devs Phil Dibowitz (11): USB: storage: sandisk unusual_devices entry USB: storage: another unusual_devs.h entry USB: storage: unusual_devs.h entry 0420:0001 USB: Storage: unusual devs update USB Storage: US_FL_MAX_SECTORS_64 flag USB: another unusual device USB Storage: unusual_devs.h for Sony Ericsson M600i USB: unusual_dev entry for Sony P990i USB: usb-storage: Unusual_dev update USB Storage: unusual_devs: add supertop drives USB: Fix UCR-61S2B unusual_dev entry Rodolfo Quesada (1): USB: storage: new unusual_devs.h entry: Mitsumi 7in1 Card Reader Russell King (1): [SERIAL] Fix oops when removing suspended serial port Stefan Richter (1): ieee1394: dv1394: fix CardBus card ejection Takashi Iwai (6): [ALSA] hda-codec - Don't return error at initialization of modem codec [ALSA] hda-intel - Don't try to probe invalid codecs [ALSA] Fix invalid assignment of PCI revision [ALSA] cmipci - Fix a typo in 'PC Speaker Playback Switch' control [ALSA] cs4281 - Fix the check of right channel [ALSA] ca0106 - Add missing sysfs device assignment Tobias Lorenz (1): USB: Mitsumi USB FDD 061M: UNUSUAL_DEV multilun fix YOSHIFUJI Hideaki (1): [IPV6] HASHTABLES: Use appropriate seed for caluculating ehash index. Makefile |2 drivers/ieee1394/dv1394.c | 17 - drivers/md/linear.c|2 drivers/md/md.c|3 drivers/md/raid1.c | 13 - drivers/md/raid10.c| 11 - drivers/net/ppp_generic.c |3 drivers/serial/serial_core.c |9 drivers/usb/storage/scsiglue.c | 12 - drivers/usb/storage/unusual_devs.h | 285 +++-- drivers/usb/storage/usb.h |4 include/linux/serial_core.h|1 include/linux/usb_usual.h |2 include/net/sch_generic.h |4 include/sound/ymfpci.h |2 mm/madvise.c
[PATCH] scsi: megaraid_sas - intercepts cmd timeout and throttle io
eh_timed_out call back (megasas_reset_timer) is used to throttle io to the adapter when it is called the first time for a scmd. The MEGASAS_FW_BUSY flag is set and can_queue reduced to 16. The can_queue is restored from completion routine in following two conditions : 5 seconds has elapsed and the # of outstanding cmds in FW is < 17. Signed-off-by: Sumant Patro <[EMAIL PROTECTED]> --- drivers/scsi/megaraid/megaraid_sas.c | 65 +++-- drivers/scsi/megaraid/megaraid_sas.h | 13 +++-- 2 files changed, 70 insertions(+), 8 deletions(-) This patch requires the patch submitted by James with subject line : [PATCH] expose eh_timed_out to the host template diff -uprN linux-2.6.orig/drivers/scsi/megaraid/megaraid_sas.c linux-2.6.new/drivers/scsi/megaraid/megaraid_sas.c --- linux-2.6.orig/drivers/scsi/megaraid/megaraid_sas.c 2007-03-28 08:41:49.0 -0700 +++ linux-2.6.new/drivers/scsi/megaraid/megaraid_sas.c 2007-03-28 08:36:38.0 -0700 @@ -10,7 +10,7 @@ *2 of the License, or (at your option) any later version. * * FILE: megaraid_sas.c - * Version : v00.00.03.10-rc1 + * Version : v00.00.03.10-rc3 * * Authors: * (email-id : [EMAIL PROTECTED]) @@ -886,6 +886,7 @@ megasas_queue_command(struct scsi_cmnd * goto out_return_cmd; cmd->scmd = scmd; + scmd->SCp.ptr = (char *)cmd; /* * Issue the command to the FW @@ -981,8 +982,8 @@ static int megasas_generic_reset(struct instance = (struct megasas_instance *)scmd->device->host->hostdata; - scmd_printk(KERN_NOTICE, scmd, "megasas: RESET -%ld cmd=%x\n", - scmd->serial_number, scmd->cmnd[0]); + scmd_printk(KERN_NOTICE, scmd, "megasas: RESET -%ld cmd=%x retries=%x\n", +scmd->serial_number, scmd->cmnd[0], scmd->retries); if (instance->hw_crit_error) { printk(KERN_ERR "megasas: cannot recover from previous reset " @@ -1000,6 +1001,40 @@ static int megasas_generic_reset(struct } /** + * megasas_reset_timer - quiesce the adapter if required + * @scmd: scsi cmnd + * + * Sets the FW busy flag and reduces the host->can_queue if the + * cmd has not been completed within the timeout period. + */ +static enum +scsi_eh_timer_return megasas_reset_timer(struct scsi_cmnd *scmd) +{ + struct megasas_cmd *cmd = (struct megasas_cmd *)scmd->SCp.ptr; + struct megasas_instance *instance; + unsigned long flags; + + if (cmd) { + if (time_after(jiffies, scmd->jiffies_at_alloc + 170 * HZ)) + return EH_NOT_HANDLED; + + instance = cmd->instance; + if (!(instance->flag & MEGASAS_FW_BUSY)) { + /* FW is busy, throttle IO */ + spin_lock_irqsave(>throttle_io_lock, flags); + + instance->host->can_queue = 16; + instance->last_time = jiffies; + instance->flag |= MEGASAS_FW_BUSY; + + spin_unlock_irqrestore(>throttle_io_lock, flags); + } + return EH_RESET_TIMER; + } + return EH_HANDLED; +} + +/** * megasas_reset_device - Device reset handler entry point */ static int megasas_reset_device(struct scsi_cmnd *scmd) @@ -1112,6 +1147,7 @@ static struct scsi_host_template megasas .eh_device_reset_handler = megasas_reset_device, .eh_bus_reset_handler = megasas_reset_bus_host, .eh_host_reset_handler = megasas_reset_bus_host, + .eh_timed_out = megasas_reset_timer, .bios_param = megasas_bios_param, .use_clustering = ENABLE_CLUSTERING, }; @@ -1215,9 +1251,8 @@ megasas_complete_cmd(struct megasas_inst int exception = 0; struct megasas_header *hdr = >frame->hdr; - if (cmd->scmd) { + if (cmd->scmd) cmd->scmd->SCp.ptr = (char *)0; - } switch (hdr->cmd) { @@ -1806,6 +1841,7 @@ static void megasas_complete_cmd_dpc(uns u32 context; struct megasas_cmd *cmd; struct megasas_instance *instance = (struct megasas_instance *)instance_addr; + unsigned long flags; /* If we have already declared adapter dead, donot complete cmds */ if (instance->hw_crit_error) @@ -1828,6 +1864,22 @@ static void megasas_complete_cmd_dpc(uns } *instance->consumer = producer; + + /* +* Check if we can restore can_queue +*/ + if (instance->flag & MEGASAS_FW_BUSY + && time_after(jiffies, instance->last_time + 5 * HZ) + && atomic_read(>fw_outstanding) < 17) { + + spin_lock_irqsave(>throttle_io_lock, flags); + + instance->flag &= ~MEGASAS_FW_BUSY; + instance->host->can_queue = + instance->max_fw_cmds - MEGASAS_INT_CMDS; + +
Re: [RFC] i386: Remove page sized slabs for pgds and pmds
* Zachary Amsden ([EMAIL PROTECTED]) wrote: > William Lee Irwin III wrote: > >>clone_pgd_range() for consistency? and it seems we lost a > >>paravirt_alloc_pd_clone() in there somewhere. > >> > > > >Yes, another reason why it shouldn't have been posted as-is. It was not > >intended to for anything more than comparative benchmarking on systems > >without graphics running on the bare metal as opposed to Xen/etc. guests. > > > > So clone_pgd_range is mostly useless now. Originally, I intended it to > take the part of paravirt_alloc_pd_clone. We should probably merge the > two into just one function, unless someone thinks clone_pgd_range is > actually useful for something. No, I was going to suggest just that. It was orginially introduced as the place holder for that IIRC. thanks, -chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] i386: Remove page sized slabs for pgds and pmds
* William Lee Irwin III ([EMAIL PROTECTED]) wrote: > * Christoph Lameter ([EMAIL PROTECTED]) wrote: > >> +#ifdef CONFIG_HIGHMEM64G > >> +#define __pgd_alloc() kmem_cache_alloc(pgd_cache, > >> GFP_KERNEL|__GFP_REPEAT) > >> +#define __pgd_free(pgd) kmem_cache_free(pgd_cache, pgd) > > On Wed, Mar 28, 2007 at 03:26:56PM -0700, Chris Wright wrote: > > I must've glazed over something, I thought this was removal of slabs? > > The pgd slab is not fully removable in the PAE case because a dedicated > slab is the only way to enforce alignment for allocations as small as > PAE PGD's. Heh, yeah "page sized" is the part i glazed over, my fault. > On Wed, Mar 28, 2007 at 03:26:56PM -0700, Chris Wright wrote: > > BTW, this will interact shared_kernel_pmd patch that Jeremy's posted a > > few times (I know at least wli has looked over that one). We need to > > make sure that PAE under at least Xen hypervisor has a page-sized pgd, > > although the mmlist chaining looks nice to me. > > That, not to mention the total lack of verification of the pageattr.c > code, are among the reasons I didn't want it posted. > > > * Christoph Lameter ([EMAIL PROTECTED]) wrote: > >> + memcpy([USER_PTRS_PER_PGD], _pg_dir[USER_PTRS_PER_PGD], > >> + KERNEL_PGD_PTRS*sizeof(pgd_t)); > > On Wed, Mar 28, 2007 at 03:26:56PM -0700, Chris Wright wrote: > > clone_pgd_range() for consistency? and it seems we lost a > > paravirt_alloc_pd_clone() in there somewhere. > > Yes, another reason why it shouldn't have been posted as-is. It was not > intended to for anything more than comparative benchmarking on systems > without graphics running on the bare metal as opposed to Xen/etc. guests. OK, all good here. Just wanted to make sure things didn't collide too badly. thanks, -chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] i386: Remove page sized slabs for pgds and pmds
William Lee Irwin III wrote: clone_pgd_range() for consistency? and it seems we lost a paravirt_alloc_pd_clone() in there somewhere. Yes, another reason why it shouldn't have been posted as-is. It was not intended to for anything more than comparative benchmarking on systems without graphics running on the bare metal as opposed to Xen/etc. guests. So clone_pgd_range is mostly useless now. Originally, I intended it to take the part of paravirt_alloc_pd_clone. We should probably merge the two into just one function, unless someone thinks clone_pgd_range is actually useful for something. Zach - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] the scheduled eepro100 removal
Kok, Auke wrote: Bill Davidsen wrote: Adrian Bunk wrote: This patch contains the scheduled removal of the eepro100 driver. Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> This keeps coming around, but I haven't seen an answer to the questions raised by Eric Piel or Kiszka. I do know that e100 didn't work on some IBM rackmount servers and eepro100 did, but since I'm no longer responsible for those machines I can't retest. Perhaps someone will be able to provide data points. IBM current offerings as of about three years ago, I had a few dozen of them at one time. We have provided a (test) driver which allows e100 to use IO to communicate with the device, which seems to have helped for one person. I think we need to work with those changes and see if it helps the other people resolve their e100 issues. Unfortunately it keeps slipping off to the low priority list for us. I suggest that we should push this code into -mm for people to test or something. It's fairly low risk as by default the patch won't enable IO and thus use the old method of writing to the adapter. Sounds sane to me. My overall opinion on eepro100 removal is that we're not there yet. Rare problem cases remain where e100 fails but eepro100 works, and it's older drivers so its low priority for everybody. Needs to happen, though... Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] i386: Remove page sized slabs for pgds and pmds
* Christoph Lameter ([EMAIL PROTECTED]) wrote: >> +#ifdef CONFIG_HIGHMEM64G >> +#define __pgd_alloc() kmem_cache_alloc(pgd_cache, >> GFP_KERNEL|__GFP_REPEAT) >> +#define __pgd_free(pgd) kmem_cache_free(pgd_cache, pgd) On Wed, Mar 28, 2007 at 03:26:56PM -0700, Chris Wright wrote: > I must've glazed over something, I thought this was removal of slabs? The pgd slab is not fully removable in the PAE case because a dedicated slab is the only way to enforce alignment for allocations as small as PAE PGD's. On Wed, Mar 28, 2007 at 03:26:56PM -0700, Chris Wright wrote: > BTW, this will interact shared_kernel_pmd patch that Jeremy's posted a > few times (I know at least wli has looked over that one). We need to > make sure that PAE under at least Xen hypervisor has a page-sized pgd, > although the mmlist chaining looks nice to me. That, not to mention the total lack of verification of the pageattr.c code, are among the reasons I didn't want it posted. * Christoph Lameter ([EMAIL PROTECTED]) wrote: >> +memcpy([USER_PTRS_PER_PGD], _pg_dir[USER_PTRS_PER_PGD], >> +KERNEL_PGD_PTRS*sizeof(pgd_t)); On Wed, Mar 28, 2007 at 03:26:56PM -0700, Chris Wright wrote: > clone_pgd_range() for consistency? and it seems we lost a > paravirt_alloc_pd_clone() in there somewhere. Yes, another reason why it shouldn't have been posted as-is. It was not intended to for anything more than comparative benchmarking on systems without graphics running on the bare metal as opposed to Xen/etc. guests. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] i386: Remove page sized slabs for pgds and pmds
On Wed, 28 Mar 2007, William Lee Irwin III wrote: >> I already went over the methodological issues with kernel compiles. >> You may be able to prove this, but not this way. On Wed, Mar 28, 2007 at 02:20:20PM -0700, Christoph Lameter wrote: > But this way is an established kernel way of doing things. Seems that my > AIM9 stuff was not convincing and I am not sure what other tests would be > acceptable. Could you post some of data regarding the improvements > possible through your patches? What I did, I did a number of years ago. Even if I could find the results (and I don't even recall order-of-magnitude estimates) they would be effectively irrelevant to modern kernels. The disaster in all this was that the PTE caching never got merged. It's not much of an observation to note that the primarily bottleneck is still there when the patch to resolve it never got merged. As far as kernel compiles being relevant to anything besides potentially optimizing a particular major benchmark using gcc as one of its components... yeah, right. It's too macro to be a microbenchmark of anything and too micro to be pertinent to any meaningful macrobenchmark such as those from major benchmark publishers (who can't be named for trademark/etc. reasons). Hasn't it been at least 5 years since people figured out kernel compiles were complete bulls**t as benchmarks along with dbench for other reasons and several others? If not, I don't know why I bother with this kernel at all. Even so, I already did this and am done with it. It's not like I'm not carrying around numerous patches I know will never be merged all the time anyway. If you want to back it all out so badly, just do it and stop bothering me about it, and I'll merely continue maintaining my local patches without ever posting them as I have been for years. I'm not at all happy with the NIH situation, either, not that I'm at such a loss for ideas to need to contest every petty NIH that flies past. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc5
Am 27.03.2007 08:17 schrieb Andrew Morton: > I have a few fixes here which belong to subsystem trees, which were missed > by the maintainers and which we probably want to get into 2.6.21. [...] > Maintainers are cc'ed. Please promptly ack, nack or otherwise quack, else > I'll be making my own decisions ;) [CC list trimmed] It's not on that list, but would you mind slipping drivers-isdn-gigaset-mark-some-static-data-as-const-v2.patch into 2.6.21 too? It's largely trivial but I'd like to get it out of the door. Thanks, Tilman -- Tilman Schmidt E-Mail: [EMAIL PROTECTED] Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Ungeoeffnet mindestens haltbar bis: (siehe Rueckseite) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] i386: Remove page sized slabs for pgds and pmds
* Christoph Lameter ([EMAIL PROTECTED]) wrote: > +#ifdef CONFIG_HIGHMEM64G > +#define __pgd_alloc()kmem_cache_alloc(pgd_cache, > GFP_KERNEL|__GFP_REPEAT) > +#define __pgd_free(pgd) kmem_cache_free(pgd_cache, pgd) I must've glazed over something, I thought this was removal of slabs? BTW, this will interact shared_kernel_pmd patch that Jeremy's posted a few times (I know at least wli has looked over that one). We need to make sure that PAE under at least Xen hypervisor has a page-sized pgd, although the mmlist chaining looks nice to me. > +static struct kmem_cache *pgd_cache; > + > +void __init pgtable_cache_init(void) > +{ > + pgd_cache = kmem_cache_create("pgd", > + PTRS_PER_PGD*sizeof(pgd_t), > + PTRS_PER_PGD*sizeof(pgd_t), > + SLAB_PANIC, > + NULL, > + NULL); > +} > +#else /* !CONFIG_HIGHMEM64G */ > +#define __pgd_alloc()((pgd_t > *)get_zeroed_page(GFP_KERNEL|__GFP_REPEAT)) > +#define __pgd_free(pgd) free_page((unsigned long)(pgd)) > +#endif /* !CONFIG_HIGHMEM64G */ > > pgd_t *pgd_alloc(struct mm_struct *mm) > { > int i; > - pgd_t *pgd = kmem_cache_alloc(pgd_cache, GFP_KERNEL); > + pgd_t *pgd = __pgd_alloc(); > > - if (PTRS_PER_PMD == 1 || !pgd) > + if (!pgd) > + return NULL; > + memcpy([USER_PTRS_PER_PGD], _pg_dir[USER_PTRS_PER_PGD], > + KERNEL_PGD_PTRS*sizeof(pgd_t)); clone_pgd_range() for consistency? and it seems we lost a paravirt_alloc_pd_clone() in there somewhere. > + if (PTRS_PER_PMD == 1) > return pgd; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-pm] [3/6] 2.6.21-rc4: known regressions
On Wednesday 28 March 2007 22:42:00 Linus Torvalds wrote: > > On Wed, 28 Mar 2007, David Brownell wrote: > > > > On Wednesday 28 March 2007 9:38 am, Linus Torvalds wrote: > > > > > It's a *device*, dammit. It should save and resume like one (probably as > > > a > > > system device). The "set_mode()" etc stuff is at a completely different > > > (higher) conceptual level. > > > > Agreed, except about "probably as a system device". > > > > Last I checked, there was no good reason to use sysdev suspend()/resume() > > rather than platform_device suspend_late()/early_resume(). Which more > > or less means no good reason to use sysdev in new code... > > I won't disagree - it might well be much nicer to just show it in the > "real" device tree. I'm not 100% sure where in the tree it would go, > though. It should probably be "inside" the root entry, before any of the > PCI buses. It's generally what we've used those "system device" things > for, but I agree that it would be better to just make system devices show > up early on the regular device list than it is to have them be special > cases. > > Bit I think that's a separate (and fairly small) issue compared to the > "don't use the clocksource infrastructure as a make-believe suspend/resume > mechanism" problem that Maxim's patch had. > > (Maxim, don't take that the wrong way - I think your analysis and patch > were great, I just think another organization would be better) Exactly, I agree completely I said that my patch was a temporary fix, and I agree that the best way is to create a new system device and use its suspend/resume hooks to bring HPET back to life on resume. > > > Also, making HPET use the legacy mode seems like a step backwards. > > I don't think that's actually "legacy" in any sense but the interrupt > delivery, where the "legacy mode" bit is not so much that the HPET itself > is "legacy" but that it *replaces* legacy devices. > > But I may have misunderstood the thing. I'm an old fart, so I know the old > timers much better than I know the new ones ;). Somebody feel free to hit > me with the clue-2x4. > > Linus > Best regards, Maxim Levitsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] the scheduled eepro100 removal
Bill Davidsen wrote: Adrian Bunk wrote: This patch contains the scheduled removal of the eepro100 driver. Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> This keeps coming around, but I haven't seen an answer to the questions raised by Eric Piel or Kiszka. I do know that e100 didn't work on some IBM rackmount servers and eepro100 did, but since I'm no longer responsible for those machines I can't retest. Perhaps someone will be able to provide data points. IBM current offerings as of about three years ago, I had a few dozen of them at one time. We have provided a (test) driver which allows e100 to use IO to communicate with the device, which seems to have helped for one person. I think we need to work with those changes and see if it helps the other people resolve their e100 issues. Unfortunately it keeps slipping off to the low priority list for us. I suggest that we should push this code into -mm for people to test or something. It's fairly low risk as by default the patch won't enable IO and thus use the old method of writing to the adapter. Auke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] i386: Remove page sized slabs for pgds and pmds
On Wed, 28 Mar 2007, William Lee Irwin III wrote: > On Wed, Mar 28, 2007 at 02:38:55PM -0700, Christoph Lameter wrote: > > No that was described in the patch. Quote: > > "i386 only provides support for caching constructed pgd and pmds. These > > are comparatively rare to ptes so it is no surprise that the current > > approach has only minimal effect. " > > And where was the mention of this being a patch I sent you in a private > reply verbatim, and furthermore asked you not to post? Yes it was private and you told me to be careful about "waving this patch around". No mention of not posting it. And I repeat that I am sorry to have removed the paragraph that mentioned you being the author during rewrites of the text. Signoff line is there. This is an RFC so we can still add this if we want to apply it at all. I think we need to discuss this openly. It seems that I am getting into unknown minefields of an ancient discussion between you and Andrew. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FF layer restrictions [Was: [PATCH 1/1] Input: add sensable phantom driver]
Jiri Slaby napsal(a): > Dmitry Torokhov napsal(a): >> On Tuesday 27 March 2007 17:34, johann deneux wrote: >>> What about adding a member to ff_effect which would be the number of the >>> motor? >>> We can't change the layout of ff_effect too much though, so we have to >>> find unused bits and put them to work. >>> >>> For instance, we could replace >>> >>> __u16 type; >>> >>> by >>> >>> __u8 motor; >>> __u8 type; >>> >> Splitting type field seems to be a good idea. > > Maybe stupid question, but what about endianness + backward compatibility? > If we split it into motor,type sequence, it would break LE (untouched BE), > if we do type,motor, it is OK for LE (broken BE). Aha, and the question is: do #ifdef __BIG_ENDIAN #else #endif ? regards, -- http://www.fi.muni.cz/~xslaby/Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey fingerprint: B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E Hnus <[EMAIL PROTECTED]> is an alias for /dev/null - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix "Section mismatch" compile warning
On Mon, 26 Mar 2007 17:19:33 +0200 Bernhard Walle <[EMAIL PROTECTED]> wrote: > Fix "Section mismatch" warnings in arch/x86_64/kernel/time.c > Please always quote the warnings in the changelog when fixing them, thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] the scheduled eepro100 removal
Adrian Bunk wrote: This patch contains the scheduled removal of the eepro100 driver. Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> This keeps coming around, but I haven't seen an answer to the questions raised by Eric Piel or Kiszka. I do know that e100 didn't work on some IBM rackmount servers and eepro100 did, but since I'm no longer responsible for those machines I can't retest. Perhaps someone will be able to provide data points. IBM current offerings as of about three years ago, I had a few dozen of them at one time. -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: FF layer restrictions [Was: [PATCH 1/1] Input: add sensable phantom driver]
Dmitry Torokhov napsal(a): > On Tuesday 27 March 2007 17:34, johann deneux wrote: >> What about adding a member to ff_effect which would be the number of the >> motor? >> We can't change the layout of ff_effect too much though, so we have to >> find unused bits and put them to work. >> >> For instance, we could replace >> >> __u16 type; >> >> by >> >> __u8 motor; >> __u8 type; >> > > Splitting type field seems to be a good idea. Maybe stupid question, but what about endianness + backward compatibility? If we split it into motor,type sequence, it would break LE (untouched BE), if we do type,motor, it is OK for LE (broken BE). regards, -- http://www.fi.muni.cz/~xslaby/Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey fingerprint: B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E Hnus <[EMAIL PROTECTED]> is an alias for /dev/null - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/