Re: rcu-refcount stacker performance
Quoting Paul E. McKenney ([EMAIL PROTECTED]): > On Thu, Jul 14, 2005 at 08:44:50AM -0500, [EMAIL PROTECTED] wrote: > > Quoting Paul E. McKenney ([EMAIL PROTECTED]): > > > My guess is that the reference count is indeed costing you quite a > > > bit. I glance quickly at the patch, and most of the uses seem to > > > be of the form: > > > > > > increment ref count > > > rcu_read_lock() > > > do something > > > rcu_read_unlock() > > > decrement ref count > > > > > > Can't these cases rely solely on rcu_read_lock()? Why do you also > > > need to increment the reference count in these cases? > > > > The problem is on module unload: is it possible for CPU1 to be > > on "do something", and sleep, and, while it sleeps, CPU2 does > > rmmod(lsm), so that by the time CPU1 stops sleeping, the code it > > is executing has been freed? > > OK, but in the above case, "do something" cannot be sleeping, since > it is under rcu_read_lock(). Oh, but that's not quite what the code is doing, rather it is doing: rcu_read_lock while get next element from list inc element.refcount rcu_read_unlock do something rcu_read_lock dec refcount rcu_read_unlock What I plan to try next is: rcu_read_lock while get next element from list if (element->owning_module->state != LIVE) continue rcu_read_unlock do something rcu_read_lock rcu_read_unlock > > Because stacker won't remove the lsm from the list of modules > > until mod->exit() is executed, and module_free(mod) happens > > immediately after that, the above scenario seems possible. > > Right, if you have some other code path that sleeps (outside of > rcu_read_lock(), right?), then you need the reference count for that > code path. But the code paths that do not sleep should be able to > dispense with the reference count, reducing the cache-line traffic. Most if not all of the codepaths can sleep, however. So unfortunately that doesn't seem a feasible solution. That's why I'm hoping there is something inherent in the module unload code that I can take advantage of to forego my own refcounting. thanks, -serge - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] split PCI probing code [1/9]
Adam Belay <[EMAIL PROTECTED]> : [...] Some nits + a suspect error branch. It seems nice otherwise. > --- a/drivers/pci/bus/bus.c 1969-12-31 19:00:00.0 -0500 > +++ b/drivers/pci/bus/bus.c 2005-07-10 22:32:53.0 -0400 [...] > +struct pci_bus * pci_alloc_bus(void) > +{ > + struct pci_bus *b; > + > + b = kmalloc(sizeof(*b), GFP_KERNEL); > + if (b) { > + memset(b, 0, sizeof(*b)); mm/slab.c provides kcalloc. [...] > --- a/drivers/pci/bus/config.c1969-12-31 19:00:00.0 -0500 > +++ b/drivers/pci/bus/config.c2005-07-12 00:52:35.147664368 -0400 [...] > +static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int > rom) > +{ > + unsigned int pos, reg, next; > + u32 l, sz; > + struct resource *res; > + > + for(pos=0; pos +static struct pci_dev * __devinit > +pci_scan_device(struct pci_bus *bus, int devfn) > +{ [...] > + dev = kmalloc(sizeof(struct pci_dev), GFP_KERNEL); > + if (!dev) > + return NULL; > + > + memset(dev, 0, sizeof(struct pci_dev)); kcalloc [...] > + /* Assume 32-bit PCI; let 64-bit PCI cards (which are far rarer) > +set this higher, assuming the system even supports it. */ > + dev->dma_mask = 0x; DMA_32BIT_MASK > + if (pci_setup_device(dev) < 0) { > + kfree(dev); > + return NULL; > + } > + device_initialize(>dev); > + dev->dev.release = pci_release_dev; > + pci_dev_get(dev); > + > + pci_name_device(dev); > + > + dev->dev.dma_mask = >dma_mask; > + dev->dev.coherent_dma_mask = 0xull; DMA_32BIT_MASK [...] > +struct pci_dev * __devinit > +pci_scan_single_device(struct pci_bus *bus, int devfn) > +{ > + struct pci_dev *dev; > + > + dev = pci_scan_device(bus, devfn); > + pci_scan_msi_device(dev); > + > + if (!dev) > + return NULL; Why not do the test immediately ? [...] > --- a/drivers/pci/bus/probe.c 1969-12-31 19:00:00.0 -0500 > +++ b/drivers/pci/bus/probe.c 2005-07-12 00:55:50.580953992 -0400 [...] > +int __devinit pci_scan_bridge(struct pci_bus *bus, struct pci_dev * dev, int > max, int pass) [...] > + > + /* Prevent assigning a bus number that already exists. > + * This can happen when a bridge is hot-plugged */ > + if (pci_find_bus(pci_domain_nr(bus), max+1)) if (pci_find_bus(pci_domain_nr(bus), max + 1)) [...] > + /* > + * For CardBus bridges, we leave 4 bus numbers > + * as cards with a PCI-to-PCI bridge can be > + * inserted later. > + */ > + for (i=0; i +int __devinit pci_scan_slot(struct pci_bus *bus, int devfn) > +{ > + int func, nr = 0; > + int scan_all_fns; > + > + scan_all_fns = pcibios_scan_all_fns(bus, devfn); > + > + for (func = 0; func < 8; func++, devfn++) { > + struct pci_dev *dev; > + > + dev = pci_scan_single_device(bus, devfn); > + if (dev) { > + nr++; > + > + /* > + * If this is a single function device, > + * don't scan past the first function. > + */ > + if (!dev->multifunction) { > + if (func > 0) { > + dev->multifunction = 1; > + } else { > + break; > + } if (func == 0) break; dev->multifunction = 1; [...] > +unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus) > +{ [...] > + pcibios_fixup_bus(bus); > + for (pass=0; pass < 2; pass++) for (pass = 0; pass < 2; pass++) [...] > +struct pci_bus * __devinit pci_scan_bus_parented(struct device *parent, int > bus, struct pci_ops *ops, void *sysdata) > +{ > + int error; > + struct pci_bus *b; > + struct device *dev; > + > + b = pci_alloc_bus(); > + if (!b) > + return NULL; > + > + dev = kmalloc(sizeof(*dev), GFP_KERNEL); > + if (!dev){ > + kfree(b); > + return NULL; > + } The code below uses goto. Why not here ? > + > + b->sysdata = sysdata; > + b->ops = ops; > + > + if (pci_find_bus(pci_domain_nr(b), bus)) { > + /* If we already got to this bus through a different bridge, > ignore it */ > + pr_debug("PCI: Bus %04x:%02x already known\n", > pci_domain_nr(b), bus); > + goto err_out; > + } > + spin_lock(_bus_lock); > + list_add_tail(>node, _root_buses); > + spin_unlock(_bus_lock); > + > + memset(dev, 0, sizeof(*dev)); kcalloc > + dev->parent = parent; > + dev->release = pci_release_bus_bridge_dev; > +
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Linus Torvalds wrote: There's absolutely nothing wrong with "jiffies", and anybody who thinks that msleep(20); is fundamentally better than timeout = jiffies + HZ/50; just doesn't realize that the latter is a bit more complicated exactly because the latter is a hell of a lot more POWERFUL. But if all I really want is to sleep for 20ms, what does the additional power actually buy me? Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: moving DRM header files
> > When you start merging DRM and fbdev you will be able to use relative > paths that are closer together. For example #include > "../char/drm/drmP.h" versus "#include "drm/drmP.h" for internal > headers. No. Using relative include paths is not good. I will most probarly not work with make O=. Sam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Kernel Bug Report
System: Motherboard = Tyan K8WE Processor = 2x Opteron 250 Memory = 8GB ECC Registered On all of the recent release candidates except for 2.6.13-rc2-git2 the kernel panics while booting. These versions include 2.6.13-rc2-git* (* != 2 ) and 2.6.13-rc3. I also want to mention that I am using gcc 3.3.5 on debian and that during compilation there are 3 messages at the end that say an assertion has failed IE (LD: assertion failed). It looks like it panics during a mem_cpy but I know its difficult to tell just by the output. I get a code: f3 a4 c3 66 66 66 90 66 66 66 90 66 66 66 90 66 The problem appears very reproducable so I can provide more information upon request. My .config is avaible upon request. -Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 2005-07-14 at 09:37 -0700, Linus Torvalds wrote: > There should be an _absolute_ interface I'm not arguing there shouldn't be an absolute interface. I'm arguing that *most* uses are relative, and as such a relative interface makes sense for those cases. > Btw, this is exactly why the jiffy-based thing is _good_. The kernel > timers _are_ absolute, and you make them relative by adding "jiffies". again there is absolutely nothing wrong with having absolute timers and a general notion of absolute time. Jiffies is one way of achieving that, and it's the current linux way. I see the "absolute timers are good" argument sort of separate from "jiffies / HZ are good" argument; there is no principal reason why such an interface couldn't be in say usec. > There's absolutely nothing wrong with "jiffies", and anybody who thinks > that > > msleep(20); > > is fundamentally better than > > timeout = jiffies + HZ/50; I *will* argue that for relative delays in drivers, msleep() is better. The reason is different than you think of; the argument why I consider msleep() better as interface for relative delays in drivers is that it is harder for a driver writer to get wrong, by virtue of it being simpler. jiffies and HZ conversion is one of those areas that driver writers very often get wrong. (multiply by HZ not divide for example, but there's a few dozen ways it can and does go wrong). A relative msec based interface is a LOT harder to get wrong, and also often is closer to what the datasheet of the hardware says. I'm not going to say "all driver writers are stupid" because they're not; however too many of them just act like they are too much of the time. That doesn't mean that there is no room for a "powerful interface" next to a simple one, and I hope you're not fully against adding a simple interface on top of a more powerful one if that simple interface is a way to reduce mistakes and thus bugs in drivers. > just doesn't realize that the latter is a bit more complicated exactly > because the latter is a hell of a lot more POWERFUL. Trying to get rid of > jiffies for some religious reason is _stupid_. I have nothing religious against jiffies per se. My argument however is that with a few simple, relative interfaces *in addition* to an absolute interface, almost all drivers suddenly are isolated from jiffies and HZ because they simply don't care. Because they really DON'T care about absolute time. At all. Doing this will in turn open up flexibility in experimenting with how one implements the timer stuff; there's suddenly a lot less code to touch in doing so. Also such relative interface can match the intent a lot better and separated from the actual implementation. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: LKM function call on kernel function call?
You can also look about some methods of "function redirection hooks"... add some opcodes at the start of the "hooked function" (something like to add a CALL or JMP pointing to the address of your function). There are docs about this subject, but unfortunately I couldn't find anything now (http://www.ouah.org/p59-0x08.txt is not exactly what I'm talking about, it's talking about ELF redirection). It's a dirty thing to do, and it's not intended to be done in any production thing (in fact, it's a *hack*). On 7/5/05, S <[EMAIL PROTECTED]> wrote: > Is it possible to code a loadable module having function1(), which > would be called, everytime a particular function of the kernel is > called? If not, atleast a way this could be done without re-compiling > the whole kernel and rebooting the system? > > Example: > > My LKM: > - > > init_module() { > ... > } > > function1() { > ... > } > > cleanup_module() { > ... > } > > > I want function1() to be called, everytime the function > ide_do_rw_disk() of ide-disk.c is called. I do not want to re-compile > the complete kernel to do this. > > Thanks in advance, > > Regards, > S > - > To unsubscribe from this list: send the line "unsubscribe > linux-c-programming" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- # (perl -e "while (1) { print "\x90"; }") | dd of=/dev/evil - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu-refcount stacker performance
On Thu, Jul 14, 2005 at 08:44:50AM -0500, [EMAIL PROTECTED] wrote: > Quoting Paul E. McKenney ([EMAIL PROTECTED]): > > My guess is that the reference count is indeed costing you quite a > > bit. I glance quickly at the patch, and most of the uses seem to > > be of the form: > > > > increment ref count > > rcu_read_lock() > > do something > > rcu_read_unlock() > > decrement ref count > > > > Can't these cases rely solely on rcu_read_lock()? Why do you also > > need to increment the reference count in these cases? > > The problem is on module unload: is it possible for CPU1 to be > on "do something", and sleep, and, while it sleeps, CPU2 does > rmmod(lsm), so that by the time CPU1 stops sleeping, the code it > is executing has been freed? OK, but in the above case, "do something" cannot be sleeping, since it is under rcu_read_lock(). > Because stacker won't remove the lsm from the list of modules > until mod->exit() is executed, and module_free(mod) happens > immediately after that, the above scenario seems possible. Right, if you have some other code path that sleeps (outside of rcu_read_lock(), right?), then you need the reference count for that code path. But the code paths that do not sleep should be able to dispense with the reference count, reducing the cache-line traffic. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Serial core: 8250_pci could not register serial port for UART chip EXAR XR17D152
Hi all, I have been coming across a problem with my serial port EXAR chip XR 17D152, when I try to use the 8250_pci driver. I am using kernel-2.6.12.1 on RHEL4.0-U1 on pSeries box with 4-cpu. 8250_pci during the boot time, after detecting the exar chip (I checked with the pci_dev structure and the pci_device_id structure for the info), is unable to get thru the port registration (static int __devinit_pciserial_init_one(struct pci_dev *dev, const struct pci_device_id *ent) procedure in 8250_pci.c). I debugged the problem and traced upto the routine "static int uart_match_port(struct uart_port *port1, struct uart_port *port2" in 8250.c where UPIO_MEM is not satisfying the condition port1->membase==port2->membase and hence returns 0. If I use the printk for dumping the port-> membase value the system hags during the boot time with a blank screen (on the serial terminal). I am yet to try with kernel-2.6.12.2. Please let me know how to proceed in this case. Thanks, V.Ananda Krishnan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [11/11] x86_64: TASK_SIZE fixes for compatibility mode processes
On Wed, Jul 13, 2005 at 08:49:47PM +0200, Andi Kleen wrote: > On Wed, Jul 13, 2005 at 11:44:26AM -0700, Greg KH wrote: > > -stable review patch. If anyone has any objections, please let us know. > > I think the patch is too risky for stable. I had even my doubts > for mainline. hmm.. Main reason why Andrew posted this for stable series is because of the memory leak issue mentioned in the patch changeset comments... We have not seen any stability issues because of this patch so far(its been there for more than a month in -mm series). Lack of this patch is actually causing us more troubles (DOS/app failures/..). thanks, suresh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 14 Jul 2005, Vojtech Pavlik wrote: > > A note on the relaive timer API: There needs to be a way to say > "x milliseconds from the time this timer should have triggered" instead > of "x milliseconds from now", to avoid skew in timers that try to be > strictly periodic. I disagree. There should be an _absolute_ interface, and a driver that wants that should just have calculated when in time the timeout finishes - and then keep on using the absolute value. Btw, this is exactly why the jiffy-based thing is _good_. The kernel timers _are_ absolute, and you make them relative by adding "jiffies". The fact is, the current timers are better than people give them credit for, and converting them away from a jiffies-based interface (to a usleep-like one) is STUPID. There's absolutely nothing wrong with "jiffies", and anybody who thinks that msleep(20); is fundamentally better than timeout = jiffies + HZ/50; just doesn't realize that the latter is a bit more complicated exactly because the latter is a hell of a lot more POWERFUL. Trying to get rid of jiffies for some religious reason is _stupid_. I have to say, this whole thread has been pretty damn worthless in general in my not-so-humble opinion. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu-refcount stacker performance
Quoting Paul E. McKenney ([EMAIL PROTECTED]): > On Thu, Jul 14, 2005 at 09:21:07AM -0500, [EMAIL PROTECTED] wrote: > > On July 8 I sent out a patch which re-implemented the rcu-refcounting > > of the LSM list in stacker for the sake of supporting safe security > > module unloading. (patch reattached here for convenience) Here are > > some performance results with and without that patch. Tests were run > > on a 16-way ppc64 machine. Dbench was run 50 times, and kernbench > > and reaim were run 10 times, and intervals are 95% confidence half- > > intervals. > > > > These results seem pretty poor. I'm now wondering whether this is > > really necessary. David Wheeler's original stacker had an option > > of simply waiting a while after a module was taken out of the list > > of active modules before freeing the modules. Something like that > > is of course one option. I'm hoping we can also take advantage of > > some already known module state info to be a little less coarse > > about it. For instance, sys_delete_module() sets m->state to > > MODULE_STATE_GOING before calling mod->exit(). If in place of > > doing atomic_inc(>use), stacker skipped the m->hook() if > > m->state!=MODULE_STATE_LIVE, then it may be safe to assume that > > any m->hook() should be finished before sys_delete_module() gets > > to free_module(mod). This seems to require adding a struct > > module argument to security/security:mod_reg_security() so an LSM > > can pass itself along. > > > > So I'll try that next. Hopefully by avoiding the potential cache > > line bounces which atomic_inc(>use) bring, this should provide > > far better performance. > > My guess is that the reference count is indeed costing you quite a > bit. I glance quickly at the patch, and most of the uses seem to > be of the form: > > increment ref count > rcu_read_lock() > do something > rcu_read_unlock() > decrement ref count > > Can't these cases rely solely on rcu_read_lock()? Why do you also > need to increment the reference count in these cases? The problem is on module unload: is it possible for CPU1 to be on "do something", and sleep, and, while it sleeps, CPU2 does rmmod(lsm), so that by the time CPU1 stops sleeping, the code it is executing has been freed? Because stacker won't remove the lsm from the list of modules until mod->exit() is executed, and module_free(mod) happens immediately after that, the above scenario seems possible. thanks, -serge - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Fix the recent C-state with FADT regression
Attached patch fixes the recent C-state based on FADT regression reported by Kevin. Please apply. Thanks, Venki Fix the regression with c1_default_handler on some systems where C-states come from FADT. Thanks to Kevin Radloff for identifying the issue and also root causing on exact line of code that is causing the issue. Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]> diff -purN linux-2.6.13-rc1-mm1//drivers/acpi/processor_idle.c.org linux-2.6.13-rc1-mm1//drivers/acpi/processor_idle.c --- linux-2.6.13-rc1-mm1//drivers/acpi/processor_idle.c.org 2005-07-14 23:19:45.038854688 -0700 +++ linux-2.6.13-rc1-mm1//drivers/acpi/processor_idle.c 2005-07-14 23:21:47.292269344 -0700 @@ -881,7 +881,7 @@ static int acpi_processor_get_power_info result = acpi_processor_get_power_info_cst(pr); if ((result) || (acpi_processor_power_verify(pr) < 2)) { result = acpi_processor_get_power_info_fadt(pr); - if (result) + if ((result) || (acpi_processor_power_verify(pr) < 2)) result = acpi_processor_get_power_info_default_c1(pr); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Why is 2.6.12.2 less stable on my laptop than 2.6.10?
I know this is a broken record, but the development process within the LKML isn't resulting in more stable and better code. Some process change could be a good thing. Why does my alps mouse pad have to stop working every time I test a new "STABLE" kernel? Why does swsup have to start hanging on shut and startup down randomly? I rolled back my home box with 2.6.10 because I want some stability (2.6.10 has problems with swsusp from time to time, but it livable for me, for now.) The process is broken if on a stable series we cannot at least make sure obvious regressions don't smack users between the eyes. I see the problem as that too much code flux is happening from people without the resources, or discipline, to effectively regresion test for side effects of their changes. I know there is a lot of back patting on how well the dot-dot stability release process is working, but that process is a solution for a different and simpler problem and we still have breakage. Stability and deliberate feature design and development along with disciplined regression testing and validation is what is needed. Why can't there be more targeted and planned development? Are we in a race to see how many changes we can push into a "stable" tree? Shouldn't changes be regression tested, formally, before its allowed to go into a tree? Why can't I expect SWSusp work better and more reliable from release to release? I know there is a point where software goes from fun to work, but without more deliberate and disciplined WORK I see the 2.6 tree spinning out of control. The problem is the process, not than the code. * The issues are too much ad-hock code flux without enough disciplined/formal regression testing and review. * Small regressions are accepted and expected to be cached latter. * ad-hock validation before changes are accepted. Some possible things that could help: *Addopt a no-regressions-allowed policy and everthing stops until any identified regressions (in performance, functionally or stability) is fixed or the changes are all rolled back. This works really well if in addition organized pre-flight testing is done before calling a new version number. You simply cannot rely on ad-hock regression testing and reporting. Its got too much latency. * assign validation folks that the developer need to appease before changes are allowed to be accepted into the tree. * Make all changes to the kernel not be submitted by the developers, but by designated subsystem validation owners. If too many bugs continue to sneak by address the problem by adding validation help to that subsystem or get a new owner for the problem subsystem. (<-- I like this one a lot.) * start 2.7 * all of the above (<--this one is good too) --mgross BTW: This may or may not be the opinion of my employer, more likely not. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
pc_keyb: controller jammed (0xA7)
Hello, I didn't find any useful answer anywhere so far, hope it's ok to ask here. I'm currently trying to get a 2.4.31 up and running on an IBM BladeCenter HS20/8843. (base system is a stripped down RH9) When booting the kernel the console is spammmed with: pc_keyb: controller jammed (0xA7) But it seems there are no further consequences and the keyboard is working. The only answer I've found is "disable usb legacy" in the BIOS but that's no solution for me because there is no option to disable usb legacy support and it wouldn't make any sense anyway because the keyboard is an usb-device, so I really do need support for usb. Is there a workaround? Is this an already known bug? Anything wrong on my side? Thanks, Thoralf - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: About a change to the implementation of spin lock in 2.6.12 kernel.
On Thu, 2005-07-14 at 09:21 -0700, [EMAIL PROTECTED] wrote: > Hi Willy, > > I think at least I can remove the LOCK instruction when the lock is already > held by someone else and enter the spinning wait directly, right? If the lock is already held by someone else, the cpu is just going to burn cycles until it's not. So why do you care? -- Brandon Niemczyk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] RealTimeSync Patch
Hello, I would like to get some feedback on this patch for the kernel. It's sole purpose is to help in reducing boot time by not waiting to synchronize the clock edge with the hardware clock. This when combined with other boot reduction patched can bring the kernel boot time to well under 10 seconds, in most cases two or three seconds. In a desktop system this patch is probably insignificant, howerver several patches like this in a set top box or cell phone will be signicant. I understand that there may be some concerns with patches like these so I would like to start a discussion so that I can better understand what the issues are. The members of the CELinux Forum have quite a bit we would like to contribute. Looking at the archives I see that a an intel patch was submitted back in October but I am unable to determine what the resolution was. This patch included is for PPC but other architecutres are available on the patch web site below. Detailed information on the patch can be found here: http://tree.celinuxforum.org/CelfPubWiki/RTCNoSync In addition, other patches for boot time reduction can be found here: http://tree.celinuxforum.org/CelfPubWiki/PatchArchive Elias Kesh [EMAIL PROTECTED] * Fast boot options * Fast boot options (FASTBOOT) [N/y/?] (NEW) y Disable synch on read of Real Time Clock (RTC_NO_SYNC) [N/y/?] (NEW) y diff -u -pruN -X ../dontdiff linux-2.6.12/arch/ppc/kernel/time.c linux-2.6.12_rtc_patch/arch/ppc/kernel/time.c --- linux-2.6.12/arch/ppc/kernel/time.c 2005-06-17 21:48:29.0 +0200 +++ linux-2.6.12_rtc_patch/arch/ppc/kernel/time.c 2005-07-02 00:27:37.0 +0200 @@ -282,8 +282,12 @@ EXPORT_SYMBOL(do_settimeofday); /* This function is only called on the boot processor */ void __init time_init(void) { - time_t sec, old_sec; - unsigned old_stamp, stamp, elapsed; + time_t sec; + unsigned stamp; +#ifndef CONFIG_RTC_NO_SYNC + time_t old_sec; + unsigned old_stamp, elapsed; +#endif if (ppc_md.time_init != NULL) time_offset = ppc_md.time_init(); @@ -308,6 +312,7 @@ void __init time_init(void) stamp = get_native_tbl(); if (ppc_md.get_rtc_time) { sec = ppc_md.get_rtc_time(); +#ifndef CONFIG_RTC_NO_SYNC elapsed = 0; do { old_stamp = stamp; @@ -320,6 +325,7 @@ void __init time_init(void) } while ( sec == old_sec && elapsed < 2*HZ*tb_ticks_per_jiffy); if (sec==old_sec) printk("Warning: real time clock seems stuck!\n"); +#endif xtime.tv_sec = sec; xtime.tv_nsec = 0; /* No update now, we just read the time from the RTC ! */ diff -u -pruN -X ../dontdiff linux-2.6.12/init/Kconfig linux-2.6.12_rtc_patch/init/Kconfig --- linux-2.6.12/init/Kconfig 2005-06-17 21:48:29.0 +0200 +++ linux-2.6.12_rtc_patch/init/Kconfig 2005-07-02 00:27:37.0 +0200 @@ -275,6 +275,33 @@ config KALLSYMS_EXTRA_PASS reported. KALLSYMS_EXTRA_PASS is only a temporary workaround while you wait for kallsyms to be fixed. +menuconfig FASTBOOT + bool "Fast boot options" + help + Say Y here to select among various options that can decrease + kernel boot time. These options may involve providing + hardcoded values for some parameters that the kernel usually + determines automatically. + + This option is useful primarily on embedded systems. + + If unsure, say N. + +config RTC_NO_SYNC + bool "Disable synch on read of Real Time Clock" if FASTBOOT + default n + help + The Real Time Clock is read aligned by default. That means a + series of reads of the RTC are done until it's verified that + the RTC's state has just changed. If you enable this feature, + this synchronization will not be performed. The result is that + the machine will boot up to 1 second faster. + + A drawback is that, with this option enabled, your system + clock may drift from the correct value over the course + of several boot cycles (under certain circumstances). + + If unsure, say N. config PRINTK default y - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: About a change to the implementation of spin lock in 2.6.12 kernel.
Hi Willy, I think at least I can remove the LOCK instruction when the lock is already held by someone else and enter the spinning wait directly, right? 0: cmpb $0, slp jle 2f# lock is not available, then spinning directly without locking the bus 1: lock; decb slp # lock the bus and atomically decrement jns 3f # if clear sign bit jump forward to 3 2: pause # spin - wait cmpb $0,slp # spin - compare to 0 jle 2b # spin - go back to 2 if <= 0 (locked) jmp 1b # unlocked; go back to 1 to try to lock again 3: # we have acquired the lock . But based on the Lockmeter report, the lock success is dominant 99.8%, so maybe this will not make much change. Thanks, Liang - Original Message - From: "Willy Tarreau" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: Sent: Wednesday, July 13, 2005 10:16 PM Subject: Re: About a change to the implementation of spin lock in 2.6.12 kernel. Hi, On Wed, Jul 13, 2005 at 07:20:06PM -0700, [EMAIL PROTECTED] wrote: Hi, I found _spin_lock used a LOCK instruction to make the following operation "decb %0" atomic. As you know, LOCK instruction alone takes almost 70 clock cycles to finish and this add lots of cost to the _spin_lock. However _spin_unlock does not use this LOCK instruction and it uses "movb $1,%0" instead since 4-byte writes on 4-byte aligned addresses are atomic. _spin_unlock does not need locked operations because when it is run, the code is already known to be the only one to hold the lock, so it can release it without checking what others do. So I want rewrite the _spin_lock defined spinlock.h (/linux/include/asm-i386) as follows to reduce the overhead of _spin_lock and make it more efficient. It does not work. You cannot write an inter-cpu atomic test-and-set with several unlocked instructions. #define spin_lock_string \ "\n1:\t" \ "cmpb $0,%0\n\t" \ "jle 2f\n\t" \ ==> here, another thread or CPU can get the lock simultaneously. "movb $0, %0\n\t" \ "jmp 3f\n" \ "2:\t" \ "rep;nop\n\t" \ "cmpb $0, %0\n\t" \ "jle 2b\n\t" \ "jmp 1b\n" \ "3:\n\t" Compared with the original version as follows, LOCK instruction is removed. I rebuilt the Intel e1000 Gigabit driver with this _spin_lock. There is about 2% throughput improvement. #define spin_lock_string \ "\n1:\t" \ "lock ; decb %0\n\t" \ "jns 3f\n" \ "2:\t" \ "rep;nop\n\t" \ "cmpb $0,%0\n\t" \ "jle 2b\n\t" \ "jmp 1b\n" \ "3:\n\t" Do you think I can get a better performance if I dig further? Any ideas will be greatly appreciated, well, of course with those methods you can improve performance, but you lose the warranty that you're alone to get a lock, and that's bad. another similar method to get a lock in some very controlled environment is as follows : 1: cmp $0, %0 jne 1b mov $CPUID, %0 membar cmp $CPUID, %0 jne 1b This only works with same speed CPUs and interrupts disabled. But in todays environments, this is very risky (hyperthreaded CPUs, etc...). However, this is often OK for more deterministic CPUs such as microcontrollers. Regards, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: fdisk: What do plus signs after "Blocks" mean?
Hi kernel. * kernel <[EMAIL PROTECTED]> dixit: > First 446 bytes are boot code and all > Next 64 bytes are for 4 partition records, 16 bytes each > Last 2 bytes are signature And that's right, but only for the MBR. If you set up an extended partition in the MBR, the partition table for that extended partition is on the boot record of the extended partition. If you just backup the MBR, you only backup the *declaration* of the extended partition (where it starts, where it ends, etc.) but NOT the partition table of the extended partition (that is, the partitions within the extended partition). For storing that you have to backup the first sector of the extended partition itself. And you have to do it recursively if you want to backup any partition setup, no matter how strange. I hope I've made this clear, is a bit difficult to explain without a couple of diagrams O:) Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 | http://www.dervishd.net http://www.pleyades.net & http://www.gotesdelluna.net It's my PC and I'll cry if I want to... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC,PATCH] RCU and CONFIG_PREEMPT_RT semi-sane patch
Hello! The attached patch passed about 36 hours of torture test on each of two 4-CPU x86 machines (about 100 passes through the torture-test script), so am officially declaring it to be semi-sane. That said, on eight runs of kernbench+LTP (also on 4-CPU x86 machines), only six passed, and the other two hung in LTP (but both did make it through five rounds of kernbench). So there are still some problems in there somewhere. The hangs were such that I got no debug info of any sort :-/ , so will be continuing testing. But this patch might be acceptable to courageous users of the CONFIG_PREEMPT_RT patch who aren't too concerned about SMP performance and scalability. ;-) The following caveats still apply: o Still have heavyweight operations in rcu_read_lock() and rcu_read_unlock(). Will work on removing the atomic_inc() and atomic_dec() first, with the memory barriers later. o Global callback queues result in poor SMP performance. On the list to fix. Will likely require handling CPU hotplug (current code is too stupid to need to care about CPU hotplug). o Grace-period-detection code is probably too aggressive, but will worry about that later. This will interact with OOM on systems with small memories. o There are likely still bugs. Which might cause occasional hangs. ;-) o Applies against V0.7.51-27 of Ingo's patch. However, the code (but not the patch) should work in a stock kernel as well as in the CONFIG_PREEMPT_RT environment. Thanks to Steve Rostedt and Bill Huey for their help with this! Thoughts? Thanx, Paul PS. Will be on travel next week, so response time may be a bit slow. Signed-off-by: <[EMAIL PROTECTED]> diff -urpN -X dontdiff linux-2.6.12-realtime-preempt-V0.7.51-27/fs/proc/proc_misc.c linux-2.6.12-realtime-preempt-V0.7.51-27-ctrRCU/fs/proc/proc_misc.c --- linux-2.6.12-realtime-preempt-V0.7.51-27/fs/proc/proc_misc.c 2005-07-13 14:52:43.0 -0700 +++ linux-2.6.12-realtime-preempt-V0.7.51-27-ctrRCU/fs/proc/proc_misc.c 2005-07-13 14:54:10.0 -0700 @@ -599,6 +599,38 @@ void create_seq_entry(char *name, mode_t entry->proc_fops = f; } +#ifdef CONFIG_RCU_STATS +int rcu_read_proc(char *page, char **start, off_t off, + int count, int *eof, void *data) +{ + int len; + extern int rcu_read_proc_data(char *page); + + len = rcu_read_proc_data(page); + return proc_calc_metrics(page, start, off, count, eof, len); +} + +int rcu_read_proc_gp(char *page, char **start, off_t off, +int count, int *eof, void *data) +{ + int len; + extern int rcu_read_proc_gp_data(char *page); + + len = rcu_read_proc_gp_data(page); + return proc_calc_metrics(page, start, off, count, eof, len); +} + +int rcu_read_proc_ptrs(char *page, char **start, off_t off, + int count, int *eof, void *data) +{ + int len; + extern int rcu_read_proc_ptrs_data(char *page); + + len = rcu_read_proc_ptrs_data(page); + return proc_calc_metrics(page, start, off, count, eof, len); +} +#endif /* #ifdef CONFIG_RCU_STATS */ + void __init proc_misc_init(void) { struct proc_dir_entry *entry; @@ -621,6 +653,11 @@ void __init proc_misc_init(void) {"cmdline", cmdline_read_proc}, {"locks", locks_read_proc}, {"execdomains", execdomains_read_proc}, +#ifdef CONFIG_RCU_STATS + {"rcustats",rcu_read_proc}, + {"rcugp", rcu_read_proc_gp}, + {"rcuptrs", rcu_read_proc_ptrs}, +#endif /* #ifdef CONFIG_RCU_STATS */ {NULL,} }; for (p = simple_ones; p->name; p++) diff -urpN -X dontdiff linux-2.6.12-realtime-preempt-V0.7.51-27/include/linux/rcupdate.h linux-2.6.12-realtime-preempt-V0.7.51-27-ctrRCU/include/linux/rcupdate.h --- linux-2.6.12-realtime-preempt-V0.7.51-27/include/linux/rcupdate.h 2005-07-13 14:52:43.0 -0700 +++ linux-2.6.12-realtime-preempt-V0.7.51-27-ctrRCU/include/linux/rcupdate.h 2005-07-13 14:54:10.0 -0700 @@ -59,6 +59,7 @@ struct rcu_head { } while (0) +#ifndef CONFIG_PREEMPT_RCU /* Global control variables for rcupdate callback mechanism. */ struct rcu_ctrlblk { @@ -209,6 +210,18 @@ static inline int rcu_pending(int cpu) # define rcu_read_unlock preempt_enable #endif +#else /* #ifndef CONFIG_PREEMPT_RCU */ + +#define rcu_qsctr_inc(cpu) +#define rcu_bh_qsctr_inc(cpu) +#define call_rcu_bh(head, rcu) call_rcu(head, rcu) + +extern void rcu_read_lock(void); +extern void rcu_read_unlock(void); +extern int rcu_pending(int cpu); + +#endif /* #else #ifndef CONFIG_PREEMPT_RCU */ + /* * So where is rcu_write_lock()? It does not exist, as there is no * way for writers to lock out RCU readers. This is a feature, not @@ -230,16 +243,22 @@ static inline int
Re: RT and XFS
On Thu, Jul 14, 2005 at 08:56:58AM -0700, Daniel Walker wrote: > On Thu, 2005-07-14 at 07:23 +0200, Ingo Molnar wrote: > > * Daniel Walker <[EMAIL PROTECTED]> wrote: > > > > > > The whole point of using a semaphore in the pagebuf is because there > > > > is no tracking of who "owns" the lock so we can actually release it > > > > in a different context. Semaphores were invented for this purpose, > > > > and we use them in the way they were intended. ;) > > > > > > Where is the that semaphore spec, is that posix ? There is a new > > > construct called "complete" that is good for this type of stuff too. > > > No owner needed , just something running, and something waiting till > > > it completes. > > > > wrt. posix, we dont really care about that for kernel-internal > > primitives like struct semaphore. So whether it's posix or not has no > > relevance. > > This reminds me of Documentation/stable_api_nonsense.txt . That no one > should really be dependent on a particular kernel API doing a particular > thing. The kernel is play dough for the kernel hacker (as it should be), > including kernel semaphores. > > So we can change whatever we want, and make no excuses, as long as we > fix the rest of the kernel to work with our change. That seems pretty > sensible , because Linux should be an evolution. > > Daniel > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ---end quoted text--- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RT and XFS
On Thu, Jul 14, 2005 at 08:56:58AM -0700, Daniel Walker wrote: > This reminds me of Documentation/stable_api_nonsense.txt . That no one > should really be dependent on a particular kernel API doing a particular > thing. The kernel is play dough for the kernel hacker (as it should be), > including kernel semaphores. > > So we can change whatever we want, and make no excuses, as long as we > fix the rest of the kernel to work with our change. That seems pretty > sensible , because Linux should be an evolution. Daniel, get a fucking clue. Read some CS 101 literature on what a semaphore is defined to be. If you want PI singing dancing blinking christmas tree locking primites call them a mutex, but not a semaphore. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci_size() error condition
> It was always effectual for IO where the mask is 0x. Okay, point taken :) So for cases of base == maxbase, why would we ever want to return a nonzero value? What is the intended purpose of the second part of that conditional? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc patch 2/2] direct-io: remove address alignment check
On Thu, 2005-07-14 at 06:18, Andi Kleen wrote: > Daniel McNeil <[EMAIL PROTECTED]> writes: > > > This patch relaxes the direct i/o alignment check so that user addresses > > do not have to be a multiple of the device block size. > > The original reason for this limit was that lots of drivers > (not only IDE) explode when you give them odd sizes. Sometimes > it is even worse. > > I doubt all of them have been fixed. > > Very risky change. > That is exactly why I made this a separate patch, so that we can test and find out where the problems are and work to fix them. Are there problems only with odd sizes, or do drivers have problems with non-512 sizes? Allowing 4-byte aligned user addresses would be a good step forward, since it looks like malloc() returns 4-byte aligned addresses. Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux On-Demand Network Access (LODNA)
Take a look at FUSE, it should be able to do all you need - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RT and XFS
On Thu, 2005-07-14 at 07:23 +0200, Ingo Molnar wrote: > * Daniel Walker <[EMAIL PROTECTED]> wrote: > > > > The whole point of using a semaphore in the pagebuf is because there > > > is no tracking of who "owns" the lock so we can actually release it > > > in a different context. Semaphores were invented for this purpose, > > > and we use them in the way they were intended. ;) > > > > Where is the that semaphore spec, is that posix ? There is a new > > construct called "complete" that is good for this type of stuff too. > > No owner needed , just something running, and something waiting till > > it completes. > > wrt. posix, we dont really care about that for kernel-internal > primitives like struct semaphore. So whether it's posix or not has no > relevance. This reminds me of Documentation/stable_api_nonsense.txt . That no one should really be dependent on a particular kernel API doing a particular thing. The kernel is play dough for the kernel hacker (as it should be), including kernel semaphores. So we can change whatever we want, and make no excuses, as long as we fix the rest of the kernel to work with our change. That seems pretty sensible , because Linux should be an evolution. Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 2005-07-14 at 08:02 -0700, Christoph Lameter wrote: > I doubt that increasing the timer frequency is the way to go to solve > these issues. HZ should be as low as possible and we should strive for > a tickless system. Agreed. Most of those applications are driven by their own interrupt source anyway. I do think Linus' proposal, or even copying what Windows does, would be a big improvement over the fixed tick rate. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu-refcount stacker performance
On Thu, Jul 14, 2005 at 09:21:07AM -0500, [EMAIL PROTECTED] wrote: > On July 8 I sent out a patch which re-implemented the rcu-refcounting > of the LSM list in stacker for the sake of supporting safe security > module unloading. (patch reattached here for convenience) Here are > some performance results with and without that patch. Tests were run > on a 16-way ppc64 machine. Dbench was run 50 times, and kernbench > and reaim were run 10 times, and intervals are 95% confidence half- > intervals. > > These results seem pretty poor. I'm now wondering whether this is > really necessary. David Wheeler's original stacker had an option > of simply waiting a while after a module was taken out of the list > of active modules before freeing the modules. Something like that > is of course one option. I'm hoping we can also take advantage of > some already known module state info to be a little less coarse > about it. For instance, sys_delete_module() sets m->state to > MODULE_STATE_GOING before calling mod->exit(). If in place of > doing atomic_inc(>use), stacker skipped the m->hook() if > m->state!=MODULE_STATE_LIVE, then it may be safe to assume that > any m->hook() should be finished before sys_delete_module() gets > to free_module(mod). This seems to require adding a struct > module argument to security/security:mod_reg_security() so an LSM > can pass itself along. > > So I'll try that next. Hopefully by avoiding the potential cache > line bounces which atomic_inc(>use) bring, this should provide > far better performance. My guess is that the reference count is indeed costing you quite a bit. I glance quickly at the patch, and most of the uses seem to be of the form: increment ref count rcu_read_lock() do something rcu_read_unlock() decrement ref count Can't these cases rely solely on rcu_read_lock()? Why do you also need to increment the reference count in these cases? Thanx, Paul > thanks, > -serge > > Dbench (throughput, larger is better) > > plain stacker:1531.448400 +/- 15.791116 > stacker with rcu: 1408.056200 +/- 12.597277 > > Kernbench (runtime, smaller is better) > > plain stacker:52.341000 +/- 0.184995 > stacker with rcu: 53.722000 +/- 0.161473 > > Reaim (numjobs, larger is better) (gnuplot-friendly format) > plain stacker: > -- > Numforked jobs/minute 95% CI > 1 106662.857000 5354.267865 > 3 301628.571000 6297.121934 > 5 488142.858000 16031.685536 > 7 673200.00 23994.030784 > 9 852428.57 31485.607271 > 11 961714.29 0.00 > 13 1108157.14400027287.525982 > 15 1171178.57100049790.796869 > > Reaim (numjobs, larger is better) (gnuplot-friendly format) > plain stacker: > -- > Numforked jobs/minute 95% CI > 1 100542.857000 2099.040645 > 3 266657.139000 6297.121934 > 5 398892.858000 12023.765252 > 7 467670.00 14911.383385 > 9 418648.352000 11665.751441 > 11 396825.00 8700.115252 > 13 357480.912000 7567.947838 > 15 337571.428000 2332.267703 > > Patch: > > Index: linux-2.6.12/security/stacker.c > === > --- linux-2.6.12.orig/security/stacker.c 2005-07-08 13:43:15.0 > -0500 > +++ linux-2.6.12/security/stacker.c 2005-07-08 16:21:54.0 -0500 > @@ -33,13 +33,13 @@ > > struct module_entry { > struct list_head lsm_list; /* list of active lsms */ > - struct list_head all_lsms; /* list of all lsms */ > char *module_name; > int namelen; > struct security_operations module_operations; > + struct rcu_head m_rcu; > + atomic_t use; > }; > static struct list_head stacked_modules; /* list of stacked modules */ > -static struct list_head all_modules; /* list of all modules, including > freed */ > > static short sysfsfiles_registered; > > @@ -84,6 +84,32 @@ MODULE_PARM_DESC(debug, "Debug enabled o > * We return as soon as an error is returned. > */ > > +static inline void stacker_free_module(struct module_entry *m) > +{ > + kfree(m->module_name); > + kfree(m); > +} > + > +/* > + * Version of stacker_free_module called from call_rcu > + */ > +static void free_mod_fromrcu(struct rcu_head *head) > +{ > + struct module_entry *m; > + > + m = container_of(head, struct module_entry, m_rcu); > + stacker_free_module(m); > +} > + > +static void stacker_del_module(struct rcu_head *head) > +{ > + struct module_entry *m; > + > + m = container_of(head, struct module_entry, m_rcu); > + if
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 14 Jul 2005, Lee Revell wrote: > On Thu, 2005-07-14 at 10:38 +0200, Ingo Molnar wrote: > > - there are real-time applications (robotic environments: fast rotating > >tools, media and mobile/phone applications, etc.) that want 10 > >usecs precision. If such users increased HZ to 100,000 or even > >1000,000, the current timer implementation would start to creek: e.g. > >jiffies on 32-bit systems would wrap around in 11 hours or 1.1 hours. > >(To solve this cleanly, pretty much the only solution seems to be to > >increase the timeout to a 64 bit value. A non-issue for 64-bit > >systems, that's why i think we could eventually look at this > >possibility, once all the other problems are hashed out.) > > > > Those types of systems will not be 64 bit for many, many years, if > ever... Linux can already provide a response time within < 3 usecs from user space using f.e. the Altix RTC driver which can generate an interrupt that then sends a signal to an application. The Altix RTC clock is supported via POSIX timer syscalls and can be accessed using CLOCK_SGI_CYCLE. This has been available in Linux since last fall and events can be specified with 50 nanoseconds accurary. Other clock sources like HPET could do the same if someone would be willing to provide the hookup to the posix layer. I doubt that increasing the timer frequency is the way to go to solve these issues. HZ should be as low as possible and we should strive for a tickless system. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Merging relayfs?
Roman Zippel writes: > Hi, > > On Mon, 11 Jul 2005, Andrew Morton wrote: > > > > > Hi Andrew, can you please merge relayfs? It provides a low-overhead > > > > logging and buffering capability, which does not currently exist in > > > > the kernel. > > > > > > While the code is pretty nicely in shape it seems rather pointless to > > > merge until an actual user goes with it. > > > > Ordinarily I'd agree. But this is a bit like kprobes - it's a funny thing > > which other kernel features rely upon, but those features are often ad-hoc > > and aren't intended for merging. > > I agree with Christoph, I'd like to see a small (and useful) example > included, which can be used as reference. relayfs client still need some > code of their own to communicate with user space. If I look at the example > code I'm not really sure netlink is a good way to go as control channel. > kprobes has a rather simple interface, relayfs is more complex and I think > it's a good idea to provide some sane and complete example code to copy > from. > The netlink control channel seems to work very well, but I can certainly change the examples to use something different. Could you suggest something? > Looking through the patch there are still a few areas I'm concerned about: > - the usage of atomic_t look a little silly, there is only a single > writer and probably needs some cache line optimisations The only things that are atomic are the counts of produced and consumed buffers and these are only ever updated or read in the slow buffer-switch path. They're atomic because if they weren't, wouldn't it be possible for the client to read an unfinished value if the producer was in the middle of updating it? > - I would prefer "unsigned int" over just "unsigned" > - the padding/commit arrays can be easily managed by the client Yes, I can move them out and update the examples to reflect that, but I thought that if this was something that most clients would need to do, it made some sense to keep it in relayfs and avoid duplication in the clients. > - overwrite mode can be implemented via the buffer switch callback The buffer switch callback is already where this is handled, unless you're thinking of something else - one of the first checks in the buffer switch is relay_buf_full(), which always returns 0 if the buffer is in overwrite mode. Tom - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: resuming swsusp twice
Andy Isaacson wrote: > Yesterday I booted my laptop to 2.6.13-rc2-mm1, suspended to swsusp, and > then resumed. It ran fine overnight, including a fair amount of IO > (running firefox, rsyncing ~/Mail/archive from my mail server, hg pull, > etc). This morning I did a swsusp: > > echo shutdown > /sys/power/disk > echo disk > /sys/power/state > > and got a panic along the lines of "Unable to find swap space, try a panic? it should only be an error message, but the machine should still be alive. > swapon -a". Unfortunately I was in a hurry and didn't record the error > messages. I powered off, then a few minutes later powered on again. Powered off hard or "shutdown -h now"? > At this point, it resumed *to the swsusp state from yesterday*! > As soon as I realized what had happened, I powered off (not > shutdown) and rebooted. Good. > On the next boot it did not find a swsusp signature and booted normally; > ext3 did a normal recovery and seemed OK, but I was suspicious and did a > fsck -f, which revealed a lot of damage; most of the damage seemed to be this is expected in this case, unfortunately. > in the hg repo which had been pulled from www.kernel.org/hg/. > > It's extremely unfortunate that there is *any* failure mode in swsusp > that can result in this behavior. I of course won't say that this cannot happen, but by design, the swsusp signature is invalidated even before reading the image, so theoretically it should not happen. > I will try to reproduce, but I'm curious if anyone else has seen this. i have not seen anything like that, but i am not always running the latest & greatest kernel. -- Stefan Seyfried \ "I didn't want to write for pay. I QA / R Team Mobile Devices \ wanted to be paid for what I write." SUSE LINUX Products GmbH, Nürnberg \-- Leonard Cohen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 2005-07-14 at 10:38 +0200, Ingo Molnar wrote: > - there are real-time applications (robotic environments: fast rotating >tools, media and mobile/phone applications, etc.) that want 10 >usecs precision. If such users increased HZ to 100,000 or even >1000,000, the current timer implementation would start to creek: e.g. >jiffies on 32-bit systems would wrap around in 11 hours or 1.1 hours. >(To solve this cleanly, pretty much the only solution seems to be to >increase the timeout to a 64 bit value. A non-issue for 64-bit >systems, that's why i think we could eventually look at this >possibility, once all the other problems are hashed out.) > Those types of systems will not be 64 bit for many, many years, if ever... Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] visws: reexport pm_power_off
On Wednesday 13 July 2005 17:38, James Bottomley wrote: > [PATCH] Remove i386_ksyms.c, almost > > made files like smp.c do their own EXPORT_SYMBOLS. This means that all > subarchitectures that override these symbols now have to do the exports > themselves. This patch adds the exports for voyager (which is the most > affected since it has a separate smp harness). However, someone should > audit all the other subarchitectures to see if any others got broken. Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]> --- arch/i386/mach-visws/reboot.c |1 + 1 files changed, 1 insertion(+) --- linux-vanilla/arch/i386/mach-visws/reboot.c 2005-07-13 19:45:59.0 +0400 +++ linux-visws/arch/i386/mach-visws/reboot.c 2005-07-14 18:53:23.0 +0400 @@ -7,6 +7,7 @@ #include "piix4.h" void (*pm_power_off)(void); +EXPORT_SYMBOL(pm_power_off); void machine_restart(char * __unused) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS corruption on move from xscale to i686
On Thu, Jul 14, 2005 at 05:45:15PM +0300, Yura Pakhuchiy wrote: > Yes, but a lof of people use older versions of compilers and suffer > from this bug. > I personally was very unhappy when lost my data. then host the patch somewhere and make sure to apply it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS corruption on move from xscale to i686
2005/7/14, Christoph Hellwig <[EMAIL PROTECTED]>: > On Thu, Jul 14, 2005 at 04:50:01PM +0300, Yura Pakhuchiy wrote: > > 2005/7/14, Nathan Scott <[EMAIL PROTECTED]>: > > > On Wed, Jul 13, 2005 at 06:22:28PM +0300, Yura Pakhuchiy wrote: > > > > I found patch by Greg Ungreger to fix this problem, but why it's still > > > > not in mainline? Or it's a gcc problem and should be fixed by gcc folks? > > > > > > Yes, IIRC the patch was incorrect for other platforms, and it sure > > > looked like an arm-specific gcc problem (this was ages back, so > > > perhaps its fixed by now). > > > > AFAIR gcc-3.4.3 was released after this conversation take place at > > linux-xfs, > > maybe add something like this: > > > > #ifdef XSCALE > > /* We need this because some gcc versions for xscale are broken. */ > > [patched version here] > > #else > > [original version here] > > #endif > > no, just fix your compiler or let the gcc folks do it. Did anyone of > the arm folks ever open a PR at the gcc bugzilla with a reproduced > testcase? You're never get your compiler fixed with that attitude. Yes, but a lof of people use older versions of compilers and suffer from this bug. I personally was very unhappy when lost my data. Best regards, Yura - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial: 8250 fails to detect Exar XR16L2551 correctly
Alex Williamson wrote: > > David, would you mind > trying this on the XR16L255x part? (ie. don't use console=ttyS, use > console=uart,...) Thanks, I wasn't even aware you could do this... These are the serial ports I have: ttyS0 at MMIO 0xc800 (irq = 15) is a XScale IXP425 internal ttyS1 at MMIO 0xc8001000 (irq = 13) is a XScale " " ttyS2 at MMIO 0x5300 (irq = 21) is a XR16550 XR16L2551 ttyS3 at MMIO 0x5308 (irq = 21) is a XR16550 " I tried console=uart,mmio,0x5300,115200 and my board didn't print anything to the console and the boot failed somewhere before starting network (I don't know exactly where or why since I couldn't see any messages). Using console=ttyS2,115200 works fine. What's 8250_early.c for anyway? console=ttyS... has always worked fine for me. David Vrabel -- David Vrabel, Design Engineer Arcom, Clifton Road Tel: +44 (0)1223 411200 ext. 3233 Cambridge CB1 7EA, UK Web: http://www.arcom.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS corruption on move from xscale to i686
On Thu, Jul 14, 2005 at 04:50:01PM +0300, Yura Pakhuchiy wrote: > 2005/7/14, Nathan Scott <[EMAIL PROTECTED]>: > > On Wed, Jul 13, 2005 at 06:22:28PM +0300, Yura Pakhuchiy wrote: > > > I found patch by Greg Ungreger to fix this problem, but why it's still > > > not in mainline? Or it's a gcc problem and should be fixed by gcc folks? > > > > Yes, IIRC the patch was incorrect for other platforms, and it sure > > looked like an arm-specific gcc problem (this was ages back, so > > perhaps its fixed by now). > > AFAIR gcc-3.4.3 was released after this conversation take place at linux-xfs, > maybe add something like this: > > #ifdef XSCALE > /* We need this because some gcc versions for xscale are broken. */ > [patched version here] > #else > [original version here] > #endif no, just fix your compiler or let the gcc folks do it. Did anyone of the arm folks ever open a PR at the gcc bugzilla with a reproduced testcase? You're never get your compiler fixed with that attitude. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] rocket.c: Fix ldisc ref count handling
If bailing out because there is nothing to receive in rp_do_receive(), tty_ldisc_deref is not called. Failure to do so increases the ref count and causes release_dev() to hang since it can't get the ref count to 0. --- Signed-off-by: Michal Ostrowski <[EMAIL PROTECTED]> drivers/char/rocket.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/char/rocket.c b/drivers/char/rocket.c --- a/drivers/char/rocket.c +++ b/drivers/char/rocket.c @@ -355,7 +355,7 @@ static void rp_do_receive(struct r_port ToRecv = space; if (ToRecv <= 0) - return; + goto done; /* * if status indicates there are errored characters in the @@ -437,6 +437,7 @@ static void rp_do_receive(struct r_port } /* Push the data up to the tty layer */ ld->receive_buf(tty, tty->flip.char_buf, tty->flip.flag_buf, count); + done: tty_ldisc_deref(ld); } pgpj30YGKPzqF.pgp Description: PGP signature
Re: Thread_Id
RVK wrote: Ian Campbell wrote: On Thu, 2005-07-14 at 15:36 +0530, RVK wrote: bits/pthreadtypes.h:150:typedef unsigned long int pthread_t; That's an implementation detail which you cannot determine any information from. What Arjan is saying is that pthread_t is a cookie -- this means that you cannot interpret it in any way, it is just a "thing" which you can pass back to the API, that pthread_t happens to be typedef'd to unsigned long int is irrelevant. Do you want to say for both 2.6.x and 2.4.x I should interpret that way ? rvk Indeed, for ANY OS using pthreads it should be interpreted that way.. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High irq load (Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt)
Linus Torvalds <[EMAIL PROTECTED]> writes: > On Wed, 13 Jul 2005, Jan Engelhardt wrote: > > > > No, some kernel code causes a triple-fault-and-reboot when the HZ is >= > > 10KHz. Maybe the highest possible value is 8192 Hz, not sure. > > Can you post the triple-fault message? It really shouldn't triple-fault, > although it _will_ obviously spend all time just doing timer interrupts, > so it shouldn't get much (if any) real work done either. ... > There should be no conceptual "highest possible HZ", although there are > certainly obvious practical limits to it (both on the timer hw itself, and > just the fact that at some point we'll spend all time on the timer > interrupt and won't get anything done..) HZ=1 appears to work fine here after some hacks to avoid over/underflows in integer arithmetics. gkrellm reports about 3-4% CPU usage when the system is idle, on a 3.07 GHz P4. --- Makefile|2 +- arch/i386/kernel/cpu/proc.c |6 ++ fs/nfsd/nfssvc.c|2 +- include/linux/jiffies.h |6 ++ include/linux/nfsd/stats.h |4 include/linux/timex.h |2 +- include/net/tcp.h | 12 +--- init/calibrate.c| 21 + kernel/Kconfig.hz |6 ++ kernel/timer.c |4 ++-- net/ipv4/netfilter/ip_conntrack_proto_tcp.c |2 +- 11 files changed, 58 insertions(+), 9 deletions(-) diff --git a/Makefile b/Makefile --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ VERSION = 2 PATCHLEVEL = 6 SUBLEVEL = 13 -EXTRAVERSION =-rc3 +EXTRAVERSION =-rc3-test NAME=Woozy Numbat # *DOCUMENTATION* diff --git a/arch/i386/kernel/cpu/proc.c b/arch/i386/kernel/cpu/proc.c --- a/arch/i386/kernel/cpu/proc.c +++ b/arch/i386/kernel/cpu/proc.c @@ -128,9 +128,15 @@ static int show_cpuinfo(struct seq_file x86_cap_flags[i] != NULL ) seq_printf(m, " %s", x86_cap_flags[i]); +#if HZ <= 5000 seq_printf(m, "\nbogomips\t: %lu.%02lu\n\n", c->loops_per_jiffy/(50/HZ), (c->loops_per_jiffy/(5000/HZ)) % 100); +#else + seq_printf(m, "\nbogomips\t: %lu.%02lu\n\n", +c->loops_per_jiffy/(50/HZ), +(c->loops_per_jiffy*(HZ/5000)) % 100); +#endif return 0; } diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -160,7 +160,7 @@ update_thread_usage(int busy_threads) decile = busy_threads*10/nfsdstats.th_cnt; if (decile>0 && decile <= 10) { diff = nfsd_last_call - prev_call; - if ( (nfsdstats.th_usage[decile-1] += diff) >= NFSD_USAGE_WRAP) + if ( (nfsdstats.th_usage[decile-1] += diff) >= NFSD_USAGE_WRAP) nfsdstats.th_usage[decile-1] -= NFSD_USAGE_WRAP; if (decile == 10) nfsdstats.th_fullcnt++; diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -38,6 +38,12 @@ # define SHIFT_HZ 9 #elif HZ >= 768 && HZ < 1536 # define SHIFT_HZ 10 +#elif HZ >= 1536 && HZ < 3072 +# define SHIFT_HZ 11 +#elif HZ >= 3072 && HZ < 6144 +# define SHIFT_HZ 12 +#elif HZ >= 6144 && HZ < 12288 +# define SHIFT_HZ 13 #else # error You lose. #endif diff --git a/include/linux/nfsd/stats.h b/include/linux/nfsd/stats.h --- a/include/linux/nfsd/stats.h +++ b/include/linux/nfsd/stats.h @@ -30,7 +30,11 @@ struct nfsd_stats { }; /* thread usage wraps very million seconds (approx one fortnight) */ +#if HZ < 2048 #defineNFSD_USAGE_WRAP (HZ*100) +#else +#defineNFSD_USAGE_WRAP (2048*100) +#endif #ifdef __KERNEL__ diff --git a/include/linux/timex.h b/include/linux/timex.h --- a/include/linux/timex.h +++ b/include/linux/timex.h @@ -90,7 +90,7 @@ * * FINENSEC is 1 ns in SHIFT_UPDATE units of the time_phase variable. */ -#define SHIFT_SCALE 22 /* phase scale (shift) */ +#define SHIFT_SCALE 25 /* phase scale (shift) */ #define SHIFT_UPDATE (SHIFT_KG + MAXTC) /* time offset scale (shift) */ #define SHIFT_USEC 16 /* frequency offset scale (shift) */ #define FINENSEC (1L << (SHIFT_SCALE - 10)) /* ~1 ns in phase units */ diff --git a/include/net/tcp.h b/include/net/tcp.h --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -486,8 +486,8 @@ static __inline__ int tcp_sk_listen_hash so that we select tick to get range about 4 seconds. */ -#if HZ <= 16 || HZ > 4096 -# error Unsupported: HZ <= 16 or HZ > 4096 +#if HZ <= 16 +# error Unsupported: HZ <= 16 #elif HZ <= 32 # define TCP_TW_RECYCLE_TICK (5+2-TCP_TW_RECYCLE_SLOTS_LOG) #elif HZ <= 64 @@ -502,8 +502,14 @@ static __inline__ int tcp_sk_listen_hash
rcu-refcount stacker performance
On July 8 I sent out a patch which re-implemented the rcu-refcounting of the LSM list in stacker for the sake of supporting safe security module unloading. (patch reattached here for convenience) Here are some performance results with and without that patch. Tests were run on a 16-way ppc64 machine. Dbench was run 50 times, and kernbench and reaim were run 10 times, and intervals are 95% confidence half- intervals. These results seem pretty poor. I'm now wondering whether this is really necessary. David Wheeler's original stacker had an option of simply waiting a while after a module was taken out of the list of active modules before freeing the modules. Something like that is of course one option. I'm hoping we can also take advantage of some already known module state info to be a little less coarse about it. For instance, sys_delete_module() sets m->state to MODULE_STATE_GOING before calling mod->exit(). If in place of doing atomic_inc(>use), stacker skipped the m->hook() if m->state!=MODULE_STATE_LIVE, then it may be safe to assume that any m->hook() should be finished before sys_delete_module() gets to free_module(mod). This seems to require adding a struct module argument to security/security:mod_reg_security() so an LSM can pass itself along. So I'll try that next. Hopefully by avoiding the potential cache line bounces which atomic_inc(>use) bring, this should provide far better performance. thanks, -serge Dbench (throughput, larger is better) plain stacker:1531.448400 +/- 15.791116 stacker with rcu: 1408.056200 +/- 12.597277 Kernbench (runtime, smaller is better) plain stacker:52.341000 +/- 0.184995 stacker with rcu: 53.722000 +/- 0.161473 Reaim (numjobs, larger is better) (gnuplot-friendly format) plain stacker: -- Numforked jobs/minute 95% CI 1 106662.857000 5354.267865 3 301628.571000 6297.121934 5 488142.858000 16031.685536 7 673200.00 23994.030784 9 852428.57 31485.607271 11 961714.29 0.00 13 1108157.14400027287.525982 15 1171178.57100049790.796869 Reaim (numjobs, larger is better) (gnuplot-friendly format) plain stacker: -- Numforked jobs/minute 95% CI 1 100542.857000 2099.040645 3 266657.139000 6297.121934 5 398892.858000 12023.765252 7 467670.00 14911.383385 9 418648.352000 11665.751441 11 396825.00 8700.115252 13 357480.912000 7567.947838 15 337571.428000 2332.267703 Patch: Index: linux-2.6.12/security/stacker.c === --- linux-2.6.12.orig/security/stacker.c2005-07-08 13:43:15.0 -0500 +++ linux-2.6.12/security/stacker.c 2005-07-08 16:21:54.0 -0500 @@ -33,13 +33,13 @@ struct module_entry { struct list_head lsm_list; /* list of active lsms */ - struct list_head all_lsms; /* list of all lsms */ char *module_name; int namelen; struct security_operations module_operations; + struct rcu_head m_rcu; + atomic_t use; }; static struct list_head stacked_modules; /* list of stacked modules */ -static struct list_head all_modules; /* list of all modules, including freed */ static short sysfsfiles_registered; @@ -84,6 +84,32 @@ MODULE_PARM_DESC(debug, "Debug enabled o * We return as soon as an error is returned. */ +static inline void stacker_free_module(struct module_entry *m) +{ + kfree(m->module_name); + kfree(m); +} + +/* + * Version of stacker_free_module called from call_rcu + */ +static void free_mod_fromrcu(struct rcu_head *head) +{ + struct module_entry *m; + + m = container_of(head, struct module_entry, m_rcu); + stacker_free_module(m); +} + +static void stacker_del_module(struct rcu_head *head) +{ + struct module_entry *m; + + m = container_of(head, struct module_entry, m_rcu); + if (atomic_dec_and_test(>use)) + stacker_free_module(m); +} + #define stack_for_each_entry(pos, head, member) \ for (pos = list_entry((head)->next, typeof(*pos), member); \ >member != (head); \ @@ -93,16 +119,27 @@ MODULE_PARM_DESC(debug, "Debug enabled o /* to make this safe for module deletion, we would need to * add a reference count to m as we had before */ +/* + * XXX We can't quite do this - we delete the module before we grab + * m->next? + * We could just do a call_rcu. Then the call_rcu happens in same + * rcu cycle has dereference, so module won't be deleted until the + * next cycle. + * That's
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 2005-07-14 at 11:24 +0200, Jan Engelhardt wrote: > "My expectation is if we want to beat the competition, we'll want > the ability to go *under* 100Hz." > >>> > >>> What does Windows do here? > >> > >> windows xp base rate is 100Hz... but multimedia apps can ask for almost > > > > 83Hz > > Well, Windoes 98 (vmmon) shows very different ones: Wow. Windows has been doing this since *98*? So that's what Paul meant by "the stupidity of a fixed HZ, which is so early '90s that its embarrassing". Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Realtime Preemption, 2.6.12, Beginners Guide?
K.R. Foley wrote: K.R. Foley wrote: Karsten Wiese wrote: Am Mittwoch, 13. Juli 2005 16:01 schrieb K.R. Foley: Ingo Molnar wrote: * Chuck Harding <[EMAIL PROTECTED]> wrote: CC [M] sound/oss/emu10k1/midi.o sound/oss/emu10k1/midi.c:48: error: syntax error before '__attribute__' sound/oss/emu10k1/midi.c:48: error: syntax error before ')' token Here's the offending line: 48 static DEFINE_SPINLOCK(midi_spinlock __attribute((unused))); Lee I got it to compile but it won't boot - it hangs right after the 'Uncompressing Linux... OK, booting the kernel' - I'm using .config from 51-27 (attached) and -51-27 worked just fine? I've uploaded -29 with the -28 io-apic changes undone (will re-apply them once Karsten has figured out what's wrong). Ingo I too had the same problem booting -51-28 on my older SMP system at home. -51-29 just booted fine. Have I corrected the other path of ioapic early initialization, which had lacked virtual-address setup before ioapic_data[ioapic] was to be filled in -51-28? Please test attached patch on top of -51-29 or later. Also on Systems that liked -51-28. thanks, Karsten Karsten, Just booted on my 2.6 dual Xeon w/HT and thus far all is well. I am still building on the older SMP system that didn't like -51-28. Will report after I try booting that one. Just booted on my older SMP box that barfed on -51-28. It would appear that the init problem is resolved. DOH! All of the above is on -51-30 with Karsten's patch applied. -- kr - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Realtime Preemption, 2.6.12, Beginners Guide?
K.R. Foley wrote: Karsten Wiese wrote: Am Mittwoch, 13. Juli 2005 16:01 schrieb K.R. Foley: Ingo Molnar wrote: * Chuck Harding <[EMAIL PROTECTED]> wrote: CC [M] sound/oss/emu10k1/midi.o sound/oss/emu10k1/midi.c:48: error: syntax error before '__attribute__' sound/oss/emu10k1/midi.c:48: error: syntax error before ')' token Here's the offending line: 48 static DEFINE_SPINLOCK(midi_spinlock __attribute((unused))); Lee I got it to compile but it won't boot - it hangs right after the 'Uncompressing Linux... OK, booting the kernel' - I'm using .config from 51-27 (attached) and -51-27 worked just fine? I've uploaded -29 with the -28 io-apic changes undone (will re-apply them once Karsten has figured out what's wrong). Ingo I too had the same problem booting -51-28 on my older SMP system at home. -51-29 just booted fine. Have I corrected the other path of ioapic early initialization, which had lacked virtual-address setup before ioapic_data[ioapic] was to be filled in -51-28? Please test attached patch on top of -51-29 or later. Also on Systems that liked -51-28. thanks, Karsten Karsten, Just booted on my 2.6 dual Xeon w/HT and thus far all is well. I am still building on the older SMP system that didn't like -51-28. Will report after I try booting that one. Just booted on my older SMP box that barfed on -51-28. It would appear that the init problem is resolved. -- kr - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6] remove PCI_BRIDGE_CTL_VGA handling from setup-bus.c
On 7/14/05, Russell King <[EMAIL PROTECTED]> wrote: > On Thu, Jul 14, 2005 at 03:53:44PM +0400, Ivan Kokshaysky wrote: > > The setup-bus code doesn't work correctly for configurations > > with more than one display adapter in the same PCI domain. > > This stuff actually is a leftover of an early 2.4 PCI setup code > > and apparently it stopped working after some "bridge_ctl" changes. > > So the best thing we can do is just to remove it and rely on the fact > > that any firmware *has* to configure VGA port forwarding for the boot > > display device properly. > > What happens when there is no firmware? > > I'm sure this code would not have been added had there not been a reason > for it. Do we know why it was added? I'm don't think it has ever been working in the 2.6 series. If you are getting rid of it get rid of the #define PCI_BRIDGE_CTL_VGA in pci.h too since this code was the only user. Looking at the code as written I don't think it would work on my machine with multiple VGA devices on different buses. I use the system BIOS to enable the one I want and it sets up the bridges. This code is part of VGA arbitration which BenH is addressing with a more globally comprehensive patch. Ben's code will probably replace it. -- Jon Smirl [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Open source firewalls
Helge Hafting wrote: RVK wrote: Proxies can be a good way of filtering but it can't avoid buffer overflows. Yes they can - did you read and udnerstand my previous post at all? A proxy _can_ avoid a buffer overflow by noticing the anomalously large data item and simply refuse to pass it on to the real server! The proxy can terminate the tcp connection and throw away the data. Some of the validations can be done at proxy end. But there are more invisible scnarios than the simple visible ones. And its definately much preferable to use Apache like stuff then using our ownI hope u agree with me... I don't disagree on proxy doing the filtering and validations what I mean to say is it can't garantee avoiding buffer overflows. As it itself can be a source for it. It can only increase it. More code more bugs. Of course the proxy can be buggy too, but it is easier to avoid problems there: 1. The server was written to perform a service, perhaps with security thrown in later. (Yes, that's bad design.) A firewall proxy is written for security, so buffer overflows are usually avoided in the firewall proxy itself. Because this is exactly what the firewall writer is thinking about. 2. The proxy may be much smaller and simpler than the server it protects, it is therefore much easier to audit for security problems. 3. Fixing the server is indeed best, but not necessarily an option. It could be proprietary, or written in a unknown language. No. As ur the only user of ur program, means resources is limited to visulize all senarios for all protocols. No one would like to keep on adding the proxies for the sake of buffer overflow. Is basically taken as a facility for filtering. If it is running on a hardware firewall as a service then its more "Hardware firewall" ??? Yes embedded firewall. When ur gateway is protected by firewall device. Another one is a software firewall sol'n. dangerous as once it is compramised then IDS signatures also can be deleated :-). No use of IDS the right ? A compromised firewall is of no use - sure. So what? That applies to any firewall, any IDS, or any server for that matter. No its not true as one ur frewall is compramised, it can effect other services also. But at the same time if any of the servers is compramises only that server is effected. So the best way is either make your code free of buffer overflows or Yes, but the server may not be "my code" at all. Can't you see that problem? It may very well be someone elses code. I may not have the source, or the source may be useless for a number of reasons, such as: 1. being written in a language I don't understand 2. Have a licence that forbids change 3. Need compilers/tools I don't have 4. Being such a nasty mess that writing a proxy is much easier than fixing the bloated ill-designed server code one unfortunately depends on for the time being. In these cases, I can still protect my server with a proxy firewall, although I can't fix the server itself. Again it will be ur own code with limitation of taking care of all scenarios. Take an exampleId we are trying to add a web proxy and using apache as our server. Do u say that code written by us will be more safe than apache ? :-) use some library which controls the attack during any buffer overflow or use Stack Randomisation and Canary based approaches. A library that controls any buffer overflow doesn't exist at all. Its there and available. Just need to search. Stack randomization helps but don't solve all cases, the attacker simply need code to search for randomly moved parts he need, pad with a few megabytes of NOPs and things like that. Of course, a proxy can easily detect megabytes of NOPs and kill that connection . . . Its not easy to have an attach with Stack Randomization. Like TCP syn randomization. Regards rvk Helge Hafting - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ . - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Realtime Preemption, 2.6.12, Beginners Guide?
Karsten Wiese wrote: Am Mittwoch, 13. Juli 2005 16:01 schrieb K.R. Foley: Ingo Molnar wrote: * Chuck Harding <[EMAIL PROTECTED]> wrote: CC [M] sound/oss/emu10k1/midi.o sound/oss/emu10k1/midi.c:48: error: syntax error before '__attribute__' sound/oss/emu10k1/midi.c:48: error: syntax error before ')' token Here's the offending line: 48 static DEFINE_SPINLOCK(midi_spinlock __attribute((unused))); Lee I got it to compile but it won't boot - it hangs right after the 'Uncompressing Linux... OK, booting the kernel' - I'm using .config from 51-27 (attached) and -51-27 worked just fine? I've uploaded -29 with the -28 io-apic changes undone (will re-apply them once Karsten has figured out what's wrong). Ingo I too had the same problem booting -51-28 on my older SMP system at home. -51-29 just booted fine. Have I corrected the other path of ioapic early initialization, which had lacked virtual-address setup before ioapic_data[ioapic] was to be filled in -51-28? Please test attached patch on top of -51-29 or later. Also on Systems that liked -51-28. thanks, Karsten Karsten, Just booted on my 2.6 dual Xeon w/HT and thus far all is well. I am still building on the older SMP system that didn't like -51-28. Will report after I try booting that one. -- kr - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6] remove PCI_BRIDGE_CTL_VGA handling from setup-bus.c
On Thu, Jul 14, 2005 at 03:53:44PM +0400, Ivan Kokshaysky wrote: > The setup-bus code doesn't work correctly for configurations > with more than one display adapter in the same PCI domain. > This stuff actually is a leftover of an early 2.4 PCI setup code > and apparently it stopped working after some "bridge_ctl" changes. > So the best thing we can do is just to remove it and rely on the fact > that any firmware *has* to configure VGA port forwarding for the boot > display device properly. What happens when there is no firmware? I'm sure this code would not have been added had there not been a reason for it. Do we know why it was added? -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Thread_Id
Jakub Jelinek wrote: On Thu, Jul 14, 2005 at 02:25:43PM +0200, Arjan van de Ven wrote: pure luck. NPTL threading uses it to store a pointer to per thread info structure; other threading (linuxthreads) may have stored a pid there to identify the internal thread. nptl is 2.6 only so you might have switched implementation of threading when you switched kernels. Actually, in linuxthreads what pthread_self () returned has the first slot in its internal threads array (up to max number of supported threads) that was unused at thread creation time in the low order bits and sequence number of thread creation in its high order bits. So unless you are using yet another threading library (I thought NGPT is dead for years...), the claim that you get the same numbers from gettid() syscall under NPTL as pthread_self () gives you under LinuxThreads is simply not true. And you certainly shouldn't be using gettid () syscall in NPTL, as it is just an implementation detail that there is a 1:1 mapping between NPTL threads and kernel threads. It can change at any time. Which ever is the implementation its expected to be backward compatible. Especially thread libraries. As lot of applications using that. rvk Jakub - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: fdisk: What do plus signs after "Blocks" mean?
I always thought; First 446 bytes are boot code and all Next 64 bytes are for 4 partition records, 16 bytes each Last 2 bytes are signature ? -fd On Wed, 2005-07-13 at 06:24, Jan Engelhardt wrote: > > Guys, thanks a lot for the explanations! > > > > Actually, it seems like one can backup information on ALL partitions > >by using the command "sfdisk -dx /dev/hdX". Supposedly, it reads not > >only primary but also extended partitions. "sfdisk -x /dev/hdX" should > >be then able to write whatever is known back to the disk. > > MBR size is 448 bytes, the rest is "the partition table", with space for four > entries. If one wants more, then s/he creates a [primary] partition, tagging > it "extended", and the "extended partiton table" is within that primary > partition. So yes, by dd'ing /dev/hdX, you get everything. Including "lost > sectors" if you dd it back to a bigger HD. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS corruption on move from xscale to i686
2005/7/14, Nathan Scott <[EMAIL PROTECTED]>: > On Wed, Jul 13, 2005 at 06:22:28PM +0300, Yura Pakhuchiy wrote: > > I found patch by Greg Ungreger to fix this problem, but why it's still > > not in mainline? Or it's a gcc problem and should be fixed by gcc folks? > > Yes, IIRC the patch was incorrect for other platforms, and it sure > looked like an arm-specific gcc problem (this was ages back, so > perhaps its fixed by now). AFAIR gcc-3.4.3 was released after this conversation take place at linux-xfs, maybe add something like this: #ifdef XSCALE /* We need this because some gcc versions for xscale are broken. */ [patched version here] #else [original version here] #endif Best regards, Yura - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial: 8250 fails to detect Exar XR16L2551 correctly
On Wed, Jul 13, 2005 at 11:04:56AM -0600, Alex Williamson wrote: > On Mon, 2005-07-11 at 15:17 -0600, Alex Williamson wrote: > >No, I think this is a problem with the broken A2 UARTs getting > > confused in serial8250_set_sleep(). If I remove either UART_CAP_SLEEP > > or UART_CAP_EFR from the capabilities list for this UART, it behaves > > normally. Also, just commenting out the UART_CAP_EFR chunks of > > set_sleep make it behave. I'll ping Exar for more data. Thanks, > > Hi Russell, > >I don't know enough about the extended UART programming model, but I > notice that when UART_CAP_EFR and UART_CAP_SLEEP are set on a UART, we > set the UART_IERX_SLEEP bit in the UART_IER immediately after it's found > and configured. Ah, I see what's happening. We're detecting the port and doing the autoconfig. Then we're checking to see if it's the console, and if not putting it into low power mode. Then we try to register the console, which may result in this UART becoming a console. So now we have a console which is in low power mode. Bad bad bad. No cookie for the serial layer today. > Are there known working configs where a UART w/ EFR and SLEEP are > able to be used as a serial console? No idea - I'm completely reliant on other folk to report problems with the 8250 driver with their random versions of UARTs which are out in the field. I only have 16450, 16550A and 16750 UARTs here. Hmm, I need to consider killing register_serial() and the associated code in serial_core.c earlier so I can sanely fix this problem. I think it's time to give the remaining register_serial() users an extra push... I haven't seen _any_ activity from the remaining users, so I might have to take the attitude that "if they don't care, I don't care about breaking their code" which would be rather shameful as far as the users go. (but hey, user pressure might wake up the maintainers.) -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Realtime Preemption, 2.6.12, Beginners Guide?
Ingo Molnar wrote: * Chuck Harding <[EMAIL PROTECTED]> wrote: I missed getting -51-29 but just booted up -51-30 and all is well. Thanks. Just out of curiosity, what was changed between -51-28, 29, and 30? -51-29 had new IO-APIC optimizations - and i reverted them in -51-30. Ingo Ingo, I just noticed that the keyboard repeat problem is back in a bad way in -51-30. I was not seeing this before I left this PC about 16 hours ago. And the uptime is: 08:34:10 up 18:46, 7 users, load average: 3.32, 3.24, 2.53 -- kr - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6] remove PCI_BRIDGE_CTL_VGA handling from setup-bus.c
On 7/14/05, Ivan Kokshaysky <[EMAIL PROTECTED]> wrote: > The setup-bus code doesn't work correctly for configurations > with more than one display adapter in the same PCI domain. > This stuff actually is a leftover of an early 2.4 PCI setup code > and apparently it stopped working after some "bridge_ctl" changes. > So the best thing we can do is just to remove it and rely on the fact > that any firmware *has* to configure VGA port forwarding for the boot > display device properly. This fixes my system where the VGA display device is on the second bus. -- Jon Smirl [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Merging relayfs?
Hi, On Mon, 11 Jul 2005, Andrew Morton wrote: > > > Hi Andrew, can you please merge relayfs? It provides a low-overhead > > > logging and buffering capability, which does not currently exist in > > > the kernel. > > > > While the code is pretty nicely in shape it seems rather pointless to > > merge until an actual user goes with it. > > Ordinarily I'd agree. But this is a bit like kprobes - it's a funny thing > which other kernel features rely upon, but those features are often ad-hoc > and aren't intended for merging. I agree with Christoph, I'd like to see a small (and useful) example included, which can be used as reference. relayfs client still need some code of their own to communicate with user space. If I look at the example code I'm not really sure netlink is a good way to go as control channel. kprobes has a rather simple interface, relayfs is more complex and I think it's a good idea to provide some sane and complete example code to copy from. Looking through the patch there are still a few areas I'm concerned about: - the usage of atomic_t look a little silly, there is only a single writer and probably needs some cache line optimisations - I would prefer "unsigned int" over just "unsigned" - the padding/commit arrays can be easily managed by the client - overwrite mode can be implemented via the buffer switch callback In general I'm not against merging, but I have a few ideas for further cleanups/optimisations and it really would help to have some useful example code (e.g. a _simple_ event tracer). bye, Roman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: moving DRM header files
On 7/14/05, Dave Airlie <[EMAIL PROTECTED]> wrote: > > > I'm thinking include/linux/drm/ > > > but include/linux would also be possible. > > > > > > Any suggestions or ideas? > > > > If you're in a mood to move things, how about moving drivers/char/drm > > to drivers/video/drm. > > But that has little point beyond aesthetics... moving the header files > is for a reason that I want them to start appearing in userspace > includeable places.. as part of the cleanup for libdrm.. > > Moving c files internally in the kernel provides no real benefit over > not moving them.. When you start merging DRM and fbdev you will be able to use relative paths that are closer together. For example #include "../char/drm/drmP.h" versus "#include "drm/drmP.h" for internal headers. DRM and fbdev need to be moved next to each other in kconfig too if they start depending on each other. It if hard to figure out that a video option might not be visible because the char/drm/option is not turned on. -- Jon Smirl [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc patch 2/2] direct-io: remove address alignment check
Daniel McNeil <[EMAIL PROTECTED]> writes: > This patch relaxes the direct i/o alignment check so that user addresses > do not have to be a multiple of the device block size. The original reason for this limit was that lots of drivers (not only IDE) explode when you give them odd sizes. Sometimes it is even worse. I doubt all of them have been fixed. Very risky change. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mb_cache_shrink() frees unexpected caches
mb_cache_shrink() tries to free all sort of mbcache in the lru list. All user of mb_cache_shrink() are ext2/ext3 xattr. Signed-off-by: Akinobu Mita <[EMAIL PROTECTED]> --- 2.6-rc/fs/mbcache.c.orig2005-07-14 20:40:34.0 +0900 +++ 2.6-rc/fs/mbcache.c 2005-07-14 20:43:42.0 +0900 @@ -329,7 +329,7 @@ mb_cache_shrink(struct mb_cache *cache, list_for_each_safe(l, ltmp, _cache_lru_list) { struct mb_cache_entry *ce = list_entry(l, struct mb_cache_entry, e_lru_list); - if (ce->e_bdev == bdev) { + if (ce->e_cache == cache && ce->e_bdev == bdev) { list_move_tail(>e_lru_list, _list); __mb_cache_entry_unhash(ce); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Thread_Id
> And you certainly shouldn't be using gettid () syscall in NPTL, as it > is just an implementation detail that there is a 1:1 mapping between > NPTL threads and kernel threads. It can change at any time. Maybe I missed the point, but I thought the 1:1 mapping between NPTL threads and kernel threads is one of the advantages of NPTL and the idea of a userland scheduler is quite dead. So please let gettid do what man gettid assures: gettid returns the thread ID of the current process. This is equal to the process ID (as returned by getpid(2)), unless the process is part of a thread group (created by specifying the CLONE_THREAD flag to the clone(2) system call). All processes in the same thread group have the same PID, but each one has a unique TID. Bene - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Open source firewalls
RVK wrote: Proxies can be a good way of filtering but it can't avoid buffer overflows. Yes they can - did you read and udnerstand my previous post at all? A proxy _can_ avoid a buffer overflow by noticing the anomalously large data item and simply refuse to pass it on to the real server! The proxy can terminate the tcp connection and throw away the data. It can only increase it. More code more bugs. Of course the proxy can be buggy too, but it is easier to avoid problems there: 1. The server was written to perform a service, perhaps with security thrown in later. (Yes, that's bad design.) A firewall proxy is written for security, so buffer overflows are usually avoided in the firewall proxy itself. Because this is exactly what the firewall writer is thinking about. 2. The proxy may be much smaller and simpler than the server it protects, it is therefore much easier to audit for security problems. 3. Fixing the server is indeed best, but not necessarily an option. It could be proprietary, or written in a unknown language. If it is running on a hardware firewall as a service then its more "Hardware firewall" ??? dangerous as once it is compramised then IDS signatures also can be deleated :-). No use of IDS the right ? A compromised firewall is of no use - sure. So what? That applies to any firewall, any IDS, or any server for that matter. So the best way is either make your code free of buffer overflows or Yes, but the server may not be "my code" at all. Can't you see that problem? It may very well be someone elses code. I may not have the source, or the source may be useless for a number of reasons, such as: 1. being written in a language I don't understand 2. Have a licence that forbids change 3. Need compilers/tools I don't have 4. Being such a nasty mess that writing a proxy is much easier than fixing the bloated ill-designed server code one unfortunately depends on for the time being. In these cases, I can still protect my server with a proxy firewall, although I can't fix the server itself. use some library which controls the attack during any buffer overflow or use Stack Randomisation and Canary based approaches. A library that controls any buffer overflow doesn't exist at all. Stack randomization helps but don't solve all cases, the attacker simply need code to search for randomly moved parts he need, pad with a few megabytes of NOPs and things like that. Of course, a proxy can easily detect megabytes of NOPs and kill that connection . . . Helge Hafting - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC/RFF][PATCH] rm -rf linux/arch/i386/boot
Hello, I am not sure the "rm -rf linux/arch/i386/boot" is an acceptable delta for the current code management system, but anyways it is only the second step - after the attached patch has been discussed, modified and hopefully accepted. Unfortunately this second step will break compatibility with LILO and GRUB - the kernel would only boot with Gujin version 1.2 or more ( http://gujin.org ), so we have some time before this cleanning begins, and have to stay compatible in between. In this mean time, the current patch is a complete rewrite of all the code executed in real mode - and so a complete replacement of all the directory linux/arch/i386/boot into one C file named arch/i386/kernel/realmode.c and its include include/asm-i386/realmode.h The mapping of the BIOS information reported to the kernel is the same, the one described in Linux/Documentation/i386/zero-page.txt - but is now expressed in the form of C structures. The kernel file becomes a lot simpler to generate, it is just an ELF file (the usual file linux/vmlinux you already get during the build process) transformed into binary by objcopy and gzip'ed. A small part is added during the link of vmlinux file: the content of realmode.c which contains a C function that the kernel need to get information from the BIOS (compiled with GCC / executed in real mode). Most of this function is written in C - have a look for yourself. To generate a kernel: make /boot/linux-2.6.13.kgz# the root filesystem will be autodetected make /boot/linux-2.6.13.kgz ROOT=/dev/hda3 # root filesystem forced You will need to install Gujin, either on a floppy, on your hard disk into a partition or at the end of your hard disk, or to a CDROM. No Configuration of Gujin is needed - because the configuration file does not (and will never) exists. Have a look at Gujin FAQ in the Documentation Manager of sourceforge before asking, please. The attached patch is made based on linux-2.6.13-rc2 , to apply to linux-2.6.12 you need to modify the patch replacing "phys_startup_32" by "startup_32" and removing " - LOAD_OFFSET" before applying. I will learn GIT soon - but for now... The generation process need to insert comments into the GZIP file, in the field reserved for this purpose, for instance to describe kernel characteristics like which processor is supported, and so the patch contains a BSD-licenced tool I wrote named gzcopy for the job. Note that there is no more limit in the size of the kernel using Gujin - but an "all yes config" without any modules will only boot up to some point: it seems that some (audio card?) driver need DMA able memory (i.e. below 16 Mbytes) else they crash the kernel. Unfortunately by default this "all yes" kernel is loaded at address 1 Mbyte and it is far bigger than 15 Mbytes. A modification of the load address need some change in the kernel - Gujin is already ready to load anywhere you want. Note also that the x86_64 architecture will need some cleanning too, but it will only be the third step. Signed-off-by: Etienne Lorrain <[EMAIL PROTECTED]> Have fun [you can check by yourself, it works], Etienne. patch-gujin-2613rc2.gz Description: application/gzip-compressed
Re: Realtime Preemption, 2.6.12, Beginners Guide?
Am Mittwoch, 13. Juli 2005 16:01 schrieb K.R. Foley: > Ingo Molnar wrote: > > * Chuck Harding <[EMAIL PROTECTED]> wrote: > > > > > >>>CC [M] sound/oss/emu10k1/midi.o > >>>sound/oss/emu10k1/midi.c:48: error: syntax error before '__attribute__' > >>>sound/oss/emu10k1/midi.c:48: error: syntax error before ')' token > >>> > >>>Here's the offending line: > >>> > >>> 48 static DEFINE_SPINLOCK(midi_spinlock __attribute((unused))); > >>> > >>>Lee > >>> > >> > >>I got it to compile but it won't boot - it hangs right after the > >>'Uncompressing Linux... OK, booting the kernel' - I'm using .config > >>from 51-27 (attached) > > > > > > and -51-27 worked just fine? I've uploaded -29 with the -28 io-apic > > changes undone (will re-apply them once Karsten has figured out what's > > wrong). > > > > Ingo > > I too had the same problem booting -51-28 on my older SMP system at > home. -51-29 just booted fine. > Have I corrected the other path of ioapic early initialization, which had lacked virtual-address setup before ioapic_data[ioapic] was to be filled in -51-28? Please test attached patch on top of -51-29 or later. Also on Systems that liked -51-28. thanks, Karsten diff -ur ../linux-2.6.12-RT-51-23/arch/i386/kernel/apic.c ./arch/i386/kernel/apic.c --- ../linux-2.6.12-RT-51-23/arch/i386/kernel/apic.c 2005-07-14 12:31:33.0 +0200 +++ linux-2.6.12-RT/arch/i386/kernel/apic.c 2005-07-14 12:34:53.0 +0200 @@ -832,10 +832,10 @@ ioapic_phys = (unsigned long) alloc_bootmem_pages(PAGE_SIZE); ioapic_phys = __pa(ioapic_phys); +set_fixmap_nocache(idx, ioapic_phys); +printk(KERN_DEBUG "faked IOAPIC to %08lx (%08lx)\n", + __fix_to_virt(idx), ioapic_phys); } - set_fixmap_nocache(idx, ioapic_phys); - printk(KERN_DEBUG "mapped IOAPIC to %08lx (%08lx)\n", - __fix_to_virt(idx), ioapic_phys); idx++; } } diff -ur ../linux-2.6.12-RT-51-23/arch/i386/kernel/io_apic.c ./arch/i386/kernel/io_apic.c --- ../linux-2.6.12-RT-51-23/arch/i386/kernel/io_apic.c 2005-07-09 23:49:21.0 +0200 +++ linux-2.6.12-RT/arch/i386/kernel/io_apic.c 2005-07-14 12:34:54.0 +0200 @@ -31,6 +31,7 @@ #include #include #include +#include #include #include @@ -55,11 +56,6 @@ int sis_apic_bug = -1; /* - * # of IRQ routing registers - */ -int nr_ioapic_registers[MAX_IO_APICS]; - -/* * Rough estimation of how many shared IRQs there are, can * be changed anytime. */ @@ -132,88 +128,74 @@ # define IOAPIC_CACHE #endif -#ifdef IOAPIC_CACHE -# define MAX_IOAPIC_CACHE 512 -/* - * Cache register values: - */ -static struct { - unsigned int reg; - unsigned int val[MAX_IOAPIC_CACHE]; -} io_apic_cache[MAX_IO_APICS] - cacheline_aligned_in_smp; + +struct ioapic_data_struct { + struct sys_device dev; + int nr_registers; // # of IRQ routing registers + volatile unsigned int *base; + struct IO_APIC_route_entry *entry; +#ifdef IOAPIC_CACHE + unsigned int reg_set; + u32 cached_val[0]; #endif +}; -volatile unsigned int *io_apic_base[MAX_IO_APICS]; +static struct ioapic_data_struct *ioapic_data[MAX_IO_APICS]; -static inline unsigned int __raw_io_apic_read(unsigned int apic, unsigned int reg) + +static inline unsigned int __raw_io_apic_read(struct ioapic_data_struct *ioapic, unsigned int reg) { - volatile unsigned int *io_apic; -#ifdef IOAPIC_CACHE - io_apic_cache[apic].reg = reg; -#endif - io_apic = io_apic_base[apic]; - io_apic[0] = reg; - return io_apic[4]; +# ifdef IOAPIC_CACHE + ioapic->reg_set = reg; +# endif + ioapic->base[0] = reg; + return ioapic->base[4]; } -unsigned int raw_io_apic_read(unsigned int apic, unsigned int reg) + +# ifdef IOAPIC_CACHE +static void __init ioapic_cache_init(struct ioapic_data_struct *ioapic) { - unsigned int val = __raw_io_apic_read(apic, reg); + int reg; + for (reg = 0; reg < (ioapic->nr_registers + 10); reg++) + ioapic->cached_val[reg] = __raw_io_apic_read(ioapic, reg); +} +# endif -#ifdef IOAPIC_CACHE - io_apic_cache[apic].val[reg] = val; -#endif + +static unsigned int raw_io_apic_read(struct ioapic_data_struct *ioapic, unsigned int reg) +{ + unsigned int val = __raw_io_apic_read(ioapic, reg); + +# ifdef IOAPIC_CACHE + ioapic->cached_val[reg] = val; +# endif return val; } -unsigned int io_apic_read(unsigned int apic, unsigned int reg) +static unsigned int io_apic_read(struct ioapic_data_struct *ioapic, unsigned int reg) { -#ifdef IOAPIC_CACHE - if (unlikely(reg >= MAX_IOAPIC_CACHE)) { - static int once = 1; - - if (once) { - once = 0; - printk("WARNING: ioapic register cache overflow: %d.\n", -reg); - dump_stack(); - } - return __raw_io_apic_read(apic, reg); - } - if (io_apic_cache[apic].val[reg] && !sis_apic_bug) { - io_apic_cache[apic].reg = -1; - return io_apic_cache[apic].val[reg]; +# ifdef IOAPIC_CACHE + if (likely(!sis_apic_bug)) { + ioapic->reg_set = -1; + return ioapic->cached_val[reg]; } -#endif - return raw_io_apic_read(apic, reg); +# endif + return
Re: Thread_Id
On Thu, Jul 14, 2005 at 02:25:43PM +0200, Arjan van de Ven wrote: > pure luck. NPTL threading uses it to store a pointer to per thread info > structure; other threading (linuxthreads) may have stored a pid there to > identify the internal thread. nptl is 2.6 only so you might have > switched implementation of threading when you switched kernels. Actually, in linuxthreads what pthread_self () returned has the first slot in its internal threads array (up to max number of supported threads) that was unused at thread creation time in the low order bits and sequence number of thread creation in its high order bits. So unless you are using yet another threading library (I thought NGPT is dead for years...), the claim that you get the same numbers from gettid() syscall under NPTL as pthread_self () gives you under LinuxThreads is simply not true. And you certainly shouldn't be using gettid () syscall in NPTL, as it is just an implementation detail that there is a 1:1 mapping between NPTL threads and kernel threads. It can change at any time. Jakub - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re-routing packets via netfilter (ip_rt_bug)
Patrick, Hebert, This issues stills seems to be in the latest trees - is this patch or a variation on it still bumping around? Thanks! Yair Itzhaki wrote: Can anyone propose a patch that I can start checking? I have come up with the following: --- net/core/netfilter.c.orig 2005-04-18 21:55:30.0 +0300 +++ net/core/netfilter.c2005-05-02 17:35:20.0 +0300 @@ -622,9 +622,10 @@ /* some non-standard hacks like ipt_REJECT.c:send_reset() can cause * packets with foreign saddr to appear on the NF_IP_LOCAL_OUT hook. */ - if (inet_addr_type(iph->saddr) == RTN_LOCAL) { + if ((inet_addr_type(iph->saddr) == RTN_LOCAL) || + (inet_addr_type(iph->daddr) == RTN_LOCAL)) { fl.nl_u.ip4_u.daddr = iph->daddr; - fl.nl_u.ip4_u.saddr = iph->saddr; + fl.nl_u.ip4_u.saddr = 0; fl.nl_u.ip4_u.tos = RT_TOS(iph->tos); fl.oif = (*pskb)->sk ? (*pskb)->sk->sk_bound_dev_if : 0; #ifdef CONFIG_IP_ROUTE_FWMARK Please advise, Yair -Original Message- From: Patrick McHardy [mailto:[EMAIL PROTECTED] Sent: Wednesday, April 27, 2005 14:05 To: Herbert Xu Cc: Jozsef Kadlecsik; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Yair Itzhaki; linux-kernel@vger.kernel.org Subject: Re: Re-routing packets via netfilter (ip_rt_bug) Herbert Xu wrote: Here is another reason why these packets should go through FORWARD. They were generated in response to packets in INPUT/FORWARD/OUTPUT. The original packet has not undergone SNAT in any of these cases. However, if we feed the response packet through LOCAL_OUT it will be subject to DNAT. This creates a NAT asymmetry and we may end up with the wrong destination address. By pushing it through FORWARD it will only undergo SNAT which is correct since the original packet would have undergone DNAT. This is only a problem since the recent NAT changes, but I agree that we should fix it by moving these packets to FORWARD. Regards Patrick - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Thread_Id
> > > So then what is the meaning of that typedef and why its still there ? the typedef means that the *IMPLEMENTATION* uses an unsigned long to store its cookie in. > > >Other implementations are allowed to use different types for this. In > >fact, I'd be surprised if NPTL and LinuxThreads would have the same > >type... (they'll have the same size for ABI compat reasons of course, > >but type... not so sure). > > > > > > > I haven't faced the same returns with 2.4.18. So why is it so with 2.6.x > kernels ? pthread_self() on 2.4.18 was returning the same as gettid() > with 2.6.x. pure luck. NPTL threading uses it to store a pointer to per thread info structure; other threading (linuxthreads) may have stored a pid there to identify the internal thread. nptl is 2.6 only so you might have switched implementation of threading when you switched kernels. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Open source firewalls
Proxies can be a good way of filtering but it can't avoid buffer overflows. It can only increase it. More code more bugs. If it is running on a hardware firewall as a service then its more dangerous as once it is compramised then IDS signatures also can be deleated :-). No use of IDS the right ? So the best way is either make your code free of buffer overflows or use some library which controls the attack during any buffer overflow or use Stack Randomisation and Canary based approaches. rvk Helge Hafting wrote: RVK wrote: I don't think buffer overflow has anything to do with transparent proxy. Transparent proxying is just doing some protocol filtering. A transparent proxy is a protocol filter, which is why it is an ideal way of detecting protocol-dependent buffer overflow attacks. The detection code have to be built into the proxy, of course. Examples: A web proxy can check for anomalous long "get" request, there have been web servers with buffer overflows when the URL was too long. The proxy can terminate such connections, protecting the possibly vulnerable webserver. An ftp proxy can check for (and remove) anomalous long filenames, as well as funnies like "ls */*/*/*/*/*" Similiar for many other services. The proxy approach is useful because knowledge of the protocol is necessary. After all, it is ok to up/download a huge file via ftp, while a 2M filename is suspicious. Size alone is not enough. Still the proxy code may have some buffer overflows. A proxy (or any other attempt at a firewall) may have its own holes of course, but avoiding making them isn't that hard. The best way is first to try avoiding any buffer overflows and take programming precautions. Of course, if you have the source and that source isn't an unmaintainable mess. One or both of those conditions may fail, and then the IDS becomes useful. Other way is to chroot the services, if running it on a firewall. Provided it is an unixish server . . . There are various mechanisms which can be used like bounding the memory region it self. Stack Randomisation and Canary based approaches can also avoid any buffer overflow attacks. These may or may not be available. You can always stick a proxy firewall in front of the server though, no matter what os and server apps it runs. Helge Hafting . - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, Jul 14, 2005 at 12:25:40PM +0200, Krzysztof Halasa wrote: > Linus Torvalds <[EMAIL PROTECTED]> writes: > > > And in short-term things, the timeval/jiffie conversion is likely to be a > > _bigger_ issue than the crystal frequency conversion. > > > > So we should aim for a HZ value that makes it easy to convert to and from > > the standard user-space interface formats. 100Hz, 250Hz and 1000Hz are all > > good values for that reason. 864 is not. > > Probably only theoretical, and probably the hardware isn't up to it... > But what if we have: > - 64-bit jiffies done in hardware (a counter). 1 cycle = 1 microsecond > or even a CPU clock cycle. Can *APIC or another HPET do that? HPETs have a fixed frequency (usually 14.31818 MHz, but that depends on the manufacturer). > - 64-bit "match timer" (i.e., a register in the counter which fires IRQ > when it matches the counter value) That's implemented in the HPET hardware. > - the CPU(s) sorting the timer list and programming "match timer" with > software timer next to be executed. Upon firing the timer, a new "next > to be executed" timer would be programmed into the counter's "match > timer". > > We would have no timer ticks when nobody requested them - the CPUs would > be allowed to sleep for, say, even 50 ms when no task is RUNNING. -- Vojtech Pavlik SuSE Labs, SuSE CR - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Open source firewalls
RVK wrote: I don't think buffer overflow has anything to do with transparent proxy. Transparent proxying is just doing some protocol filtering. A transparent proxy is a protocol filter, which is why it is an ideal way of detecting protocol-dependent buffer overflow attacks. The detection code have to be built into the proxy, of course. Examples: A web proxy can check for anomalous long "get" request, there have been web servers with buffer overflows when the URL was too long. The proxy can terminate such connections, protecting the possibly vulnerable webserver. An ftp proxy can check for (and remove) anomalous long filenames, as well as funnies like "ls */*/*/*/*/*" Similiar for many other services. The proxy approach is useful because knowledge of the protocol is necessary. After all, it is ok to up/download a huge file via ftp, while a 2M filename is suspicious. Size alone is not enough. Still the proxy code may have some buffer overflows. A proxy (or any other attempt at a firewall) may have its own holes of course, but avoiding making them isn't that hard. The best way is first to try avoiding any buffer overflows and take programming precautions. Of course, if you have the source and that source isn't an unmaintainable mess. One or both of those conditions may fail, and then the IDS becomes useful. Other way is to chroot the services, if running it on a firewall. Provided it is an unixish server . . . There are various mechanisms which can be used like bounding the memory region it self. Stack Randomisation and Canary based approaches can also avoid any buffer overflow attacks. These may or may not be available. You can always stick a proxy firewall in front of the server though, no matter what os and server apps it runs. Helge Hafting - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, Jul 14, 2005 at 09:42:18AM +0200, Arjan van de Ven wrote: > > IOW, nothing ever sees any "variable frequency", and there's never any > > question about what the timer tick is: the timer tick is 2kHz as far as > > everybody is concerned. It's just that the ticks sometimes come in > > "bunches of 20". > > btw we can hide all of this a lot nicer from just about the entire > kernel by reducing the usage of both HZ and jiffies in drivers/non > platform code. That isn't hard; msleep() is a good step forward there > already; the next step is a nicer api for add_timer/mod_timer that is > both relative and in miliseconds; with those 2 the majority of code that > has "knowledge" about this shrinks to near zero. Once we have that the > actual implementation of this in the background matters a whole lot > less. A note on the relaive timer API: There needs to be a way to say "x milliseconds from the time this timer should have triggered" instead of "x milliseconds from now", to avoid skew in timers that try to be strictly periodic. But other than that - such an API would be a great thing for drivers. -- Vojtech Pavlik SuSE Labs, SuSE CR - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[2.6-GIT] NTFS: Release 2.1.23.
Hi Linus, please pull from rsync://rsync.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs-2.6.git/HEAD This is a big NTFS update. It was meant for as soon as 2.6.12 was released but it was delayed due to the need for a patch I submitted to Andrew for -mm to make it to the mainline kernel (which it has as of yesterday). This update includes lots of fixes including a really nasty deadlock that with recent kernels was triggered with 100% probability on umount of an NTFS volume so it is important to go in before 2.6.13 is released. Please apply. Thanks! Best regards, Anton -- Anton Altaparmakov (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/, http://www-stu.christs.cam.ac.uk/~aia21/ This will update the following files: Documentation/filesystems/ntfs.txt | 29 + fs/ntfs/ChangeLog | 179 - fs/ntfs/Makefile |4 fs/ntfs/aops.c | 166 +--- fs/ntfs/attrib.c | 630 + fs/ntfs/attrib.h | 16 fs/ntfs/compress.c | 46 +- fs/ntfs/debug.c| 15 fs/ntfs/dir.c | 32 - fs/ntfs/file.c |2 fs/ntfs/index.c| 16 fs/ntfs/inode.c| 530 ++-- fs/ntfs/inode.h|7 fs/ntfs/layout.h | 87 ++-- fs/ntfs/lcnalloc.c | 72 +-- fs/ntfs/logfile.c | 11 fs/ntfs/mft.c | 227 fs/ntfs/namei.c| 34 + fs/ntfs/ntfs.h |8 fs/ntfs/runlist.c | 278 ++ fs/ntfs/runlist.h | 16 fs/ntfs/super.c| 692 - fs/ntfs/sysctl.c |4 fs/ntfs/time.h |4 fs/ntfs/types.h| 10 fs/ntfs/unistr.c |2 fs/ntfs/usnjrnl.c | 84 fs/ntfs/usnjrnl.h | 205 ++ fs/ntfs/volume.h | 12 29 files changed, 2522 insertions(+), 896 deletions(-) through these ChangeSets: commit ba6d2377c85c9b8a793f455d8c9b6cf31985d70f tree 21e65c76db693869c84864af02e91c4b997a6ba5 parent af859a42d798f047fbfe198ed315a942662c39d2 author Anton Altaparmakov <[EMAIL PROTECTED]> Sun, 26 Jun 2005 22:12:02 +0100 committer Anton Altaparmakov <[EMAIL PROTECTED]> Sun, 26 Jun 2005 22:12:02 +0100 NTFS: Fix a nasty deadlock that appeared in recent kernels. The situation: VFS inode X on a mounted ntfs volume is dirty. For same inode X, the ntfs_inode is dirty and thus corresponding on-disk inode, i.e. mft record, which is in a dirty PAGE_CACHE_PAGE belonging to the table of inodes, i.e. $MFT, inode 0. What happens: Process 1: sys_sync()/umount()/whatever... calls __sync_single_inode() for $MFT -> do_writepages() -> write_page for the dirty page containing the on-disk inode X, the page is now locked -> ntfs_write_mst_block() which clears PageUptodate() on the page to prevent anyone else getting hold of it whilst it does the write out. This is necessary as the on-disk inode needs "fixups" applied before the write to disk which are removed again after the write and PageUptodate is then set again. It then analyses the page looking for dirty on-disk inodes and when it finds one it calls ntfs_may_write_mft_record() to see if it is safe to write this on-disk inode. This then calls ilookup5() to check if the corresponding VFS inode is in icache(). This in turn calls ifind() which waits on the inode lock via wait_on_inode whilst holding the global inode_lock. Process 2: pdflush results in a call to __sync_single_inode for the same VFS inode X on the ntfs volume. This locks the inode (I_LOCK) then calls write-inode -> ntfs_write_inode -> map_mft_record() -> read_cache_page() for the page (in page cache of table of inodes $MFT, inode 0) containing the on-disk inode. This page has PageUptodate() clear because of Process 1 (see above) so read_cache_page() blocks when it tries to take the page lock for the page so it can call ntfs_read_page(). Thus Process 1 is holding the page lock on the page containing the on-disk inode X and it is waiting on the inode X to be unlocked in ifind() so it can write the page out and then unlock the page. And Process 2 is holding the inode lock on inode X and is waiting for the page to be unlocked so it can call ntfs_readpage() or discover that Process 1 set PageUptodate() again and use the page. Thus we have a deadlock due to ifind() waiting on the inode lock. The solution: The fix is
Re: GIT tree broken? (rsync depreciated)
Stelian Pop schrieb: > After resyncing cogito to the latest version (which incorporates the > 'pack' changes, which were causing the failure), it does indeed work > again, when using rsync. > hm, i haven't updated my git-tree (linux-2.6.git) for a while and i got similiar error messages. i updated cogito to: % cg-version cogito-0.12.1 (cbec08d191d36126ddaf021961cc8995794b4a72) and the "cannot map sha1 file..." errors went away. now i get: Applying changes... error: Could not read 043d051615aa5da09a7e44f1edbb69798458e067 error: Could not read 043d051615aa5da09a7e44f1edbb69798458e067 error: Could not read c101f3136cc98a003d0d16be6fab7d0d950581a6 error: Could not read c101f3136cc98a003d0d16be6fab7d0d950581a6 error: Could not read c101f3136cc98a003d0d16be6fab7d0d950581a6 error: Could not read a18bcb7450840f07a772a45229de4811d930f461 Merging 99f95e5286df2f69edab8a04c7080d986ee4233b -> 514fd7fd01d378a7b5584c657d9807fc28f22079 to 62351cc38d3eaf3de0327054dd6ebc039f4da80d... fatal: failed to unpack tree object bda3910b7737a4fac464792657ffedcba185d799 cg-merge: git-read-tree failed (merge likely blocked by local changes) i *think* i did not make any local changes, but if i did - i want to get rid ofthem and want a clean tree. cg-status prints a lot of files with a "D" in front of it but "cg-status -h" does not know about the "D" status flag any hints for this one? thank you, Christian. -- BOFH excuse #378: Operators killed by year 2000 bug bite. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2.6] remove PCI_BRIDGE_CTL_VGA handling from setup-bus.c
The setup-bus code doesn't work correctly for configurations with more than one display adapter in the same PCI domain. This stuff actually is a leftover of an early 2.4 PCI setup code and apparently it stopped working after some "bridge_ctl" changes. So the best thing we can do is just to remove it and rely on the fact that any firmware *has* to configure VGA port forwarding for the boot display device properly. But then we need to ensure that the bus->bridge_ctl will always contain valid information collected at the probe time, therefore the following change in pci_scan_bridge() is needed. Signed-off-by: Ivan Kokshaysky <[EMAIL PROTECTED]> --- 2.6.13-rc3/drivers/pci/probe.c Thu Jul 14 11:09:52 2005 +++ linux/drivers/pci/probe.c Thu Jul 14 11:22:06 2005 @@ -507,7 +507,7 @@ int __devinit pci_scan_bridge(struct pci pci_write_config_dword(dev, PCI_PRIMARY_BUS, buses); if (!is_cardbus) { - child->bridge_ctl = PCI_BRIDGE_CTL_NO_ISA; + child->bridge_ctl = bctl | PCI_BRIDGE_CTL_NO_ISA; /* * Adjust subordinate busnr in parent buses. * We do this before scanning for children because --- 2.6.13-rc3/drivers/pci/setup-bus.c Thu Jul 14 11:09:52 2005 +++ linux/drivers/pci/setup-bus.c Thu Jul 14 11:22:54 2005 @@ -51,8 +51,6 @@ pbus_assign_resources_sorted(struct pci_ struct resource_list head, *list, *tmp; int idx; - bus->bridge_ctl &= ~PCI_BRIDGE_CTL_VGA; - head.next = NULL; list_for_each_entry(dev, >devices, bus_list) { u16 class = dev->class >> 8; @@ -62,10 +60,6 @@ pbus_assign_resources_sorted(struct pci_ class == PCI_CLASS_BRIDGE_HOST) continue; - if (class == PCI_CLASS_DISPLAY_VGA || - class == PCI_CLASS_NOT_DEFINED_VGA) - bus->bridge_ctl |= PCI_BRIDGE_CTL_VGA; - pdev_sort_resources(dev, ); } @@ -509,12 +503,6 @@ pci_bus_assign_resources(struct pci_bus pbus_assign_resources_sorted(bus); - if (bus->bridge_ctl & PCI_BRIDGE_CTL_VGA) { - /* Propagate presence of the VGA to upstream bridges */ - for (b = bus; b->parent; b = b->parent) { - b->bridge_ctl |= PCI_BRIDGE_CTL_VGA; - } - } list_for_each_entry(dev, >devices, bus_list) { b = dev->subordinate; if (!b) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: console remains blanked
>Before 2.6.12-rc2, the console was unblanked by just >writing to the console. >For keyboardless and mouseless systems (which is my >case, embedded) this new behaviour is a bit annoying. Interesting. I have observed the following (2.6.13-rc1 and a little earlier): mplayer bla.avi -vo cvidix After the blanking time, all chars turn black[1] but are still "visible" thanks the movie in the background - a vga palette manipulation to the entries 0-15 as it seems. This is quite different to writing 80x25 the space character. Jan Engelhardt -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Patch to make mount follow a symlink at /etc/mtab
I attach a patch that modifies the mount program in the util-linux package so that if /etc/mtab is a symbolic link (to a location outside of /proc) then mount accesses mtab at the target of the symbolic link. This feature is useful when the root filesystem is mounted read-only; /etc/mtab can then be symlinked to a location on a writable filesystem. In the long run mtab should be eliminated entirely but in the meantime it is nice to be able to relocate the file. The patch deals correctly with the fact that mount creates lock files in the same directory as mtab. This patch also fixes a bug in umount.c whereby umount will update mtab in some circumstances even though the -n option has been given. I wrote the patch in August 2003 and submitted it to the util-linux maintainer at that time. He said that he would apply it if it proved to be reliable after some testing. The patch has been on my web page http://panopticon.csustan.edu/thood/readonly-root.html for almost two years since then, updated from time to time as new versions of util-linux were released. I have advertised the patch in various forums and I have used the patch myself for a long time. No problems have ever been reported. The latest version of the patch applies to versions 2.12p and 2.12q. util-linux-2.12q-symlinkmtab_jdth20050709.patch I tested it by patching the latest Debian and Ubuntu packages. In order for the latter to build I had to modify 10fstab.dpatch as well. I attach the patch for that file too. util-linux-2.12q-symlinkmtab-10fstab_jdth20050709.patch -- Thomas Hood diff -uNr util-linux-2.12p_ORIG/mount/fstab.c util-linux-2.12p/mount/fstab.c --- util-linux-2.12p_ORIG/mount/fstab.c 2004-12-21 20:09:24.0 +0100 +++ util-linux-2.12p/mount/fstab.c 2005-07-09 11:53:19.0 +0200 @@ -1,7 +1,10 @@ -/* 1999-02-22 Arkadiusz Mi¶kiewicz <[EMAIL PROTECTED]> +/* + * 1999-02-22 Arkadiusz Mi¶kiewicz <[EMAIL PROTECTED]> * - added Native Language Support - * Sun Mar 21 1999 - Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> + * 1999-03-21 Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> * - fixed strerr(errno) in gettext calls + * 2003-08-08 Thomas Hood <[EMAIL PROTECTED]> with help from Patrick McLean + * - Write through a symlink at /etc/mtab if it doesn't point into /proc/ */ #include @@ -11,67 +14,129 @@ #include #include "mntent.h" #include "fstab.h" +#include "realpath.h" #include "sundries.h" #include "xmalloc.h" #include "mount_blkid.h" #include "paths.h" #include "nls.h" -#define streq(s, t) (strcmp ((s), (t)) == 0) - -#define PROC_MOUNTS "/proc/mounts" - - /* Information about mtab. */ -static int have_mtab_info = 0; -static int var_mtab_does_not_exist = 0; -static int var_mtab_is_a_symlink = 0; +/* A 64 bit number can be displayed in 20 decimal digits */ +#define LEN_LARGEST_PID 20 +#define MTAB_PATH_MAX (PATH_MAX - (sizeof(MTAB_LOCK_SUFFIX) - 1) - LEN_LARGEST_PID) +static char mtab_path[MTAB_PATH_MAX]; +static char mtab_lock_path[PATH_MAX]; +static char mtab_lock_targ[PATH_MAX]; +static char mtab_temp_path[PATH_MAX]; -static void +/* + * Set mtab_path to the real path of the mtab file + * or to the null string if that path is inaccessible + * + * Run this early + */ +void get_mtab_info(void) { struct stat mtab_stat; - if (!have_mtab_info) { - if (lstat(MOUNTED, _stat)) - var_mtab_does_not_exist = 1; - else if (S_ISLNK(mtab_stat.st_mode)) - var_mtab_is_a_symlink = 1; - have_mtab_info = 1; + if (lstat(MOUNTED, _stat)) { + /* Assume that the lstat error means that the file does not exist */ + /* (Maybe we should check errno here) */ + strcpy(mtab_path, MOUNTED); + } else if (S_ISLNK(mtab_stat.st_mode)) { + /* Is a symlink */ + int len; + char *r = myrealpath(MOUNTED, mtab_path, MTAB_PATH_MAX); + mtab_path[MTAB_PATH_MAX - 1] = 0; /* Just to be sure */ + len = strlen(mtab_path); + if ( + r == NULL + || len == 0 + || len >= (MTAB_PATH_MAX - 1) + || streqn(mtab_path, PATH_PROC, sizeof(PATH_PROC) - 1) + ) { + /* Real path invalid or inaccessible */ + mtab_path[0] = '\0'; + return; + } + /* mtab_path now contains mtab's real path */ + } else { + /* Exists and is not a symlink */ + strcpy(mtab_path, MOUNTED); } + + sprintf(mtab_lock_path, "%s%s", mtab_path, MTAB_LOCK_SUFFIX); + sprintf(mtab_lock_targ, "%s%s%d", mtab_path, MTAB_LOCK_SUFFIX, getpid()); + sprintf(mtab_temp_path, "%s%s", mtab_path, MTAB_TEMP_SUFFIX); + + return; } -int -mtab_does_not_exist(void) { - get_mtab_info(); - return var_mtab_does_not_exist; +/* + * Tell whether or not the mtab real path is accessible + * + * get_mtab_info() must have been run + */ +static int +mtab_is_accessible(void) { + return (mtab_path[0] != '\0'); } +/* + * Tell whether or not the mtab file currently exists + * + * Note that the answer here is independent of whether or + * not the file is writable, so if you are planning to create + * the mtab file then check
Re: Thread_Id
Ian Campbell wrote: On Thu, 2005-07-14 at 16:32 +0530, RVK wrote: Ian Campbell wrote: What Arjan is saying is that pthread_t is a cookie -- this means that you cannot interpret it in any way, it is just a "thing" which you can pass back to the API, that pthread_t happens to be typedef'd to unsigned long int is irrelevant. Do you want to say for both 2.6.x and 2.4.x I should interpret that way ? As I understand it, yes, you should never try and assign any meaning to the values. The fact that you may have been able to find some apparent meaning under 2.4 is just a coincidence. Iam sorry I don't agree on this. This confusion have created only becoz of the different behavior of pthread_self() on 2.4.18 and 2.6.x kernels. And Iam looking for clarifying my doubt. I can't digest this at all. rvk Ian. -- Ian Campbell Current Noise: Nile - Annihilation Of The Wicked BOFH excuse #127: Sticky bits on disk. . - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Thread_Id
Arjan van de Ven wrote: On Thu, 2005-07-14 at 15:36 +0530, RVK wrote: it doesn't return a number it returns a pointer ;) or a floating point number. You don't know :) what it returns is a *cookie*. A cookie that you can only use to pass back to various pthread functions. Hahaha..common. Please clarify following I'm missing the joke Its not a joke its a confusion created by the thread identifier. SYNOPSIS #include pthread_t pthread_self(void); DESCRIPTION pthread_self return the thread identifier for the calling thread. *identifier*. It doesn't give a meaning beyond that, and if you look at other pthread manpages (say pthread_join) it just wants that identifier back. If you want to attach meaning to a thread identifier, please come up with a manpage/standard that actually defines the meaning of it. bits/pthreadtypes.h:150:typedef unsigned long int pthread_t; and here you 1) look at implementation details of your specific threading implementation and 2) you prove that your analysis is wrong since the implementation you look at defines it as *unsigned* so it can't be negative. So what your app does is clearly wrong even within the implementation you look at. So then what is the meaning of that typedef and why its still there ? Other implementations are allowed to use different types for this. In fact, I'd be surprised if NPTL and LinuxThreads would have the same type... (they'll have the same size for ABI compat reasons of course, but type... not so sure). I haven't faced the same returns with 2.4.18. So why is it so with 2.6.x kernels ? pthread_self() on 2.4.18 was returning the same as gettid() with 2.6.x. rvk . - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Thread_Id
On Thu, 2005-07-14 at 16:32 +0530, RVK wrote: > Ian Campbell wrote: > >What Arjan is saying is that pthread_t is a cookie -- this means that > >you cannot interpret it in any way, it is just a "thing" which you can > >pass back to the API, that pthread_t happens to be typedef'd to unsigned > >long int is irrelevant. > Do you want to say for both 2.6.x and 2.4.x I should interpret that way ? As I understand it, yes, you should never try and assign any meaning to the values. The fact that you may have been able to find some apparent meaning under 2.4 is just a coincidence. Ian. -- Ian Campbell Current Noise: Nile - Annihilation Of The Wicked BOFH excuse #127: Sticky bits on disk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Thread_Id
Ian Campbell wrote: On Thu, 2005-07-14 at 15:36 +0530, RVK wrote: bits/pthreadtypes.h:150:typedef unsigned long int pthread_t; That's an implementation detail which you cannot determine any information from. What Arjan is saying is that pthread_t is a cookie -- this means that you cannot interpret it in any way, it is just a "thing" which you can pass back to the API, that pthread_t happens to be typedef'd to unsigned long int is irrelevant. Do you want to say for both 2.6.x and 2.4.x I should interpret that way ? rvk Ian. -- Ian Campbell Current Noise: Nile - Annihilation Of The Wicked Don't tell me what you dreamed last night for I've been reading Freud. . - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] V4L: Bug fixes at tuner, cx88 and tea5767 (against 2.6.13-rc3)
- Bug fixes: 1) On CX88 code, some cards needs to have audio reprogramed after changing video channel; 2) Tuner autodetection code seems not to work on some cards. Now, no_autodetect insmod option allows disabling autodetection code; 3) Minor fixes at tea5767 to reduce integer trunc; 4) There are some new Pixelview Ultra Pro cards that doesn't use TEA5767 for radio. As autodetection is capable of checking for tea, radio tuners and addresses removed. - CX88 version number incremented. Signed-off-by: Mauro Carvalho Chehab <[EMAIL PROTECTED]> linux/drivers/media/video/cx88/cx88-cards.c |8 ++-- linux/drivers/media/video/cx88/cx88-dvb.c |2 - linux/drivers/media/video/cx88/cx88-video.c |7 +++- linux/drivers/media/video/cx88/cx88.h |6 +-- linux/drivers/media/video/tea5767.c | 34 +++- linux/drivers/media/video/tuner-core.c | 31 +++--- 6 files changed, 52 insertions(+), 36 deletions(-) diff -u linux-2.6.13/drivers/media/video/cx88/cx88.h linux/drivers/media/video/cx88/cx88.h --- linux-2.6.13/drivers/media/video/cx88/cx88.h 2005-07-13 11:07:25.0 -0300 +++ linux/drivers/media/video/cx88/cx88.h 2005-07-14 07:32:17.0 -0300 @@ -1,5 +1,5 @@ /* - * $Id: cx88.h,v 1.68 2005/07/07 14:17:47 mchehab Exp $ + * $Id: cx88.h,v 1.69 2005/07/13 17:25:25 mchehab Exp $ * * v4l2 device driver for cx2388x based TV cards * @@ -35,8 +35,8 @@ #include "btcx-risc.h" #include "cx88-reg.h" -#include -#define CX88_VERSION_CODE KERNEL_VERSION(0,0,4) +#include +#define CX88_VERSION_CODE KERNEL_VERSION(0,0,5) #ifndef TRUE # define TRUE (1==1) diff -u linux-2.6.13/drivers/media/video/cx88/cx88-cards.c linux/drivers/media/video/cx88/cx88-cards.c --- linux-2.6.13/drivers/media/video/cx88/cx88-cards.c 2005-07-13 11:07:25.0 -0300 +++ linux/drivers/media/video/cx88/cx88-cards.c 2005-07-14 07:32:17.0 -0300 @@ -1,5 +1,5 @@ /* - * $Id: cx88-cards.c,v 1.85 2005/07/04 19:35:05 mkrufky Exp $ + * $Id: cx88-cards.c,v 1.86 2005/07/14 03:06:43 mchehab Exp $ * * device driver for Conexant 2388x based TV cards * card-specific stuff. @@ -682,9 +682,9 @@ .name = "PixelView PlayTV Ultra Pro (Stereo)", /* May be also TUNER_YMEC_TVF_5533MF for NTSC/M or PAL/M */ .tuner_type = TUNER_PHILIPS_FM1216ME_MK3, - .radio_type = TUNER_TEA5767, - .tuner_addr = 0xc2>>1, - .radio_addr = 0xc0>>1, + .radio_type = UNSET, + .tuner_addr = ADDR_UNSET, + .radio_addr = ADDR_UNSET, .input = {{ .type = CX88_VMUX_TELEVISION, .vmux = 0, diff -u linux-2.6.13/drivers/media/video/cx88/cx88-video.c linux/drivers/media/video/cx88/cx88-video.c --- linux-2.6.13/drivers/media/video/cx88/cx88-video.c 2005-07-13 11:07:25.0 -0300 +++ linux/drivers/media/video/cx88/cx88-video.c 2005-07-14 07:32:17.0 -0300 @@ -1,5 +1,5 @@ /* - * $Id: cx88-video.c,v 1.79 2005/07/07 14:17:47 mchehab Exp $ + * $Id: cx88-video.c,v 1.80 2005/07/13 08:49:08 mchehab Exp $ * * device driver for Conexant 2388x based TV cards * video4linux video interface @@ -1346,6 +1346,11 @@ dev->freq = f->frequency; cx88_newstation(core); cx88_call_i2c_clients(dev->core,VIDIOC_S_FREQUENCY,f); + + /* When changing channels it is required to reset TVAUDIO */ + msleep (10); + cx88_set_tvaudio(core); + up(>lock); return 0; } diff -u linux-2.6.13/drivers/media/video/cx88/cx88-dvb.c linux/drivers/media/video/cx88/cx88-dvb.c --- linux-2.6.13/drivers/media/video/cx88/cx88-dvb.c 2005-07-13 11:07:25.0 -0300 +++ linux/drivers/media/video/cx88/cx88-dvb.c 2005-07-14 07:32:17.0 -0300 @@ -1,5 +1,5 @@ /* - * $Id: cx88-dvb.c,v 1.41 2005/07/04 19:35:05 mkrufky Exp $ + * $Id: cx88-dvb.c,v 1.42 2005/07/12 15:44:55 mkrufky Exp $ * * device driver for Conexant 2388x based TV cards * MPEG Transport Stream (DVB) routines diff -u linux-2.6.13/drivers/media/video/tuner-core.c linux/drivers/media/video/tuner-core.c --- linux-2.6.13/drivers/media/video/tuner-core.c 2005-07-13 11:07:25.0 -0300 +++ linux/drivers/media/video/tuner-core.c 2005-07-14 07:32:17.0 -0300 @@ -1,5 +1,5 @@ /* - * $Id: tuner-core.c,v 1.55 2005/07/08 13:20:33 mchehab Exp $ + * $Id: tuner-core.c,v 1.58 2005/07/14 03:06:43 mchehab Exp $ * * i2c tv tuner chip device driver * core core, i.e. kernel interfaces, registering and so on @@ -39,6 +39,9 @@ static unsigned int addr = 0; module_param(addr, int, 0444); +static unsigned int no_autodetect = 0; +module_param(no_autodetect, int, 0444); + /* insmod options used at runtime => read/write */ unsigned int tuner_debug = 0; module_param(tuner_debug, int, 0644); @@ -318,17 +321,19 @@ tuner_info("chip found @ 0x%x (%s)\n", addr << 1, adap->name); /* TEA5767 autodetection code - only for addr = 0xc0 */ - if (addr == 0x60) { - if (tea5767_autodetection(>i2c) != EINVAL) { - t->type = TUNER_TEA5767; - t->mode_mask = T_RADIO; - t->mode = T_STANDBY; -
Re: [patch 2.6.13-git] 8250 tweaks
On Thu, Jul 14, 2005 at 12:12:02AM -0700, Sam Song wrote: > It turned out the conflict of uart init definition > like MPC10X_UART0_IRQ in ../syslib/mpc10x_common.c > and SERIAL_PORT_DFNS in ../platform/sandpoint.h. By > now, only MPC10X_UART0_IRQ stuff is needed. > SERIAL_PORT_DFNS should be omitted. Oh dear, it seems that I missed a load of fixups then. I only scanned include/asm-* for SERIAL_PORT_DFNS - and I stupidly thought that PPC this "platform" directory would be in include/asm-ppc. > Seems it's time for me to stand with Russell:-) Well, in this case, the "whinging" resulted in finding a _real_ bug and locating why your ports weren't being found. So I guess it's good for something. Can you mail me a diff of the changes you made to arch/ppc/platforms/sandpoint.h please? If that file is being used it seems that you actually have 4 ports defined in total. However, I'm a little confused because the sandpoint.h defines don't seem to match your original dmesg output. -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Thread_Id
On Thu, 2005-07-14 at 15:36 +0530, RVK wrote: > bits/pthreadtypes.h:150:typedef unsigned long int pthread_t; That's an implementation detail which you cannot determine any information from. What Arjan is saying is that pthread_t is a cookie -- this means that you cannot interpret it in any way, it is just a "thing" which you can pass back to the API, that pthread_t happens to be typedef'd to unsigned long int is irrelevant. Ian. -- Ian Campbell Current Noise: Nile - Annihilation Of The Wicked Don't tell me what you dreamed last night for I've been reading Freud. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.13-rc3 ACPI regression and hang on x86-64
On my x86-64 laptop (Targa Visionary 811: Athlon64 + VIA chipset, Arima OEM:d HW also sold by eMachines and others), ACPI is broken and hangs the x86-64 2.6.13-rc3 kernel. During boot, ACPI reduces the screen's brightness (it's always done this in the x86-64 kernels but not the i386 ones), so I have to press a specific key combination (Fn+F8) to increase the brightness. This worked up to and including the 2.6.13-rc2 kernel, but with 2.6.13-rc3 it causes an error message: acpi_ec-0217 [04] acpi_ec_leave_burst_mo: --->status fail on the console, and then the machine is hung hard. With the i386 kernel, both this key combination and the other one for reducing the brightness work as expected. A diff between the dmesg logs for 2.6.13-rc2 and -rc3 (included below) indicates that APCI experiences several new errors in rc3. /Mikael --- dmesg-2.6.13-rc2-x86_64 2005-07-14 11:59:58.0 +0200 +++ dmesg-2.6.13-rc3-x86_64 2005-07-14 11:59:59.0 +0200 @@ -1,5 +1,5 @@ Bootdata ok (command line is ro root=/dev/hda7) -Linux version 2.6.13-rc2 ([EMAIL PROTECTED]) (gcc version 4.0.1) #1 Fri Jul 8 15:44:53 CEST 2005 +Linux version 2.6.13-rc3 ([EMAIL PROTECTED]) (gcc version 4.0.1) #1 Wed Jul 13 17:51:48 CEST 2005 BIOS-provided physical RAM map: BIOS-e820: - 0009f800 (usable) BIOS-e820: 0009f800 - 000a (reserved) @@ -37,46 +37,49 @@ Initializing CPU#0 PID hash table entries: 2048 (order: 11, 65536 bytes) time.c: Using 1.193182 MHz PIT timer. -time.c: Detected 1603.693 MHz processor. +time.c: Detected 1603.705 MHz processor. time.c: Using PIT/TSC based timekeeping. Console: colour VGA+ 80x25 Dentry cache hash table entries: 65536 (order: 7, 524288 bytes) Inode-cache hash table entries: 32768 (order: 6, 262144 bytes) -Memory: 511408k/523200k available (1653k kernel code, 11012k reserved, 941k data, 128k init) -Calibrating delay using timer specific routine.. 3211.68 BogoMIPS (lpj=16058428) +Memory: 511408k/523200k available (1656k kernel code, 11012k reserved, 941k data, 128k init) +Calibrating delay using timer specific routine.. 3211.67 BogoMIPS (lpj=16058383) Mount-cache hash table entries: 256 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 1024K (64 bytes/line) +mtrr: v2.0 (20020519) CPU: Mobile AMD Athlon(tm) 64 Processor 2800+ stepping 0a - tbxface-0118 [02] acpi_load_tables : ACPI Tables successfully acquired + tbxface-0120 [02] acpi_load_tables : ACPI Tables successfully acquired Parsing all Control Methods: Table [DSDT](id F005) - 482 Objects with 46 Devices 148 Methods 16 Regions Parsing all Control Methods: Table [SSDT](id F003) - 3 Objects with 0 Devices 0 Methods 0 Regions -ACPI Namespace successfully loaded at root 803ac6e0 -evxfevnt-0094 [03] acpi_enable : Transition to ACPI mode successful +ACPI Namespace successfully loaded at root 803ad260 +evxfevnt-0096 [03] acpi_enable : Transition to ACPI mode successful Using local APIC timer interrupts. Detected 12.528 MHz APIC timer. testing NMI watchdog ... OK. NET: Registered protocol family 16 +ACPI: bus type pci registered PCI: Using configuration type 1 -mtrr: v2.0 (20020519) -ACPI: Subsystem revision 20050309 -evgpeblk-0979 [06] ev_create_gpe_block : GPE 00 to 0F [_GPE] 2 regs on int 0xA -evgpeblk-0987 [06] ev_create_gpe_block : Found 7 Wake, Enabled 0 Runtime GPEs in this block +ACPI: Subsystem revision 20050408 +evgpeblk-1016 [06] ev_create_gpe_block : GPE 00 to 0F [_GPE] 2 regs on int 0xA +evgpeblk-1024 [06] ev_create_gpe_block : Found 7 Wake, Enabled 0 Runtime GPEs in this block Completing Region/Field/Buffer/Package initialization:... Initialized 16/16 Regions 0/0 Fields 18/18 Buffers 17/27 Packages (494 nodes) -Executing all Device _STA and_INI methods:..[ACPI Debug] String: [0x24] " AC _STA" +Executing all Device _STA and_INI methods:..[ACPI Debug] String: [0x24] " AC _STA" ... 49 Devices found containing: 49 _STA, 2 _INI methods ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing -nsxfeval-0250 [06] acpi_evaluate_object : Handle is NULL and Pathname is relative -nsxfeval-0250 [06] acpi_evaluate_object : Handle is NULL and Pathname is relative -nsxfeval-0250 [06] acpi_evaluate_object : Handle is NULL and Pathname is relative -nsxfeval-0250 [06] acpi_evaluate_object : Handle is NULL and Pathname is relative +nsxfeval-0251 [06] acpi_evaluate_object : Handle is NULL and Pathname is relative +nsxfeval-0251 [06] acpi_evaluate_object : Handle is NULL and Pathname is relative +nsxfeval-0251 [06] acpi_evaluate_object :
Re: Thread_Id
On Thu, 2005-07-14 at 15:36 +0530, RVK wrote: > > > >it doesn't return a number it returns a pointer ;) or a floating point > >number. You don't know :) > > > >what it returns is a *cookie*. A cookie that you can only use to pass > >back to various pthread functions. > > > > > > > Hahaha..common. Please clarify following I'm missing the joke > SYNOPSIS >#include > >pthread_t pthread_self(void); > > DESCRIPTION >pthread_self return the thread identifier for the calling thread. *identifier*. It doesn't give a meaning beyond that, and if you look at other pthread manpages (say pthread_join) it just wants that identifier back. If you want to attach meaning to a thread identifier, please come up with a manpage/standard that actually defines the meaning of it. > > bits/pthreadtypes.h:150:typedef unsigned long int pthread_t; and here you 1) look at implementation details of your specific threading implementation and 2) you prove that your analysis is wrong since the implementation you look at defines it as *unsigned* so it can't be negative. So what your app does is clearly wrong even within the implementation you look at. Other implementations are allowed to use different types for this. In fact, I'd be surprised if NPTL and LinuxThreads would have the same type... (they'll have the same size for ABI compat reasons of course, but type... not so sure). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Open source firewalls
I don't think buffer overflow has anything to do with transparent proxy. Transparent proxying is just doing some protocol filtering. Still the proxy code may have some buffer overflows. The best way is first to try avoiding any buffer overflows and take programming precautions. Other way is to chroot the services, if running it on a firewall. There are various mechanisms which can be used like bounding the memory region it self. Stack Randomisation and Canary based approaches can also avoid any buffer overflow attacks. IDS runs on L7, best example is snort. Its not possible for IDS to detect these attacks accurately. rvk Helge Hafting wrote: Vinay Venkataraghavan wrote: I know how to implement buffer overflow attacks. But how would an intrusion detection system detect a buffer overflow attack. Buffer overflow attacks vary, but have one thing in common. The overflow string is much longer than what's usual for the app/protocol in question. It may also contain illegal characters, but be careful - non-english users use plenty of valid non-ascii characters in filenames, passwords and so on. The way to do this is to implement a transparent proxy module for every protocol you want to do overflow prevention for. Collect the strings transmitted, pass them on after validating them. Or reset the connection when one gets "too long". For example, you may want to limit POP usernames to whatever the maximum username length is on your system. But make such things configurable, others may want longer usernames than you. My question is at the layer that the intrusion detection system operates, how will it know that a particular string for exmaple is liable to overflow a vulnerable buffer. It can't know of course, but it can suspect that 1000-character usernames, passwords or filenames is foul play and reset the connection. Or 10k URL's . . . Helge Hafting - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ . - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Linus Torvalds <[EMAIL PROTECTED]> writes: > And in short-term things, the timeval/jiffie conversion is likely to be a > _bigger_ issue than the crystal frequency conversion. > > So we should aim for a HZ value that makes it easy to convert to and from > the standard user-space interface formats. 100Hz, 250Hz and 1000Hz are all > good values for that reason. 864 is not. Probably only theoretical, and probably the hardware isn't up to it... But what if we have: - 64-bit jiffies done in hardware (a counter). 1 cycle = 1 microsecond or even a CPU clock cycle. Can *APIC or another HPET do that? - 64-bit "match timer" (i.e., a register in the counter which fires IRQ when it matches the counter value) - the CPU(s) sorting the timer list and programming "match timer" with software timer next to be executed. Upon firing the timer, a new "next to be executed" timer would be programmed into the counter's "match timer". We would have no timer ticks when nobody requested them - the CPUs would be allowed to sleep for, say, even 50 ms when no task is RUNNING. -- Krzysztof Halasa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: console remains blanked
Hi, --- Jan Engelhardt <[EMAIL PROTECTED]> escribió: > The console is unblanked when you hit a key (or > probably move a mouse too), > not when some application outputs something on > stdout/stderr/etc. Before 2.6.12-rc2, the console was unblanked by just writing to the console. For keyboardless and mouseless systems (which is my case, embedded) this new behaviour is a bit annoying. > Which kernel versions have this patch? I'm on > 2.6.13-rc1 and have no problems > with unblanking. I have this problem since 2.6.12-rc2. If I add back the patch hunk specified in my original message, the blanking behaviour changes to that present in pre-2.6.12-rc2 kernels. I just would like to know if this new behaviour is just intentional and makes sense to everyone (except me :-) Thanks for your feedback, Albert __ Renovamos el Correo Yahoo! Nuevos servicios, más seguridad http://correo.yahoo.es - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Thread_Id
Arjan van de Ven wrote: On Thu, 2005-07-14 at 11:03 +0530, RVK wrote: Robert Hancock wrote: RVK wrote: Can anyone suggest me how to get the threadId using 2.6.x kernels. pthread_self() does not work and returns some -ve integer. What do you mean, negative integer? It's not an integer, it's a pthread_t, you're not even supposed to look at it.. What is pthread_t inturn defined to ? pthread_self for 2.4.x thread libraries return +ve number(as u have a objection me calling it as integer :-)) it doesn't return a number it returns a pointer ;) or a floating point number. You don't know :) what it returns is a *cookie*. A cookie that you can only use to pass back to various pthread functions. Hahaha..common. Please clarify following SYNOPSIS #include pthread_t pthread_self(void); DESCRIPTION pthread_self return the thread identifier for the calling thread. bits/pthreadtypes.h:150:typedef unsigned long int pthread_t; rvk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ . - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Open source firewalls
Vinay Venkataraghavan wrote: I know how to implement buffer overflow attacks. But how would an intrusion detection system detect a buffer overflow attack. Buffer overflow attacks vary, but have one thing in common. The overflow string is much longer than what's usual for the app/protocol in question. It may also contain illegal characters, but be careful - non-english users use plenty of valid non-ascii characters in filenames, passwords and so on. The way to do this is to implement a transparent proxy module for every protocol you want to do overflow prevention for. Collect the strings transmitted, pass them on after validating them. Or reset the connection when one gets "too long". For example, you may want to limit POP usernames to whatever the maximum username length is on your system. But make such things configurable, others may want longer usernames than you. My question is at the layer that the intrusion detection system operates, how will it know that a particular string for exmaple is liable to overflow a vulnerable buffer. It can't know of course, but it can suspect that 1000-character usernames, passwords or filenames is foul play and reset the connection. Or 10k URL's . . . Helge Hafting - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Wed, 13 Jul 2005, Benjamin LaHaise wrote: > That's one thing I truely dislike about the current timer code. If we > could program the RTC interrupt to come into the system as an NMI (iirc > oprofile already has code to do this), we could get much better TSC > interpolation since we would be sampling the TSC at a much smaller, less > variable offset, which can only be a good thing. And we'd get a lot more crashes on broken systems that do not handle NMIs in the SMM -- this is the very reason the NMI watchdog is disabled these days by default. A whole lot of systems simply cannot handle NMIs happening randomly. Programming an I/O APIC to deliver the RTC interrupt (or any other that's edge-triggered) as an NMI is itself trivial (we can do this for the PIT for the NMI watchdog already). Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/19] Kconfig I18N completion
>> Patch 19/19 contains a .po file. > >Yes, the patch 19/19 contains the translation of configuration... >I see Linus doesn't want the huge language files in kernel source. >But what is Linus opinion about this little .po file? What is little? Given that there's 'roughly' 119 languages (find /usr/share/locale -type d -maxdepth 1 | wc -l), you'd surely reconsider if adding 119 23KB files, if it was considered "small". As I perceive it, the policy is: no PO files in mainline at all. I'm fine with that. Keeping the translations in sync with the mainline Kconfig help texts/etc. is also not an easy task unless you got a lot of time to spare. Jan Engelhardt -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fuse chardevice number
> >> /** The minor number of the fuse character device */ > >> -#define FUSE_MINOR 229 > >> +#define FUSE_MINOR MISC_DYNAMIC_MINOR > > > >FUSE has an allocated fix minor. Dynamic minor is much harder to > >handle with legacy /dev (not udev). > > How many users of 2.6.13 and up really do not have/run udev? [Please don't > send too many responses] Don't be afraid, 2.6.13 is not yet released. So the number of users of udev under 2.6.13 is exactly zero ;) > A module option could be added to specify an explicit minor. That's just making it more complicated without any gain. An assigned device number (if it exsist) is exactly as good as a dynamic. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc2-mm2
Chuck Ebbert wrote: >Looks like Quilt is adding the space during push/pop operations. Only the > lines it has touched in the series file have the trailing space. Quilt versions prior to 0.39 would add a trailing space to the series file entry when doing a quilt refresh with the default -p1 patch level. David Vrabel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Wed, 13 Jul 2005, Lee Revell wrote: > Did anyone else find this strange: > > "The RTC is used in periodic mode to provide the system profiling > interrupt on uni-processor systems and the clock interrupt on > multi-processor systems." > > We just take NR_CPUS * HZ timer interrupts per second, what's the > advantage of using the RTC? It tends to work in the APIC mode all the time (with all systems), unlike the PIT which has "interesting" routing problems with its IRQ0, which you've probably already noticed. Have a look at all the hassle in check_timer() if you want to double-check it. Of course using APIC internal timers is generally the best idea on SMP, but they may have had reasons to avoid them (it's not an ISA interrupt, so it could have been simply out of question in the initial design). Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fuse chardevice number
Hi, >> /** The minor number of the fuse character device */ >> -#define FUSE_MINOR 229 >> +#define FUSE_MINOR MISC_DYNAMIC_MINOR > >FUSE has an allocated fix minor. Dynamic minor is much harder to >handle with legacy /dev (not udev). How many users of 2.6.13 and up really do not have/run udev? [Please don't send too many responses] A module option could be added to specify an explicit minor. Jan Engelhardt -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc2-mm2
On Wed, Jul 13, 2005 at 05:29:32PM -0400, Chuck Ebbert wrote: > On Wed, 13 Jul 2005 at 00:23:42 -0700, Andrew Morton wrote: > > >>...and BTW why does every line in the series file have a trailing space? > > > > Not in my copy of > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc2/2.6.13-rc2-mm2/patch-series > > ? > > > Looks like Quilt is adding the space during push/pop operations. Only the > lines it has touched in the series file have the trailing space. Nope. For me quilt leaves a trailing space if I add patches with -p0 to the series file and then do a "quilt refresh -p1". Johannes - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: Oops when running mkreiserfs on large (9TB) raid0 set on AMD64 SMP
On Thu 14 Jul 2005, Neil Brown wrote: > > Aug 9 20:09:18 localhost kernel: > > {:raid0:raid0_make_request+472} > > Looks like the problem is at: > sector_div(x, (unsigned long)conf->hash_spacing); > zone = conf->hash_table[x]; [...] > Anyway, the following patch, if it compiles, might changed the > behaviour of raid0 -- possibly even improve it :-) > > Thanks for the report. > > Success/failure reports of this patch would be most welcome. Thanks for the quick fix. I just tried it again with your patch, and now it works fine! FilesystemSize Used Avail Use% Mounted on /dev/md11 8.8T 33M 8.8T 1% /mnt Very nice... :) Paul Slootman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: console remains blanked
>Looks like, since [1] was merged, a blanked console >(due to inactivity for example) doesn't get unblanked >anymore when new output is written to it. The console is unblanked when you hit a key (or probably move a mouse too), not when some application outputs something on stdout/stderr/etc. >[1] >http://marc.theaimsgroup.com/?l=linux-kernel=111052009232499=2 Which kernel versions have this patch? I'm on 2.6.13-rc1 and have no problems with unblanking. Jan Engelhardt -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
"My expectation is if we want to beat the competition, we'll want the ability to go *under* 100Hz." >>> >>> What does Windows do here? >> >> windows xp base rate is 100Hz... but multimedia apps can ask for almost > > 83Hz Well, Windoes 98 (vmmon) shows very different ones: /dev/vmmon[4355]: host clock rate change request 0 -> 19 /dev/vmmon[4355]: host clock rate change request 19 -> 0 /dev/vmmon[4355]: host clock rate change request 0 -> 19 /dev/vmmon[4355]: host clock rate change request 19 -> 63 /dev/vmmon[4355]: host clock rate change request 63 -> 200 /dev/vmmon[4355]: host clock rate change request 200 -> 201 /dev/vmmon[4355]: host clock rate change request 201 -> 1001 >> any rate they want (depends on the hw capabilities). i recall seeing >> rates >1200Hz when you launch some of the media player apps -- sorry i >> forget the exact number. I have seen some apps which seem to schedule themselves using some kind of SCHED_FIFO and therefore seem to get good RT: from an ini file... # This option determines the multi-tasking capabilities of WinDEU. # The priority determines the minimum number of milliseconds WinDEU # will work before giving control back to Windows. # For example, if you set it to 20, it means WinDEU will gives # back control to Windows approximately (at most) 50 times a second. # A value of 0 means WinDEU WON'T multi-task. # (Can be changed in the preferences dialog box.) BuildPriority=25 Jan Engelhardt -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] don't bind to PCI express links [8/9]
This patch prevents the PCI<->PCI bridge driver from binding to PCI express devices. This is needed to coexist with the PCI express root port driver. Eventually we may want to rework and better integrate linux PCI express link support, but for now this should work. Signed-off-by: Adam Belay <[EMAIL PROTECTED]> --- a/drivers/pci/bus/pci-bridge.c 2005-07-14 02:30:09.0 -0400 +++ b/drivers/pci/bus/pci-bridge.c 2005-07-14 02:46:12.0 -0400 @@ -132,6 +132,10 @@ if (dev->subordinate) return -ENODEV; + /* don't bind to pci express links */ + if (pci_find_capability(dev, PCI_CAP_ID_EXP)) + return -ENODEV; + bus = ppb_detect_bus(dev); if (!bus) return -ENODEV; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] master abort on scanning fixes [6/9]
The PCI bridge driver now checks if changing bridge_ctrl is necessary. It also restores the original bridge_ctl settings when finished scanning for devices. Finally, a pci_bus setup fix is included. Signed-off-by: Adam Belay <[EMAIL PROTECTED]> --- a/drivers/pci/bus/pci-bridge.c 2005-07-12 01:45:46.0 -0400 +++ b/drivers/pci/bus/pci-bridge.c 2005-07-14 02:09:15.0 -0400 @@ -30,7 +30,7 @@ bus->bridge = >dev; bus->ops = bus->parent->ops; bus->sysdata = bus->parent->sysdata; - bus->bridge = get_device(>dev); + bus->self = dev; /* Set up default resource pointers and names.. */ for (i = 0; i < 4; i++) { @@ -82,12 +82,7 @@ if (!bus) return NULL; - /* Disable MasterAbortMode during probing to avoid reporting -* of bus errors (in some architectures) -*/ pci_read_config_word(dev, PCI_BRIDGE_CONTROL, ); - pci_write_config_word(dev, PCI_BRIDGE_CONTROL, - bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); bus->number = bus->secondary = busnr; bus->primary = buses & 0xFF; @@ -105,10 +100,22 @@ { unsigned int devfn; + /* Disable MasterAbortMode during probing to avoid reporting +* of bus errors (in some architectures) +*/ + if (!(bus->bridge_ctl & PCI_BRIDGE_CTL_MASTER_ABORT)) + pci_write_config_word(bus->self, PCI_BRIDGE_CONTROL, + bus->bridge_ctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); + /* Go find them, Rover! */ for (devfn = 0; devfn < 0x100; devfn += 8) pci_scan_slot(bus, devfn); + /* restore the original bridge_ctl configuration */ + if (!(bus->bridge_ctl & PCI_BRIDGE_CTL_MASTER_ABORT)) + pci_write_config_word(bus->self, PCI_BRIDGE_CONTROL, + bus->bridge_ctl); + pcibios_fixup_bus(bus); pci_bus_add_devices(bus); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/