[PATCH] 2.2.18pre21 ide-disk.c for OB800
hi folks, I've been playing with hdparms on the OB800 (running debian 2.2 - that 2.2.18pre21 based) as shown on: http://www.phys.unsw.edu.au/~mcba/hp800ct.html The appended patch attemps to fix the following errors seen when the IDE drive spins up after a sleep mode: hda: multwrite_intr: status=0x51 { DriveReady SeekComplete Error } hda: multwrite_intr: error=0x04 { DriveStatusError } *or* hda: read_intr: status=0x59 { DriveReady SeekComplete DataReady Error } hda: read_intr: error=0x04 { DriveStatusError } After several retries, both result in "ide0: reset: success" and life is good again. The patch avoids the reset. I have two theories on what's wrong: drive forgets multmode was enabled *or* drive doesn't spin up media when poked. The web page above claims the drive forgets the hdparm -S (idle time before sleep) and provides a script to reset that. I've wondered if the same might be true for Multi-sector mode. Anyway, resetting MultMode seems to fix the problem. I've learned the drive will spin down in four different cases: 1) hdparm -S10 (50 seconds) and let the machine idle that long 2) BIOS APM has been idle and decides it time to sleep (I have mine set to 3 minutes right now) 3) "on/off" button on keyboard 4) hdparm -Y /dev/hda (this case still generates similar errors.) This patch does NOT fix another symptom: o "irq timeout: status=0xd0 { Busy }" followed by "ide0: reset: success". enjoy! grant grundler at puffin.external.hp.com parisc-linux I/O hacker --- kernel-source-2.2.18pre21-2.2.18pre21.ORIG/drivers/block/ide-disk.c Wed Jun 7 14:26:42 2000 +++ linux/drivers/block/ide-disk.c Wed Jan 10 15:03:15 2001 @@ -123,6 +123,23 @@ static int lba_capacity_is_ok (struct hd } /* + * set_multmode_intr() is invoked on completion of a WIN_SETMULT cmd. + */ +static ide_startstop_t set_multmode_intr (ide_drive_t *drive) +{ + byte stat = GET_STAT(); + + if (OK_STAT(stat,READY_STAT,BAD_STAT)) { + drive-mult_count = drive-mult_req; + } else { + drive-mult_req = drive-mult_count = 0; + drive-special.b.recalibrate = 1; + (void) ide_dump_status(drive, "set_multmode", stat); + } + return ide_stopped; +} + +/* * read_intr() is the handler for disk read/multread interrupts */ static ide_startstop_t read_intr (ide_drive_t *drive) @@ -145,6 +162,32 @@ static ide_startstop_t read_intr (ide_dr return ide_started; } #endif + else if ((0x59 == stat) drive-mult_count) { + /* + ** HP OB800 laptop HD (IBM-DMCA-21440) will spin down + ** and *forget* multi-mode parms when it spins up on + ** the next access. The following fixes the stat 0x51 + ** w/error 0x4 messages and reset after spinning up. + ** ggg 9.1.2001 + ** + ** If first request which triggers resume is a write, + ** stat == 0x51 (vs. 0x59 for read). + ** + ** 1) Check if HD forgot. + ** 2) If so, set them again, otherwise report error. + ** 3) I/O is still queued - will get retried (I hope). + ** + ** REVISIT: Don't know how to easily check HD settings. + **Not sure it's possible since it doesn't just seem to + **be a simple register read. Not checking HD settings + **risks an infinite loop/wedged system. I would expect + **to get a different error for the SETMULT cmd. + */ + { + ide_cmd(drive, WIN_SETMULT, drive-mult_req, +set_multmode_intr); + return ide_started; + } + } msect = drive-mult_count; read_next: @@ -331,24 +374,17 @@ static ide_startstop_t multwrite_intr (i } return ide_stopped; /* the original code did this here (?) */ } - return ide_error(drive, "multwrite_intr", stat); -} - -/* - * set_multmode_intr() is invoked on completion of a WIN_SETMULT cmd. - */ -static ide_startstop_t set_multmode_intr (ide_drive_t *drive) -{ - byte stat = GET_STAT(); - - if (OK_STAT(stat,READY_STAT,BAD_STAT)) { - drive-mult_count = drive-mult_req; - } else { - drive-mult_req = drive-mult_count = 0; - drive-special.b.recalibrate = 1; - (void) ide_dump_status(drive, "set_multmode", stat); + else if (0x51 == stat) { + /* + ** HP OB800 laptop HD (IBM-DMCA-21440) fix. + ** See comments in read_intr(). + */ + { + ide_cmd(drive, WIN_SETMULT, drive-mult_req, +set_multmode_intr); + return ide_started;
Re: [PATCH] 2.2.18pre21 ide-disk.c for OB800
Andre, Alan, My grand total experience with IDE drivers is now around 4 hours. I have no clue what's right or wrong and am quite clueless what the role of apmd is wrt ide-disk driver. I'm open to testing other fixes for this problem. AFAIK, this could be a BIOS bug since no one else seems to have run into on other laptops and it's reproduce with two different makes of drives. The reason I put the fix in the read/multwrite_intr path is the recovery has to occur before the I/O is retried and before accessing the disk. If multmode can be set before even trying the I/O, then that's definitely a better solution. I just have no clue how to implement it. thanks, grant Grant Grundler Unix Systems Enablement Lab +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: The IO problem on multiple PCI busses
Benjamin Herrenschmidt wrote: Hi Grant ! Alan Cox suggested I contact you about this. I'm trying to figure out a way to cleanly resolve the problem of doing IO accesses on machines with multiple PCI host bridges (and multiple IO bases when IO cycles are not generated by the CPU). I'd be glad if you could catch on the "The IO problem on multiple PCI busses" thread on linux-kernel list and let us share your point of viw. To l-k, Benjamin wrote: | I've looked at the parisc code (thanks Alan for pointing that out), and | it seem they implement all inb/outb as quite big functions that decypher | the address, retreive the bus, and do the proper IO call. Unfortunately, | that's a bit bloated, and I don't think I'll ever get other PPC | maintainers to agree with such a mecanism (everybody seem to be quite | concerned with IO speed, I admit including me). Benjamin, As the main author/maintainer of that code, let me explain why it's so ugly. Hopefully this will give you insight into a "better" (arch independent) solution. Apologies for the length. For IO Port space, I didn't worry about the bloat. A nice side effect of this bloat is it will discourage use of I/O Port space. That's good for everyone, AFAICT. (I know some devices *only* support I/O port space and I personnally don't care about them. If someone who does care about one wants to talk to me about it...fine...I'll help) [ Caveat: I've simplified the following *alot* to keep it short. ] parisc supports two different PCI host bus adapters with each having variants that behave differently. All work under the model we are using with one binary. One kernel binary is important since we want to make install's easy for users. Under Dino (GSCtoPCI), each PCI HBA has it's own 64K I/O port space. I/O port space transactions are generated by poking registers on Dino. Yes - performance sucks - that's why HPUX (almost) exclusively uses devices which support MMIO. Under Elroy (aka LBA or RopesToPCI), we have two methods of accessing I/O port space. One view of I/O space can be shared across all Elroy's which share the same IOMMU (aka SBA). This method distributes the 64K I/O space over the 8 (or 16) "ropes" with rope 0 getting the first 8k (or 4k) and so on. The other view is each LBA has it's own 64K of I/O port space. The second view is mapped above 4GB and requires 64-bit kernel to access. In both cases, processor loads/stores from/to the region will generate an I/O cycle on the respective PCI bus. Generally speaking, parisc doesn't support VGA or ISA legacy crud on it's PCI busses. But I think those are orthogonal issues. The inb/outb support hings on this definition in include/asm-parisc/pci.h: struct pci_port_ops { u8 (*inb) (struct pci_hba_data *hba, u16 port); u16 (*inw) (struct pci_hba_data *hba, u16 port); u32 (*inl) (struct pci_hba_data *hba, u16 port); void (*outb) (struct pci_hba_data *hba, u16 port, u8 data); void (*outw) (struct pci_hba_data *hba, u16 port, u16 data); void (*outl) (struct pci_hba_data *hba, u16 port, u32 data); }; Code which uses this is in arch/parisc/kernel/pci.c at: http://puffin.external.hp.com/cvs/linux/arch/parisc/kernel/pci.c (look for PCI_PORT_HBA usage) In a nut shell, the HBA number is encoded in the upper 16-bits of the 32-bit I/O port space address. The inb() *function* uses the decoded HBA number to lookup the matching pci_port_ops function table and pci_hba_data * to pass in. PCI fixup_bus() code virtualizes the I/O port addresses found by the generic PCI bus walk. inb() is function so drivers work under *all* parisc PCI HBAs with one binary. This scheme allows us to support "PCI-like" busses as well. Some parisc machines have both PCI and EISA slots which are completely independent of each other. We'd like to keep the semantics of inb/outb the same and support both at the same time. It might be possible to do this by feeding the drivers different versions of inb/outb definitions at compile time. But initial attempts to do this ran into problems (which I don't remember the details of). Last comment is regarding who *configures* the PCI devices. On legacy PDC (parisc's "BIOS on steriods"), the PDC sets everything up but does not enable everything (ie pci_enable_device will set bits in PCI_COMMAND cfg register). On card-mode Dino, (GSC cards plugged in proprietary bus), the firmware doesn't know *anything* about the PCI devices and the arch support has to set everything up - PCI MMIO space is not currently supported there. And new servers (like L2000 or A500) with "PAT PDC" only initialize PCI devices for boot. OS has to initialize the rest. grant Grant Grundler parisc-linux {PCI|IOMMU|SMP} hacker +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at h
Re: IO issues vs. multiple busses
Benjamin, Here are my comments directly responding to your mail. Benjamin Herrenschmidt wrote: Hi Grant ! My original mail is: Here's the return of an ld problem for which we really need a solution asap since it's now biting us in real life configurations... So the problem happens when you have a machine with more than one PCI host bridge. This is typically the case of all new Apple machines as they have 3 host bridges in one chip (2 of them are relevant: the AGP and the PCI). I don't think the problem exist on x86 machines with real IO cycles, at least in that case, the problem is different. Large systems have problems with I/O port space and legacy devices. There just isn't enough I/O port space to support large configs and ISA aliasing and all the other crud. That's why Intel is (a) ditching all the legacy crap in IA64 and (b) strongly encouraging people to use MMIO space on PCI. In order to generate IO cycles, the bridge provides us with a region in CPU physical memory space (a 16Mb region in our case) that translates accesses to IO cycles on the PCI bus. Our implementation of inb/outb currently relies on the kernel ioremap'ing one of these regions (the PCI one) and using the ioremap result as a base (offset) inside the inb/outb functions. If you only support one type of bridge, you could avoid the indirect function call (which parisc-linux uses) and encode the access method directly in the inb/outb macros. Just note that processor speed is so much faster (and getter faster) than the ISA bus (and PCI-1X bus), that CPU overhead is mostly irrelevant to the cost of accessing IO port space. On older x86 boxes it is relevant. So that mean that the current design won't allow access to IOs located on any bus but the one we arbitrarily choose (the PCI bus). That's fine in most case, until you decide to put a 3dfx or nvidia card in the AGP slot. Those cards require some IO accesses to be done to the legacy VGA addresses, and of course, our inb/outb functions can't do that. parisc-linux has solved exactly that problem. Obviously, we can hack some driver specific thing that would use the arch-specific code to retreive the proper io base address for a given host bridge, but that's a hack. I'm looking for a solution that would cleanly apply to all archs that may potentially face this problem. I don't believe such a solution exists which is "cleaner" than what parisc-linux does and meets the same objectives. Right now, it's important the install be easy in order to make it easy for people to migrate from HPUX to parisc-linux. :^) The problem potentially exist also for any PCI card that has PCI IOs on anything but the main PCI bus. One possibility is to limit our IO space to 64k per bus (to avoid bloating) and then use a hacked ioremap to create a single virtually contiguous kernel region that appends all those IO spaces together. Accessing IOs on bus N would just be the matter of calculating an address of the type 64k*N+offset and doing normal inb/outb on the result. This might work for other arches. I'm pretty sure it won't for parisc. Again, the issue is the IO port space access method varies by HBA. The arch PCI code could then properly fixup PCI IO resources for PCI drivers, and we could add a function of the kind unsigned long pci_bus_io_offset(int busno); that would return the offset to add to inb/outb when accessing IOs on the N'th PCI bus. Basically, parisc-linux does that but the arch support hides that from the device drivers. If we want to go a bit further, and allow ISA drivers that don't have a pci_dev structure to work on legacy devices on any bus, we could provide a set of function of the type int isa_get_bus_count(); unsigned long isa_get_bus_io_offset(int busno); and eventually int isa_bus_to_pci_bus(int isa_busno); int pci_bus_to_isa_bus(int pci_busno); I don't like this either. Reserving bus 0 for E/ISA solves the problem. I'm, of course open to any comments about this (in fact, I'd really like some feedback). One thing is that we also need to find a way to pass those infos to userland. Currently, we implement an arch-specific syscall that allow to retreive the IO physical base of a given PCI bus. That may be enough, but we may also want something that match more closely what we do in the kernel. I agree with davem on this. But maybe for different reasons. The issue with exporting IO port regions is the access method. Access method varies by platform (for parisc arch) and I don't want to see user apps encoding the access method for specific platforms by default. If someone intentionally wants to build an app for a specific platform, that's different. grant Grant Grundler parisc-linux {PCI|IOMMU|SMP} hacker +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.
PATCH 2.4.0 parisc PCI support
Hi all, This patch contains the support parisc-linux needs in PCI generic. My patch is not as clean as I'd like - but it should work. Please send changes/feedback directly to me. Code in parisc-linux CVS (based on 2.4.0) does boot on my OB800 (133Mhz Pentium), C3000, and A500 with PCI-PCI bridge support working. I'm quite certain PCI-PCI bridge configuration (ie BIOS didn't configure the bridge) support was broken. I'm not able to test on alpha though...alpha may want to see #ifdef __hppa__ around some of the code I've changed. I think the plan is to update the arch/parisc support in the near future so parisc builds actually work from linus' tree. grant Grant Grundler parisc-linux {PCI|IOMMU|SMP} hacker +1.408.447.7253 Index: drivers/pci/Makefile === RCS file: /home/cvs/parisc/linux/drivers/pci/Makefile,v retrieving revision 1.1.1.4 retrieving revision 1.6 diff -u -p -r1.1.1.4 -r1.6 --- Makefile2001/01/09 16:57:56 1.1.1.4 +++ Makefile2001/02/02 15:35:25 1.6 @@ -21,6 +21,7 @@ obj-$(CONFIG_PROC_FS) += proc.o # obj-$(CONFIG_ALPHA) += setup-bus.o setup-irq.o obj-$(CONFIG_ARM) += setup-bus.o setup-irq.o +obj-$(CONFIG_PARISC64) += setup-bus.o ifndef CONFIG_X86 obj-y += syscall.o Index: drivers/pci/pci.c === RCS file: /home/cvs/parisc/linux/drivers/pci/pci.c,v retrieving revision 1.1.1.6 diff -u -p -r1.1.1.6 pci.c --- pci.c 2001/01/09 16:57:56 1.1.1.6 +++ pci.c 2001/03/02 18:44:59 @@ -615,6 +615,7 @@ static void pci_read_bases(struct pci_de } } + void __init pci_read_bridge_bases(struct pci_bus *child) { struct pci_dev *dev = child-self; @@ -628,7 +629,7 @@ void __init pci_read_bridge_bases(struct if (!dev) /* It's a host bus, nothing to read */ return; - for(i=0; i3; i++) + for(i=0; i4; i++) child-resource[i] = dev-resource[PCI_BRIDGE_RESOURCES+i]; res = child-resource[0]; @@ -644,12 +645,16 @@ void __init pci_read_bridge_bases(struct res-end = limit + 0xfff; res-name = child-name; } else { + /* -* Ugh. We don't know enough about this bridge. Just assume -* that it's entirely transparent. +* Either this is not a PCI-PCI bridge or it's not +* configured yet. Since this code only supports PCI-PCI +* bridge, we better not be called for any other type. +* Don't muck the resources since it will confuse the +* platform specific code which does that. */ - printk("Unknown bridge resource %d: assuming transparent\n", 0); - child-resource[0] = child-parent-resource[0]; + printk("PCI : ignoring %s PCI-PCI bridge (I/O BASE not configured)\n", +child-self-slot_name); + return; } res = child-resource[1]; @@ -664,8 +669,8 @@ void __init pci_read_bridge_bases(struct res-name = child-name; } else { /* See comment above. Same thing */ - printk("Unknown bridge resource %d: assuming transparent\n", 1); - child-resource[1] = child-parent-resource[1]; + printk("PCI : ignoring %s PCI-PCI bridge (MMIO base not +configured)\n", child-self-slot_name); + return; } res = child-resource[2]; @@ -690,11 +695,10 @@ void __init pci_read_bridge_bases(struct res-end = limit + 0xf; res-name = child-name; } else { - /* See comments above */ - printk("Unknown bridge resource %d: assuming transparent\n", 2); - child-resource[2] = child-parent-resource[2]; + /* Base limit means the prefetchable mem is disabled.*/ } } + static struct pci_bus * __init pci_alloc_bus(void) { Index: drivers/pci/setup-bus.c === RCS file: /home/cvs/parisc/linux/drivers/pci/setup-bus.c,v retrieving revision 1.1.1.2 retrieving revision 1.5 diff -u -p -r1.1.1.2 -r1.5 --- setup-bus.c 2001/01/09 16:57:56 1.1.1.2 +++ setup-bus.c 2001/02/22 01:11:47 1.5 @@ -23,7 +23,7 @@ #include linux/slab.h -#define DEBUG_CONFIG 1 +#define DEBUG_CONFIG 0 #if DEBUG_CONFIG # define DBGC(args) printk args #else @@ -32,6 +32,7 @@ #define ROUND_UP(x, a) (((x) + (a) - 1) ~((a) - 1)) + static int __init pbus_assign_resources_sorted(struct pci_bus *bus, struct pbus_set_ranges_data *ranges) @@ -46,7 +47,6 @@ pbus_assign_resources_sorted(struct pci_ for (ln=bus-devices.next; ln != bus-devices; ln=ln-next) { struct pci_dev *dev = pci_d
Re: The IO problem on multiple PCI busses
"David S. Miller" wrote: There is another case you are ignoring. Some devices support memory space as well as I/O space, but only operate reliably when their I/O space window is used to access it. ok. Those also fall into the category of "I personally don't care" :^) It just sounds to me like the hppa pci controllers are crap, especially the GSC one. In defense of the HW designers, Dino operates extremely well in the environment it was designed for. Principally, workstations with HP graphics cards (which only use MMIO). Optimizations for graphics make it one of the fastest PCI-1X (and Cujo is PCI-2X) HBA's - that's according to a 3rd party graphics card vendor who has ported to the major high-end platforms. At least the rope one does something reasonable when you have a 64-bit kernel. The horrors you've told me about the IOMMUs and stream-caches on these chips further confirms my theory :-) Yup. *sigh*. Between chip bugs, tradeoffs of performance, time to market, and simple programming interface, things got pretty ugly (its the old saying about "Pick any two"). grant Grant Grundler parisc-linux {PCI|IOMMU|SMP} hacker +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH 2.4.0 parisc PCI support
Jeff Garzik wrote: IIRC these "assuming transparent" lines were put in to -fix- PCI-PCI bridges on at least some x86 boxes... I didn't really understand the bridge code well enough at the time to comment one way or the other on its correctness, but it definitely fixed some problems. Jeff, If someone could clarify, I'd be happy to rework/resubmit the patch. My gut feeling is it was to support something other than a PCI-PCI bridge. pci_read_bridge_bases() assumes the device is a PCI-PCI Bridge (layout and interpretation of the window registers). Either the code needs to be more explicit about the type of bridge being handled or the caller (arch specific code) should. Only x86 and parisc PCI support call this code in my 2.4.0 tree. Maybe the right answer is the "assuming transperent" support in pci_read_bridge_bases() move to arch/x86. I'm pretty sure Alpha and parisc/PAT_PDC systems don't use this code since the registers programmed in pci_setup_bridge(). This makes me think none of the other arches attempt to support PCI-PCI bridges. Is that correct? thanks, grant Grant Grundler parisc-linux {PCI|IOMMU|SMP} hacker +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH 2.4.0 parisc PCI support
Jeff Garzik wrote: The patch worked 100% on my laptop, but failed to allocate a PCI memory region on my desktop machine. Two attachments... "diff -u" output for dmesg before and after your patch, and "diff -u" output for lspci before and after your patch. Jeff, Thanks for trying. I'll rework and resubmit later. Can you send me the complete lspci output of your desktop? (either with or without the patch) I'd like to pull whatever docs I can find on the offending bridge. I'll also look at moving "transparent Bridge" support to x86 pci_fixup_bus() code (and see if I can find a machine locally which has this same "feature"). thanks, grant Grant Grundler parisc-linux {PCI|IOMMU|SMP} hacker +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: IO issues vs. multiple busses
Benjamin Herrenschmidt wrote: ... The reason why I'm getting this problem on the public place (again ?) is that we are now faced with people who want to put video cards in both AGP PCI busses, those cards requiring accesses to some legacy VGA IOs on each of their busses. I don't see any way of getting around virtualizing the IO port space. (by that I mean encoding more info in the upper IO port address bits) ... Right. That's my opinion too. But it's difficult to make everybody agree on it ;) Even the simple mechanism Paul Mackerras did so that IOs to non-existent devices don't kill the kernel (very small overhead) caused some barking ;) Well, make it a CONFIG_XXX option for your arch and the people who insist on doing complicated things will have to live with complicated solutions. I don't have a better idea. It's my understanding that you use high bits of the IO address to store the HBA number and then use that to call the proper access functions. Correct. For some reason, I thought alpha (or sparc? hose number?) did this as well. That would solve the PCI IO problem (PCI cards requiring IOs to BAR-mapped regions), but I don't see how it can fix the problem of a card accessing legacy VGA addresses, except if you hand-fixed the video drivers to fill those high bits before doing IOs. That's right. That happens after the bus walk but before drivers see the device in pci_fixup_bus(). If I understand things correctly, that mean that each card, instead of accessing the legacy VGA port 0xpp, would instead access 0x00bb00pp (or whatever mangling you use to stuff the HBA number). right. The question is then to decide is all ISA busses are on a matching PCI bus, in which case a simple unsigned pci_get_bus_io_base(int bus_no) -like functio n would be enough, or if we want a scheme that supports other ISA-like busses ? I don't know enough about your arch to answer that. We could eventually decide to support only PCI, and additionally declare a fake PCI bus for an ISA bus not matched to a PCI bus, whose config ops would return no device in any slot. Do we agree on this ? In theory yes. But davem already wrote: | There is no 'fake' ISA bus number you need. There is a 'real' one, | the one on which the PCI--ISA bridge lives, why not use that one | :-) Well, from the driver point of view, I think it _do_ exist. Basically, the driver will do inb/outb friends. Whatever those function do in reality is arch-dependant. Right. I meant the underlying implementation of inb/outb. But we agree on the fact that in order for those functions to know on which bus to tap, an additional information must be "cooked" inside the IO address passed to them. yes. Additionally, the same problem is true for ISA memory, when it exist obviously. Really? I expected ISA memory to look like reguler uncacheable memory and the drivers would simply dereference the address. But I'm not an expert on how ISA busses work... With those two simple functions, we could at least [ deleted list ] I have to defer to someone like Alan Cox or Davem or someone who has a clue about VGA. I don't. Like sparc, parisc doesn't support VGA. Alan is the only person I know of who's even considered plugging a VGA card into a parisc box. The only thing that's annoying me in the fact that we keep tied to PCI is that in various embedded system, there is one (or more) ISA-like bus hanging around with legacy devices and no PCI, and abstracting a In short, where the HW doesn't route transactions down the right bus adapter, the SW has too. bit the notion of IO bus would have helped. But it seem that now, more embedded systems are also going toward in-CPU IOs anyway, along with eventually a PCI bus for the most expensive ones, so it may finally not be an issue in the long term. It sounds like the HW will do (some of?) the routing. grant Grant Grundler parisc-linux {PCI|IOMMU|SMP} hacker +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH 2.4.0 parisc PCI support
Ivan Kokshaysky wrote: On Fri, Mar 02, 2001 at 11:32:35AM -0800, Grant Grundler wrote: Code in parisc-linux CVS (based on 2.4.0) does boot on my OB800 (133Mhz Pentium), C3000, and A500 with PCI-PCI bridge support working. I'm quite certain PCI-PCI bridge configuration (ie BIOS didn't configure the bridge) support was broken. I believe it isn't. ;-) It works on various alphas including configurations with chained PCI-PCI bridges. Ok. I overlooked the changes in arch/alpha/kernel/pci.c:pcibios_fixup_bus(). (As you know, things changed alot between -test10 and -test12) Some comments on the patch: +** If I/O or MEM ranges are overlapping, that's a BIOS bug. No. As we reallocate everything, it is quite possible that we'll have temporary overlaps during setup with resources allocated by BIOS. I'm not sure if it is harmful though. The other part of the comment I added was: +** Disabling *all* devices is bad. Console, root, etc get +** disabled this way. I can't debug with *all* devices disabled. Normally, the whole point of resource mgt is to (a) avoid overlaps and (b) reflect the state of the HW. I thought the arch specific code was responsible for setting the HW state and resource mgt state congruent. If the arch/alpha code wants everything reallocated anyway, why not have the arch code disable the HW during the bus walk in pcibios_fixup_bus() before calling pci_assign_unassigned_resources()? (I'm looking at 2.4.0 linux/arch/alpha/kernel/pci.c:common_init_pci() ) FYI, under PDC PAT (eg A500), unused devices are left in the "power on" state (which AFAIK, implies disabled). +#ifdef __hppa__ +/* XXX FIXME +** PCI_BRIDGE_CONTROL and PCI_COMMAND programming need to be revisited +** to support FBB. Make all this crud "configurable" by the arch specific +** (ie "PCI BIOS") support and the ifdef __hppa__ crap can go away then. +*/ Agreed. Something like pcibios_set_bridge_control(). possibly...I have another problem I wanted to solve - FBB support. I think generic Fast Back-Back support wants a new field in struct pci_bus (u32 bridge_ctl) to save and manage the FBB state (and SERR/PERR). Arch support would need a way to initialize bridge_ctl *before* pci_do_scan_bus() is called to indicate FBB is or is not supported by the parent PCI bus controller (either HBA or PCI-PCI Bridge). Originally I was thinking of seperating the "root" bus allocation from pci_scan_bus(). But calling pcibios_set_bridge_control() before the bus walk would work too if it passes struct pci_bus * as the parameter. And that could allow arch specific control over SERR/PERR bits as well. In pcibios_fixup_bus(), the arch code could check FBB state to see if it should be enabled on that HBA or not. Ideally, generic code would fully handle FBB for PCI-PCI secondary busses. Perhaps the FBB test could be in pci_setup_bridge() but I'm not sure if that would work for all arches (ie not sure off hand which uses pci_setup_bridge()). [ deleted code changes in drivers/pci/setup-bus.c:pbus_assign_resources() driver/pci/setup-res.c:pdev_sort_resources() ] This change totally breaks PCI allocation logic. Probably you assign PCI-PCI bridge windows in arch specific code - why? I think my change in pdev_sort_resources() permitted it to occur in generic code. parisc HBA code only calls request_resources for resources assigned by firmware to the HBA. The only thing you need is to set up the root bus resources properly and generic code will do the rest. hmmm...Code in alpha's pcibios_fixup_bus() modifies PCI-PCI Bridge resources. It wouldn't if your statement were literally true. I reversed the two changes in my tree to see what breaks on A500: | lba version TR4.0 (0x5) found at 0xfed3c000 | lba_fixup_bus(0x18b4b780) bus 48 sysdata 0x18b4a800 | lba_fixup_bus() LBA I/O Port [3/3]/100 | lba_fixup_bus() LBA LMMIO [fb00/fb7f]/200 | lba_fixup_bus(0x18b4b880) bus 49 sysdata 0x18b4a800 | lba_fixup_bus(0x18b4b980) bus 50 sysdata 0x18b4a800 | PCI: Failed to allocate resource 0 for 31:04.0 | PCI: Failed to allocate resource 0 for 31:04.1 [ I have a 4-port 100BT card and a 2-port 100BT/896 SCSI "combo" card installed in bus 48 - both have PCI-PCI bridges. No resources are available for any devices under either PPB. ] ... | tulip: eth1: I/O region (0xfffd@0x31000) unavailable, aborting ... | sym53c896-6: rev 0x5 on pci bus 49 device 4 function 0 irq 320 | sym53c896-6: ID 7, Fast-40, Parity Checking | sym53c896-6: on-chip RAM at 0xfb10 | CACHE TEST FAILED: reg dstat-sstat2 readback . | CACHE INCORRECTLY CONFIGURED. ... Should I try to follow alpha's pcibios_fixup_bus() and add the code following (linux 2.4.0, arch/alpha/kernel/pci.c line 256) /* This is a bridge. Do not care how it's initialized,
Re: PATCH 2.4.0 parisc PCI support
d = bus-resource[1]-end; + /* Turn off downstream PF memory address range */ + bus-resource[2]-start = 1024*1024; + bus-resource[2]-end = 0; pbus_assign_resources(b, ranges); Yes, sort of. If my patch to pdev_sort_resources() makes sense now, I'm not sure this is needed either. My first reaction was initialization of b-resource pointer would have to happen earlier in order to match arch code in pcibios_fixup_bus(). The idea being generic PCI code *in general* do the same things for primary bus resources as secondary bus resources (ie window registers). I would prefer this but it doesn't matter ATM; just needs to work. Certainly all this should be cleaned up in 2.5; I'm not sure for 2.4. I don't think existing PCI code is very "dirty". Understanding the interactions between generic and arch PCI code is non-trivial. But it's not that bad. Understanding how various arches use the code is the hard part. parisc support is mostly in place in 2.4.0. It would be quite frustrating to not have it fully working in a 2.4.x release because of 10 or 20 lines of code change. FBB support will cause more change than what I've proposed for in the patch parisc support. thanks again, grant ps. Ivan - this has been a good exchange since it's forcing me to revisit code I haven't looked at in a few monthes with a fresh perspective. This (and my previous) reply took me about 4 hours to write. I have to keep looking at code. :^) Ivan. Grant Grundler parisc-linux {PCI|IOMMU|SMP} hacker +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH 2.4.0 parisc PCI support
Ivan Kokshaysky wrote: ... Well, it seems that I finally see what is wrong with your code, and why it worked in your case. You assume that "window" resources of the bridge are already known when we call pbus_assign_resources_sorted(). This is incorrect. Oh...do we know the "sizes" of all child resources from the bus walk? I'll check that and see if it can be safely changed. Probably you rely on pci_read_bridge_bases() doing something meaningful (I looked at the parisc pci code in current 2.4.x, don't know about your CVS tree). Nope - don't call that for A500 (machines with PDC PAT)...that might in fact be another problem later related to some PDC (aka BIOS) changes. Yes, at least some of the DEC bridges after power-up/reset have 0s in base/limit registers. This means that you have ranges -0fff (4K) for IO and -000f (1M) for MEM. Obviously it's enough to hold all resources on the cards you've tested, but it won't work in common case. There is a lot of reasons why; just a couple of them: - according to PPB specification, base/limits registers of the bridge after reset are *undefined*, so you'll probably have troubles with non-DEC bridges. - there is a number of alpha systems with a built-in PCI-PCI bridge and real PCI slots behind it. Obviously 4K/1M isn't enough for these systems, and it was the main reason of rewriting that code. etc etc etc. Yup - I think you are right on all counts here. I'll rework the parisc code tonight/tomorrow and see if I can get rid of the contentious generic PCI changes. I should be able to. Basically, you won't know bridge "window" size for a given bus until you'll have allocated *all* devices on *all* its child busses. Linux doesn't. It's possible to deal with window register size in the initial bus walk (where BAR sizes are determined). Besides, including bridge resources in the "sort lists" is meaningless, since these resources have fixed alignment - 4K for IO and 1M for MEM, unlike "regular" ones, which alignment == size. The alignment would have to be handled correctly and I thought pcibios_align_resource() did that. I see now the arch/parisc one doesn't and others probably don't either. Let me think about this more... Unfortunately I haven't anything with a bridge handy at the moment to test that patch. Besides, we'll have here a sort of holidays till Sunday. So maybe next week... np. thanks. I don't think existing PCI code is very "dirty". I hope so. :-) :^) However, some problems need to be worked out: 1. generic vs. arch code - we've already discussed some of these 2. Prefetchable Memory - do we need to deal with it? Though looking at modern x86 systems I tend to keep it disabled :-) Ditto for parisc. 3. pdev_enable_device() - it's a bit ugly, confuses people and possibly is not needed at all. Agreed. thanks, grant Grant Grundler parisc-linux {PCI|IOMMU|SMP} hacker +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] move xchg/cmpxchg to atomic.h
On parisc-linux mailing list, Grant Grundler wrote: After surveying all the arches that define __HAVE_ARCH_CMPXCHG: ./include/asm-alpha/system.h:#define __HAVE_ARCH_CMPXCHG 1 ./include/asm-i386/system.h:#define __HAVE_ARCH_CMPXCHG 1 ./include/asm-ia64/system.h:#define __HAVE_ARCH_CMPXCHG 1 ./include/asm-ppc/system.h:#define __HAVE_ARCH_CMPXCHG 1 ./include/asm-sparc64/system.h:#define __HAVE_ARCH_CMPXCHG 1 I've come to the conclusion xchg/cmpxchg definitions do NOT belong in system.h. AFAICT, all the above use Load Linked semantics (or in the i386 case, operation is atomic). In other words, xchg/cmpxchg are atomic operations. Shouldn't xchg/cmpxchg definitions live with other atomic operations - asm/atomic.h? On Sat, 30 Dec 2000 16:46:57 + (GMT), Alan Cox replied: | Seems a reasonable thing to try and move to atomic.h yes Fundemental problem is parisc only supports one atomic operation (LDCW/LDCD) and uses spinlocks for all atomic operations including xchg/cmpxchg. Issue is dependencies between system.h, atomic.h and spinlock.h are *really* ugly and prevented parisc port from inlining xchg/cmpxchg definitions. This is a first step in fixing that problem. I've already made this change to the parisc-linux source tree for parisc and parisc64 builds. Below is the i386 patch for linux-2.4.0-prerelease. This is a simple cut/paste. thanks, grant diff -ruNp linux/include/asm-i386/atomic.h linux.patch/include/asm-i386/atomic.h --- linux/include/asm-i386/atomic.h Sun Dec 31 11:10:16 2000 +++ linux.patch/include/asm-i386/atomic.h Mon Jan 1 23:28:08 2001 @@ -2,6 +2,7 @@ #define __ARCH_I386_ATOMIC__ #include linux/config.h +#include linux/bitops.h /* for LOCK_PREFIX */ /* * Atomic operations that C can't guarantee us. Useful for @@ -111,4 +112,136 @@ __asm__ __volatile__(LOCK "andl %0,%1" \ __asm__ __volatile__(LOCK "orl %0,%1" \ : : "r" (mask),"m" (*addr) : "memory") + +/* xchg/cmpxchg moved from asm/system.h */ +#define xchg(ptr,v) ((__typeof__(*(ptr)))__xchg((unsigned +long)(v),(ptr),sizeof(*(ptr + +#define tas(ptr) (xchg((ptr),1)) + +struct __xchg_dummy { unsigned long a[100]; }; +#define __xg(x) ((struct __xchg_dummy *)(x)) + + +/* + * The semantics of XCHGCMP8B are a bit strange, this is why + * there is a loop and the loading of %%eax and %%edx has to + * be inside. This inlines well in most cases, the cached + * cost is around ~38 cycles. (in the future we might want + * to do an SIMD/3DNOW!/MMX/FPU 64-bit store here, but that + * might have an implicit FPU-save as a cost, so it's not + * clear which path to go.) + */ +extern inline void __set_64bit (unsigned long long * ptr, + unsigned int low, unsigned int high) +{ +__asm__ __volatile__ ( + "1: movl (%0), %%eax; + movl 4(%0), %%edx; + cmpxchg8b (%0); + jnz 1b" + :: "D"(ptr), + "b"(low), + "c"(high) + : + "ax","dx","memory"); +} + +extern void inline __set_64bit_constant (unsigned long long *ptr, +unsigned long long value) +{ + __set_64bit(ptr,(unsigned int)(value), (unsigned int)((value)32ULL)); +} +#define ll_low(x) *(((unsigned int*)(x))+0) +#define ll_high(x) *(((unsigned int*)(x))+1) + +extern void inline __set_64bit_var (unsigned long long *ptr, +unsigned long long value) +{ + __set_64bit(ptr,ll_low(value), ll_high(value)); +} + +#define set_64bit(ptr,value) \ +(__builtin_constant_p(value) ? \ + __set_64bit_constant(ptr, value) : \ + __set_64bit_var(ptr, value) ) + +#define _set_64bit(ptr,value) \ +(__builtin_constant_p(value) ? \ + __set_64bit(ptr, (unsigned int)(value), (unsigned int)((value)32ULL) ) : \ + __set_64bit(ptr, ll_low(value), ll_high(value)) ) + +/* + * Note: no "lock" prefix even on SMP: xchg always implies lock anyway + * Note 2: xchg has side effect, so that attribute volatile is necessary, + * but generally the primitive is invalid, *ptr is output argument. --ANK + */ +static inline unsigned long __xchg(unsigned long x, volatile void * ptr, int size) +{ + switch (size) { + case 1: + __asm__ __volatile__("xchgb %b0,%1" + :"=q" (x) + :"m" (*__xg(ptr)), "0" (x) + :"memory"); + break; + case 2: + __asm__ __volatile__("xchgw %w0,%1" + :"=r" (x) + :"m" (*__xg(ptr)), "0" (x) + :"memory"); + b
Re: [PATCH] move xchg/cmpxchg to atomic.h
David, Sorry for being dense - but I don't see the problem in using a spinlock to implement xchg(). The example algorithm looks broken. Or am I missing something obvious here? "David S. Miller" wrote: It is very common to do things like: producer(elem) { elem-next = list-head; xchg(list-head, elem); } consumer() { local_list = xchg(list-head, NULL); for_each(elem, local_list) do_something(elem); } producer() looks broken. The problem is two producers can race and one will put the wrong value of list-head in elem-next. I think prepending to list-head needs to either be protected by a spinlock or be a per-cpu data structure. consumer() should be ok assuming the code can tolerate picking up "late arrivals" in the next pass. Or am I missing something obvious here? It's worse if producer were inlined: the arch specific optimisers might re-order the "elem-next = list-head" statement to be quite a bit more than 1 or 2 cycles from the xchg() operation. thanks, grant Grant Grundler Unix Systems Enablement Lab +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
PATCH: remove dead code resource_fixup()
ooops... --- Forwarded Message [EMAIL PROTECTED]: host vger.rutgers.edu[128.6.14.121] said: 550 [EMAIL PROTECTED]... User unknown Date: Fri, 3 Nov 2000 09:54:19 -0800 (PST) From: Grant Grundler [EMAIL PROTECTED] Message-Id: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: PATCH: remove dead code resource_fixup() Cc: [EMAIL PROTECTED] Hi Linus, The following patch against linux-2.4.0-test10 removes resource_free() code. "David S. Miller" [EMAIL PROTECTED] wrote: | Yeah, it's dead code, feel free to send Linus a patch which kills | it off :-) thanks, grant diff -uNpr linux/arch/ppc/kernel/pci.c linux.patch/arch/ppc/kernel/pci.c - --- linux/arch/ppc/kernel/pci.c Sun Sep 17 09:48:07 2000 +++ linux.patch/arch/ppc/kernel/pci.c Fri Nov 3 09:37:25 2000 @@ -344,11 +344,6 @@ pcibios_fixup_pbus_ranges(struct pci_bus ranges-mem_end -= bus-resource[1]-start; } - -unsigned long resource_fixup(struct pci_dev * dev, struct resource * res, - - unsigned long start, unsigned long size) - -{ - - return start; - -} void __init pcibios_fixup_bus(struct pci_bus *bus) { diff -uNpr linux/arch/sparc/kernel/pcic.c linux.patch/arch/sparc/kernel/pcic.c - --- linux/arch/sparc/kernel/pcic.cTue Oct 3 09:24:41 2000 +++ linux.patch/arch/sparc/kernel/pcic.cFri Nov 3 09:37:13 2000 @@ -866,23 +866,6 @@ void pcibios_update_resource(struct pci_ { } - -#if 0 - -void pcibios_update_irq(struct pci_dev *pdev, int irq) - -{ - -} - - - -unsigned long resource_fixup(struct pci_dev *pdev, struct resource *res, - - unsigned long start, unsigned long size) - -{ - - return start; - -} - - - -void pcibios_fixup_pbus_ranges(struct pci_bus *pbus, - -struct pbus_set_ranges_data *pranges) - -{ - -} - -#endif - - void pcibios_align_resource(void *data, struct resource *res, unsigned long size) { } diff -uNpr linux/arch/sparc64/kernel/pci.c linux.patch/arch/sparc64/kernel/pci.c - --- linux/arch/sparc64/kernel/pci.c Tue Oct 10 10:33:51 2000 +++ linux.patch/arch/sparc64/kernel/pci.c Fri Nov 3 09:37:42 2000 @@ -202,12 +202,6 @@ void pcibios_update_irq(struct pci_dev * { } - -unsigned long resource_fixup(struct pci_dev *pdev, struct resource *res, - - unsigned long start, unsigned long size) - -{ - - return start; - -} - - void pcibios_fixup_pbus_ranges(struct pci_bus *pbus, struct pbus_set_ranges_data *pranges) { --- End of Forwarded Message - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [parisc-linux] Re: OK, let's try cleaning up another nit. Is anyone paying attention?
"Eric S. Raymond" wrote: Here's what I have for you guys: ... CONFIG_DMB_TRAP: arch/parisc/kernel/sba_iommu.c CONFIG_FUNC_SIZE: arch/parisc/kernel/sba_iommu.c Would you please take these out of the CONFIG_ namespace? Changing the prefix to CONFIGURE would do nicely. As willy noted, both mine. I'll remove or rename them rename them so they aren't in the CONFIG_ name space. Probably s/CONFIG_/SBA_/ for those two. I'm going to submit a "wishlist" bug to our debian BTS (bugs.parisc-linux.org) for "Data Memory Break Trap" support. It's a damn good Hammer! :^) (GDB will probably want to use this too) I once had a working "Data Memory Break Trap" handler to catch other parts of the kernel when they corrupted the IO Pdirs. Hooks in sba_ccio.c helped mark which pages would trap and define which code was allowed to touch the page. My implementation had issues and I never bothered to re-implement as suggested by our parisc CPU god, John Marvin. CONFIG_FUNC_SIZE is just a bad choice of name (asking for trouble). One might consider this a bug that hasn't happened yet - thanks Eric! #define CONFIG_FUNC_SIZE 4096 /* SBA configuration function reg set */ CONFIG_KWDB: arch/parisc/Makefile arch/parisc/config.in arch/parisc/defconfig arch/parisc/kernel/entry.S arch/parisc/kernel/traps.c arch/parisc/mm/init. c This ones actually mine too. It could be replaced with the SGI debugger CONFIG option if/when that ever gets supported. The hooks will have to be in the same place. I'm pretty sure now the HP KWBD team will never give me permission to publish KWDB sources (they've had almost a year now). I sorta almost had the damn thing working too...*sigh*. Willy should do whatever he thinks is right in this case. CONFIG_PCI_LBA: arch/parisc/config.in arch/parisc/defconfig arch/parisc/kerne l/Makefile ... Looks like these need Configure.help entries. That's mine too. We've been lazy about documentation since the getting the code working has been a higher priority. I think having them documented will be a prerequisite to merging upstream (either to Alan Cox or Linus). thanks, grant Grant Grundler parisc-linux {PCI|IOMMU|SMP} hacker +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG] 2.4.4 PCI /proc output
Hi, The /proc/bus/pci/devices data looks correct. /proc/bus/pci/0[01]/* entries look correct. The /proc/bus/pci/0[23]/* entries don't match devices data and looks wrong. The host machine is a HP LXR8000 (4x 500Mhz PIII, 2GB RAM, ~8 PCI slots). Eg for 02/6.0 lspci -v says: 02:06.0 Non-VGA unclassified device: Digital Equipment Corporation DECchip 21154 Flags: fast devsel I/O ports at ignored [disabled] Memory at ignored (type 3, non-prefetchable) [disabled] Memory at ignored (type 3, non-prefetchable) [disabled] Memory at ignored (low-1M, non-prefetchable) [disabled] Memory at ignored (32-bit, prefetchable) [disabled] (This is a 64-bit PCI-PCI bridge) od -Ax -x /proc/bus/pci/02/06.0 00 1000 1000 0040 10 020b 4011 1000 000b 0157 0210 20 0007 0100 8008 0080 4001 4004 fe40 30 0004 fe40 40 * 000100 /proc/bus/pci/devices for 02/06.0 says: 0230101100260 Full output for lspci -t, lspci -v, /proc/bus/pci/0?/*, and devices is available at ftp://gsyprf10.external.hp.com/pub/244_pci/. If more info is desired, send me mail. I didn't see anything obviously wrong with proc_bus_pci_read() in drivers/pci/proc.c. My first guess is the *ppos parameter is fubar but I'm not able to test this theory. My excuse is the LXR8000 doesn't reboot reliably and is 1km away (I'm in Germany instead of California). If this isn't already fixed in 2.4.5 (or .6), I'll look at it in July when I get back. grant Grant Grundler parisc PCI|IOMMU|SMP hacker +1 408-447-7253 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] 2.4.4 PCI FBB
Hi Ivan, Jeff, Appended is the 2.4.4 patch for PCI Fast Back-Back (FBB) support. Could you please review/comment on it? Some caveats/notes: o Since I'm on the road (visiting relatives in Germany mostly, currently in Zurich), I'm only able to verify it boots on my Omnibook 800. PA-RISC port is still based on 2.4.0 so I can't test there. My lxr8000 (in Cupertino) doesn't reboot remotely reliably. Consider this a first cut. o I've added logic to pull the secondary PCI bus out of reset properly in case the PCI-PCI bridge not been initialized by BIOS. This was implicitly happening before in pci_setup_bridge(). The OB800 has a *broken* VLSI PCI-PCI bridge onboard with respect to the BUS_RESET bit - see the KLUGE ALERT. o lspci segfaults on the OB800. I don't know why. I don't think it's related to any changes I've made. I'm running debian unstable and haven't been able to update in several weeks. If it persists after an update, then I'll chase it. o Ivan proposed pcibios_set_bridge_ctl(); I used pcibios_init_bus(). The calling location in pci_do_scan_bus() seemed like a per-bus initialization point rather than a narrow/specific task. I'd like to make use of pcibios_init_bus() in the parisc port. I've only modified arch/i386 to provide pcibios_init_bus(). o For each secondary bus, pci_setup_bridge() gets called before pcibios_init_bus(). The former handles generic PCI-PCI bridge and the later deals with arch specific (eg Host-PCI bridge) stuff. However the difference is on the primary bus - only pcibios_init_bus() is called. FWIW, PA-RISC host-PCI bridges support FBB and I would like to add support to enable FBB on the primary busses (yes - plural!). o I intentionally put all FBB support in pci_setup_bridge() (arch common). FBB could also live in the arch specific location (pcibios_init_bus()) but then that gets replicated for each arch. Not sure if that's a problem or not. The trade off is how arch code interacts with common code for FBB support on the primary bus(ses). thanks! grant --- 2.4.4-orig/arch/i386/kernel/pci-i386.c Mon Aug 7 14:31:40 2000 +++ 2.4.4/arch/i386/kernel/pci-i386.c Sun Jun 17 02:30:29 2001 @@ -324,6 +324,11 @@ int pcibios_enable_resources(struct pci_ } if (dev-resource[PCI_ROM_RESOURCE].start) cmd |= PCI_COMMAND_MEMORY; + + /* If bridge/bus controller has FBB enabled, child must too. */ + if (dev-bus-bridge_ctl PCI_BRIDGE_CTL_FAST_BACK) + cmd |= PCI_COMMAND_FAST_BACK; + if (cmd != old_cmd) { printk(PCI: Enabling device %s (%04x - %04x)\n, dev-slot_name, old_cmd, cmd); pci_write_config_word(dev, PCI_COMMAND, cmd); --- 2.4.4-orig/arch/i386/kernel/pci-pc.cThu Apr 19 22:57:06 2001 +++ 2.4.4/arch/i386/kernel/pci-pc.c Sun Jun 17 02:46:26 2001 @@ -11,6 +11,7 @@ #include linux/pci.h #include linux/init.h #include linux/ioport.h +#include linux/delay.h #include asm/segment.h #include asm/io.h @@ -1015,6 +1016,29 @@ struct pci_fixup pcibios_fixups[] = { }; /* + * Called before each bus is probed. Allows us to tweak struct pci_bus *. + */ +void __init pcibios_init_bus(struct pci_bus *b) +{ + struct pci_dev *dev = b-self; + + /* If host PCI bridge supports FBB, could add support here and + ** in pcibios_fixup_bus(). For the moment, hope the BIOS is + ** smart enough to take advantage of FBB. + */ + + /* don't forward all ISA IO addresses */ + if (dev (dev-hdr_type == PCI_HEADER_TYPE_BRIDGE) +((dev-class 8) == PCI_CLASS_BRIDGE_PCI) +!(b-bridge_ctl | PCI_BRIDGE_CTL_NO_ISA) ) + { + b-bridge_ctl |= PCI_BRIDGE_CTL_NO_ISA; + pci_write_config_word(dev, PCI_BRIDGE_CONTROL, b-bridge_ctl); + } +} + + +/* * Called after each bus is probed, but before its children * are examined. */ @@ -1023,6 +1047,11 @@ void __init pcibios_fixup_bus(struct pci { pcibios_fixup_ghosts(b); pci_read_bridge_bases(b); + + /* if any i386 PCI host bus adapters support FBB, test FBB bit + ** in b-bridge_ctl (dis-) enable FBB in the host bus adapter. + ** Also look at comments in pcibios_init_bus(). + */ } /* --- 2.4.4-orig/drivers/pci/pci.cThu Apr 19 08:38:48 2001 +++ 2.4.4/drivers/pci/pci.c Sun Jun 17 03:06:58 2001 @@ -36,6 +36,7 @@ LIST_HEAD(pci_root_buses); LIST_HEAD(pci_devices); +unsigned int pci_post_reset_delay = 50;/* spec says 1sec but this works */ /** * pci_find_slot - locate PCI device from a given PCI slot @@ -978,9 +979,17 @@ static int __init pci_scan_bridge(struct } } else { /* -* We need to assign a number to this bus which we always -* do in the second pass. We also keep all address decoders -* on the bridge disabled during scanning. FIXME: Why? +
2.4.0 pdev_enable_device() call in setup-bus.c
Hi Ivan, Can you explain why pci_assign_unassigned_resources() calls pdev_enable_device() for every PCI device instead of having each PCI *driver* call pci_enable_device() as part of driver initialization? I'm thinking I missed something that a comment in the code should have explained. After having written the bulk of PCI support for parisc port, I was clearly under the impression the PCI driver was supposed to call pci_enable_device(). IMHO, it's a *bad* idea to enable a device when it's driver might not be present. thanks, grant Code from drivers/pci/setup-bus.c: void __init pci_assign_unassigned_resources(void) { ... pci_for_each_dev(dev) { pdev_enable_device(dev); } } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.0 pdev_enable_device() call in setup-bus.c
Ivan Kokshaysky wrote: Mainly because there are driverless devices like display adapters, PCI bridges, or PCI devices with legacy drivers (IDE, for example). Ok. It seems that all of those would have to interact with the PCI code someplace. And that's were pci_enable_device() could be called. eg. PCI Bridges could be handled in it's "driver": pci_setup_bridge(). OTOH, pdev_enable_device() most likely will be removed, but it's 2.5 material. Agreed - standardizing the enable path would be good for above devices. I'd like to see arch hooks added for stuff like default latency and PCI_BRIDGE_CONTROL. My gut feeling is the "root" struct pci_bus allocation and initialization should be done by arch specific code *before* pci_scan_bus() is called. That would allow cfg space defaults to be set to arch specific values on a per bus basis *before* doing the bus walks instead of hacking this in pci_bus_fixup() later. thanks, grant Grant Grundler parisc-linux {PCI|IOMMU|SMP} hacker +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] 2.4.2-pre3: parport_pc init_module bug
Philipp Rumpf wrote: Jeff Garzik wrote: Looks ok, but I wonder if we should include this list in the docs. These is stuff defined by the PCI spec, and this list could potentially get longer... (opinions either way wanted...) Having people look things up in the spec isn't very user friendly. Finding a copy of the PCI 2.1 or 2.2 spec I could pass on to others (outside of HP) was a problem last year. The best I could do then (legally) was point them to "PCI Systems Architecture" published by MindShare. I'm not sure whether the plan is to have drivers handle MSIs or do it in the generic PCI code. Grant ? Generic PCI code can d very little by itself with MSI since not all platforms provide support for it - even within the same arch. Support for MSI is very platform specific. Eg for parisc: I expect everything but V-class to support MSI. There are some subtle differences between transaction based interrupts used by parisc CPU and MSI. But I don't recall what they are at the moment - I'd have to look it up again. I thought any x86 with SAPIC (not sure about APIC) could support MSI as well. But I don't know the x86 arch nearly as well. It's also possible for the driver to just ignore MSI and not use it. ie use regular PCI IRQ lines for interrupts. grant Grant Grundler parisc-linux {PCI|IOMMU|SMP} hacker +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] 2.4.2-pre3: parport_pc init_module bug
Philipp Rumpf wrote: On Wed, 14 Feb 2001, Grant Grundler wrote: Having people look things up in the spec isn't very user friendly. Having the constants in some well-known header file should be sufficient, shouldn't it ? I would hope anyone bothering to include the constants in a document would spend a few minutes explaining them as well. Perhaps a bad assumption on my part... It depends on the platform and maybe the exact PCI slot used, but I don't think it depends on the driver (unless MSI support is broken in which case you would want to fix it up in the driver). correct. At least I can't find anything in the PCI 2.2 spec that would suggest we need to consult the driver before enabling MSIs with one message only. I don't know how BIOS's treat this (if at all). Need to know this first. If they manage this resource and pre-assign everything, ok. That's how it goes. But if generic PCI manages this, I prefer to avoid allocating resources that may not get used. The host platform may have a limited pool of usable MSI data values (think parisc EIRR bits) and some cards may want to use more than one MSI. grant ps. seems this thread has gotten a bit far off from the original subject. :^/ Grant Grundler parisc-linux {PCI|IOMMU|SMP} hacker +1.408.447.7253 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH/RFC] PCI prepare/activate instead of enable to avoid IRQ storm and rogue DMA access
On Thu, Mar 15, 2007 at 11:37:20AM +0900, Tejun Heo wrote: ... Also, the current implementation doesn't have any arch independent part. I thnk you meant arch dependent here. It's wholly contained in arch independent PCI layer, but it might be beneficial to have arch dependent hooks (IRQ line enable/disable?) in the future. What if the device with the IRQ problem is never loaded? Sometimes devices aren't loaded until after boot. What do you mean by loading a device? Do you mean loading driver for the device? Yes, I think that's what he meant. Any change like this has to be done without changing device drivers. Changing the skge/sky2 drivers as special case is not acceptable. I don't like the idead of changing the driver API for PCI device setup. But if it's necessary to solve this class of problem, I think it's ok. I dunno about that. What I'm proposing is alternative two-step PCI initialization step - the first step enables the device just enough for initialization/reset and the second one enables full access. We're doing part of it already for bus master. I'm proposing to expand that approach and make them handled by generic PCI layer. As you can see, it doesn't add noticeable complexity to drivers. I think it's even clearer than doing pci_set_master() explicitly. Please update Documentation/pci.txt to reflect the API changes too. If this way of solving the problem is chosen, eventually most drivers should be converted to new initialization steps. And there is no way to do this without modifying low level driver. Only low level driver knows when full blown access can be enabled and such thing must happen before registering the device to upper layer (e.g. ATA/SCSI, netif). Agreed. ISTR this has been discussed before but don't recall the exact context. I'll try to find the previous thread. When I started the parisc port on 2.4 kernels, the policy was to leave all interrupts enabled even if no interrupt handler was registered. This is useful for debugging misconfigured IRQ routing. Did the policy already change or is this a proposal to change the policy? thanks, grant sky2/skge aren't exceptions. If this way of solving the problem is chosen, eventually most if not all drivers should be converted to new model. It may take two years, maybe five, but as a start just converting ATA and network drivers shouldn't take too long and that would help a lot of cases. Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.21-rc5] MSI: read-flush MSI-X table
On Tue, Mar 27, 2007 at 07:23:16AM -0600, Eric W. Biederman wrote: I guess I should add that I'm not certain that the code is exactly correct there are weird differences between enable/disable and mask. My understanding was enable would clear (or ignore) pending interrupts and unmask would deliver pending interrupts. Disable and mask could in many implementations be the same thing as long as the enable/unmask difference was supported. thanks, grnat Where generally the mask/unmask methods do the work and enable/disable do some weird software thing. Having them different and enable/disable not doing some software thing concerns me a little. I think mask/unmask may been overoptimized in this case. So I expect someone will wind up refactor this code at some point. However the code is clearly better than what we have now, and it can't affect anything else. Eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.21-rc5] MSI: read-flush MSI-X table
On Mon, Mar 26, 2007 at 04:18:22PM -0700, Mitch Williams wrote: This patch fixes a kernel bug which is triggered when using the irqbalance daemon with MSI-X hardware. Because both MSI-X interrupt messages and MSI-X table writes are posted, it's possible for them to cross while in-flight. This results in interrupts being received long after the kernel thinks they're disabled, and in interrupts being sent to stale vectors after rebalancing. This patch performs a read flush after writes to the MSI-X table for enable/disable and rebalancing operations. Why wouldn't MSI have the same problems as MSI-X? ... diff -urpN -X dontdiff linux-2.6.21-rc4-clean/drivers/pci/msi.c linux-2.6.21-rc4/drivers/pci/msi.c --- linux-2.6.21-rc4-clean/drivers/pci/msi.c 2007-03-19 16:16:32.0 -0700 +++ linux-2.6.21-rc4/drivers/pci/msi.c2007-03-21 12:44:51.0 -0700 @@ -68,6 +68,29 @@ static void msix_set_enable(struct pci_d } } +void msix_flush_writes(unsigned int irq) +{ + struct msi_desc *entry; + + entry = get_irq_msi(irq); + BUG_ON(!entry || !entry-dev); + switch (entry-msi_attrib.type) { + case PCI_CAP_ID_MSI: + /* nothing to do */ + break; + case PCI_CAP_ID_MSIX: + { + int offset = entry-msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE + + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET; + readl(entry-mask_base + offset); + break; + } + default: + BUG(); + break; + } +} PCI_CAP_ID_MSI case seems wrong to me. I was expecting a readl() in that case as well. thanks, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [parisc-linux] [KJ][PATCH] ROUND_?UP macro cleanup in drivers/parisc
On Sat, Mar 31, 2007 at 04:27:46PM +0530, Milind Arun Choudhary wrote: Clean up ROUND_?UP, Use ALIGN where ever appropriate. Signed-off-by: Milind Arun Choudhary [EMAIL PROTECTED] Milind, Looks good to me. Signed-off-by: Grant Grundler [EMAIL PROTECTED] Kyle, can you remind me how these patches should be pushed into your tree? Or do you want to directly apply them yourself? I have the impression git.parisc-linux.org tree is stale and I should be using something different. I just don't know what. thanks, grant --- ccio-dma.c |8 iommu-helpers.h |4 ++-- sba_iommu.c |6 ++ 3 files changed, 8 insertions(+), 10 deletions(-) diff --git a/drivers/parisc/ccio-dma.c b/drivers/parisc/ccio-dma.c index 894fdb9..1459ca8 100644 --- a/drivers/parisc/ccio-dma.c +++ b/drivers/parisc/ccio-dma.c @@ -32,6 +32,7 @@ */ #include linux/types.h +#include linux/kernel.h #include linux/init.h #include linux/mm.h #include linux/spinlock.h @@ -292,7 +293,6 @@ static int ioc_count; #define PDIR_INDEX(iovp)((iovp)IOVP_SHIFT) #define MKIOVP(pdir_idx)((long)(pdir_idx) IOVP_SHIFT) #define MKIOVA(iovp,offset) (dma_addr_t)((long)iovp | (long)offset) -#define ROUNDUP(x,y) ((x + ((y)-1)) ~((y)-1)) /* ** Don't worry about the 150% average search length on a miss. @@ -668,7 +668,7 @@ ccio_mark_invalid(struct ioc *ioc, dma_addr_t iova, size_t byte_cnt) size_t saved_byte_cnt; /* round up to nearest page size */ - saved_byte_cnt = byte_cnt = ROUNDUP(byte_cnt, IOVP_SIZE); + saved_byte_cnt = byte_cnt = ALIGN(byte_cnt, IOVP_SIZE); while(byte_cnt 0) { /* invalidate one page at a time */ @@ -751,7 +751,7 @@ ccio_map_single(struct device *dev, void *addr, size_t size, offset = ((unsigned long) addr) ~IOVP_MASK; /* round up to nearest IOVP_SIZE */ - size = ROUNDUP(size + offset, IOVP_SIZE); + size = ALIGN(size + offset, IOVP_SIZE); spin_lock_irqsave(ioc-res_lock, flags); #ifdef CCIO_MAP_STATS @@ -814,7 +814,7 @@ ccio_unmap_single(struct device *dev, dma_addr_t iova, size_t size, iova ^= offset;/* clear offset bits */ size += offset; - size = ROUNDUP(size, IOVP_SIZE); + size = ALIGN(size, IOVP_SIZE); spin_lock_irqsave(ioc-res_lock, flags); diff --git a/drivers/parisc/iommu-helpers.h b/drivers/parisc/iommu-helpers.h index 38d9e1a..0a1f99a 100644 --- a/drivers/parisc/iommu-helpers.h +++ b/drivers/parisc/iommu-helpers.h @@ -138,7 +138,7 @@ iommu_coalesce_chunks(struct ioc *ioc, struct scatterlist *startsg, int nents, ** exceed DMA_CHUNK_SIZE if we coalesce the ** next entry. */ - if(unlikely(ROUNDUP(dma_len + dma_offset + startsg-length, + if(unlikely(ALIGN(dma_len + dma_offset + startsg-length, IOVP_SIZE) DMA_CHUNK_SIZE)) break; @@ -158,7 +158,7 @@ iommu_coalesce_chunks(struct ioc *ioc, struct scatterlist *startsg, int nents, ** Allocate space for DMA stream. */ sg_dma_len(contig_sg) = dma_len; - dma_len = ROUNDUP(dma_len + dma_offset, IOVP_SIZE); + dma_len = ALIGN(dma_len + dma_offset, IOVP_SIZE); sg_dma_address(contig_sg) = PIDE_FLAG | (iommu_alloc_range(ioc, dma_len) IOVP_SHIFT) diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c index 76a29da..7d83d9f 100644 --- a/drivers/parisc/sba_iommu.c +++ b/drivers/parisc/sba_iommu.c @@ -113,8 +113,6 @@ module_param(sba_reserve_agpgart, int, 1); MODULE_PARM_DESC(sba_reserve_agpgart, Reserve half of IO pdir as AGPGART); #endif -#define ROUNDUP(x,y) ((x + ((y)-1)) ~((y)-1)) - / ** SBA register read and write support @@ -352,7 +350,7 @@ sba_search_bitmap(struct ioc *ioc, unsigned long bits_wanted) ** SBA HW features in the unmap path. */ unsigned long o = 1 get_order(bits_wanted PAGE_SHIFT); - uint bitshiftcnt = ROUNDUP(ioc-res_bitshift, o); + uint bitshiftcnt = ALIGN(ioc-res_bitshift, o); unsigned long mask; if (bitshiftcnt = BITS_PER_LONG) { @@ -779,7 +777,7 @@ sba_unmap_single(struct device *dev, dma_addr_t iova, size_t size, offset = iova ~IOVP_MASK; iova ^= offset;/* clear offset bits */ size += offset; - size = ROUNDUP(size, IOVP_SIZE); + size = ALIGN(size, IOVP_SIZE); spin_lock_irqsave(ioc-res_lock, flags); -- Milind Arun Choudhary ___ parisc-linux mailing list [EMAIL PROTECTED] http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
Re: [KJ][PATCH] ROUND_UP macro cleanup in arch/parisc
On Sun, Apr 01, 2007 at 01:06:46PM +0530, Milind Arun Choudhary wrote: ROUND_UP macro cleanup, use ALIGN where ever appropriate Signed-off-by: Milind Arun Choudhary [EMAIL PROTECTED] Also looks good to me. Just one white space nit we can fixup by hand. Signed-off-by: Grant Grundler [EMAIL PROTECTED] --- hpux/fs.c |5 ++--- kernel/sys_parisc32.c |3 +-- 2 files changed, 3 insertions(+), 5 deletions(-) diff --git a/arch/parisc/hpux/fs.c b/arch/parisc/hpux/fs.c index 4204cd1..7ff5546 100644 --- a/arch/parisc/hpux/fs.c +++ b/arch/parisc/hpux/fs.c @@ -21,6 +21,7 @@ *Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ +#include linux/kernel.h #include linux/mm.h #include linux/sched.h #include linux/file.h @@ -70,7 +71,6 @@ struct getdents_callback { }; #define NAME_OFFSET(de) ((int) ((de)-d_name - (char *) (de))) -#define ROUND_UP(x) (((x)+sizeof(long)-1) ~(sizeof(long)-1)) static int filldir(void * __buf, const char * name, int namlen, loff_t offset, u64 ino, unsigned d_type) @@ -78,7 +78,7 @@ static int filldir(void * __buf, const char * name, int namlen, loff_t offset, struct hpux_dirent * dirent; struct getdents_callback * buf = (struct getdents_callback *) __buf; ino_t d_ino; - int reclen = ROUND_UP(NAME_OFFSET(dirent) + namlen + 1); + int reclen = ALIGN(NAME_OFFSET(dirent) + namlen + 1,sizeof(long)); Nit: I'd want a blank space after the comma. thanks, grant buf-error = -EINVAL; /* only used if we fail.. */ if (reclen buf-count) @@ -103,7 +103,6 @@ static int filldir(void * __buf, const char * name, int namlen, loff_t offset, } #undef NAME_OFFSET -#undef ROUND_UP int hpux_getdents(unsigned int fd, struct hpux_dirent *dirent, unsigned int count) { diff --git a/arch/parisc/kernel/sys_parisc32.c b/arch/parisc/kernel/sys_parisc32.c index ce3245f..e590880 100644 --- a/arch/parisc/kernel/sys_parisc32.c +++ b/arch/parisc/kernel/sys_parisc32.c @@ -311,14 +311,13 @@ struct readdir32_callback { int count; }; -#define ROUND_UP(x,a)((__typeof__(x))(((unsigned long)(x) + ((a) - 1)) ~((a) - 1))) #define NAME_OFFSET(de) ((int) ((de)-d_name - (char __user *) (de))) static int filldir32 (void *__buf, const char *name, int namlen, loff_t offset, u64 ino, unsigned int d_type) { struct linux32_dirent __user * dirent; struct getdents32_callback * buf = (struct getdents32_callback *) __buf; - int reclen = ROUND_UP(NAME_OFFSET(dirent) + namlen + 1, 4); + int reclen = ALIGN(NAME_OFFSET(dirent) + namlen + 1, 4); u32 d_ino; buf-error = -EINVAL; /* only used if we fail.. */ -- Milind Arun Choudhary - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PARTIAL PATCH 0/3] dma: passing attributes to dma_map_* routines
On Wed, Jan 09, 2008 at 03:30:58PM -0600, James Bottomley wrote: ... Also, all of this is quite separate from the PCIe loose ordering stuff. In fact, it's quite conceivable that SGI could hook up a PCIe 3.0 host bridge to the Altix NUMA interconnect, so that you could have a PCI bus that supported loose ordering and also a system interconnect that allowed a different type of reordering too. Actually, no ... there'd just be an even weirder attribute, I suspect. The attributes will be set per *transaction* not per bus. A transaction is an operation (DMA, PIO, config space, etc) from the actual bus to the CPU. It doesn't matter how many physical bus segments this traverses and whether they're different bus types; all that matters for the attributes are the characteristics that are CPU visible. James is right. Setting the PCI-e Relaxed Ordering bit in config space allows the device to set the Relaxed Order in specific DMA transactions. Upstream bridges may choose to ignore the bit or follow well defined transaction flushing/ordering rules if they do implement Relaxed Ordering. Key thing here is the device decides when to set RO bit in each transaction. This is completely different than the re-ordering which occurs in Altix NUMA bus for _all_ (default config) transactions. Given SGI/Altix only cares about a limited number of drivers and only a subset of those support RDMA (or something like it), would it be reasonable to add the new API to the RDMA code instead of the generic DMA API? Please don't shoot me for suggesting that...just thinking out loud here. thanks, grant -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI Failed to allocate mem for PCI ROM
On Fri, Jan 11, 2008 at 02:27:16PM -0600, Kumar Gala wrote: I'm getting the following message from the kernel on an embedded ppc32 system: PCI: Failed to allocate mem resource #9:[EMAIL PROTECTED] for :00:00.0 The HW setup is a PCIe host controller and an e1000 NIC card. ... I'm happy to debug, is the fact that the resno == 9 ok or does that seem wrong? That is fine for the Bridge. See include/linux/pci.h : #define PCI_ROM_RESOURCE6 #define PCI_BRIDGE_RESOURCES7 #define PCI_NUM_RESOURCES 11 IIRC, Bridges may have two 32-bit or one 64-bit BAR, Expansion ROM BAR and three range registers: IO Port, MMIO (Prefetchable and non-prefetchable). The BRIDGE_RESOURCES (7-10) are what failed to be assigned for some reason. Looking at setup-bus.c:pci_bridge_check_ranges(), I'm concluding that: [7] is IO Range. [8] is MMIO [9] is Prefetchable MMIO [10] no clue...maybe used by host PCI bus controllers. 0x10 is 1MB and would be the minimum MMIO range that can be allocated. So that looks right too. Probably need to find out what is allocating 0xe000 instead. hth, grant -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Align PCI memory regions to page size (4K) - Fix
On Thu, Nov 08, 2007 at 05:24:00PM -0600, Linas Vepstas wrote: ... E.g. 4 port Gige card could directly support the host and 3 guests with somewhat lower risk of tromping on each other's MMIO space. If Xen is cooperative, this seems a bit paranoid. I don't recall ever seeing a driver bug where the driver accidentally poked MMIO space at the wrong device. I presume the issue is not a driver bug per-se, but a spying/hacking-type security issue: Having root in one guest could in principle allow one to write a driver that snooped on data in other guests, and/or intentionally corrupted data on other guests. If someone has root on a guest, they could modprobe a driver that can map any unused virtual address to any physical address they want. Unless the chipset somehow blocks/refuses to route IO for that guest, then they can still poke at any other device once they figure out where addresses are being routed (e.g. directly reading configuration space or directly accessing chipset specific registers.) I envision some ISP renting out 1/3 of a machine with a 4-port card, and having some nosey college-kid wannabe hacker getting root on one of the guests and causing trouble. But perhaps I'm wy off base here. I agree this will make it slightly harder. Also makes it much more likely the box will crash - taking down all the guests. And someone should notice that. (Just like occasional cigarette smoking is known to inevitably lead to full-fledged heroin addiction, I am pretty sure that the culture of cheat codes among 12-year-olds is going to lead to an epidemic of hackers in about 10 years. I am atuned to wannabe hacker culture). Ok - but I think there are more serious issues if someone can get root on a remote box (ignore Virtualization). Several other possible layers of security have already been defeated by then. thanks, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Align PCI memory regions to page size (4K) - Fix
On Wed, Nov 14, 2007 at 08:17:33AM +1100, Benjamin Herrenschmidt wrote: ... Though he's trying to fix a real issue, his patch is not the right approach imho. A better approach would be to have a mechanism to be triggered by the hypervisor administration tools that will attempt to reassign -that- specific device if it happens to share pages with another. The remapping would thus only happen for that single device, after it's been put in control of the hypervisor (no host driver is bound, maybe just an HV specific bridging driver is), and only when the action of assigning it to the partition is performed, so that if the machine crashes as a result, at least you know why :-) So something like your hypervisor binds a special driver to a device that is to be reflected to a partition, at which point we are sure no other driver is using it, then that driver can call something in the pci layer that attempts to re-assign the device resources to keep it in a separate page. We already have that something: pci_enable_device(). The guest OS Arch code can then do the reassignment. See 3.1 Enable the PCI device in Documentation/pci.txt. A patch implementing such a helper, and maybe reserving the rest of the MMIO page via some dummy resource to make sure nobody else gets in, that would make more sense. The Hypervisor could be responsible for making the right devices visible to the appropriate partitions/guests by intercepting the PCI bus walk and/or hotplug support. I don't think you should need any dummy resource/drivers in the guest OS. hth, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Align PCI memory regions to page size (4K) - Fix
On Wed, Nov 14, 2007 at 07:16:18PM +1100, Benjamin Herrenschmidt wrote: We already have that something: pci_enable_device(). The guest OS Arch code can then do the reassignment. See 3.1 Enable the PCI device in Documentation/pci.txt. No, that can't be done there because that would mean the guest OS has the ability to reassign resources of PCI devices which is really not something you want. You really want the host OS to do that -before- it makes the device visible to the guest imho. Agreed. I didn't explain well enough. The assignment doesn't need to happen until that point. My assumption was the host OS (or Hypervisor) would already know what it's supposed to be before then and just set it up at that point. The Hypervisor could be responsible for making the right devices visible to the appropriate partitions/guests by intercepting the PCI bus walk and/or hotplug support. I don't think you should need any dummy resource/drivers in the guest OS. I'm not talking about a dummy resource/driver in the guest OS, but rather, in the host. When a PCI device, that is visible to the host, is to be reflected into a guest, it makes sense to have a proxy driver take over in the host from a resource management point of view, thus making this device effectively used and thus preventing another process or partition from trying to bind to it. That proxy driver can then be responsible for doing the appropriate resource tweaking before making the device effectively visible to the guest. It might make sense to provide a helper in the PCI layer to make that easier tho. Ah ok.. I was assuming there was only a Hypervisor and all the guests were equal. If one OS instance is Host and can see the device before hand, then yeah, it makes sense to hide the device from the normal device drivers. grant Host OS Ben. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] include/asm-parisc/: Spelling fixes
Andrew, Please include in -mm. Cosmetic - but I appreciate correct spelling too. On Mon, Dec 17, 2007 at 11:30:11AM -0800, Joe Perches wrote: Signed-off-by: Joe Perches [EMAIL PROTECTED] Signed-off-by: Grant Grundler [EMAIL PROTECTED] thanks, grant --- include/asm-parisc/elf.h |2 +- include/asm-parisc/linkage.h |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/asm-parisc/elf.h b/include/asm-parisc/elf.h index f628ac7..8e7946a 100644 --- a/include/asm-parisc/elf.h +++ b/include/asm-parisc/elf.h @@ -28,7 +28,7 @@ #define EFA_PARISC_1_1 0x0210 /* PA-RISC 1.1 big-endian. */ #define EFA_PARISC_2_0 0x0214 /* PA-RISC 2.0 big-endian. */ -/* Additional section indeces. */ +/* Additional section indices. */ #define SHN_PARISC_ANSI_COMMON 0xff00 /* Section for tenatively declared symbols in ANSI C. */ diff --git a/include/asm-parisc/linkage.h b/include/asm-parisc/linkage.h index ad8cd0d..0b19a72 100644 --- a/include/asm-parisc/linkage.h +++ b/include/asm-parisc/linkage.h @@ -8,7 +8,7 @@ /* * In parisc assembly a semicolon marks a comment while a - * exclamation mark is used to seperate independent lines. + * exclamation mark is used to separate independent lines. */ #ifdef __ASSEMBLY__ -- 1.5.3.7.949.g2221a6 - To unsubscribe from this list: send the line unsubscribe linux-parisc in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/parisc/: Spelling fixes
Andrew, ditto - thanks On Mon, Dec 17, 2007 at 11:40:10AM -0800, Joe Perches wrote: Signed-off-by: Joe Perches [EMAIL PROTECTED] Signed-off-by: Grant Grundler [EMAIL PROTECTED] thanks again, grant --- drivers/parisc/ccio-dma.c |4 ++-- drivers/parisc/hppb.c |2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/parisc/ccio-dma.c b/drivers/parisc/ccio-dma.c index 7c60cbd..ca52307 100644 --- a/drivers/parisc/ccio-dma.c +++ b/drivers/parisc/ccio-dma.c @@ -363,7 +363,7 @@ ccio_alloc_range(struct ioc *ioc, size_t size) if (pages_needed = 8) { /* * LAN traffic will not thrash the TLB IFF the same NIC - * uses 8 adjacent pages to map seperate payload data. + * uses 8 adjacent pages to map separate payload data. * ie the same byte in the resource bit map. */ #if 0 @@ -1589,7 +1589,7 @@ static int __init ccio_probe(struct parisc_device *dev) } /** - * ccio_init - ccio initalization procedure. + * ccio_init - ccio initialization procedure. * * Register this driver. */ diff --git a/drivers/parisc/hppb.c b/drivers/parisc/hppb.c index a728a7c..65eee67 100644 --- a/drivers/parisc/hppb.c +++ b/drivers/parisc/hppb.c @@ -95,7 +95,7 @@ static struct parisc_driver hppb_driver = { }; /** - * hppb_init - HP-PB bus initalization procedure. + * hppb_init - HP-PB bus initialization procedure. * * Register this driver. */ -- 1.5.3.7.949.g2221a6 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 3/4] pci: Remove pci_enable_device_bars()
On Tue, Dec 18, 2007 at 10:01:14AM +1100, Benjamin Herrenschmidt wrote: Now that all in-tree users are gone, this removes pci_enable_device_bars() completely. Almost completely. Patch below removes pci_enable_device_bars() from Documentation/pci.txt . (And I'd be happy to review patches to pci.txt that add pci_enable_device_io() and _mmio() as well. ) Signed-off-by: Grant Grundler [EMAIL PROTECTED] diff --git a/Documentation/pci.txt b/Documentation/pci.txt index 7754f5a..72b20c6 100644 --- a/Documentation/pci.txt +++ b/Documentation/pci.txt @@ -274,8 +274,6 @@ the PCI device by calling pci_enable_device(). This will: o allocate an IRQ (if BIOS did not). NOTE: pci_enable_device() can fail! Check the return value. -NOTE2: Also see pci_enable_device_bars() below. Drivers can -attempt to enable only a subset of BARs they need. [ OS BUG: we don't check resource allocations before enabling those resources. The sequence would make more sense if we called @@ -605,40 +603,7 @@ device lists. This is still possible but discouraged. -10. pci_enable_device_bars() and Legacy I/O Port space -~~ - -Large servers may not be able to provide I/O port resources to all PCI -devices. I/O Port space is only 64KB on Intel Architecture[1] and is -likely also fragmented since the I/O base register of PCI-to-PCI -bridge will usually be aligned to a 4KB boundary[2]. On such systems, -pci_enable_device() and pci_request_region() will fail when -attempting to enable I/O Port regions that don't have I/O Port -resources assigned. - -Fortunately, many PCI devices which request I/O Port resources also -provide access to the same registers via MMIO BARs. These devices can -be handled without using I/O port space and the drivers typically -offer a CONFIG_ option to only use MMIO regions -(e.g. CONFIG_TULIP_MMIO). PCI devices typically provide I/O port -interface for legacy OSes and will work when I/O port resources are not -assigned. The PCI Local Bus Specification Revision 3.0 discusses -this on p.44, IMPLEMENTATION NOTE. - -If your PCI device driver doesn't need I/O port resources assigned to -I/O Port BARs, you should use pci_enable_device_bars() instead of -pci_enable_device() in order not to enable I/O port regions for the -corresponding devices. In addition, you should use -pci_request_selected_regions() and pci_release_selected_regions() -instead of pci_request_regions()/pci_release_regions() in order not to -request/release I/O port regions for the corresponding devices. - -[1] Some systems support 64KB I/O port space per PCI segment. -[2] Some PCI-to-PCI bridges support optional 1KB aligned I/O base. - - - -11. MMIO Space and Write Posting +10. MMIO Space and Write Posting ~~ Converting a driver from using I/O Port space to using MMIO space -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 2/4] pci: Remove users of pci_enable_device_bars()
Just a style nit... On Tue, Dec 18, 2007 at 10:01:14AM +1100, Benjamin Herrenschmidt wrote: This patch converts users of pci_enable_device_bars() to the new pci_enable_device_{io,mem} interface. The new API fits nicely, except maybe for the QLA case where a bit of code re-organization might be a good idea but I prefer sticking to the simple patch as I don't have hardware to test on. I'll also need some feedback on the cs5520 change. Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] /* We must not grab the entire device, it has 'ISA' space in its -BARS too and we will freak out other bits of the kernel */ - if (pci_enable_device_bars(dev, 12)) { + * BARS too and we will freak out other bits of the kernel + * + * pci_enable_device_bars() is going away. I replaced it with + * IO only enable for now but I'll need confirmation this is + * allright for that device. If not, it will need some kind of + * quirk. --BenH. + */ This comment doesn't really belong in the code. It should be part of the commit log or email RFC. cheers, grant -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 4/4] [POWERPC] pci: Disable IO/Mem on a device when resources can't be allocated
On Tue, Dec 18, 2007 at 10:01:15AM +1100, Benjamin Herrenschmidt wrote: This patch changes the PowerPC PCI code to disable IO and/or Memory decoding on a PCI device when a resource of that type failed to be allocated. This is done to avoid having unallocated dangling BARs enabled that might try to decode on top of other devices. If a proper resource is assigned later on, then pci_enable_device{,_io,_mem} will take care of re-enabling decoding. Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] @@ -1062,8 +1065,12 @@ static void __init pcibios_allocate_reso disabled = !(command PCI_COMMAND_IO); else disabled = !(command PCI_COMMAND_MEMORY); - if (pass == disabled) - alloc_resource(dev, idx); + if (pass == disabled alloc_resource(dev, idx)) { + command = ~(r-flags (IORESOURCE_IO | + IORESOURCE_MEM)); While this may be ok for PPC, in general, wouldn't we want to only disable which ever type of resource that couldn't be allocated? ie make two calls: alloc_resource_io() and alloc_resource_mem() and disable the respective flag if the alloc call fails? Thus a device which was enable and programmed by BIOS could remain functional despite one resource not being allocated. (e.g. MMIO was allocated successfully while IO Port space was not) Again, this is just a hypothetical question since I have no example (yet) off hand where this is true. ISTR, the original discussion around pci_enable_device_bars() suggested some machines already have this issue. cheers, grant + pci_write_config_word(dev, + PCI_COMMAND, command); + } } if (pass) continue; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
On Sun, Dec 23, 2007 at 03:16:24PM -0500, Loic Prylli wrote: ... I just realized one thing: the bar sizing code in pci_read_bases() (that writes 0x in the bars) does not seem to disable the PCI_COMMAND_MEM/PCI_COMMAND_IO bits in the cmd register before manipulating the BARs. And it seems nobody else ensures they are disabled at this point either (or am I missing something?). You are missing some history... I posted such a patch in 2002: http://lkml.org/lkml/2002/12/19/145 Touching the bars while they are enabled would be buggy behaviour from our part, and something trivial to fix. And it might well fix that particular problem (it's fair play from the machine to crash if we create a decoding conflict, simply disabling the cmd bits in pci_read_bases() should remove that conflict). ISTR willy or Ivan recently posted a patch that was suggested in 2002 as well (don't disable MMIO on bridge devices when sizing BARs)...so the main objections might be resolved to this obvious fix. *sigh* FWIW, to partially answer your last question, Windows does disable mem-space and/or IO-space when sizing the bars of a device (I have some traces of configuration-space-access taken on a window machine for one of the PCI busses). Thanks for posting the traces...it's past midnight here and I'll try to look at those tomorrow. (Sorry - sounds like a lame excuse but I'm likely to read the trace incorrectly at the moment.) cheers, grant -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
On Thu, Dec 20, 2007 at 02:40:06PM -0800, Greg KH wrote: Sure, I realize this, but it solves the problem in one way for broken hardware, such that it at least allows it to work, right? It also provides a better incentive for the manufacturer to fix their bios, which as you are on-site at HP, it would seem odd that they would just not do that instead of trying to work around this in the kernel... Greg, you know why: Cost. Asking the BIOS provider to roll a new version costs. Rolling the BIOS version might also require re-running the OS Certification. Besides the cost of an OS re-cert, BIOS changes are unlikely to happen if it will delay the introduction and/or sale of new products. And it's not considered to be a BIOS problem unless win XP (or maybe Vista these days) fails the same way (which BIOS supplier would have fixed...). hth, grant -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH 4/4] [POWERPC] pci: Disable IO/Mem on a device when resources can't be allocated
On Wed, Dec 26, 2007 at 08:26:27AM +1100, Benjamin Herrenschmidt wrote: This is exactly what's supposed to be happening, but the code is buggy and nobody noticed :-) (I'm mixing up IORESOURCE_* flags and PCI_COMMAND_* flags). Thanks for reviewing ! Note that this patch isn't in the series Greg queued up anyway. The powerpc specific bits will be going in via Paulus an I already asked him to drop that specific one until I send a fixed version. Ah okbut I was mostly worried about the use of PPC bits as an example. Anyway, in general it looks like the right direction to me too. cheers, grant -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How to debug complete kernel lock-ups
On Wed, Oct 24, 2007 at 11:17:40AM +0200, John Sigler wrote: ... I've tested with a vanilla 2.6.22.10 kernel (no PREEMPT_RT patch). That system also locks up and remains completely unresponsive (I can't open new ssh sessions, the system won't answer ICMP echo requests). How do driver writers deal with complete kernel hangs? Use different HW. Both IA64 and PARISC gives useful diagnostics when the machine has a hard crash (MCA or HPMC respectively). I'll bet PPC does too on the POWER machines. Maybe a newer x86 machine can provide some MCE data as well? Otherwise it's what gregkh said...not the we slowly go crazy part. :) Well, sometimes. :) BTW, getting PCI bus traces would be quite helpful in this case. It'll give you clear data as to whether the devices are being programmed as expected (also to rule out chipset/Host bus controller issues) and whether they are responding as expected (maybe something else dies when they do). hth, grant hth, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Align PCI memory regions to page size (4K) - Fix
On Sun, Oct 28, 2007 at 01:03:36PM -0700, Greg KH wrote: On Sun, Oct 28, 2007 at 03:53:20PM -0400, Barak Fargoun wrote: ... About your question: today, some of the hypervisors are using linux kernel as their domain-0 (e.g. Xen). In order to implement direct hardware access for these native domains (e.g. running windows in a virtual machine above Xen), the PCI memory regions should be aligned at-least at the page-level (so, a virtual machine - can't see data of other devices which may not be assigned to it). So, for that reason, we wanted a boot parameter to let us force the kernel to align PCI memory regions at-least at a PAGE_SIZE alignment. It is very useful for hypervisors which are developed at Linux environment (e.g.: Xen). ... And if not, why would we not do this for all devices not just for virtual machines, if it is such a benefit? It's a benefit IFF multiple devices are spread across more than one guest _and_ we don't trust every particating guest to play nicely with IO. That way the Hypervisor can assign one device to a specific guest OS for direct access. E.g. 4 port Gige card could directly support the host and 3 guests with somewhat lower risk of tromping on each other's MMIO space. If Xen is cooperative, this seems a bit paranoid. I don't recall ever seeing a driver bug where the driver accidentally poked MMIO space at the wrong device. That's much more common for IO Port space. The only exception was xfree86 where it poked around in random places to deal with buggy HW quirks. The use of an IOMMU will provide much more useful protection. grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
On Sun, Feb 10, 2008 at 07:51:22AM -0700, Matthew Wilcox wrote: From: Matthew Wilcox [EMAIL PROTECTED] Date: Sun, 10 Feb 2008 09:45:28 -0500 Subject: [PATCH] Change pci_raw_ops to pci_raw_read/write ... -static int -pci_read (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value) +static int pci_read(struct pci_bus *bus, unsigned int devfn, int where, + int size, u32 *value) { - return raw_pci_ops-read(pci_domain_nr(bus), bus-number, + return raw_pci_read(pci_domain_nr(bus), bus-number, devfn, where, size, value); Willy, Just wondering...why don't we just pass struct bus* through to the raw_pci* ops? My thinking is if a PCI bus controller or bridge is discovered, then we should always create a matching struct bus *. Your patch looks fine to me but if you (and others) agree with the above, I can make patch to change the internal interface. The pci_*_config API needs to remain the same. ... --- a/arch/x86/kernel/quirks.c +++ b/arch/x86/kernel/quirks.c @@ -27,7 +27,7 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev) pci_write_config_byte(dev, 0xf4, config|0x2); /* read xTPR register */ - raw_pci_ops-read(0, 0, 0x40, 0x4c, 2, word); + raw_pci_read(0, 0, 0x40, 0x4c, 2, word); Why are we using raw_pci_read here instead of pci_read_config_dword()? If the pci_write_config_byte() above works, then I expect the read to work too. To be clear, this is not a problem with this patch...rather a seperate problem with the original code. hth, grant -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: raw_pci_read in quirk_intel_irqbalance
On Sun, Feb 10, 2008 at 10:04:16PM -0700, Matthew Wilcox wrote: A disabled or non-existent device's configuration register space is hidden. A disabled or non-existent device will return all ones for reads and will drop writes just as if the cycle terminated with a Master Abort on PCI. I'd like to thank Grant for pointing out to me that this is exactly what the write immediately above this is doing -- enabling device 8 to respond to config space cycles. welcome. ... From f565b65591a3f90a272b1d511e4ab1728861fe77 Mon Sep 17 00:00:00 2001 From: Matthew Wilcox [EMAIL PROTECTED] Date: Sun, 10 Feb 2008 23:18:15 -0500 Subject: [PATCH] Use proper abstractions in quirk_intel_irqbalance Since we may not have a pci_dev for the device we need to access, we can't use pci_read_config_word. But raw_pci_read is an internal implementation detail; it's better to use the architected pci_bus_read_config_word interface. Using PCI_DEVFN instead of a mysterious constant helps reassure everyone that we really do intend to access device 8. Signed-off-by: Matthew Wilcox [EMAIL PROTECTED] --- arch/x86/kernel/quirks.c |9 ++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c index 1941482..c47208f 100644 --- a/arch/x86/kernel/quirks.c +++ b/arch/x86/kernel/quirks.c @@ -11,7 +11,7 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev) { u8 config, rev; - u32 word; + u16 word; /* BIOS may enable hardware IRQ balancing for * E7520/E7320/E7525(revision ID 0x9 and below) @@ -26,8 +26,11 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev) pci_read_config_byte(dev, 0xf4, config); pci_write_config_byte(dev, 0xf4, config|0x2); Can you also add a comment which points at the Intel documentation? http://download.intel.com/design/chipsets/datashts/30300702.pdf Page 34 documents 0xf4 register. And I just doubled checked that the 0xf4 register value is restored later in the quirk (obvious when you look at the code but not from the patch). - /* read xTPR register */ - raw_pci_read(0, 0, 0x40, 0x4c, 2, word); + /* + * read xTPR register. We may not have a pci_dev for device 8 + * because it might be hidden until the above write. + */ + pci_bus_read_config_word(dev-bus, PCI_DEVFN(8, 0), 0x4c, word); Yeah, this should work even though we don't have a dev for it. Acked-by: Grant Grundler [EMAIL PROTECTED] thanks, grant -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: raw_pci_read in quirk_intel_irqbalance
On Mon, Feb 11, 2008 at 09:18:49AM -0800, Linus Torvalds wrote: I put it in the commit message, but it wasn't on page 34 when I checked (I changed it to 69), Sorry - page 34 was just the first reference to Extended Configuration Registers when I originally scrounged up the info for willy. Page 69 is in fact what I wanted to point at (DEVPRES1 reg). and I added the naem for the datasheet so that if/when it moves, maybe google can help. It should. But doing a quick check now only shows one other copy (in .es domain :) when searching for 30300702.pdf. Searching for the full document title results in several intel.com locations and lots of other misc references that don't look quite right. Many of those just reference the product brief and not the data sheet. yahoo.com gives similar results. thanks, grant Linus -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/20] drivers/net/ethernet/dec/tulip/dmfe.c: fix error return code
On Wed, Oct 3, 2012 at 9:17 AM, Peter Senna Tschudin peter.se...@gmail.com wrote: From: Peter Senna Tschudin peter.se...@gmail.com Convert a nonnegative error return code to a negative one, as returned elsewhere in the function. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // smpl ( if@p1 (\(ret 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != ret *if(...) { ... when != ret = e2 when forall return ret; } // /smpl Signed-off-by: Peter Senna Tschudin peter.se...@gmail.com Thanks! Looks good to me. Acked-by: Grant Grundler grund...@parisc-linux.org cheers, grant --- drivers/net/ethernet/dec/tulip/dmfe.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/dec/tulip/dmfe.c b/drivers/net/ethernet/dec/tulip/dmfe.c index 4d6fe60..d23755e 100644 --- a/drivers/net/ethernet/dec/tulip/dmfe.c +++ b/drivers/net/ethernet/dec/tulip/dmfe.c @@ -446,13 +446,17 @@ static int __devinit dmfe_init_one (struct pci_dev *pdev, /* Allocate Tx/Rx descriptor memory */ db-desc_pool_ptr = pci_alloc_consistent(pdev, sizeof(struct tx_desc) * DESC_ALL_CNT + 0x20, db-desc_pool_dma_ptr); - if (!db-desc_pool_ptr) + if (!db-desc_pool_ptr) { + err = -ENOMEM; goto err_out_res; + } db-buf_pool_ptr = pci_alloc_consistent(pdev, TX_BUF_ALLOC * TX_DESC_CNT + 4, db-buf_pool_dma_ptr); - if (!db-buf_pool_ptr) + if (!db-buf_pool_ptr) { + err = -ENOMEM; goto err_out_free_desc; + } db-first_tx_desc = (struct tx_desc *) db-desc_pool_ptr; db-first_tx_desc_dma = db-desc_pool_dma_ptr; @@ -462,8 +466,10 @@ static int __devinit dmfe_init_one (struct pci_dev *pdev, db-chip_id = ent-driver_data; /* IO type range. */ db-ioaddr = pci_iomap(pdev, 0, 0); - if (!db-ioaddr) + if (!db-ioaddr) { + err = -ENOMEM; goto err_out_free_buf; + } db-chip_revision = pdev-revision; db-wol_mode = 0; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 079/193] drivers/net/ethernet/dec/tulip: remove CONFIG_EXPERIMENTAL
On Tue, Oct 23, 2012 at 1:02 PM, Kees Cook keesc...@chromium.org wrote: This config item has not carried much meaning for a while now and is almost always enabled by default. As agreed during the Linux kernel summit, remove it. CC: Grant Grundler grund...@parisc-linux.org Acked-by: Grant Grundler grund...@parisc-linux.org It clearly makes no sense for this driver (obsolete HW for the most part). Thanks Kees! grant Signed-off-by: Kees Cook keesc...@chromium.org --- drivers/net/ethernet/dec/tulip/Kconfig |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/dec/tulip/Kconfig b/drivers/net/ethernet/dec/tulip/Kconfig index 1203be0..0c37fb2 100644 --- a/drivers/net/ethernet/dec/tulip/Kconfig +++ b/drivers/net/ethernet/dec/tulip/Kconfig @@ -57,8 +57,8 @@ config TULIP be called tulip. config TULIP_MWI - bool New bus configuration (EXPERIMENTAL) - depends on TULIP EXPERIMENTAL + bool New bus configuration + depends on TULIP ---help--- This configures your Tulip card specifically for the card and system cache line size type you are using. -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] allow drivers to flush in-flight DMA
[+jejb to cc] On Tue, Sep 25, 2007 at 04:58:43PM -0700, [EMAIL PROTECTED] wrote: This is a followup to http://lkml.org/lkml/2007/8/24/280 Despite Grant's desire for a more elegant solution, there's not much new here. I moved the API change from pci.h to dma-mapping.h and removed the pci_ prefix from the name. Thanks - but I don't have a better idea either. I think you are right to just move forward with this until someone provides a better API. Problem Description --- On Altix, DMA may be reordered within the NUMA interconnect. This can be a problem with Infiniband, where DMA to Completion Queues allocated in user-space can race with data DMA. This patchset allows a driver to associate a user-space memory region with a dmaflush attribute, so that writes to the memory region flush in-flight DMA, preventing the CQ/data race. Can we define this API to provide the same semantics as the memory that dma_alloc_coherent() returns? Did I summarize this correctly? Defining it terms of completion queues won't mean much to most folks. Better to add a description of completion queues to the DMA-API.txt if necessary. dma_alloc_coherent() API is pretty well understood. There are four patches in this set: [1/4] dma: add dma_flags_set_dmaflush() to dma interface Sorry - this feels like a color of the shed argument, but isn't this about DMA ordering attribute? dmaflush is an action and not an attribute to me. Is dma_flags_set_coherent() better since it's doing the same thing as dma_alloc_coherent()? [2/4] dma: redefine dma_flags_set_dmaflush() for sn-ia64 [3/4] dma: document dma_flags_set_dmaflush() This patch updates Documentation/DMA-mapping.txt. But it's a change to the generic (not PCI specific) API described in DMA-API.txt. Can you update that as well please? Upon reading the 2) Platforms that permit DMA reordering, I think I have been confusing coherency with ordering. I think I have because DMA is leaving the PCI domain, crossing an unordered domain (NUMA, interconnect), and then finally hitting the cache coherency domain when it reaches a far away memory controller. That's why I've been thinking of this as a coherency problem. The description and API uses the word flush (which is ok I guess) instead of describing this in terms of enforcing DMA ordering. Any DMA write to the strongly ordered region will cause _all_ inflight DMA to be visible to cache coherency, thus preserving the illusion of strong DMA ordering. Does that sound right/better to you too? I don't have chipset docs and some of this is just trying to rephrase what I've heard before from former SGI employees. [4/4] mthca: allow setting dmaflush attribute on user-allocated memory Besides calling the parameter dmaflush, it looks fine to me. (It's either a DMA ordering or coherency attribute depending on how you want to look at it.) thanks, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [3/4] dma: document dma_flags_set_dmabarrier()
On Thu, Sep 27, 2007 at 06:13:02PM -0700, [EMAIL PROTECTED] wrote: Document dma_flags_set_dmabarrier(). Signed-off-by: Arthur Kepner [EMAIL PROTECTED] This looks really good! thanks, grant Acked-by: Grant Grundler [EMAIL PROTECTED] --- DMA-API.txt | 26 ++ 1 files changed, 26 insertions(+) diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt index cc7a8c3..5fc0bba 100644 --- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt @@ -544,3 +544,29 @@ size is the size (and should be a page-sized multiple). The return value will be either a pointer to the processor virtual address of the memory, or an error (via PTR_ERR()) if any part of the region is occupied. + +int +dma_flags_set_dmabarrier(int dir) + +Amend dir (one of the enum dma_data_direction values), with a +platform-specific dmabarrier attribute. The dmabarrier attribute +forces a flush of all in-flight DMA when the associated memory +region is written to (see example below.) + +This provides a mechanism to enforce ordering of DMA on platforms that +permit DMA to be reordered between device and host memory (within a +NUMA interconnect). On other platforms this is a nop. + +The dmabarrier would be set when the memory region is mapped for DMA, +e.g.: + + int count, flags = dma_flags_set_dmabarrier(DMA_BIDIRECTIONAL); + + count = dma_map_sg(dev, sglist, nents, flags); + +As an example of a situation where this would be useful, suppose that +the device does a DMA write to indicate that data is ready and +available in memory. The DMA of the completion indication could +race with data DMA. Using a dmabarrier on the memory used for +completion indications would prevent the race. + - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.23-rc4] irq: irq and pci_ids patch for Intel Tolapai
On Thu, Aug 30, 2007 at 03:46:56PM -0700, Jason Gaston wrote: This updated patch adds the Intel Tolapai LPC and SMBus Controller DID's. Signed-off-by: ?Jason Gaston [EMAIL PROTECTED] --- linux-2.6.23-rc4/arch/i386/pci/irq.c.orig 2007-08-27 18:32:35.0 -0700 +++ linux-2.6.23-rc4/arch/i386/pci/irq.c 2007-08-28 16:58:31.0 -0700 @@ -550,6 +550,7 @@ case PCI_DEVICE_ID_INTEL_ICH9_3: case PCI_DEVICE_ID_INTEL_ICH9_4: case PCI_DEVICE_ID_INTEL_ICH9_5: + case PCI_DEVICE_ID_INTEL_Tolapai_0: If this is the only place it's used, the prefence is to define the constant locally (in the file) and not in pci_ids.h. Please do submit new PCI device IDs to pciids.sf.net project. thanks, grant r-name = PIIX/ICH; r-get = pirq_piix_get; r-set = pirq_piix_set; --- linux-2.6.23-rc4/include/linux/pci_ids.h.orig 2007-08-27 18:32:35.0 -0700 +++ linux-2.6.23-rc4/include/linux/pci_ids.h 2007-08-28 16:58:31.0 -0700 @@ -2293,6 +2293,8 @@ #define PCI_DEVICE_ID_INTEL_MCH_PC 0x3599 #define PCI_DEVICE_ID_INTEL_MCH_PC1 0x359a #define PCI_DEVICE_ID_INTEL_E7525_MCH0x359e +#define PCI_DEVICE_ID_INTEL_Tolapai_00x5031 +#define PCI_DEVICE_ID_INTEL_Tolapai_10x5032 #define PCI_DEVICE_ID_INTEL_82371SB_00x7000 #define PCI_DEVICE_ID_INTEL_82371SB_10x7010 #define PCI_DEVICE_ID_INTEL_82371SB_20x7020 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [parisc-linux] [patch 15/23] Add cmpxchg_local to parisc
On Sun, Aug 12, 2007 at 10:54:49AM -0400, Mathieu Desnoyers wrote: Use the new generic cmpxchg_local (disables interrupt). Also use the generic cmpxchg as fallback if SMP is not set. Mathieu, thanks for adding __cmpxchg_local to parisc but why do we need it? By definition, atomic operators are, well, atomic. I searched for __cmpxchg_local and found this reference: http://www.ussg.iu.edu/hypermail/linux/kernel/0612.2/1337.html but the root of that thread (Dec 20, 2006): http://www.ussg.iu.edu/hypermail/linux/kernel/0612.2/1334.html Doesn't explain the difference between local and non-local either. Per CPU data should only need memory barriers (in some cases) and protection against interrupts (in probably more cases). So I'm not understanding why a new set of APIs is needed. Can you add a description to Documentation/atomic_ops.txt ? *sigh* sorry for being late to the party on this one... cheers, grant Signed-off-by: Mathieu Desnoyers [EMAIL PROTECTED] CC: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] --- include/asm-parisc/atomic.h | 29 + 1 file changed, 29 insertions(+) Index: linux-2.6-lttng/include/asm-parisc/atomic.h === --- linux-2.6-lttng.orig/include/asm-parisc/atomic.h 2007-07-20 19:44:40.0 -0400 +++ linux-2.6-lttng/include/asm-parisc/atomic.h 2007-07-20 19:44:47.0 -0400 @@ -122,6 +122,35 @@ __cmpxchg(volatile void *ptr, unsigned l (unsigned long)_n_, sizeof(*(ptr))); \ }) +#include asm-generic/cmpxchg-local.h + +static inline unsigned long __cmpxchg_local(volatile void *ptr, + unsigned long old, + unsigned long new_, int size) +{ + switch (size) { +#ifdef CONFIG_64BIT + case 8: return __cmpxchg_u64((unsigned long *)ptr, old, new_); +#endif + case 4: return __cmpxchg_u32(ptr, old, new_); + default: + return __cmpxchg_local_generic(ptr, old, new_, size); + } +} + +/* + * cmpxchg_local and cmpxchg64_local are atomic wrt current CPU. Always make + * them available. + */ +#define cmpxchg_local(ptr,o,n) \ + (__typeof__(*(ptr)))__cmpxchg_local((ptr), (unsigned long)(o), \ + (unsigned long)(n), sizeof(*(ptr))) +#ifdef CONFIG_64BIT +#define cmpxchg64_local(ptr,o,n) cmpxchg_local((ptr), (o), (n)) +#else +#define cmpxchg64_local(ptr,o,n) __cmpxchg64_local_generic((ptr), (o), (n)) +#endif + /* Note that we need not lock read accesses - aligned word writes/reads * are atomic, so a reader never sees unconsistent values. * -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ___ parisc-linux mailing list [EMAIL PROTECTED] http://lists.parisc-linux.org/mailman/listinfo/parisc-linux - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] pci: let devices flush DMA to host memory
On Fri, Aug 24, 2007 at 11:02:32AM -0700, [EMAIL PROTECTED] wrote: On Altix, DMA may be reordered within the NUMA interconnect. This can be a problem with Infiniband, where DMA to Completion Queues can race with data DMA. This patchset allows a driver to associate a memory region with a dmaflush attribute, so that writes to the memory region flush in-flight DMA, preventing the CQ/data race. FYI to linux-pci folks This patch had previous discussion on LKML: http://lkml.org/lkml/2007/8/17/336 James Bottomley at one point obliquely referred to my OLS2003 paper: DMA Hints on ia64/PARISC After reading the thread, my take is we need a more elegant way for a device driver to handle registration of DMA regions allocated by user space. The API would make this page/region act like dma_alloc_coherent(). That implies strong ordering between CPU and DMA to/from the device. Maybe the code is the right thing and I want a name that makes sense in the context of current DMA API. On IRC, willy suggested an mmap() flag and that sounds reasonable too though I don't know if it's feasible. hth, grant There are three patches in this set: [1/3]: add pci_dma_flags_set_dmaflush() to pci interface [2/3]: redefine pci_dma_flags_set_dmaflush() for sn-ia64 [3/3]: document pci_dma_flags_set_dmaflush() And there would be additional patches to IB drivers to make use of the interface, of course. -- Arthur - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [parisc-linux] [patch 15/23] Add cmpxchg_local to parisc
On Mon, Aug 27, 2007 at 05:11:40PM -0400, Mathieu Desnoyers wrote: ... Can you add a description to Documentation/atomic_ops.txt ? *sigh* sorry for being late to the party on this one... Does Documentation/local_ops.txt answer your questions ? If not, please tell me and I'll gladly explain more. Yes, it does mostly - thanks. A few questions/nits: o Did you attempt quantify how many places in the kernel could use this? I'm just trying to get a feel for how useful this really is vs just using existing mechanisms (that people understand) to implement a non-SMP-safe counter that protects updates (writes) against interrupts. If you did, adding some referencs to local_ops.txt would be helpful so folks could look for examples of correct usage. o Wording in local_ops.txt: on the ... it will then appear to be written out of order wrt other memory writes on the owner CPU. I'd like to suggest by the owner CPU. o How can a local_t counter protect updates (writes) against interrupts but not preemption? I always thought preemption required some sort of interrupt or trap. Maybe the local_ops.txt explains that and I just missed it. DaveM explained updates in flight would not be visible to interrupts and I suspect that's the answer to my questionbut then I don't feel good the local_ops are safe to update in interrupts _and_ the process context kernel. Maybe the relationship between local_ops, preemption, and interrupts could be explained more carefully in local_ops.txt. o OK to add a reference for local_ops.txt to atomic_ops.txt? They are obviously related and anyone discovering one of the docs should be made aware of the other. Patch+log entry appended below. Please sign-off if that's ok with you. thanks, grant Diff+Commit entry against 2.6.22.5: local_t is a variant of atomic_t and has related ops to match. Add reference for local_t documentation to atomic_ops.txt. Signed-off-by: Grant Grundler [EMAIL PROTECTED] --- 2.6.22.5-ORIG/Documentation/atomic_ops.txt 2007-08-27 22:50:27.0 -0700 +++ 2.6.22.5-ggg/Documentation/atomic_ops.txt 2007-08-27 22:54:44.0 -0700 @@ -14,6 +14,10 @@ typedef struct { volatile int counter; } atomic_t; +local_t is very similar to atomic_t. If the counter is per CPU and only +updated by one CPU, local_t is probably more appropriate. Please see +Documentation/local_ops.txt for the semantics of local_t. + The first operations to implement for atomic_t's are the initializers and plain reads. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [parisc-linux] [patch 15/23] Add cmpxchg_local to parisc
On Tue, Aug 28, 2007 at 07:50:18AM -0400, Mathieu Desnoyers wrote: ... A few questions/nits: o Did you attempt quantify how many places in the kernel could use this? I'm just trying to get a feel for how useful this really is vs just using existing mechanisms (that people understand) to implement a non-SMP-safe counter that protects updates (writes) against interrupts. If you did, adding some referencs to local_ops.txt would be helpful so folks could look for examples of correct usage. Good question. Since it is useful to implement fast, interrupt reentrant, counters of any kind without disabling interrupts, I think it could be vastely used in the kernel. I also use it in my LTTng kernel tracer implementation to provide very fast buffer management. It is used in LTTng, but could be used for most kind of buffering management too; meaning that we could manage buffers without disabling interrupts. So I don't expect to come with an upper bound about where it can be used... Ok...so I'll try to find one in 2.6.22.5: grundler 1855find -name \*.c | xargs fgrep DEFINE_PER_CPU | fgrep atomic_t ./arch/s390/kernel/time.c:static DEFINE_PER_CPU(atomic_t, etr_sync_word); grundler 1856find -name \*.c | xargs fgrep DEFINE_PER_CPU | fgrep local_t ./arch/x86_64/kernel/nmi.c:static DEFINE_PER_CPU(local_t, alert_counter); uhm, I was expecting more than that. Maybe there is some other systemic problem with how PER_CPU stuff is used/declared? In any case, some references to LTT usage would be quite helpful. E.g. a list of file and variable names at the end of local_ops.txt file. o How can a local_t counter protect updates (writes) against interrupts but not preemption? I always thought preemption required some sort of interrupt or trap. Maybe the local_ops.txt explains that and I just missed it. Local atomic operations only guarantee variable modification atomicity wrt the CPU which owns the data. Therefore, care must taken to make sure that only one CPU writes to the local_t data. This is done by using per cpu data and making sure that we modify it from within a preemption safe context. - therefore, preemption must be disabled around local ops usage. This is required to be pinned to one CPU anyway. Sorry...the quoted text doesn't answer my question. It's a definition of semantics, not an explanation of the mechanics. I want to know what happens when (if?) an interrupt occurs in the middle of a read/modify/write sequence that isn't prefixed with LOCK (or something similar for other arches like store locked conditional ops). Stating the semantics is a good thing - but not a substitution for describing how it works for a given architecture. Either in the code or in local_ops.txt. Otherwise people like me won't use it because we don't believe that (or understand how) it really works. DaveM explained updates in flight would not be visible to interrupts and I suspect that's the answer to my questionbut then I don't feel good the local_ops are safe to update in interrupts _and_ the process context kernel. Maybe the relationship between local_ops, preemption, and interrupts could be explained more carefully in local_ops.txt. Does the paragraph above explain it enough or should I add some more explanation ? Please add a bit more detail. If DaveM is correct (he normally is), then there must be limits on how the local_t can be used in the kernel process and interrupt contexts. I'd like those rules spelled out very clearly since it's easy to get wrong and tracking down such a bug is quite painful. Note: I already missed the one critical sentence about only the owning CPU can write the valuethere seem to be other limitations as well with respect to interrupts. o OK to add a reference for local_ops.txt to atomic_ops.txt? They are obviously related and anyone discovering one of the docs should be made aware of the other. Patch+log entry appended below. Please sign-off if that's ok with you. I'm perfectly ok with the idea, but suggest a small modification. See below. Looks fine to me. Add your Signed-off-by and submit to DaveM since he seems to be the maintainer of atomic_ops.txt. cheers, grant thanks, grant Diff+Commit entry against 2.6.22.5: local_t is a variant of atomic_t and has related ops to match. Add reference for local_t documentation to atomic_ops.txt. Signed-off-by: Grant Grundler [EMAIL PROTECTED] --- 2.6.22.5-ORIG/Documentation/atomic_ops.txt 2007-08-27 22:50:27.0 -0700 +++ 2.6.22.5-ggg/Documentation/atomic_ops.txt 2007-08-27 22:54:44.0 -0700 @@ -14,6 +14,10 @@ typedef struct { volatile int counter; } atomic_t; +local_t is very similar to atomic_t. If the counter is per CPU and only +updated by one CPU, local_t is probably more appropriate. Please see +Documentation/local_ops.txt for the semantics
Re: [PATCH] Fix boot-time hang on G31/G33 PC
On Sat, Aug 25, 2007 at 07:55:56PM -0600, Matthew Wilcox wrote: This patch, loosely based on a patch from Robert Hancock, which was in turn based on a patch from Jesse Barnes, fixes a boot-time hang on my shiny new PC. The 'conflict' mentioned in the patch in my case happens to be between mmconfig and the graphics card, but it could easily be between any pair of devices if they are left enabled by the BIOS and mappen in the wrong place. This issue has come up before: http://www.uwsg.indiana.edu/hypermail/linux/kernel/0212.2/0232.html and Ivan Kokshaysky suggested a very similar patch: http://www.uwsg.indiana.edu/hypermail/linux/kernel/0212.2/0324.html cheers, grant Signed-off-by: Matthew Wilcox [EMAIL PROTECTED] diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 34b8dae..51ef450 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -180,11 +180,26 @@ static inline int is_64bit_memory(u32 mask) return 0; } +/* + * Sizing PCI BARs requires us to disable decoding, otherwise we may run + * into conflicts with other devices while trying to size the BAR. Normally + * this isn't a problem, but it happens on some machines normally, and can + * happen on others during PCI device hotplug. Don't disable BARs for host + * bridges, though. Some of them do silly things like disable accesses to + * RAM from the CPU + */ static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom) { unsigned int pos, reg, next; u32 l, sz; struct resource *res; + u16 orig_cmd; + + if ((dev-class 8) != PCI_CLASS_BRIDGE_HOST) { + pci_read_config_word(dev, PCI_COMMAND, orig_cmd); + pci_write_config_word(dev, PCI_COMMAND, + orig_cmd ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO)); + } for(pos=0; poshowmany; pos = next) { u64 l64; @@ -283,6 +298,9 @@ static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom) } } } + + if ((dev-class 8) != PCI_CLASS_BRIDGE_HOST) + pci_write_config_word(dev, PCI_COMMAND, orig_cmd); } void __devinit pci_read_bridge_bases(struct pci_bus *child) -- Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix boot-time hang on G31/G33 PC
On Tue, Aug 28, 2007 at 11:59:08AM -0600, Grant Grundler wrote: On Sat, Aug 25, 2007 at 07:55:56PM -0600, Matthew Wilcox wrote: This patch, loosely based on a patch from Robert Hancock, which was in turn based on a patch from Jesse Barnes, fixes a boot-time hang on my shiny new PC. The 'conflict' mentioned in the patch in my case happens to be between mmconfig and the graphics card, but it could easily be between any pair of devices if they are left enabled by the BIOS and mappen in the wrong place. This issue has come up before: http://www.uwsg.indiana.edu/hypermail/linux/kernel/0212.2/0232.html Ok...finally found the thread I was looking for: http://www.ussg.iu.edu/hypermail/linux/kernel/0212.2/0978.html or look at the by Thread page and search for disable BAR: http://www.ussg.iu.edu/hypermail/linux/kernel/0212.2/index.html Main difference now is not disabling anything on any sort of Bridge. Summary: Sizing BARs has never been a very safe operation. We have to mitigate as best we can and then live with the remaining risks. cheers, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [parisc-linux] [patch 15/23] Add cmpxchg_local to parisc
[davem: patch for you at the bottom to Documentation/atomic_ops.txt ] On Tue, Aug 28, 2007 at 02:38:35PM -0400, Mathieu Desnoyers wrote: * Grant Grundler ([EMAIL PROTECTED]) wrote: On Tue, Aug 28, 2007 at 07:50:18AM -0400, Mathieu Desnoyers wrote: ... So I don't expect to come with an upper bound about where it can be used... Ok...so I'll try to find one in 2.6.22.5: grundler 1855find -name \*.c | xargs fgrep DEFINE_PER_CPU | fgrep atomic_t ./arch/s390/kernel/time.c:static DEFINE_PER_CPU(atomic_t, etr_sync_word); grundler 1856find -name \*.c | xargs fgrep DEFINE_PER_CPU | fgrep local_t ./arch/x86_64/kernel/nmi.c:static DEFINE_PER_CPU(local_t, alert_counter); uhm, I was expecting more than that. Maybe there is some other systemic problem with how PER_CPU stuff is used/declared? the local ops has just been standardized in 2.6.22 though a patchset I did. I would not expect the code to start using them this quickly. Or maybe is it just that I am doing a terrible marketing job ;) Yeah, I didn't expect many users of local_t. The search for atomic_t usage in DEFINE_PER_CPU was an attempt to find other potential candidates which could be local_t. Is there any other programmatic way we could look for code which could (or should) be using local_t? In any case, some references to LTT usage would be quite helpful. E.g. a list of file and variable names at the end of local_ops.txt file. LTT is not mainlined (yet!) ;) Ok...probably not such a good example then. :) ... Sorry...the quoted text doesn't answer my question. It's a definition of semantics, not an explanation of the mechanics. I want to know what happens when (if?) an interrupt occurs in the middle of a read/modify/write sequence that isn't prefixed with LOCK (or something similar for other arches like store locked conditional ops). Stating the semantics is a good thing - but not a substitution for describing how it works for a given architecture. Either in the code or in local_ops.txt. Otherwise people like me won't use it because we don't believe that (or understand how) it really works. Quoting Intel 64 and IA-32 Architectures Software Developer's Manual 3.2 Instructions LOCK - Assert LOCK# Signal Prefix ... I've read this before and understand how LOCK works. This isn't helpful since I want a description of the behavior without LOCK. And if we take a look at some of the atomic primitives which are used in i386 local.h: add (for inc/dec/add/sub) xadd cmpxchg All these instructions, just like any other, can be interrupted by an external interrupt or cause a trap, exception, or fault. Interrupt handler are executing between instructions and traps/exceptions/faults will either execute instead of the faulty instruction or after is has been executed. In all these cases, each instruction can be seen as executing atomically wrt the local CPU. This is exactly what permits asm-i386/local.h to define out the LOCK prefix for UP kernels. I think what I'm looking for but don't know if it's true: The cmpxchg (for example) at the kernel process context will not clobber or be clobbered by a cmpxchg done to the same local_t performed at the kernel interrupt context by the same CPU. If that's not true, then it would be good to add that as another restriction to usage. I use the same trick UP kernel are using, but I deploy it in SMP context, but I require the CPU to be the only one to access the memory locations written to by the local ops. Basically, since the memory location is _not_ shared across CPUs for writing, we can safely write to it without holding the LOCK signal. ok. ... Note: I already missed the one critical sentence about only the owning CPU can write the valuethere seem to be other limitations as well with respect to interrupts. Ok, let's give a try at a clear statement: - Variables touched by local ops must be per cpu variables. - _Only_ the CPU owner of these variables must write to them. - This CPU can use local ops from any context (process, irq, softirq, nmi, ...) to update its local_t variables. - Preemption (or interrupts) must be disabled when using local ops in process context to make sure the process won't be migrated to a different CPU between getting the per-cpu variable and doing the actual local op. - When using local ops in interrupt context, no special care must be taken on a mainline kernel, since they will run on the local CPU with preemption already disabled. I suggest, however, to explicitly disable preemption anyway to make sure it will still work correctly on -rt kernels. - Reading the local cpu variable will provide the current copy of the variable. - Reads of these variables can be done from any CPU, because updates to long, aligned, variables are always atomic. Since no memory synchronization is done by the writer CPU, an outdated copy
Re: [PATCH] local_t Documentation update 2
On Wed, Aug 29, 2007 at 08:19:53AM -0400, Mathieu Desnoyers wrote: local_t Documentation update 2 Grant Grundler was asking for more detail about correct usage of local atomic operations and suggested adding the resulting summary to local_ops.txt. Please add a bit more detail. If DaveM is correct (he normally is), then there must be limits on how the local_t can be used in the kernel process and interrupt contexts. I'd like those rules spelled out very clearly since it's easy to get wrong and tracking down such a bug is quite painful. It applies on top of 2.6.23-rc3-mm1. Signed-off-by: Mathieu Desnoyers [EMAIL PROTECTED] CC: Grant Grundler [EMAIL PROTECTED] Signed-off-by: Grant Grundler [EMAIL PROTECTED] thanks! grant --- Documentation/local_ops.txt | 23 +++ 1 file changed, 23 insertions(+) Index: linux-2.6-lttng/Documentation/local_ops.txt === --- linux-2.6-lttng.orig/Documentation/local_ops.txt 2007-08-29 08:09:34.0 -0400 +++ linux-2.6-lttng/Documentation/local_ops.txt 2007-08-29 08:15:37.0 -0400 @@ -45,6 +45,29 @@ long fails. The definition looks like : typedef struct { atomic_long_t a; } local_t; +* Rules to follow when using local atomic operations + +- Variables touched by local ops must be per cpu variables. +- _Only_ the CPU owner of these variables must write to them. +- This CPU can use local ops from any context (process, irq, softirq, nmi, ...) + to update its local_t variables. +- Preemption (or interrupts) must be disabled when using local ops in + process context to make sure the process won't be migrated to a + different CPU between getting the per-cpu variable and doing the + actual local op. +- When using local ops in interrupt context, no special care must be + taken on a mainline kernel, since they will run on the local CPU with + preemption already disabled. I suggest, however, to explicitly + disable preemption anyway to make sure it will still work correctly on + -rt kernels. +- Reading the local cpu variable will provide the current copy of the + variable. +- Reads of these variables can be done from any CPU, because updates to + long, aligned, variables are always atomic. Since no memory + synchronization is done by the writer CPU, an outdated copy of the + variable can be read when reading some _other_ cpu's variables. + + * How to use local atomic operations #include linux/percpu.h -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add missing KERN_level to multiline printk(KERN_level...)
On Wed, Aug 08, 2007 at 05:28:46PM -0700, Joe Perches wrote: Some uses of printk are missing KERN_level on the second and subsequent lines. Nice! Thanks! ACK the three parisc-linux bits. thanks, grant For instance: printk(KERN_INFO line1: %d\nline2: %d\n, val1, val2); Line1 is marked log_level: info Line2 is marked log_level: unknown Some lines have trailing spaces after \n, removed where appropriate. Signed-off-by: Joe Perches [EMAIL PROTECTED] arch/arm/kernel/ecard.c |3 ++- arch/blackfin/kernel/dualcore_test.c |3 ++- arch/blackfin/kernel/traps.c |4 +++- arch/h8300/kernel/setup.c|4 +++- arch/i386/kernel/io_apic.c |3 ++- arch/m68knommu/kernel/setup.c|4 +++- arch/m68knommu/kernel/traps.c|5 +++-- arch/m68knommu/mm/init.c |9 ++--- arch/m68knommu/platform/68328/config.c |3 ++- arch/m68knommu/platform/68360/config.c |3 ++- arch/m68knommu/platform/68EZ328/config.c |3 ++- arch/mips/vr41xx/common/pmu.c|9 ++--- arch/parisc/kernel/traps.c |3 ++- arch/parisc/math-emu/driver.c|5 +++-- arch/v850/kernel/setup.c |6 -- arch/x86_64/kernel/io_apic.c |3 ++- arch/x86_64/kernel/mpparse.c |3 ++- drivers/acpi/acpi_memhotplug.c |3 ++- drivers/char/dtlk.c |3 ++- drivers/char/tpm/tpm_bios.c |2 +- drivers/ide/ide-cd.c |3 ++- drivers/input/serio/hil_mlc.c|2 +- drivers/message/fusion/mptlan.c |3 ++- drivers/mtd/maps/cdb89712.c |5 - drivers/net/cs89x0.c |2 +- drivers/net/dgrs.c |3 ++- drivers/net/wireless/arlan-main.c|2 +- drivers/net/wireless/arlan-proc.c| 19 ++- drivers/parisc/led.c |3 ++- drivers/scsi/aha152x.c | 16 +++- drivers/scsi/dpt_i2o.c |3 ++- drivers/scsi/mac_scsi.c |3 ++- drivers/scsi/megaraid.c |3 ++- drivers/scsi/megaraid/megaraid_sas.c | 25 - drivers/scsi/osst.c |3 ++- drivers/scsi/zalon.c |2 +- drivers/video/savage/savagefb_driver.c | 21 - fs/dlm/dlm_internal.h|9 + fs/freevxfs/vxfs_bmap.c |8 ++-- fs/jffs2/wbuf.c |3 ++- mm/slub.c| 18 -- 41 files changed, 152 insertions(+), 85 deletions(-) diff --git a/arch/arm/kernel/ecard.c b/arch/arm/kernel/ecard.c index f56d48c..6402ad2 100644 --- a/arch/arm/kernel/ecard.c +++ b/arch/arm/kernel/ecard.c @@ -547,7 +547,8 @@ static void ecard_check_lockup(struct irq_desc *desc) if (last == jiffies) { lockup += 1; if (lockup 100) { - printk(KERN_ERR \nInterrupt lockup detected - + printk(KERN_ERR \n +KERN_ERR Interrupt lockup detected - disabling all expansion card interrupts\n); desc-chip-mask(IRQ_EXPANSIONCARD); diff --git a/arch/blackfin/kernel/dualcore_test.c b/arch/blackfin/kernel/dualcore_test.c index 0fcba74..3c94199 100644 --- a/arch/blackfin/kernel/dualcore_test.c +++ b/arch/blackfin/kernel/dualcore_test.c @@ -35,7 +35,8 @@ static int *testarg = (int *)0xfeb0; static int test_init(void) { *testarg = 1; - printk(KERN_INFO Dual core test module inserted: set testarg = [%d]\n @ [%p]\n, + printk(KERN_INFO Dual core test module inserted: set testarg = [%d]\n +KERN_INFO @ [%p]\n, *testarg, testarg); return 0; } diff --git a/arch/blackfin/kernel/traps.c b/arch/blackfin/kernel/traps.c index 8766bd6..f76b78b 100644 --- a/arch/blackfin/kernel/traps.c +++ b/arch/blackfin/kernel/traps.c @@ -350,7 +350,9 @@ asmlinkage void trap_c(struct pt_regs *fp) info.si_code = ILL_CPLB_MULHIT; #ifdef CONFIG_DEBUG_HUNT_FOR_ZERO sig = SIGSEGV; - printk(KERN_EMERG \n\nJump to address 0 - 0x0fff\n); + printk(KERN_EMERG \n +KERN_EMERG \n +KERN_EMERG Jump to address 0 - 0x0fff\n); #else sig = SIGILL; printk(KERN_EMERG EXC_0x2D); diff --git a/arch/h8300/kernel/setup.c b/arch/h8300/kernel/setup.c index b2e86d0..cb45404 100644 --- a/arch/h8300/kernel/setup.c +++ b/arch/h8300/kernel/setup.c @@ -127,7 +127,9 @@ void __init setup_arch(char **cmdline_p) register_console((struct console
Re: [RFC][PATCH] uli526x: Add suspend and resume routines (updated)
On Wed, Aug 08, 2007 at 12:56:41AM +0200, Rafael J. Wysocki wrote: ... Apologies, I missed this. I'll look to our new tulip maintainer to queue your resent patch, or at least ACK it... OK Below is the updated version. It's functionally equivalent to the previous one. ACK. Looks fine to me too. I don't have the HW to test it though. cheers, grant Greetings, Rafael --- From: Rafael J. Wysocki [EMAIL PROTECTED] Add suspend/resume support to the uli526x network driver (tested on x86_64, with 'Ethernet controller: ALi Corporation M5263 Ethernet Controller, rev 40'). This patch is based on the suspend/resume code in the tg3 driver. Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED] --- drivers/net/tulip/uli526x.c | 109 +--- 1 file changed, 103 insertions(+), 6 deletions(-) Index: linux-2.6.23-rc2/drivers/net/tulip/uli526x.c === --- linux-2.6.23-rc2.orig/drivers/net/tulip/uli526x.c +++ linux-2.6.23-rc2/drivers/net/tulip/uli526x.c @@ -1110,19 +1110,15 @@ static void uli526x_timer(unsigned long /* - * Dynamic reset the ULI526X board * Stop ULI526X board * Free Tx/Rx allocated memory - * Reset ULI526X board - * Re-initialize ULI526X board + * Init system variable */ -static void uli526x_dynamic_reset(struct net_device *dev) +static void uli526x_reset_prepare(struct net_device *dev) { struct uli526x_board_info *db = netdev_priv(dev); - ULI526X_DBUG(0, uli526x_dynamic_reset(), 0); - /* Sopt MAC controller */ db-cr6_data = ~(CR6_RXSC | CR6_TXSC); /* Disable Tx/Rx */ update_cr6(db-cr6_data, dev-base_addr); @@ -1141,6 +1137,22 @@ static void uli526x_dynamic_reset(struct db-link_failed = 1; db-init=1; db-wait_reset = 0; +} + + +/* + * Dynamic reset the ULI526X board + * Stop ULI526X board + * Free Tx/Rx allocated memory + * Reset ULI526X board + * Re-initialize ULI526X board + */ + +static void uli526x_dynamic_reset(struct net_device *dev) +{ + ULI526X_DBUG(0, uli526x_dynamic_reset(), 0); + + uli526x_reset_prepare(dev); /* Re-initialize ULI526X board */ uli526x_init(dev); @@ -1150,6 +1162,89 @@ static void uli526x_dynamic_reset(struct } +#ifdef CONFIG_PM_SLEEP + +/* + * Suspend the interface. + */ + +static int uli526x_suspend(struct pci_dev *pdev, pm_message_t state) +{ + struct net_device *dev = pci_get_drvdata(pdev); + pci_power_t power_state; + int err; + + ULI526X_DBUG(0, uli526x_suspend, 0); + + if (!(dev netdev_priv(dev))) + return 0; + + pci_save_state(pdev); + + if (!netif_running(dev)) + return 0; + + netif_device_detach(dev); + uli526x_reset_prepare(dev); + + power_state = pci_choose_state(pdev, state); + pci_enable_wake(pdev, power_state, 0); + err = pci_set_power_state(pdev, power_state); + if (err) { + netif_device_attach(dev); + /* Re-initialize ULI526X board */ + uli526x_init(dev); + /* Restart upper layer interface */ + netif_wake_queue(dev); + } + + return err; +} + +/* + * Resume the interface. + */ + +static int uli526x_resume(struct pci_dev *pdev) +{ + struct net_device *dev = pci_get_drvdata(pdev); + struct uli526x_board_info *db = netdev_priv(dev); + int err; + + ULI526X_DBUG(0, uli526x_resume, 0); + + if (!(dev db)) + return 0; + + pci_restore_state(pdev); + + if (!netif_running(dev)) + return 0; + + err = pci_set_power_state(pdev, PCI_D0); + if (err) { + printk(KERN_WARNING %s: Could not put device into D0\n, + dev-name); + return err; + } + + netif_device_attach(dev); + /* Re-initialize ULI526X board */ + uli526x_init(dev); + /* Restart upper layer interface */ + netif_wake_queue(dev); + + return 0; +} + +#else /* !CONFIG_PM_SLEEP */ + +#define uli526x_suspend NULL +#define uli526x_resume NULL + +#endif /* !CONFIG_PM_SLEEP */ + + /* * free all allocated rx buffer */ @@ -1689,6 +1784,8 @@ static struct pci_driver uli526x_driver .id_table = uli526x_pci_tbl, .probe = uli526x_init_one, .remove = __devexit_p(uli526x_remove_one), + .suspend= uli526x_suspend, + .resume = uli526x_resume, }; MODULE_AUTHOR(Peer Chen, [EMAIL PROTECTED]); - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 11/68] 0 - NULL, for arch/parisc
On Fri, Jul 27, 2007 at 11:45:05AM +0200, Yoann Padioleau wrote: When comparing a pointer, it's clearer to compare it to NULL than to 0. ... diff --git a/arch/parisc/kernel/smp.c b/arch/parisc/kernel/smp.c index 04c7e1d..16fccbe 100644 --- a/arch/parisc/kernel/smp.c +++ b/arch/parisc/kernel/smp.c @@ -333,7 +333,7 @@ smp_call_function (void (*func) (void *i if (retry) { spin_lock (lock); - while (smp_call_function_data != 0) + while (smp_call_function_data != NULL) barrier(); } else { Yoann, Thanks! I like comparing pointers to NULL since it makes it explicit we are dealing with a pointer and is consistent with the assignment to NULL later in the code. But I'd like the later comparisons of smp_call_function_data to be consistent with your suggestion above. Patch below adds another != NULL. thanks grant Signed-off-by: Grant Grundler [EMAIL PROTECTED] diff --git a/arch/parisc/kernel/smp.c b/arch/parisc/kernel/smp.c index 04c7e1d..c9ce659 100644 --- a/arch/parisc/kernel/smp.c +++ b/arch/parisc/kernel/smp.c @@ -333,12 +333,12 @@ smp_call_function (void (*func) (void *info), void *info, int retry, int wait) if (retry) { spin_lock (lock); - while (smp_call_function_data != 0) + while (smp_call_function_data != NULL) barrier(); } else { spin_lock (lock); - if (smp_call_function_data) { + if (smp_call_function_data != NULL) { spin_unlock (lock); return -EBUSY; } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [TULIP] Need new maintainer
On Mon, Jul 30, 2007 at 03:31:58PM -0400, Kyle McMartin wrote: On Mon, Jul 30, 2007 at 01:04:13PM -0600, Valerie Henson wrote: The Tulip network driver needs a new maintainer! I no longer have time to maintain the Tulip network driver and I'm stepping down. Jeff Garzik would be happy to get volunteers. Val! I'm sorry to see you have to drop this one...C'est la Vie. Since I already take care of a major consumer of these devices (parisc, which pretty much all have tulip) I'm willing to take care of this. Alternately, Grant is probably willing. Yeah, I am willing and able to maintain tulip as well. Either way, parisc and some mips64 folks are stuck with tulip since that's what is embedded on the motherboard. So it would make sense for someone from either camp to maintain it. Thanks to David Lang for the offer of 4-port Dlink Tulip card. HP has two types of 4-port tulip cards (64-bit/33Mhz and 32-bit/33Mhz) that I gave to Val...so whoever picks up the maintainership can probably (eventually) get those from Val. thanks, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Coding style fix include/asm-parisc/compat_rt_sigframe.h
On Wed, Aug 01, 2007 at 04:55:00PM +0200, Michal Piotrowski wrote: Hi, Coding style fix Acked-by: Grant Grundler [EMAIL PROTECTED] thanks, grant Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ Signed-off-by: Michal Piotrowski [EMAIL PROTECTED] --- linux-mm-clean/include/asm-parisc/compat_rt_sigframe.h2007-07-09 01:32:17.0 +0200 +++ linux-mm/include/asm-parisc/compat_rt_sigframe.h 2007-08-01 16:51:57.0 +0200 @@ -1,6 +1,6 @@ -#includelinux/compat.h -#includelinux/compat_siginfo.h -#includeasm/compat_ucontext.h +#include linux/compat.h +#include linux/compat_siginfo.h +#include asm/compat_ucontext.h #ifndef _ASM_PARISC_COMPAT_RT_SIGFRAME_H #define _ASM_PARISC_COMPAT_RT_SIGFRAME_H - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] [PATCH] USB BIOS early handoff only when the we the driver is configured
On Thu, Aug 02, 2007 at 09:43:29AM -0700, Greg KH wrote: ... It wasn't just MIPS. IBM has a very popular blade system that has huge issues with this, and I think there are some other IBM systems based on the same BIOS that also do bad things if we don't grab the USB controller away from the BIOS as soon as possible (nasty interrupt and other messes happen...) PA-RISC has the same problem with USB. We can't reprogram the IOMMU windows at boot time unless the USB controller is forcefully stopped from doing DMA. BIOS leaves the USB DMA enabled to avoid loosing activity between polls for input. PA-RISC solves this by calling back into the BIOS (aka PDC) to cleanly stop all possible DMA devices. Other arches probably don't have this luxury. grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] add a clear_pages function to clear pages of higher order
On Tue, Apr 05, 2005 at 10:15:18PM -0700, Gerrit Huizenga wrote: SpecSDET, Aim7 or ReAim from OSDL are probably what you are thinking of. SDET isn't publicly available. I hope by now osdl-reaim is called osdl-aim7: http://lkml.org/lkml/2003/8/1/172 grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] parisc: kfree cleanup in arch/parisc/
On Mon, Apr 11, 2005 at 10:54:25PM +0200, Jesper Juhl wrote: Get rid of redundant NULL pointer checks before kfree() in arch/parisc/ as well as a few blank lines. Signed-off-by: Jesper Juhl [EMAIL PROTECTED] thanks! Commited to cvs.parisc-linux.org: http://lists.parisc-linux.org/pipermail/parisc-linux-cvs/2005-April/035542.html I expect Matthew Wilcox will submit to linus with other parisc code changes in the next RC release. thanks, grant diff -upr linux-2.6.12-rc2-mm3-orig/arch/parisc/kernel/ioctl32.c linux-2.6.12-rc2-mm3/arch/parisc/kernel/ioctl32.c --- linux-2.6.12-rc2-mm3-orig/arch/parisc/kernel/ioctl32.c2005-04-05 21:21:08.0 +0200 +++ linux-2.6.12-rc2-mm3/arch/parisc/kernel/ioctl32.c 2005-04-11 22:48:03.0 +0200 @@ -104,12 +104,9 @@ static int drm32_version(unsigned int fd } out: - if (kversion.name) - kfree(kversion.name); - if (kversion.date) - kfree(kversion.date); - if (kversion.desc) - kfree(kversion.desc); + kfree(kversion.name); + kfree(kversion.date); + kfree(kversion.desc); return ret; } @@ -166,9 +163,7 @@ static int drm32_getsetunique(unsigned i ret = -EFAULT; } - if (karg.unique != NULL) - kfree(karg.unique); - + kfree(karg.unique); return ret; } @@ -265,7 +260,6 @@ static int drm32_info_bufs(unsigned int } kfree(karg.list); - return ret; } @@ -305,7 +299,6 @@ static int drm32_free_bufs(unsigned int out: kfree(karg.list); - return ret; } @@ -494,15 +487,10 @@ static int drm32_dma(unsigned int fd, un } out: - if (karg.send_indices) - kfree(karg.send_indices); - if (karg.send_sizes) - kfree(karg.send_sizes); - if (karg.request_indices) - kfree(karg.request_indices); - if (karg.request_sizes) - kfree(karg.request_sizes); - + kfree(karg.send_indices); + kfree(karg.send_sizes); + kfree(karg.request_indices); + kfree(karg.request_sizes); return ret; } @@ -555,9 +543,7 @@ static int drm32_res_ctx(unsigned int fd ret = -EFAULT; } - if (karg.contexts) - kfree(karg.contexts); - + kfree(karg.contexts); return ret; } PS. If you reply to lists other than Linux-kernel, then please keep me on CC: - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: M7101
On Sun, Feb 06, 2005 at 03:26:15PM +0100, Jean Delvare wrote: Maarten Deprez then converted it to the proper kernel coding-style: http://marc.theaimsgroup.com/?l=linux-kernelm=110726276414532 ... Any chance we could get the PCI folks to review the code and push it upwards if it is OK? I'm not the maintainer, but it looks fine to me except for use of numeric constants (e.g 0x5f, 0x18) instead of adding #defines to name the values. But if no one knows what those offsets are for we'll just have to leave it. For reference, here are links to the original m7101 unhiding driver code and help file: http://www2.lm-sensors.nu/~lm78/cvs/lm_sensors2/prog/hotplug/m7101.c Unfortunately, I didn't anything documenting 0x5f and 0x18. Will m7101.c be modified once quirks enabling the device? hth, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pci: Arch hook to determine config space size
On Mon, Jan 31, 2005 at 01:40:04PM -0600, Brian King wrote: CC'ing the linux-pci mailing list... thanks... This patch adds an arch hook so that individual archs can indicate if the underlying system supports expanded config space accesses or not. @@ -653,6 +653,8 @@ static int pci_cfg_space_size(struct pci goto fail; } +if (!pcibios_exp_cfg_space(dev)) +goto fail; if (pci_read_config_dword(dev, 256, status) != PCIBIOS_SUCCESSFUL) goto fail; pci_read_config_dword lands in arch specific code. See drivers/pci/access.c:PCI_OP_READ() macro. I'm missing what pcibios_exp_cfg_space() does that can't be handled by the bus_ops supplied by pci_scan_bus(). I would expect the pci_read_config_dword to fail for being out of bounds. Is that wrong? Or is bus_ops not feasible in this case because pcibios needs access to pci_dev? If it's feasible, maybe the right place to add this hook is to pci_read_config_dword which is also handed the pci_dev. And add another function pointer to bus_ops (which could be NULL) to check chipset support for Expanded Config space before calling pci_bus_read_config_dword. Thats cleaner than adding a hook before each use of pci_read_config_dword. hth, grant
Re: [patch 2.6.12 (repost w/ corrected subject)] pci: restore BAR values in pci_enable_device_bars
On Tue, Jul 05, 2005 at 01:46:20PM -0400, John W. Linville wrote: On Sat, Jul 02, 2005 at 01:29:54AM -0600, Grant Grundler wrote: On Thu, Jun 30, 2005 at 10:26:37PM -0400, John W. Linville wrote: + /* Some devices lose PCI config header data during D3hot-D0 Can you name some of those devices here? I just want to know what sort of devices need to be tested if this code changes in the future. I don't really have a list. The devices that brought this issue to my attention are a 3c905B and a 3c556B, both covered by the 3c59x driver. John, apologies for the late reply - been offline the past two weeks on holiday. Just listing the two devices in a comment would be sufficient. According to section 5.4.1 of the PCI BUS POWER MANAGEMENT INTERFACE SPECIFICATION, REV. 1.2, a device transitioning from D3hot to D0 _may_ perform an internal reset, thereby going to D0 Uninitialized rather than D0 Initialized. Including the above paragraph in a comment would be a good thing. I don't know if this spec is publicly available. But even if it is, typically only a handful of people will be familiar enough with it to know where to look in it. Since this behaviour is ratified by the spec, I think we need to accomodate it. Yes - sounds reasonable to me too. A bit in the PMCSR register indicates how a device will behave in this regard. We could have a test to only execute the BAR restoration for those devices that seem to need it. I left that out because it seemed to add needless complexity. In the meantime the patch has gotten bigger and more complex, so maybe that code doesn't make it any worse. Do you want me to add that? I think I'd keep it simpler until someone proves we need it. I've read the rest of the thread and don't recall any such proof. +transition. Since some firmware leaves devices in D3hot +state at boot, this information needs to be restored. Again, which firmware? Examples are good since it makes it possible to track down the offending devices for testing. The Thinkpad T21 does this. I don't know of any others specifically, but it seems like something laptop BIOSes would be likely to do. That's fine - just listing the Thinkpad T21 in a comment is helpful. If you happen to know the firmware version too, that would be even better. The following chunk looks like it will have issues with 64-bit BARs: As RMK pointed-out, this code is inspired by setup-res.c; specifically that in pci_update_resource. I'd prefer not to blaze any new trails regarding 64-bit BAR support ATM... :-) After thinking about this more, I'm convinced it's broken if a 64-bit BAR is present on the PCI device. It doesn't matter if the MMIO value is greater than 4GB or not. The problem is pci_dev-resource[i] does NOT map 1:1 with PCI_BASE_ADDRESS_0+(i*4). So, is the current patch acceptable? I don't think so. 64-bit BARs are just too common today. One solution is to use a seperate variable to track the offset into PCI config space. ie use i to walk through pci_dev-resource[] and add unsigned int pcibar_offset to keep track of 32 vs 64-bit BARs. Or shall I add the check for the no soft reset bit in the PMCSR register? I don't see why that's necessary. thanks, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 8/8] pci and yenta: pcibios_bus_to_resource
On Tue, Jul 12, 2005 at 12:21:38AM +0200, Dominik Brodowski wrote: In yenta_socket, we default to using the resource setting of the CardBus bridge. However, this is a PCI-bus-centric view of resources and thus needs to be converted to generic resources first. Therefore, add a call to pcibios_bus_to_resource() call in between. This function is a mere wrapper on x86 and friends, however on some others it already exists, is added in this patch (alpha, arm, ppc, ppc64) or still needs to be provided (parisc -- where is its pcibios_resource_to_bus() ?). in arch/parisc/kernel/pci.c? At least, it seems to be present in the 2.6.13-rc1 tree on cvs.parisc-linux.org tree. Arnaldo De Carmelo had add-on pci-pcmcia cards working in his a500 (64-bit w/IOMMU PA-RISC) last year. ISTR a few other people have similar cards working for on B180 workstation (32-bit w/o IOMMU PARISC). grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6] remove PCI_BRIDGE_CTL_VGA handling from setup-bus.c
On Thu, Jul 14, 2005 at 02:53:27PM +0100, Russell King wrote: On Thu, Jul 14, 2005 at 03:53:44PM +0400, Ivan Kokshaysky wrote: The setup-bus code doesn't work correctly for configurations with more than one display adapter in the same PCI domain. This stuff actually is a leftover of an early 2.4 PCI setup code and apparently it stopped working after some bridge_ctl changes. So the best thing we can do is just to remove it and rely on the fact that any firmware *has* to configure VGA port forwarding for the boot display device properly. What happens when there is no firmware? I helped test/add bridge_ctl patch but PARISC general does NOT support VGA at this time. I'm sure this code would not have been added had there not been a reason for it. Do we know why it was added? It was a replacement for the previous hacks and should represent essentially the same functionality. I suspect we just didn't care about (or test) multiheaded gfx at the time. Certainly not on parisc. This was in 2000/2001 timeframe originally. grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.13-rc1 05/10] IOCHK interface for I/O error handling/detecting
On Wed, Jul 06, 2005 at 02:11:42PM +0900, Hidetoshi Seto wrote: [This is 5 of 10 patches, iochk-05-check_bridge.patch] ... It means that A or B hits a bus error, but there is no data which one actually hits the error. So, C should notify the error to both of A and B, and clear the H's status to start its own I/Os. If there are only two devices, it become more simple. It is clear if one find a bridge error while another is check-in, the error is nothing except for another's. Sorry, I don't understand this last paragraph. I don't see how it's more simple with two devices (vs three) if we don't exactly know which device caused the error. I thought one still needed to reset/restart both devices. Is that correct? The devices operate asyncronously from the drivers. Only the driver can tell us for sure if IO was in flight for a particular device and decide that a device could NOT have generated an error. Otherwise, so far, the patches look fine to me. thanks, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pcibios_bus_to_resource for parisc [Was: Re: [PATCH 8/8] pci and yenta: pcibios_bus_to_resource]
On Sat, Jul 23, 2005 at 09:54:11PM +0200, Dominik Brodowski wrote: Oh, yes, I seem to have missed it. Sorry. Does this patch look good? Yes. Acked-by: Grant Grundler [EMAIL PROTECTED] I'll commit this to the cvs.parisc-linux.org tree as well. Willy can let me deal with the collision if it's not trivial on his next merge. thanks, grant Add pcibios_bus_to_resource for parisc. Signed-off-by: Dominik Brodowski [EMAIL PROTECTED] Index: 2.6.13-rc3-git2/arch/parisc/kernel/pci.c === --- 2.6.13-rc3-git2.orig/arch/parisc/kernel/pci.c +++ 2.6.13-rc3-git2/arch/parisc/kernel/pci.c @@ -255,8 +255,26 @@ void __devinit pcibios_resource_to_bus(s pcibios_link_hba_resources(hba-lmmio_space, bus-resource[1]); } +void pcibios_bus_to_resource(struct pci_dev *dev, struct resource *res, + struct pci_bus_region *region) +{ + struct pci_bus *bus = dev-bus; + struct pci_hba_data *hba = HBA_DATA(bus-bridge-platform_data); + + if (res-flags IORESOURCE_MEM) { + res-start = PCI_HOST_ADDR(hba, region-start); + res-end = PCI_HOST_ADDR(hba, region-end); + } + + if (res-flags IORESOURCE_IO) { + res-start = region-start; + res-end = region-end; + } +} + #ifdef CONFIG_HOTPLUG EXPORT_SYMBOL(pcibios_resource_to_bus); +EXPORT_SYMBOL(pcibios_bus_to_resource); #endif /* - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver
Tom, A co-worker made the following observation (I'm paraphrasing): ...this proposal does not deal with the Error Reporting ECN. For example, they do not show the advisory non-fatal bit in the correctable error status register. I believe he is referring to the Error Clarifications ECN: http://www.pcisig.com/specifications/pciexpress/ECN_-_Error_Clarifications.pdf Looks like all PCI-E ECNs are available [just not the original docs :^( ]: http://www.pcisig.com/specifications/pciexpress/specifications hth, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver
On Tue, Mar 15, 2005 at 01:54:32PM -0800, Nguyen, Tom L wrote: On Tuesday, March 15, 2005 12:12 PM Grant Grundler wrote: Tom, A co-worker made the following observation (I'm paraphrasing): ...this proposal does not deal with the Error Reporting ECN. For example, they do not show the advisory non-fatal bit in the correctable error status register. Does he refer to the ECN update on the Received Error Bit[0] of the Correctable Error Status Register and on the Training Error Bit[0] of the Uncorrectable Error Status Register? If not, please clarify his comments for us. Yes - I believe so. grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/6] PCI Express Advanced Error Reporting Driver
On Tue, Mar 15, 2005 at 04:51:01PM -0600, Linas Vepstas wrote: Hi, On Fri, Mar 11, 2005 at 04:12:18PM -0800, long was heard to remark: +void hw_aer_unregister(void) +{ + struct pci_dev *dev = (struct pci_dev*)host-dev; I'm more nervous about host being defined as a global instead of being passed in. I've not review the other code and don't know if that's safe. + unsigned short id; + + id = (dev-bus-number 8) | dev-devfn; + + /* Unregister with AER Root driver */ + pcie_aer_unregister(id); +} I don't understand how this can work on a system with more than one domain. On any midrange/high-end system, you'll have a number of devices with identical values for (bus-number 8) | devfn) Yes - this is an error reported within a particular domain. I'm expecting host- to refer to a particular domain. Maybe it doesn't? [ example deleted ] Or am I being stupid/dense/all-of-the-above? Probably not. grant --linas - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver
On Tue, Mar 15, 2005 at 01:11:39PM -0700, Grant Grundler wrote: Tom, A co-worker made the following observation (I'm paraphrasing): ...this proposal does not deal with the Error Reporting ECN. For example, they do not show the advisory non-fatal bit in the correctable error status register. I believe he is referring to the Error Clarifications ECN: http://www.pcisig.com/specifications/pciexpress/ECN_-_Error_Clarifications.pdf Tom, Sorry - I got this wrong. He was referring to an unpublished draft Error Reporting ECN. You'll have to talk to Intel's PCI-SIG representative to get a copy. [ Ugh. And everyone else is SOL - sorry ] I'm annoyed he wanted me to raise this in a public forum without having a public document to point at. And I'm annoyed at myself for being lazy and not verifying that before hand... sorry, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI Error Recovery API Proposal. (WAS:: [PATCH/RFC]PCIErrorRecovery)
On Fri, Mar 18, 2005 at 09:24:02AM -0800, Nguyen, Tom L wrote: Likewise, with EEH the device driver could take recovery action on its own. But we don't want to end up with multiple sets of recovery code in drivers, if possible. Also we want the recovery code to be as simple as possible, otherwise driver authors will get it wrong. Drivers own their devices register sets. Therefore if there are any vendor unique actions that can be taken by the driver to recovery we expect the driver to do so. ... All drivers also need to cleanup driver state if they can't simply recover (and restart pending IOs). ie they need to release DMA resources and return suitable errors for pending requests. I would see the AER driver as being included in the platform code. The AER driver would be be closely involved in the recovery process. Our goal is to have the AER driver be part of the general code base because it is based on a PCI SIG specification that can be implemented across all architectures. To the driver writer, it's all platform code. Folks who maintain PCI (and other) services differentiate between generic and arch/platform specific. Think first like a driver writer and then worry about if/how that can be divided between platform generic and platform/arch specific code. Even PCI-Express has *some* arch specific component. At a minimum each architecture has it's own chipset and firmware to deal with for PCI Express bus discovery and initialization. But driver writers don't have to worry about that and they shouldn't for error recovery either. For a FATAL error the link is unreliable. This means MMIO operations may or may not succeed. That is why the reset is performed by the upstream port driver. The interface to that is reliable. A reset of an upstream port will propagate to all downstream links. So we need an interface to the bus/port driver to request a reset on its downstream link. We don't want the AER driver writing port bus driver bridge control registers. We are trying to keep the ownership of the devices register read/write within the domain of the devices driver. In our case the port bus driver. A port bus driver does NOT sound like a normal device driver. If PCI Express defines a standard register set for a bridge device (like PCI COnfig space for PCI-PCI Bridges), then I don't see a problem with PCI-Express error handling code mucking with those registers. Look at how PCI-PCI bridges are supported today and which bits of code poke registers on PCI-PCI Bridges. hth, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/6] PCI Express Advanced Error Reporting Driver
On Tue, Mar 15, 2005 at 07:12:07PM -0700, Grant Grundler wrote: ... He was referring to an unpublished draft Error Reporting ECN. You'll have to talk to Intel's PCI-SIG representative to get a copy. Good News: the Error Reporting ECN is now posted on the PCISIG website. http://www.pcisig.com/specifications/pciexpress/specifications/ECN_Error_Reporting_050127_clean.pdf Tom, please review and see if/how that changes your implementation. thanks, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 64bit build of tulip driver
On Thu, Mar 31, 2005 at 07:52:06PM -0800, Jim Gifford wrote: Grant Thanx for your feedback. I got it working, but I don't think the patch is the best. Here is the patch, and the information, but if you can recommend a different way to fix it, let me know. I can not reccomend one. I can suggest other things to try since I'm very skeptical this patch will get accepted by the maintainer (Jeff Garzik). He's normally wants a much better explanation of the problem than this works. The patch was done by Peter Horton. Here is the link to the full patch, http://ftp.jg555.com/patches/raq2/linux/linux-2.6.11.6-raq2_fix-2.patch but here is the section for this issue Jim, You have other changes to tulip_core.c: + /* Avoid a chip errata by prefixing a dummy entr y. Don't do + this on the ULI526X as it triggers a differen t problem */ Picking a few nits: o comment extends past 80 columns - please wrap before 80 columns o *Which* chip errata? o *Which* other problem? o I prefer diffs with -p when reviewing patches so I know which function is getting mangled. - /* No media table either */ - tp-flags = ~HAS_MEDIA_TABLE; + /* Ensure our media table fixup get's applied */ + memcpy(ee_data + 16, ee_data, 8); This isn't likely to get far either unless it's better explained. You don't have to explain it to me, now. But have something handy if you want jgarzik to accept it. @@ -1628,6 +1631,16 @@ } } +#if defined(CONFIG_MIPS_COBALT) defined(CONFIG_MIPS64) +/* + * something very bad is happening. without this + * cache flush the PHY can't be read. I've tried + * various ins outs, delays etc but only a call + * to printk or this flush seems to fix it ... help! + */ +flush_cache_all(); +#endif The code immediately before this calls tulip_select_media(). Code paths exist in tulip_select_media() where the last thing the driver does to the NIC is io_write(). This could easily be a posted write flush problem. Does replacing flush_cache_all() with ioread32(ioaddr + CSR12) also work? Can you find out how long one has to wait after banging on CSR12 before it's safe to call tulip_find_mii()? How long does flush_cache_all() take in microseconds? It's possible this is a very fast PPC chip and it's executing the code path between tulip_select_media() and tulip_find_mii() faster than the chips can finish dealing with the writes to CSR12. I'd consider this issue if flushing posted PCI writes doesn't help. The tulip changes I maintain in parisc-linux port deal with similar issues where the driver is not following the specified timing requirements. Search google for tulip 802.3 22.2.4 Management functions or look into http://cvs.parisc-linux.org/linux-2.6/. + /* Find the connected MII xcvrs. Doing this in open() would allow detecting external xcvrs later, but takes much time. */ Are there any config option differences? e.g. MWI or MMIO options enabled on 64-bit but not 32-bit? I verified that there are no differences. ok. thanks. ... Applied the patch, here is the output :00:07.0: tulip_stop_rxtx() failed (CSR5 0xf066 CSR6 0xb3862002) ... Sorry, I don't have time to decode what these mean right now. But I think the publicly available tulip chips docs sufficiently explain what the registers mean and what state the chip is in. I was able to get some more information on the bootup sequence with the updates. Here is the output now from the driver Linux Tulip driver version 1.1.13 (May 11, 2002) PCI: Enabling device :00:07.0 (0045 - 0047) tulip0: Old format EEPROM on 'Cobalt Microserver' board. Using substitute media control info. tulip0: EEPROM default media type Autosense. tulip0: Index #0 - Media MII (#11) described by a 21142 MII PHY (3) block. tulip0: ***WARNING***: No MII transceiver found! ok. I assume this is unpatched. thanks, grant eth0: Digital DS21143 Tulip rev 65 at b0001400, 00:10:E0:00:32:DE, IRQ 19. PCI: Enabling device :00:0c.0 (0005 - 0007) tulip1: Old format EEPROM on 'Cobalt Microserver' board. Using substitute media control info. tulip1: EEPROM default media type Autosense. tulip1: Index #0 - Media MII (#11) described by a 21142 MII PHY (3) block. tulip1: ***WARNING***: No MII transceiver found! eth1: Digital DS21143 Tulip rev 65 at b0001480, 00:10:E0:00:32:DF, IRQ 20. -- Jim Gifford [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 64bit build of tulip driver
On Fri, Apr 01, 2005 at 08:46:33AM -0800, Jim Gifford wrote: Code paths exist in tulip_select_media() where the last thing the driver does to the NIC is io_write(). This could easily be a posted write flush problem. Does replacing flush_cache_all() with ioread32(ioaddr + CSR12) also work? The code immediately before this calls tulip_select_media(). Didn't work, Can you try replacing flush_cache_all() with the following? ioread32(ioaddr + CSR12); udelay(500);/* random delay until someone looks up what is spec'd */ I'm going to revert back and try your code and see if it fixes the issue. Erm...the code in parisc-linux tree won't have the COBALT hacks. You might try adding selective bits from the parisc-linux tulip. That fact the flush_cache_all() changes the behavior made me wonder if IORESOURCE_CACHEABLE is set in the pci resource. But that doesn't seem to matter for ppc (32 or 64). Notes on what I learned below. arch/ppc64/kernel/iomap.c doesn't look at that flag. arch/ppc/kernel/io.c:pci_ioremap() has the nice comment: if (flags IORESOURCE_MEM) /* Not checking IORESOURCE_CACHEABLE because PPC does * not currently distinguish between ioremap and * ioremap_nocache. */ return ioremap(start, len); ioremap resolves to: void __iomem * ioremap64(unsigned long long addr, unsigned long size) { return __ioremap(addr, size, _PAGE_NO_CACHE); } I *think* (too many ifdefs) ppc64 does the same in arch/ppc64/mm/init.c. Cacheing is clear not an issue for accessing MMIO space via pci_iomap(). grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 64bit build of tulip driver
On Fri, Apr 01, 2005 at 12:23:25PM -0800, Jim Gifford wrote: Grant, Thank you, I took your driver as a reference and added in the cobalt specifics to the eeprom.c file, works perfectly now. Cool! very welcome. Can you do me a favor and submit a diff of all the tulip changes you have at this point back to lkml (and whatever other lists are cc'd)? jgarzik might accept your bits and ignore the parts that have been submitted/rejected before. But whatever you post will get archived with this thread for others to find in the future. thanks, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Avoiding fragmentation through different allocator
On Mon, Jan 24, 2005 at 10:29:52AM -0200, Marcelo Tosatti wrote: Grant Grundler and James Bottomley have been working on this area, they might want to add some comments to this discussion. It seems HP (Grant et all) has pursued using big pages on IA64 (64K) for this purpose. Marcello, That might have been Alex Williamson...but the reasons for 64K pages is to reduce TLB thrashing, not faster IO. On HP ZX1 boxes, SG performance is slightly better (max +5%) when going through the IOMMU than when bypassing it. The IOMMU can perfectly coalesce DMA pages but has a small CPU and DMA cost to do so as well. Otherwise, I totally agree with James. IO devices do scatter-gather pretty well and IO subsystems are tuned for page-size chunk or smaller anyway. ... I could keep digging, but I think the bottom line is that having large pages generally available rather than a fixed setting is desirable. Definately, yes. Thanks for the pointers. Big pages are good for CPU TLB and that's where most of the research has been done. I think IO devices have learned to cope with the fact the alot less has been (or can be for many workloads) done to coalesce IO pages. grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Avoiding fragmentation through different allocator
On Tue, Jan 25, 2005 at 09:02:34AM -0500, Mukker, Atul wrote: The megaraid driver is open source, do you see anything that driver can do to improve performance. We would greatly appreciate any feedback in this regard and definitely incorporate in the driver. The FW under Linux and windows is same, so I do not see how the megaraid stack should perform differently under Linux and windows? Just to second what Andy already stated: it's more likely the Megaraid firmware could be better at fetching the SG lists. This is a difficult problem since the firmware needs to work well on so many different platforms/chipsets. If LSI has time to turn more stones, get a PCI bus analyzer and filter it to only capture CPU MMIO traffic and DMA traffic to/from some well known SG lists (ie instrument the driver to print those to the console). Then run AIM7 or similar multithreaded workload. A perfect PCI trace will show the device pulling the SG list in cacheline at time after the CPU MMIO reads/writes from the card to indicate a new transaction is ready to go. Another stone LSI could turn is to verify the megaraid controller is NOT contending with the CPU for cachelines used to build SG lists. This something the driver controls but I only know how to measure this on ia64 machines (with pfmon or caliper or similar tool). If you want examples, see http://iou.parisc-linux.org/ols2004/pfmon_for_iodorks.pdf In case it's not clear from above, optimal IO flow means the device is moving control data and streaming data in cacheline or bigger units. If Megaraid is already doing that, then the PCI trace timing info should point at where the latencies are. hth, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: Patch to control VGA bus routing and active VGA device.
On Thu, Jan 27, 2005 at 08:28:43AM -0800, Jesse Barnes wrote: But then again, I suppose if a platform supports more than one legacy I/O space, Eh?! there can only be *one* legacy I/O space. We can support multipl IO port spaces, but only one can be the legacy. Moving the VGA device can only function within that legacy space the way the code is written now (using hard coded addresses). If it is intended to work with multiple IO Port address spaces, then it needs to use the pci_dev-resource[] and mangle that appropriately. grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: Patch to control VGA bus routing and active VGA device.
On Fri, Jan 28, 2005 at 01:36:48PM -0500, Jon Smirl wrote: If it is intended to work with multiple IO Port address spaces, then it needs to use the pci_dev-resource[] and mangle that appropriately. Post a patch an I will incorporate it. Sorry - I only wanted to point out the short coming. I don't care if it gets fixed (or not) since I don't use or need to support multiple VGA cards. If someone else (in HP) does, it's just nice to warn them what's broken. thanks, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: Patch to control VGA bus routing and active VGA device.
On Fri, Jan 28, 2005 at 02:26:40PM -0500, Jon Smirl wrote: Next year we are going to see a lot of multiple VGAs. Depending on configuration the Nvidia4 chipset can support from one up to eight PCI Express video cards simultaneously. Oh geezsomeone is going to implement IO port space on PCI express device?! /me gets out the cluebat... grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: Patch to control VGA bus routing and active VGA device.
On Fri, Jan 28, 2005 at 10:41:41AM -0800, Jesse Barnes wrote: Eh?! there can only be *one* legacy I/O space. We can support multipl IO port spaces, but only one can be the legacy. What do you mean? If you define legacy I/O space to be 0x-0x, then yes of course you're right. Yes - exactly. But if you mean being able to access legacy ports at all, then no. On SGI machines, there's a per-bus base address that can be used as the base for port I/O, which is what I was getting at. Ok - my point was 0x3fc will get routed to exactly one of those IO port address spaces. If it is intended to work with multiple IO Port address spaces, then it needs to use the pci_dev-resource[] and mangle that appropriately. There is no resource for some of the I/O port space that cards respond to. Yes - I've heard several graphics cards are horrible broken WRT address decoding. Are PCI quirks supposed to handle that sort of thing? Another example was Xf86 was poking around in MMIO space to determine if such broken cards are installed. I can set the I/O BAR of my VGA card to 0x400 and it'll still respond to accesses at 0x3bc for example. That's what I mean by legacy space--space that cards respond to but don't report in their PCI resources. Can't PCI quirks fix up the resources to reflect this? I think one needs to fix up PCI IO Port resources to adjust for The One legacy IO port space getting routed to a different PCI segment - assuming no one submits a patch to change current behavior of using hard coded addresses. HP parisc and ia64 platforms implement seperate PCI segments under each PCI host bus controller. Linux PCI BIOS support provides the illusion it's all in one PCI segment on most (not all) platforms. Some HP chipsets also provide a Legacy IO Port space that gets routed to a chosen PCI Host bus controller. parisc PCI BIOS adds the controller instance number to the IO port space resource to help inb() generate the IO port cycle on the right PCI segment. This needs to be fixed up if we decide a different PCI segment should be segment 0 (and thus get references to 0x3fc). I expect other arches with multi-segment support to do similar fix ups. Am I making more sense now? thanks, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: Patch to control VGA bus routing and active VGA device.
On Fri, Jan 28, 2005 at 11:41:16AM -0800, Jesse Barnes wrote: Yeah, I think I understand. We could probably do the same thing on sn2 as you do on parisc--add a 'segment 0' offset to the port so that it's routed correctly. I think that's a little less flexible than adding a new resource though, since it makes it harder for drivers to support more than one device or devices on non-segment 0 busses. To be clear, the parisc code defines this in include/asm-parisc/pci.h: #define HBA_PORT_SPACE_BITS 16 #define PCI_PORT_HBA(a)((a) HBA_PORT_SPACE_BITS) and PCI_PORT_HBA gets used in arch/parisc/kernel/pci.c. Offhand, I don't know if ia64 has the equivalent. But it sounds like it might need it. grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Horrible regression with -CURRENT from Don't busy-lock-loop in preemptable spinlocks patch
On Thu, Jan 20, 2005 at 08:43:30AM +1100, Paul Mackerras wrote: I suggest read_poll(), write_poll(), spin_poll(),... Erm...those names sound way too much like existing interfaces. Perhaps check_read_lock()/check_write_lock() ? If they don't get used too much, the longer name should be ok. grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mmc: dw_mmc: rewrite CLKDIV computation
Last year Seungwon Jeon (Samsung) fixed a bug in CLKDIV computation. But when debugging a related issue (http://crbug.com/221828) I found the code unreadable. This rewrite simplifies the computation and explains each step. Signed-off-by: Grant Grundler grund...@chromium.org --- Tested on Samsung Series 3 Chromebook (exynos 5250 chipset) using ChromeOS 3.4 kernel (not 3.9-rc3 which this patch is based against). I've written a test program to confirm this patch generates the same correct values and will share that separately. drivers/mmc/host/dw_mmc.c | 22 +++--- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c index 9834221..6fdf55b 100644 --- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -631,14 +631,22 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot, bool force_clkinit) if (slot-clock != host-current_speed || force_clkinit) { div = host-bus_hz / slot-clock; - if (host-bus_hz % slot-clock host-bus_hz slot-clock) - /* -* move the + 1 after the divide to prevent -* over-clocking the card. + if (host-bus_hz slot-clock) { + /* don't overclock due to integer math losses */ + if ((div * slot-clock) host-bus_hz) + div++; + + /* don't overclock due to resolution of HW */ + if (div 1) + div++; + + /* See 6.2.3 CLKDIV in Mobile Storage Host Databook +* Look for dwc_mobile_storage_db.pdf from Synopsys. +* CLKDIV value 0 means divisor 1, val 1 - 2, ... */ - div += 1; - - div = (host-bus_hz != slot-clock) ? DIV_ROUND_UP(div, 2) : 0; + div /= 2; + } else + div = 0;/* use bus_hz */ dev_info(slot-mmc-class_dev, Bus speed (slot %d) = %dHz (slot req %dHz, actual %dHZ -- 1.8.1.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mmc: dw_mmc: rewrite CLKDIV computation
I've attached the test program I wrote to compare the different flavors of CLKDIV computation: old (3.4 kernel), current upstream, and my rewrite. thanks grant On Tue, Mar 26, 2013 at 3:50 PM, Grant Grundler grund...@chromium.org wrote: Last year Seungwon Jeon (Samsung) fixed a bug in CLKDIV computation. But when debugging a related issue (http://crbug.com/221828) I found the code unreadable. This rewrite simplifies the computation and explains each step. Signed-off-by: Grant Grundler grund...@chromium.org --- Tested on Samsung Series 3 Chromebook (exynos 5250 chipset) using ChromeOS 3.4 kernel (not 3.9-rc3 which this patch is based against). I've written a test program to confirm this patch generates the same correct values and will share that separately. drivers/mmc/host/dw_mmc.c | 22 +++--- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c index 9834221..6fdf55b 100644 --- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -631,14 +631,22 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot, bool force_clkinit) if (slot-clock != host-current_speed || force_clkinit) { div = host-bus_hz / slot-clock; - if (host-bus_hz % slot-clock host-bus_hz slot-clock) - /* -* move the + 1 after the divide to prevent -* over-clocking the card. + if (host-bus_hz slot-clock) { + /* don't overclock due to integer math losses */ + if ((div * slot-clock) host-bus_hz) + div++; + + /* don't overclock due to resolution of HW */ + if (div 1) + div++; + + /* See 6.2.3 CLKDIV in Mobile Storage Host Databook +* Look for dwc_mobile_storage_db.pdf from Synopsys. +* CLKDIV value 0 means divisor 1, val 1 - 2, ... */ - div += 1; - - div = (host-bus_hz != slot-clock) ? DIV_ROUND_UP(div, 2) : 0; + div /= 2; + } else + div = 0;/* use bus_hz */ dev_info(slot-mmc-class_dev, Bus speed (slot %d) = %dHz (slot req %dHz, actual %dHZ -- 1.8.1.3 #include stdio.h /* from include/linux/kernel.h */ #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d)) unsigned int original(unsigned int bus_hz, unsigned int clock) { unsigned int div; if (bus_hz % clock) /* * move the + 1 after the divide to prevent * over-clocking the card. */ div = ((bus_hz / clock) 1) + 1; else div = (bus_hz / clock) 1; return div; } unsigned int upstream(unsigned int bus_hz, unsigned int clock) { unsigned int div; div = bus_hz / clock; if (bus_hz % clock bus_hz clock) /* * move the + 1 after the divide to prevent * over-clocking the card. */ div += 1; div = (bus_hz != clock) ? DIV_ROUND_UP(div, 2) : 0; return div; } unsigned int ggg(unsigned int bus_hz, unsigned int clock) { unsigned int div; div = bus_hz / clock; if (bus_hz clock) { /* don't overclock due to integer math losses */ if ((div * clock) bus_hz) div++; /* don't overclock due to resolution of HW */ if (div 1) div++; /* See 6.2.3 CLKDIV in Mobile Storage Host Databook * Look for dwc_mobile_storage_db.pdf from Synopsys. * CLKDIV value 0 means divisor 1, value 1 - 2, val 2 - 4 etc. */ div /= 2; } else div = 0; /* use bus_hz */ return div; } unsigned int bus_hz_tbl[] = { 13, 17, 19, 20, 5000, 1}; unsigned int slot_hz_tbl[] = { 10, 13, 21, 40, 57, 40, 784314, 5200}; static unsigned int div_to_hz(unsigned int bus_hz, unsigned int d) { return d ? (bus_hz/(d*2)) : bus_hz; } static void verify_hz(unsigned int bus_hz, unsigned int requested_hz, unsigned int divided_hz, const char *algo_name) { if (divided_hz bus_hz) printf( [FAIL: %s bus hz!], algo_name); if (divided_hz requested_hz) printf( [FAIL: %s slot hz!], algo_name); } void main(void) { unsigned int i, j; printf(bus/slot hz Original Upstream GGG\n); for(i=0; i sizeof(bus_hz_tbl)/sizeof(unsigned int); i++) { unsigned int bus_hz = bus_hz_tbl[i]; for (j=0; j sizeof(slot_hz_tbl)/sizeof(unsigned int); j++) { unsigned int slot_hz = slot_hz_tbl[j]; unsigned int x = original(bus_hz, slot_hz); unsigned int y = upstream(bus_hz, slot_hz); unsigned int z = ggg(bus_hz, slot_hz); unsigned int hz_x, hz_y, hz_z; hz_x = div_to_hz(bus_hz, x); hz_y = div_to_hz(bus_hz, y); hz_z = div_to_hz(bus_hz, z); printf(%8d/%6d %7d %6d %7d %6d %7d %6d, bus_hz, slot_hz, x, hz_x, y, hz_y, z, hz_z); verify_hz(bus_hz, slot_hz, hz_x, Original); verify_hz(bus_hz
Re: [PATCH] mmc: dw_mmc: rewrite CLKDIV computation
Sorry - please ignore the previous version. Did not include a copyright (implied to be mine...but it's not) nor license. Same thing but with proper attribution. cheers, grant On Tue, Mar 26, 2013 at 3:53 PM, Grant Grundler grund...@chromium.org wrote: I've attached the test program I wrote to compare the different flavors of CLKDIV computation: old (3.4 kernel), current upstream, and my rewrite. thanks grant On Tue, Mar 26, 2013 at 3:50 PM, Grant Grundler grund...@chromium.org wrote: Last year Seungwon Jeon (Samsung) fixed a bug in CLKDIV computation. But when debugging a related issue (http://crbug.com/221828) I found the code unreadable. This rewrite simplifies the computation and explains each step. Signed-off-by: Grant Grundler grund...@chromium.org --- Tested on Samsung Series 3 Chromebook (exynos 5250 chipset) using ChromeOS 3.4 kernel (not 3.9-rc3 which this patch is based against). I've written a test program to confirm this patch generates the same correct values and will share that separately. drivers/mmc/host/dw_mmc.c | 22 +++--- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c index 9834221..6fdf55b 100644 --- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -631,14 +631,22 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot, bool force_clkinit) if (slot-clock != host-current_speed || force_clkinit) { div = host-bus_hz / slot-clock; - if (host-bus_hz % slot-clock host-bus_hz slot-clock) - /* -* move the + 1 after the divide to prevent -* over-clocking the card. + if (host-bus_hz slot-clock) { + /* don't overclock due to integer math losses */ + if ((div * slot-clock) host-bus_hz) + div++; + + /* don't overclock due to resolution of HW */ + if (div 1) + div++; + + /* See 6.2.3 CLKDIV in Mobile Storage Host Databook +* Look for dwc_mobile_storage_db.pdf from Synopsys. +* CLKDIV value 0 means divisor 1, val 1 - 2, ... */ - div += 1; - - div = (host-bus_hz != slot-clock) ? DIV_ROUND_UP(div, 2) : 0; + div /= 2; + } else + div = 0;/* use bus_hz */ dev_info(slot-mmc-class_dev, Bus speed (slot %d) = %dHz (slot req %dHz, actual %dHZ -- 1.8.1.3 /* Test dw_mmc driver CLKDIV computation. * Copyright (c) 2013 The Chromium OS Authors. All rights reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ #include stdio.h /* from include/linux/kernel.h */ #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d)) unsigned int original(unsigned int bus_hz, unsigned int clock) { unsigned int div; if (bus_hz % clock) /* * move the + 1 after the divide to prevent * over-clocking the card. */ div = ((bus_hz / clock) 1) + 1; else div = (bus_hz / clock) 1; return div; } unsigned int upstream(unsigned int bus_hz, unsigned int clock) { unsigned int div; div = bus_hz / clock; if (bus_hz % clock bus_hz clock) /* * move the + 1 after the divide to prevent * over-clocking the card. */ div += 1; div = (bus_hz != clock) ? DIV_ROUND_UP(div, 2) : 0; return div; } unsigned int ggg(unsigned int bus_hz, unsigned int clock) { unsigned int div; div = bus_hz / clock; if (bus_hz clock) { /* don't overclock due to integer math losses */ if ((div * clock) bus_hz) div++; /* don't overclock due to resolution of HW */ if (div 1) div++; /* See 6.2.3 CLKDIV in Mobile Storage Host Databook * Look for dwc_mobile_storage_db.pdf from Synopsys. * CLKDIV value 0 means divisor 1, value 1 - 2, val 2 - 4 etc. */ div /= 2; } else div = 0; /* use bus_hz */ return div; } unsigned int bus_hz_tbl[] = { 13, 17, 19, 20, 5000, 1}; unsigned int slot_hz_tbl[] = { 10, 13, 21, 40, 57, 40, 784314, 5200}; static unsigned
Re: [PATCH] mmc: dw_mmc: rewrite CLKDIV computation
On Wed, Mar 27, 2013 at 5:25 AM, Seungwon Jeon tgih@samsung.com wrote: On Wednesday, March 27, 2013, Grant Grundler wrote: Last year Seungwon Jeon (Samsung) fixed a bug in CLKDIV computation. For easily identifying, it would be good to point the commit id and subject. commit id will be different for different git trees and branches - not as useful as one might think. I will include the following and update the patch: Author: Seungwon Jeon tgih@samsung.com Date: Tue May 22 13:01:21 2012 +0900 mmc: dw_mmc: correct the calculation for CLKDIV But when debugging a related issue (http://crbug.com/221828) I found It is not easy to catch up issue in your site. What problem is bothering you? Could you describe the problem and conditions you have found? The bug is not relevant to this patch - it's a related bug. I mentioned the bug only to explain my motivation for looking at this code. I will move the bug reference out of the commit message. To summarize, I was trying to backport mmc: dw_mmc: correct the calculation for CLKDIV patch to 3.4 kernel to support faster eMMC parts since the original computation in 3.4 kernel was returning wrong CLKDIV value. But then ran into other problems specific to one firmware combination breaking the eMMC clock settings. cheers, grant Thanks, Seungwon Jeon the code unreadable. This rewrite simplifies the computation and explains each step. Signed-off-by: Grant Grundler grund...@chromium.org --- Tested on Samsung Series 3 Chromebook (exynos 5250 chipset) using ChromeOS 3.4 kernel (not 3.9-rc3 which this patch is based against). I've written a test program to confirm this patch generates the same correct values and will share that separately. drivers/mmc/host/dw_mmc.c | 22 +++--- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c index 9834221..6fdf55b 100644 --- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -631,14 +631,22 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot, bool force_clkinit) if (slot-clock != host-current_speed || force_clkinit) { div = host-bus_hz / slot-clock; - if (host-bus_hz % slot-clock host-bus_hz slot-clock) - /* - * move the + 1 after the divide to prevent - * over-clocking the card. + if (host-bus_hz slot-clock) { + /* don't overclock due to integer math losses */ + if ((div * slot-clock) host-bus_hz) + div++; + + /* don't overclock due to resolution of HW */ + if (div 1) + div++; + + /* See 6.2.3 CLKDIV in Mobile Storage Host Databook + * Look for dwc_mobile_storage_db.pdf from Synopsys. + * CLKDIV value 0 means divisor 1, val 1 - 2, ... */ - div += 1; - - div = (host-bus_hz != slot-clock) ? DIV_ROUND_UP(div, 2) : 0; + div /= 2; + } else + div = 0;/* use bus_hz */ dev_info(slot-mmc-class_dev, Bus speed (slot %d) = %dHz (slot req %dHz, actual %dHZ -- 1.8.1.3 -- To unsubscribe from this list: send the line unsubscribe linux-mmc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mmc: dw_mmc: rewrite CLKDIV computation
On Wed, Mar 27, 2013 at 8:07 AM, Doug Anderson diand...@chromium.org wrote: Grant, Thanks for posting! See below... thanks for reviewing/feedback! :) On Tue, Mar 26, 2013 at 3:50 PM, Grant Grundler grund...@chromium.org wrote: Last year Seungwon Jeon (Samsung) fixed a bug in CLKDIV computation. But when debugging a related issue (http://crbug.com/221828) I found the code unreadable. This rewrite simplifies the computation and explains each step. The fact that you mention a bug here is confusing. I think this patch has nothing to do with fixing that bug and is just a readability patch. Is that correct? Correct. related implies not the same. But you are right - the reference is causing confusion. I will move the reference to the bug out of the commit log to below the '---' area of the patch. Please add to the description if so and maybe remove unrelated comment about the bug. Thanks! Will do and repost later today. ... + /* See 6.2.3 CLKDIV in Mobile Storage Host Databook +* Look for dwc_mobile_storage_db.pdf from Synopsys. +* CLKDIV value 0 means divisor 1, val 1 - 2, ... You are quoting exynos5250 docs here. This driver is used for more than just exynos and so this could be confusing to users on other platforms. I'm quoting Synopsys docs - that's the origin of this HW's ip. You and I looked at exynos5250 docs originally and they say exactly the same thing. But the section numbers are different. */ - div += 1; - - div = (host-bus_hz != slot-clock) ? DIV_ROUND_UP(div, 2) : 0; + div /= 2; It does look like you're re-implementing DIV_ROUND_UP. Yes, it does look like that but by breaking it out into simple steps AND explaining why we do each step, the code becomes maintainable by normal developers. The comments are key to *quickly* understanding the code in this case. Maybe replace your if test and division with just a DIV_ROUND_UP? No. I'd rather just drop the patch. This code can and should be stupid simple. DIV_ROUND_UP just makes it harder to understand and impossible to document as clearly. cheers, grant -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mmc: dw_mmc: rewrite CLKDIV computation
Hi Chris, On Wed, Mar 27, 2013 at 10:58 AM, Chris Ball c...@laptop.org wrote: Hi Grant, ... Please use the following (standard) syntax in the commit message: Commit e419990b5e8 (mmc: dw_mmc: correct the calculation for CLKDIV) fixed a bug in CLKDIV computation. [..] Ok - I didn't know that was standard. But easy enough to do. cheers, grant -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2] mmc: dw_mmc: rewrite CLKDIV computation
When backporting Commit e419990b5e8 (mmc: dw_mmc: correct the calculation for CLKDIV) to 3.4 kernel and debugging a FW issue, I found the code unreadable. This rewrite simplifies the computation and explains each step. Signed-off-by: Grant Grundler grund...@chromium.org --- V2: rewrote commit msg per feedback Tested on Samsung Series 3 Chromebook (exynos 5250 chipset) using ChromeOS 3.4 kernel (not 3.9-rc3 which this patch is based against). I've posted the test_dw_mmc.c source to confirm this patch generates the same correct values. The CLKDIV issue I was trying to resolve in ChromeOS 3.4 kernel: http://crbug.com/221828 drivers/mmc/host/dw_mmc.c | 22 +++--- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c index 9834221..6fdf55b 100644 --- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -631,14 +631,22 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot, bool force_clkinit) if (slot-clock != host-current_speed || force_clkinit) { div = host-bus_hz / slot-clock; - if (host-bus_hz % slot-clock host-bus_hz slot-clock) - /* -* move the + 1 after the divide to prevent -* over-clocking the card. + if (host-bus_hz slot-clock) { + /* don't overclock due to integer math losses */ + if ((div * slot-clock) host-bus_hz) + div++; + + /* don't overclock due to resolution of HW */ + if (div 1) + div++; + + /* See 6.2.3 CLKDIV in Mobile Storage Host Databook +* Look for dwc_mobile_storage_db.pdf from Synopsys. +* CLKDIV value 0 means divisor 1, val 1 - 2, ... */ - div += 1; - - div = (host-bus_hz != slot-clock) ? DIV_ROUND_UP(div, 2) : 0; + div /= 2; + } else + div = 0;/* use bus_hz */ dev_info(slot-mmc-class_dev, Bus speed (slot %d) = %dHz (slot req %dHz, actual %dHZ -- 1.8.1.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.13] IOCHK interface for I/O error handling/detecting (for ia64)
On Thu, Sep 01, 2005 at 05:45:54PM -0500, Brent Casavant wrote: ... The first is serialization of all I/O reads and writes. This will be a severe problem on systems with large numbers of PCI buses, the very type of system that stands the most to gain in reliability from these efforts. At a minimum any locking should be done on a per-bus basis. The lock could be per error domain - that would require some arch specific support though to define the scope of the error domain. The second is the raw performance penalty from acquiring and dropping a lock with every read and write. This will be a substantial amount of activity for any I/O-intensive system, heck even for moderate I/O levels. Sorry - I think this is BS. Please run mmio_test on your box and share the results. mmio_test is available here: svn co http://svn.gnumonks.org/trunk/mmio_test/ Then we can talk about the cost of spinlocks vs cost of MMIO access. thanks, grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] pci: Block config access during BIST (resend)
On Fri, Sep 02, 2005 at 06:56:35PM -0500, Brian King wrote: Andrew Morton wrote: Brian King [EMAIL PROTECTED] wrote: +void pci_block_user_cfg_access(struct pci_dev *dev) +{ + unsigned long flags; + + pci_save_state(dev); + spin_lock_irqsave(pci_lock, flags); + dev-block_ucfg_access = 1; + spin_unlock_irqrestore(pci_lock, flags); Are you sure the locking in here is meaningful? All it will really do is give you a couple of barriers. Actually, it is meaningful. It synchronizes the blocking of pci config accesses with other pci config accesses that may be going on when this function is called. Without the locking, the API cannot guarantee that no further user initiated PCI config accesses will be initiated after this function is called. I don't have the impression you understood what Andrew wrote. dev-block_ucfg_access = 1 is essentially an atomic operation. AFAIK, Use of the pci_lock doesn't solve any race conditions that mb() wouldn't solve. If you had: spin_lock_irqsave(pci_lock, flags); pci_save_state(dev); dev-block_ucfg_access = 1; spin_unlock_irqrestore(pci_lock, flags); Then I could buy your arguement since the flag now implies we need to atomically save state and set the flag. grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.13] IOCHK interface for I/O error handling/detecting (for ia64)
On Fri, Sep 02, 2005 at 11:16:10AM -0700, david mosberger wrote: Sorry - I think this is BS. Please run mmio_test on your box and share the results. mmio_test is available here: svn co http://svn.gnumonks.org/trunk/mmio_test/ Reads are slow, sure, but writes are not (or should not). Sure, MMIO writes are generally posted. But those aren't always free. At some point, I expect MMIO reads typically will flush those writes and thus stall until 2 (or more) PCI bus transactions complete. ISTR locking around MMIO writes was necessary if the box to enforce syncronization of the error with the driver. ISTR this syncronization was neccessary. Was that wrong? Complicating the MMIO perf picture are fabrics connecting the NUMA cells which do NOT enforce MMIO ordering (e.g. Altix). In that case, arch code will sometimes need to enforce the write ordering by flushing MMIO writes before a driver releases a spinlock or other syncronization primitive. This was discussed before and is archived in the dialog between Jesse Barns and myself in late 2004 (IIRC). In any case, mmio_test currently only tests MMIO read perf. I need to think about how we might also test MMIO write perf. Ie how much more expensive is MMIO read when it follows an MMIO write. grant - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/