Re: [PATCH] 3c59x: read current link status from phy
On Thu, 8 Sep 2005, Andy Fleming wrote: The new PHY Layer (drivers/net/phy/*) can provide all these features for you without much difficulty, I suspect. As pointed to be Andrew a few days ago, this driver supports a lot of chips - for most of them the test hardware would be hard to come by and the documentation even more. Unless you'd like to do it based on "whoever is interested should cry loud"... The layer supports handling the interrupts for you, or (if it's shared with your controller's interrupt) Yes, there is only one interrupt that for data transmission (both Tx and Rx), statistics, errors and (for those chips that support it) link state change. Is the cost of an extra read every minute really too high? You probably didn't look at the code. The MII registers are not exposed in the PCI space, they need to be accessed through a serial protocol, such that each MII register read is in fact about 200 (in total) of outw and inw/inl operations. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] 3c59x: read current link status from phy
On Thu, 8 Sep 2005, Andy Fleming wrote: The new PHY Layer (drivers/net/phy/*) can provide all these features for you without much difficulty, I suspect. As pointed to be Andrew a few days ago, this driver supports a lot of chips - for most of them the test hardware would be hard to come by and the documentation even more. Unless you'd like to do it based on whoever is interested should cry loud... The layer supports handling the interrupts for you, or (if it's shared with your controller's interrupt) Yes, there is only one interrupt that for data transmission (both Tx and Rx), statistics, errors and (for those chips that support it) link state change. Is the cost of an extra read every minute really too high? You probably didn't look at the code. The MII registers are not exposed in the PCI space, they need to be accessed through a serial protocol, such that each MII register read is in fact about 200 (in total) of outw and inw/inl operations. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] 3c59x: read current link status from phy
On Thu, 8 Sep 2005, Tommy Christensen wrote: Besides, how long would you like to wait for network connectivity after plugging in the cable? It is now lowered from [60-120] to [0-60] seconds. I now understood what the problem was, so I'll put it in words for posterity: the Link Status bit of the MII Status register needs to be read twice to first clear the error state (link bit=0) after which the bit reports the actual value of the link. From the manual: This bit has a latching function. A link failure causes the bit to clear and remain clear until it has been read through the management interface. I tested this on a Tornado chip and it works as advertised (after link is back up, first read gives 0x7829, the second 0x782d). But I still don't agree with your solution: you are reading the Status register twice in all cases, which is wrong. What you want is to read it a second time only after the link was marked as down: a simple check if bit 2 of the Status register is 0, in which case you issue the second read. This still means that there will be 2 reads if the link remains down, but at least there is only 1 read for the case where the link is up and remains up. Personally, I'd prefer the delay to be < 10 seconds. If you sample every 60 seconds ? Teach Shannon how to do it ;-) If you mean to reduce the sampling period, there is a very good reason not to do it: these MDIO operations are expensive - it's a serial protocol. vortex_timer() might do 2 (and with the discussed change - 3) of them - there are better things to do for the CPU than wait for these I/O operations. Plus, vortex_timer() also disables the interrupt... The Tornado and at least some Cyclone chips support generating an interrupt whenever the link changes, which can be used instead of polling for link state. This feature is not used in the 3c59x driver and could give you much less than 10 seconds accuracy - but you have to code it. ;-) -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] 3c59x: read current link status from phy
On Thu, 8 Sep 2005, Tommy Christensen wrote: The idea is to avoid an extra delay of 60 seconds before detecting link-up. But you are adding the read to a function that is called repeatedly to fix an event that happens only once at start-up ! If this read is really needed (I still doubt it...), can't it be performed in vortex_up(), by possibly doubling the existing one there ? vortex_up() is executed only once at start-up, not every 60 seconds. Please see http://bugzilla.kernel.org/show_bug.cgi?id=5025 Hah, a Cisco switch. Look in Documentation/networking/vortex.txt for "portfast". -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] 3c59x: read current link status from phy
On Wed, 7 Sep 2005, Jeff Garzik wrote: The phy status register must be read twice in order to get the actual link state. Can the original poster give an explanation ? I've enjoyed a rather well functioning 3c59x driver for the past ~6 years without such double reading. Plus: - this operation is I/O expensive - it is performed inside a region protected by a spinlock - it is performed often, every 60 seconds Is there some specific hardware that exhibits a problem that is solved by this double reading ? -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] 3c59x: read current link status from phy
On Wed, 7 Sep 2005, Jeff Garzik wrote: The phy status register must be read twice in order to get the actual link state. Can the original poster give an explanation ? I've enjoyed a rather well functioning 3c59x driver for the past ~6 years without such double reading. Plus: - this operation is I/O expensive - it is performed inside a region protected by a spinlock - it is performed often, every 60 seconds Is there some specific hardware that exhibits a problem that is solved by this double reading ? -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] 3c59x: read current link status from phy
On Thu, 8 Sep 2005, Tommy Christensen wrote: The idea is to avoid an extra delay of 60 seconds before detecting link-up. But you are adding the read to a function that is called repeatedly to fix an event that happens only once at start-up ! If this read is really needed (I still doubt it...), can't it be performed in vortex_up(), by possibly doubling the existing one there ? vortex_up() is executed only once at start-up, not every 60 seconds. Please see http://bugzilla.kernel.org/show_bug.cgi?id=5025 Hah, a Cisco switch. Look in Documentation/networking/vortex.txt for portfast. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] 3c59x: read current link status from phy
On Thu, 8 Sep 2005, Tommy Christensen wrote: Besides, how long would you like to wait for network connectivity after plugging in the cable? It is now lowered from [60-120] to [0-60] seconds. I now understood what the problem was, so I'll put it in words for posterity: the Link Status bit of the MII Status register needs to be read twice to first clear the error state (link bit=0) after which the bit reports the actual value of the link. From the manual: This bit has a latching function. A link failure causes the bit to clear and remain clear until it has been read through the management interface. I tested this on a Tornado chip and it works as advertised (after link is back up, first read gives 0x7829, the second 0x782d). But I still don't agree with your solution: you are reading the Status register twice in all cases, which is wrong. What you want is to read it a second time only after the link was marked as down: a simple check if bit 2 of the Status register is 0, in which case you issue the second read. This still means that there will be 2 reads if the link remains down, but at least there is only 1 read for the case where the link is up and remains up. Personally, I'd prefer the delay to be 10 seconds. If you sample every 60 seconds ? Teach Shannon how to do it ;-) If you mean to reduce the sampling period, there is a very good reason not to do it: these MDIO operations are expensive - it's a serial protocol. vortex_timer() might do 2 (and with the discussed change - 3) of them - there are better things to do for the CPU than wait for these I/O operations. Plus, vortex_timer() also disables the interrupt... The Tornado and at least some Cyclone chips support generating an interrupt whenever the link changes, which can be used instead of polling for link state. This feature is not used in the 3c59x driver and could give you much less than 10 seconds accuracy - but you have to code it. ;-) -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.13] libata: Marvell SATA support (PIO mode)
A cfg=0x011f EDMA IRQ err cause/mask=0x/0x1f7f mv_port_init: EDMA cfg=0x011f EDMA IRQ err cause/mask=0x/0x1f7f mv_port_init: EDMA cfg=0x011f EDMA IRQ err cause/mask=0x/0x1f7f mv_host_init: HC0: HC config=0x11dcf013 HC IRQ cause=0x0111 mv_host_init: HC MAIN IRQ cause/mask=0x0102/0x0007 PCI int cause/mask=0x/6mv_init_one: PCI config space: mv_init_one: 504111ab 02b7 0103 2008 mv_init_one: f504 mv_init_one: 81241043 mv_init_one: 0040 0109 mv_host_intr: ENTER, hc0 relevant=0x0102 HC IRQ cause=0x0111 mv_host_intr: EXIT mv_phy_reset: ENTER, port 0, mmio 0xf8a22000 mv_host_intr: ENTER, hc0 relevant=0x0001 HC IRQ cause=0x mv_err_intr: port 0 error; EDMA err cause: 0x0008 SERR: 0x mv_phy_reset: ENTER, port 0, mmio 0xf8a22000 mv_phy_reset: Done. Now calling __sata_phy_reset() Badness in __sata_phy_reset at drivers/scsi/libata-core.c:1413 [] __sata_phy_reset+0x75/0x12e [libata] [] mv_phy_reset+0xe8/0x175 [sata_mv] [] mv_host_intr+0xf1/0x187 [sata_mv] [] mv_interrupt+0x7f/0xa7 [sata_mv] [] handle_IRQ_event+0x25/0x4f [] do_IRQ+0x11c/0x1ae === [] common_interrupt+0x18/0x20 [] dup_task_struct+0x71/0xc0 [] delay_tsc+0x9/0x13 [] __delay+0x9/0xa [] mv_phy_reset+0xcc/0x175 [sata_mv] [] setup_irq+0xae/0xb7 [] mv_interrupt+0x0/0xa7 [sata_mv] [] ata_bus_probe+0xe/0x7b [libata] [] ata_device_add+0x16c/0x1e8 [libata] [] mv_init_one+0x1f8/0x235 [sata_mv] [] pci_device_probe_static+0x2a/0x3d [] __pci_device_probe+0x1b/0x2c [] pci_device_probe+0x1b/0x2d [] bus_match+0x27/0x45 [] driver_attach+0x37/0x66 [] bus_add_driver+0x78/0x99 [] pci_register_driver+0x6e/0x8a [] mv_init+0xa/0x15 [sata_mv] [] sys_init_module+0x116/0x238 [] syscall_call+0x7/0xb bad: scheduling while atomic! [] schedule+0x2d/0x87a [] scheduler_tick+0x3e/0x3e5 [] profile_hook+0x1b/0x26 [] __mod_timer+0x101/0x10b [] schedule_timeout+0xd3/0xee [] process_timeout+0x0/0x5 [] mv_scr_read+0xf/0x54 [sata_mv] [] msleep+0x4f/0x55 [] __sata_phy_reset+0xa8/0x12e [libata] [] mv_phy_reset+0xe8/0x175 [sata_mv] [] mv_host_intr+0xf1/0x187 [sata_mv] [] mv_interrupt+0x7f/0xa7 [sata_mv] [] handle_IRQ_event+0x25/0x4f [] do_IRQ+0x11c/0x1ae === [] common_interrupt+0x18/0x20 [] dup_task_struct+0x71/0xc0 [] delay_tsc+0x9/0x13 [] __delay+0x9/0xa [] mv_phy_reset+0xcc/0x175 [sata_mv] [] setup_irq+0xae/0xb7 [] mv_interrupt+0x0/0xa7 [sata_mv] [] ata_bus_probe+0xe/0x7b [libata] [] ata_device_add+0x16c/0x1e8 [libata] [] mv_init_one+0x1f8/0x235 [sata_mv] [] pci_device_probe_static+0x2a/0x3d [] __pci_device_probe+0x1b/0x2c [] pci_device_probe+0x1b/0x2d [] bus_match+0x27/0x45 [] driver_attach+0x37/0x66 [] bus_add_driver+0x78/0x99 [] pci_register_driver+0x6e/0x8a [] mv_init+0xa/0x15 [sata_mv] [] sys_init_module+0x116/0x238 [] syscall_call+0x7/0xb after which the system is dead. > Either way, mv_phy_reset() is called from mv_err_intr() which > doesn't appear in either of the stack dumps above. It appears now, both in the debug messages and on the stack (but not in the success case). > -do the phy_reset part of error recovery after returning from > interrupt handler? I might be completely off here, but what the 3c59x network driver does for the case where a MII link is used is to start a timer which checks the state of the link, async from the init routine, which is very similar in my understanding to your intention expressed above. This scheme works fine for network... Please note that I also exposed another problem in my previous message: after 'rmmod sata_mv', the controller can still generate interrupts, f.e. when a drive is attached. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.13] libata: Marvell SATA support (PIO mode)
): mv_init_one: ENTER for PCI Bus:Slot.Func=2:8.0 mv_port_init: EDMA cfg=0x011f EDMA IRQ err cause/mask=0x/0x1f7f mv_port_init: EDMA cfg=0x011f EDMA IRQ err cause/mask=0x/0x1f7f mv_port_init: EDMA cfg=0x011f EDMA IRQ err cause/mask=0x/0x1f7f mv_port_init: EDMA cfg=0x011f EDMA IRQ err cause/mask=0x/0x1f7f mv_host_init: HC0: HC config=0x11dcf013 HC IRQ cause=0x0111 mv_host_init: HC MAIN IRQ cause/mask=0x0102/0x0007 PCI int cause/mask=0x/6mv_init_one: PCI config space: mv_init_one: 504111ab 02b7 0103 2008 mv_init_one: f504 mv_init_one: 81241043 mv_init_one: 0040 0109 mv_host_intr: ENTER, hc0 relevant=0x0102 HC IRQ cause=0x0111 mv_host_intr: EXIT mv_phy_reset: ENTER, port 0, mmio 0xf8a22000 mv_host_intr: ENTER, hc0 relevant=0x0001 HC IRQ cause=0x mv_err_intr: port 0 error; EDMA err cause: 0x0008 SERR: 0x mv_phy_reset: ENTER, port 0, mmio 0xf8a22000 mv_phy_reset: Done. Now calling __sata_phy_reset() Badness in __sata_phy_reset at drivers/scsi/libata-core.c:1413 [f8961dc8] __sata_phy_reset+0x75/0x12e [libata] [f887566e] mv_phy_reset+0xe8/0x175 [sata_mv] [f8875449] mv_host_intr+0xf1/0x187 [sata_mv] [f887555e] mv_interrupt+0x7f/0xa7 [sata_mv] [c010745e] handle_IRQ_event+0x25/0x4f [c01079be] do_IRQ+0x11c/0x1ae === [c02d1c88] common_interrupt+0x18/0x20 [c012007b] dup_task_struct+0x71/0xc0 [c0111714] delay_tsc+0x9/0x13 [c01c18d9] __delay+0x9/0xa [f8875652] mv_phy_reset+0xcc/0x175 [sata_mv] [c0107e9e] setup_irq+0xae/0xb7 [f88754df] mv_interrupt+0x0/0xa7 [sata_mv] [f8961ce1] ata_bus_probe+0xe/0x7b [libata] [f8963fd4] ata_device_add+0x16c/0x1e8 [libata] [f8875abf] mv_init_one+0x1f8/0x235 [sata_mv] [c01c744d] pci_device_probe_static+0x2a/0x3d [c01c747b] __pci_device_probe+0x1b/0x2c [c01c74a7] pci_device_probe+0x1b/0x2d [c021d734] bus_match+0x27/0x45 [c021d7fd] driver_attach+0x37/0x66 [c021dbbb] bus_add_driver+0x78/0x99 [c01c764e] pci_register_driver+0x6e/0x8a [f881b00a] mv_init+0xa/0x15 [sata_mv] [c0137c05] sys_init_module+0x116/0x238 [c02d12cb] syscall_call+0x7/0xb bad: scheduling while atomic! [c02ced41] schedule+0x2d/0x87a [c011e1ae] scheduler_tick+0x3e/0x3e5 [c0122bba] profile_hook+0x1b/0x26 [c0129741] __mod_timer+0x101/0x10b [c02cfa90] schedule_timeout+0xd3/0xee [c0129feb] process_timeout+0x0/0x5 [f8875082] mv_scr_read+0xf/0x54 [sata_mv] [c012a562] msleep+0x4f/0x55 [f8961dfb] __sata_phy_reset+0xa8/0x12e [libata] [f887566e] mv_phy_reset+0xe8/0x175 [sata_mv] [f8875449] mv_host_intr+0xf1/0x187 [sata_mv] [f887555e] mv_interrupt+0x7f/0xa7 [sata_mv] [c010745e] handle_IRQ_event+0x25/0x4f [c01079be] do_IRQ+0x11c/0x1ae === [c02d1c88] common_interrupt+0x18/0x20 [c012007b] dup_task_struct+0x71/0xc0 [c0111714] delay_tsc+0x9/0x13 [c01c18d9] __delay+0x9/0xa [f8875652] mv_phy_reset+0xcc/0x175 [sata_mv] [c0107e9e] setup_irq+0xae/0xb7 [f88754df] mv_interrupt+0x0/0xa7 [sata_mv] [f8961ce1] ata_bus_probe+0xe/0x7b [libata] [f8963fd4] ata_device_add+0x16c/0x1e8 [libata] [f8875abf] mv_init_one+0x1f8/0x235 [sata_mv] [c01c744d] pci_device_probe_static+0x2a/0x3d [c01c747b] __pci_device_probe+0x1b/0x2c [c01c74a7] pci_device_probe+0x1b/0x2d [c021d734] bus_match+0x27/0x45 [c021d7fd] driver_attach+0x37/0x66 [c021dbbb] bus_add_driver+0x78/0x99 [c01c764e] pci_register_driver+0x6e/0x8a [f881b00a] mv_init+0xa/0x15 [sata_mv] [c0137c05] sys_init_module+0x116/0x238 [c02d12cb] syscall_call+0x7/0xb after which the system is dead. Either way, mv_phy_reset() is called from mv_err_intr() which doesn't appear in either of the stack dumps above. It appears now, both in the debug messages and on the stack (but not in the success case). -do the phy_reset part of error recovery after returning from interrupt handler? I might be completely off here, but what the 3c59x network driver does for the case where a MII link is used is to start a timer which checks the state of the link, async from the init routine, which is very similar in my understanding to your intention expressed above. This scheme works fine for network... Please note that I also exposed another problem in my previous message: after 'rmmod sata_mv', the controller can still generate interrupts, f.e. when a drive is attached. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.13] libata: Marvell SATA support (PIO mode)
4 cmd 0x0 ctl 0xF8A22120 bmdma 0x0 irq 9 ata2: SATA max PIO4 cmd 0x0 ctl 0xF8A24120 bmdma 0x0 irq 9 ata3: SATA max PIO4 cmd 0x0 ctl 0xF8A26120 bmdma 0x0 irq 9 ata4: SATA max PIO4 cmd 0x0 ctl 0xF8A28120 bmdma 0x0 irq 9 Badness in __sata_phy_reset at drivers/scsi/libata-core.c:1413 [] __sata_phy_reset+0x75/0x12e [libata] [] mv_phy_reset+0xbf/0x11e [sata_mv] [] end_that_request_last+0x6c/0x7e [] mv_host_intr+0xd6/0x142 [sata_mv] [] mv_interrupt+0xd5/0x145 [sata_mv] [] handle_IRQ_event+0x25/0x4f [] do_IRQ+0x18a/0x2bf === [] common_interrupt+0x18/0x20 [] mv_phy_reset+0xa8/0x11e [sata_mv] [] setup_irq+0x179/0x181 [] mv_interrupt+0x0/0x145 [sata_mv] [] ata_bus_probe+0xe/0x7b [libata] [] ata_device_add+0x186/0x202 [libata] [] mv_init_one+0x197/0x1d5 [sata_mv] [] pci_device_probe_static+0x2a/0x3d [] __pci_device_probe+0x1b/0x2c [] pci_device_probe+0x1b/0x2d [] bus_match+0x27/0x45 [] driver_attach+0x37/0x66 [] bus_add_driver+0x77/0x97 [] driver_register+0x51/0x58 [] pci_register_driver+0x85/0xa1 [] mv_init+0xa/0x15 [sata_mv] [] sys_init_module+0x1f1/0x2d9 [] syscall_call+0x7/0xb bad: scheduling while atomic! [] schedule+0x2d/0x552 [] handle_IRQ_event+0x25/0x4f [] schedule_timeout+0xf1/0x10c [] process_timeout+0x0/0x5 [] mv_scr_read+0xf/0x54 [sata_mv] [] msleep+0x4e/0x54 [] __sata_phy_reset+0xa8/0x12e [libata] [] mv_phy_reset+0xbf/0x11e [sata_mv] [] end_that_request_last+0x6c/0x7e [] mv_host_intr+0xd6/0x142 [sata_mv] [] mv_interrupt+0xd5/0x145 [sata_mv] [] handle_IRQ_event+0x25/0x4f [] do_IRQ+0x18a/0x2bf === [] common_interrupt+0x18/0x20 [] mv_phy_reset+0xa8/0x11e [sata_mv] [] setup_irq+0x179/0x181 [] mv_interrupt+0x0/0x145 [sata_mv] [] ata_bus_probe+0xe/0x7b [libata] [] ata_device_add+0x186/0x202 [libata] [] mv_init_one+0x197/0x1d5 [sata_mv] [] pci_device_probe_static+0x2a/0x3d [] __pci_device_probe+0x1b/0x2c [] pci_device_probe+0x1b/0x2d [] bus_match+0x27/0x45 [] driver_attach+0x37/0x66 [] bus_add_driver+0x77/0x97 [] driver_register+0x51/0x58 [] pci_register_driver+0x85/0xa1 [] mv_init+0xa/0x15 [sata_mv] [] sys_init_module+0x1f1/0x2d9 [] syscall_call+0x7/0xb I don't know how much of the problem comes from BIOS/ACPI and how much from the combination of this driver with the RHEL kernel and my hacking. But I don't know how to proceed further, so I'm waiting for some hints or patches :-) Side note: I was able some time ago to use this controller with the mv_sata driver 3.40, also with a RHEL kernel, without any fiddling with ACPI. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.13] libata: Marvell SATA support (PIO mode)
IRQ 5 for device :02:08.0 PCI: Sharing IRQ 5 with :00:1d.0 IRQ routing conflict for :02:08.0, have irq 9, want irq 5 ata1: SATA max PIO4 cmd 0x0 ctl 0xF8A22120 bmdma 0x0 irq 9 ata2: SATA max PIO4 cmd 0x0 ctl 0xF8A24120 bmdma 0x0 irq 9 ata3: SATA max PIO4 cmd 0x0 ctl 0xF8A26120 bmdma 0x0 irq 9 ata4: SATA max PIO4 cmd 0x0 ctl 0xF8A28120 bmdma 0x0 irq 9 Badness in __sata_phy_reset at drivers/scsi/libata-core.c:1413 [f88e8f0c] __sata_phy_reset+0x75/0x12e [libata] [f883f62f] mv_phy_reset+0xbf/0x11e [sata_mv] [c0250f16] end_that_request_last+0x6c/0x7e [f883f3bf] mv_host_intr+0xd6/0x142 [sata_mv] [f883f500] mv_interrupt+0xd5/0x145 [sata_mv] [c0107e2b] handle_IRQ_event+0x25/0x4f [c01087d3] do_IRQ+0x18a/0x2bf === [c030fb7c] common_interrupt+0x18/0x20 [f883f618] mv_phy_reset+0xa8/0x11e [sata_mv] [c01091d8] setup_irq+0x179/0x181 [f883f42b] mv_interrupt+0x0/0x145 [sata_mv] [f88e8e25] ata_bus_probe+0xe/0x7b [libata] [f88eb34d] ata_device_add+0x186/0x202 [libata] [f883f97a] mv_init_one+0x197/0x1d5 [sata_mv] [c01ec15d] pci_device_probe_static+0x2a/0x3d [c01ec18b] __pci_device_probe+0x1b/0x2c [c01ec1b7] pci_device_probe+0x1b/0x2d [c024a33b] bus_match+0x27/0x45 [c024a404] driver_attach+0x37/0x66 [c024a7b9] bus_add_driver+0x77/0x97 [c024abd4] driver_register+0x51/0x58 [c01ec375] pci_register_driver+0x85/0xa1 [f881a00a] mv_init+0xa/0x15 [sata_mv] [c013d5a3] sys_init_module+0x1f1/0x2d9 [c030fa37] syscall_call+0x7/0xb bad: scheduling while atomic! [c030d515] schedule+0x2d/0x552 [c0107e2b] handle_IRQ_event+0x25/0x4f [c030e40e] schedule_timeout+0xf1/0x10c [c012ad7e] process_timeout+0x0/0x5 [f883f082] mv_scr_read+0xf/0x54 [sata_mv] [c012b498] msleep+0x4e/0x54 [f88e8f3f] __sata_phy_reset+0xa8/0x12e [libata] [f883f62f] mv_phy_reset+0xbf/0x11e [sata_mv] [c0250f16] end_that_request_last+0x6c/0x7e [f883f3bf] mv_host_intr+0xd6/0x142 [sata_mv] [f883f500] mv_interrupt+0xd5/0x145 [sata_mv] [c0107e2b] handle_IRQ_event+0x25/0x4f [c01087d3] do_IRQ+0x18a/0x2bf === [c030fb7c] common_interrupt+0x18/0x20 [f883f618] mv_phy_reset+0xa8/0x11e [sata_mv] [c01091d8] setup_irq+0x179/0x181 [f883f42b] mv_interrupt+0x0/0x145 [sata_mv] [f88e8e25] ata_bus_probe+0xe/0x7b [libata] [f88eb34d] ata_device_add+0x186/0x202 [libata] [f883f97a] mv_init_one+0x197/0x1d5 [sata_mv] [c01ec15d] pci_device_probe_static+0x2a/0x3d [c01ec18b] __pci_device_probe+0x1b/0x2c [c01ec1b7] pci_device_probe+0x1b/0x2d [c024a33b] bus_match+0x27/0x45 [c024a404] driver_attach+0x37/0x66 [c024a7b9] bus_add_driver+0x77/0x97 [c024abd4] driver_register+0x51/0x58 [c01ec375] pci_register_driver+0x85/0xa1 [f881a00a] mv_init+0xa/0x15 [sata_mv] [c013d5a3] sys_init_module+0x1f1/0x2d9 [c030fa37] syscall_call+0x7/0xb I don't know how much of the problem comes from BIOS/ACPI and how much from the combination of this driver with the RHEL kernel and my hacking. But I don't know how to proceed further, so I'm waiting for some hints or patches :-) Side note: I was able some time ago to use this controller with the mv_sata driver 3.40, also with a RHEL kernel, without any fiddling with ACPI. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux and system area networks
On Wed, 27 Jun 2001, Pekka Pietikainen wrote: > Providing a wrapper library for use with Infiniband and the current > SAN boards like WSD would probably be a useful exercise, but to really get > good performance (especially latency-wise) you probably want to use > something like MPI. For many applications a wrapper will be enough, though. I'm sorry, but I don't understand your reference to MPI here. MPI is a high-level API; MPI can run on top of whatever communication features exists: TCP/IP, shared memory, VI, etc. MPI (as well as other "standards" for parallel programming - PVM, OpenMP) came from the need to have a common interface, not to have all parallel programs include specific code to deal with TCP/IP, shared memory, VI, etc. whenever they were available. Instead, MPI serves as a middle-man between them and the parallel programs. So, MPI cannot be faster than the underlying communication features. Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux and system area networks
On Wed, 27 Jun 2001, Pekka Pietikainen wrote: Providing a wrapper library for use with Infiniband and the current SAN boards like WSD would probably be a useful exercise, but to really get good performance (especially latency-wise) you probably want to use something like MPI. For many applications a wrapper will be enough, though. I'm sorry, but I don't understand your reference to MPI here. MPI is a high-level API; MPI can run on top of whatever communication features exists: TCP/IP, shared memory, VI, etc. MPI (as well as other standards for parallel programming - PVM, OpenMP) came from the need to have a common interface, not to have all parallel programs include specific code to deal with TCP/IP, shared memory, VI, etc. whenever they were available. Instead, MPI serves as a middle-man between them and the parallel programs. So, MPI cannot be faster than the underlying communication features. Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3C905B -- EEPROM (i blive so) problem
On Wed, 13 Jun 2001, L. K. wrote: > I have a 3COM 3C905B ethernet card that has been hit by a power outage for > aprox. 0.5 sec. What do you mean by "power outage" ? If you mean cutting the power, this is not a serious reason for EEPROM damages, unless you were modifying it at that moment. > I do belive something happened to the eeprom of the card. I would like > to know if I can overwrite-it with a new one so that I can make my > ethernet card work again. Maybe 3Com's DOS-based tool (3c90xcfg.exe) can help. In order to re-write the EEPROM, you need to use vortex-diag; I think that you need to hack it a bit as the EEPROM writting code is not enabled. But most important is that you need a good EEPROM image to write; if you have another similar card, you can use vortex-diag to dump the EEPROM, then change the MAC address (if you put both cards on the same network segment). If you don't have a similar card... you have to download the card's documentation from 3Com and build your own EEPROM image based on what you know about your card's capabilities - having an EEPROM image from a different card might screw things up badly. If you decide to go the last way, maybe I can help with some interpretation of the docs - please e-mail me in private. Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3C905B -- EEPROM (i blive so) problem
On Wed, 13 Jun 2001, L. K. wrote: I have a 3COM 3C905B ethernet card that has been hit by a power outage for aprox. 0.5 sec. What do you mean by power outage ? If you mean cutting the power, this is not a serious reason for EEPROM damages, unless you were modifying it at that moment. I do belive something happened to the eeprom of the card. I would like to know if I can overwrite-it with a new one so that I can make my ethernet card work again. Maybe 3Com's DOS-based tool (3c90xcfg.exe) can help. In order to re-write the EEPROM, you need to use vortex-diag; I think that you need to hack it a bit as the EEPROM writting code is not enabled. But most important is that you need a good EEPROM image to write; if you have another similar card, you can use vortex-diag to dump the EEPROM, then change the MAC address (if you put both cards on the same network segment). If you don't have a similar card... you have to download the card's documentation from 3Com and build your own EEPROM image based on what you know about your card's capabilities - having an EEPROM image from a different card might screw things up badly. If you decide to go the last way, maybe I can help with some interpretation of the docs - please e-mail me in private. Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH: ethtool MII helpers
On Sun, 10 Jun 2001, Jeff Garzik wrote: > Comments appreciated. Some general comments first, the others are spread through the code. - I don't know what the long-term plan is about ethtool vs. MII ioctl's. If you do plan to replace completely the MII ioctl's, there should be a way to access _all_ MII registers provided by the PHY, even if you do this in a restricted way (i.e. for CAP_NET_ADMIN only). There is also useful info in other registers than the 4 you have in your implementation. - You are proposing some caching for the MII registers. I suppose that you would like to have this code also working with whatever caching will be done for MII access that was recently discussed. Wouldn't this produce double caching under some circumstances ? + int speed; /* 10, 100, 1000 or -1 (ask hw) */ Please note that the comment specifies 1000, while the code in several places assumes only 2 possibilities: 10 and 100. + if (mii->autoneg < 0) + autoneg = mii->autoneg = (bmcr & BMCR_ANENABLE) ? 1 : 0; + elseautoneg = mii->autoneg; You don't read anything from the hardware at this point. Why do you want caching ? Not related: I know that this comes from David Miller's older work, but wouldn't be possible to have a more uniform naming scheme ? You have BMCR_ANENABLE, but you have BMSR_ANEGCAPABLE... + if (mii->full_duplex < 0) + full_duplex = mii->full_duplex = + mii_nway_result(negotiated) & LPA_DUPLEX; + elsefull_duplex = mii->full_duplex; If autoneg. is disabled, I don't think that you always get useful info in 'negotiated'. Applies to the next chunk, too. + if (mii->speed < 0) { + if (negotiated & LPA_100) + speed = mii->speed = 100; + else + speed = mii->speed = 10; + } else + speed = mii->speed; That's one of the places where you don't have 1000... + ecmd->speed = speed == 100 ? SPEED_100 : SPEED_10; ... and that's the second. + ecmd->transceiver = XCVR_INTERNAL; I didn't understand what XCVR_INTERNAL should mean as opposed to XCVR_EXTERNAL or whatever. For example: some older 3Com cards use external transceivers (not on the chip), while newer ones have NWAY capable MII transceivers on the chip. So, you can have: 1. chip + MII 2. NWAY-chip 3. NWAY-chip + MII All MII accesses are done through the serial mdio_* protocol. How should be this handled w.r.t. XCVR_* or is it completely orthogonal? + if ((in.phy_address != out.phy_address) || + (in.transceiver != XCVR_INTERNAL) || + (in.maxtxpkt != out.maxtxpkt) || + (in.maxrxpkt != out.maxrxpkt)) + return -EOPNOTSUPP; ... and here too. + if (advert != mii->advertising) { + bmcr |= BMCR_ANRESTART; + mii->mdio_write(dev, mii->phy_id, MII_ADVERTISE, advert); + mii->advertising = advert; + } + + /* some phys need autoneg dis/enabled separately from other settings */ + if ((bmcr & BMCR_ANENABLE) && (!(mii->bmcr & BMCR_ANENABLE))) { + mii->mdio_write(dev, mii->phy_id, MII_BMCR, + mii->bmcr | BMCR_ANENABLE | BMCR_ANRESTART); + bmcr &= ~BMCR_ANRESTART; + } else if ((!(bmcr & BMCR_ANENABLE)) && (mii->bmcr & BMCR_ANENABLE)) { + mii->mdio_write(dev, mii->phy_id, MII_BMCR, + mii->bmcr & ~BMCR_ANENABLE); + } This is nice, but I would like to able to restart autonegotiation even without changing any of the advertised capabilities. If I missed this possibility, please point me to it... Nice work! -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH: ethtool MII helpers
On Sun, 10 Jun 2001, Jeff Garzik wrote: Comments appreciated. Some general comments first, the others are spread through the code. - I don't know what the long-term plan is about ethtool vs. MII ioctl's. If you do plan to replace completely the MII ioctl's, there should be a way to access _all_ MII registers provided by the PHY, even if you do this in a restricted way (i.e. for CAP_NET_ADMIN only). There is also useful info in other registers than the 4 you have in your implementation. - You are proposing some caching for the MII registers. I suppose that you would like to have this code also working with whatever caching will be done for MII access that was recently discussed. Wouldn't this produce double caching under some circumstances ? + int speed; /* 10, 100, 1000 or -1 (ask hw) */ Please note that the comment specifies 1000, while the code in several places assumes only 2 possibilities: 10 and 100. + if (mii-autoneg 0) + autoneg = mii-autoneg = (bmcr BMCR_ANENABLE) ? 1 : 0; + elseautoneg = mii-autoneg; You don't read anything from the hardware at this point. Why do you want caching ? Not related: I know that this comes from David Miller's older work, but wouldn't be possible to have a more uniform naming scheme ? You have BMCR_ANENABLE, but you have BMSR_ANEGCAPABLE... + if (mii-full_duplex 0) + full_duplex = mii-full_duplex = + mii_nway_result(negotiated) LPA_DUPLEX; + elsefull_duplex = mii-full_duplex; If autoneg. is disabled, I don't think that you always get useful info in 'negotiated'. Applies to the next chunk, too. + if (mii-speed 0) { + if (negotiated LPA_100) + speed = mii-speed = 100; + else + speed = mii-speed = 10; + } else + speed = mii-speed; That's one of the places where you don't have 1000... + ecmd-speed = speed == 100 ? SPEED_100 : SPEED_10; ... and that's the second. + ecmd-transceiver = XCVR_INTERNAL; I didn't understand what XCVR_INTERNAL should mean as opposed to XCVR_EXTERNAL or whatever. For example: some older 3Com cards use external transceivers (not on the chip), while newer ones have NWAY capable MII transceivers on the chip. So, you can have: 1. chip + MII 2. NWAY-chip 3. NWAY-chip + MII All MII accesses are done through the serial mdio_* protocol. How should be this handled w.r.t. XCVR_* or is it completely orthogonal? + if ((in.phy_address != out.phy_address) || + (in.transceiver != XCVR_INTERNAL) || + (in.maxtxpkt != out.maxtxpkt) || + (in.maxrxpkt != out.maxrxpkt)) + return -EOPNOTSUPP; ... and here too. + if (advert != mii-advertising) { + bmcr |= BMCR_ANRESTART; + mii-mdio_write(dev, mii-phy_id, MII_ADVERTISE, advert); + mii-advertising = advert; + } + + /* some phys need autoneg dis/enabled separately from other settings */ + if ((bmcr BMCR_ANENABLE) (!(mii-bmcr BMCR_ANENABLE))) { + mii-mdio_write(dev, mii-phy_id, MII_BMCR, + mii-bmcr | BMCR_ANENABLE | BMCR_ANRESTART); + bmcr = ~BMCR_ANRESTART; + } else if ((!(bmcr BMCR_ANENABLE)) (mii-bmcr BMCR_ANENABLE)) { + mii-mdio_write(dev, mii-phy_id, MII_BMCR, + mii-bmcr ~BMCR_ANENABLE); + } This is nice, but I would like to able to restart autonegotiation even without changing any of the advertised capabilities. If I missed this possibility, please point me to it... Nice work! -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Looking for device to write device driver for
On Sun, 3 Jun 2001, Kip Macy wrote: > I then tried to get the interface information from 3com on their new > 3cr990 card to add IPsec offload support to the linux driver. Which Linux driver ? They only provide a 2.2 one which is in an alpha stage (as written in it!). > They responded by telling me that due to IP-heavy nature of the product > that they would not be releasing the interface. You were much luckier than me. To me, they said that they don't provide any support for Linux with these cards when I was only asking for docs for how to use their own firmware to do basic operations! Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MII access (was [PATCH] support for Cobalt Networks (x86 only)
On Sun, 3 Jun 2001, Jeff Garzik wrote: > Bogdan Costescu wrote: > > With clearer mind, I have to make some a correction to one of the previous > > messages: the problem of not checking arguments range does not apply to > > 3c59x which has in the ioctl function '& 0x1f' for both transceiver number > > and register number. However, eepro100 and tulip don't do that. (I'm > > checking now with 2.4.3 from Mandrake 8, but I don't think that there were > > recent changes in these areas). > > half right -- tulip does this for the phy id but not the MII register > number. I'll fix that up. Please bug Andrey about fixing up > eepro100... OK, Andrey is now CC-ed. However, I only checked the 3 mentioned drivers, while MII ioctl's are used in many more... I was hoping that the mantainers would jump in! -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Looking for device to write device driver for
On Sun, 3 Jun 2001, Kip Macy wrote: I then tried to get the interface information from 3com on their new 3cr990 card to add IPsec offload support to the linux driver. Which Linux driver ? They only provide a 2.2 one which is in an alpha stage (as written in it!). They responded by telling me that due to IP-heavy nature of the product that they would not be releasing the interface. You were much luckier than me. To me, they said that they don't provide any support for Linux with these cards when I was only asking for docs for how to use their own firmware to do basic operations! Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MII access (was [PATCH] support for Cobalt Networks (x86 only)
On Sun, 3 Jun 2001, Jeff Garzik wrote: Bogdan Costescu wrote: With clearer mind, I have to make some a correction to one of the previous messages: the problem of not checking arguments range does not apply to 3c59x which has in the ioctl function ' 0x1f' for both transceiver number and register number. However, eepro100 and tulip don't do that. (I'm checking now with 2.4.3 from Mandrake 8, but I don't think that there were recent changes in these areas). half right -- tulip does this for the phy id but not the MII register number. I'll fix that up. Please bug Andrey about fixing up eepro100... OK, Andrey is now CC-ed. However, I only checked the 3 mentioned drivers, while MII ioctl's are used in many more... I was hoping that the mantainers would jump in! -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MII access (was [PATCH] support for Cobalt Networks (x86 only)
On Sat, 2 Jun 2001, Alan Cox wrote: > > One application needs to poll link status with 1 second resolution. On a > > Then it needs to be privileged Fine. Can you think of a default value for expiring cache ? > And if the approach is to block until the time for the next read occurs is > done then the program get stuck for 30 seconds, misses its deadline and kills > the cluster - how is this better ?? Is not better. Well, when somebody is playing against you, you're in trouble either way: - rate limit: - blocking - as above - non-blocking - notify the user that you can't get the info and probably stop or aquire elevated priviledges and try to restart the network - cache: get outdated info But when a HA application runs, it's usually preferable to stop (and you notice it) than to continue with wrong data. Especially if you set the cache expiry to something like 30 seconds; think in terms of how many transactions/second today's hardware allows... > Doing the MII monitoring somewhere centralised like the routing daemons would > certainly let more inteillgent management and reporting get done I don't argue over this point, already several people mentioned it. But I explained the present situation in a previous message: the MII info is normally read at a low rate and some applications need it more often. It doesn't matter that it's delivered through ioctl, netlink or any other way, you have to read it from the hardware and deliver to user-space at user request. So the "doing the MII monitoring" is the tough part. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
MII access (was [PATCH] support for Cobalt Networks (x86 only)systems)
[ As this is becoming more and more MII specific, I changed the subject ] On Sat, 2 Jun 2001, Alan Cox wrote: > > This only answered the first part of the question: when. How do you pass > > the "how long" info ? > > Does the same applies for the MII ioctl case ? > > The mtime tells you exactly that. Alan, please consider this situation: One application needs to poll link status with 1 second resolution. On a system where caching is done with an unknown cache expiring time, this application is sometimes fed incorrect data. So, you need a way to tell for how long this situation lasts. If you have a proc/ioctl interface for setting cache expiring time, this same interface can then be used for reading back this info. This application can then check that this value is lower than 1 second and if not, notify the user that it cannot run. As this thread started as a general hardware access problem, would only _one_ value for all these cases be sufficient ? Or each case should have its own timeout ? Anyway, for MII, accessing the status at sub-second intervals might be a legit one, so what measuring units should be used? > I disagree. A non priviledged app should not be able to poke around in MII > registers anyway. So you only have to cache the generic state of the link. At the beginning of this thread, Jeff said "calling the ioctls without priveleges is quite useful". Now if you say that there is no such case, the whole problem could simply be solved by checking for the appropriate priviledges. I just realized another thing, important (IMHO) if a normal user is still allowed to access MII: the drivers (checked for 3c59x, eepro100, tulip) do not verify that the value passed for register number is within the allowed range and use it as: int read_cmd = (0xf6 << 10) | ((phy_id & 0x1f) << 5) | location; (phy_id is the MII address and location is the register number). There is also no check that the MII address specified is actually in use by the driver, but this is used with mii-diag to query a MII which was not correctly identified (maybe this should be allowed for CAP_NET_ADMIN only ?) >From one of Don Becker pages: "MII transceivers have 32 management registers. The first 16 are reserved for standard-defined uses, and the remaining one are available for chip-specific features. Only the first seven registers are currently defined." Usually, the transceivers return garbage if you read from locations you are not supposed to (overwritting phy_ad). But if you begin overwritting the READ command (0xf6 above)... Something like this should do: int read_cmd = (0xf6 << 10) | ((phy_id & 0x1f) << 5) | (location & 0x1f); > You don't need timers. Too tired to think straight yesterday... You're right. And if you alloc 32*sizeof(int) (you want to keep jiffies, right ?) per netdevice, I think that it could even be done outside the driver. Hmm, most of my previous arguments are no longer valid 8-( -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
MII access (was [PATCH] support for Cobalt Networks (x86 only)systems)
[ As this is becoming more and more MII specific, I changed the subject ] On Sat, 2 Jun 2001, Alan Cox wrote: This only answered the first part of the question: when. How do you pass the how long info ? Does the same applies for the MII ioctl case ? The mtime tells you exactly that. Alan, please consider this situation: One application needs to poll link status with 1 second resolution. On a system where caching is done with an unknown cache expiring time, this application is sometimes fed incorrect data. So, you need a way to tell for how long this situation lasts. If you have a proc/ioctl interface for setting cache expiring time, this same interface can then be used for reading back this info. This application can then check that this value is lower than 1 second and if not, notify the user that it cannot run. As this thread started as a general hardware access problem, would only _one_ value for all these cases be sufficient ? Or each case should have its own timeout ? Anyway, for MII, accessing the status at sub-second intervals might be a legit one, so what measuring units should be used? I disagree. A non priviledged app should not be able to poke around in MII registers anyway. So you only have to cache the generic state of the link. At the beginning of this thread, Jeff said calling the ioctls without priveleges is quite useful. Now if you say that there is no such case, the whole problem could simply be solved by checking for the appropriate priviledges. I just realized another thing, important (IMHO) if a normal user is still allowed to access MII: the drivers (checked for 3c59x, eepro100, tulip) do not verify that the value passed for register number is within the allowed range and use it as: int read_cmd = (0xf6 10) | ((phy_id 0x1f) 5) | location; (phy_id is the MII address and location is the register number). There is also no check that the MII address specified is actually in use by the driver, but this is used with mii-diag to query a MII which was not correctly identified (maybe this should be allowed for CAP_NET_ADMIN only ?) From one of Don Becker pages: MII transceivers have 32 management registers. The first 16 are reserved for standard-defined uses, and the remaining one are available for chip-specific features. Only the first seven registers are currently defined. Usually, the transceivers return garbage if you read from locations you are not supposed to (overwritting phy_ad). But if you begin overwritting the READ command (0xf6 above)... Something like this should do: int read_cmd = (0xf6 10) | ((phy_id 0x1f) 5) | (location 0x1f); You don't need timers. Too tired to think straight yesterday... You're right. And if you alloc 32*sizeof(int) (you want to keep jiffies, right ?) per netdevice, I think that it could even be done outside the driver. Hmm, most of my previous arguments are no longer valid 8-( -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MII access (was [PATCH] support for Cobalt Networks (x86 only)
On Sat, 2 Jun 2001, Alan Cox wrote: One application needs to poll link status with 1 second resolution. On a Then it needs to be privileged Fine. Can you think of a default value for expiring cache ? And if the approach is to block until the time for the next read occurs is done then the program get stuck for 30 seconds, misses its deadline and kills the cluster - how is this better ?? Is not better. Well, when somebody is playing against you, you're in trouble either way: - rate limit: - blocking - as above - non-blocking - notify the user that you can't get the info and probably stop or aquire elevated priviledges and try to restart the network - cache: get outdated info But when a HA application runs, it's usually preferable to stop (and you notice it) than to continue with wrong data. Especially if you set the cache expiry to something like 30 seconds; think in terms of how many transactions/second today's hardware allows... Doing the MII monitoring somewhere centralised like the routing daemons would certainly let more inteillgent management and reporting get done I don't argue over this point, already several people mentioned it. But I explained the present situation in a previous message: the MII info is normally read at a low rate and some applications need it more often. It doesn't matter that it's delivered through ioctl, netlink or any other way, you have to read it from the hardware and deliver to user-space at user request. So the doing the MII monitoring is the tough part. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] support for Cobalt Networks (x86 only) systems (for
[ OK, this time I cc'ed netdev 8-) ] On Fri, 1 Jun 2001, Alan Cox wrote: > Please re-read your comment. Then think about it. Then tell me how rate > limiting differs from caching to the application. For caching, the kernel establishes the rate with which the info is updated. There's nothing wrong, but how is the application to know if the value is actual or cached (from when, until when) ? That means that a single application that needs data more often than the caching rate will get bogus data and not know about it. With rate limiting, you always get new values, unless the limit is exceeded. When the limit is exceeded, you log and: - block any request until some timer is expired. The application can detect that it's been blocked and react. You can detect if there are several calls waiting and return the same value to all. - return error until some timer is expired. The application can again detect that. In both cases, the application is also capable of guessing the value of the delay. For one application which follows the rules (doesn't need data more often than the caching rate or doesn't exceed the rate limit) there is no difference, I agree. But when somebody is playing tricks while you need data, you have the chance of detecting this by using rate limits. And yes, I agree that either of them (cache or rate limit) should be modifiable through proc entry/ioctl/whatever. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis
On Fri, 1 Jun 2001, jamal wrote: > Jeff, Thanks for copying netdev. Wish more people would do that. Shame on me, I should have thought of that too... I joined lkml only about 2 weeks ago because netdev related topics are sometimes discussed only there... > Not really. > > One idea i have been toying with is to maintain hysteris or threshold of > some form in dev_watchdog; AFAIK, dev_watchdog is right now used only for Tx (if I'm wrong, please correct me!). So how do you sense link loss if you expect only high Rx traffic ? > example: if watchdog timer expires threshold times, you declare the link > dead and send netif_carrier_off netlink message. > On recovery, you send netif_carrier_on I assume that you mean "on recovery" as in "first succesful hard_start_xmit". > Assumption: > If the tx path is blocked, more than likely the link is down. Yes, but is this a good approximation ? I'm not saying that it's not, I'm merely asking for counter-arguments. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis
On Fri, 1 Jun 2001, David S. Miller wrote: > Don't such HA apps need to run as root anyways? Not necessarily, but eventually you can let root (CAP_NET_ADMIN, anyway) go through without any limitations, root can bring down the system at will in other ways. In addition, the rate limiting solution allows a warning to be issued when the limit is exceeded, so that the poor sysadmin knows what hit him 8-) -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis
On Fri, 1 Jun 2001, Jeff Garzik wrote: > The loss and regain of link status should be proactively signalled to > userspace using netlink or something similar. [ For the general discussion ] I fully agree, but I just wanted to give an example of legit use from user space of _current_ values from hardware. > Currently we have > netif_carrier_{on,off,ok} but it is only passively checked. > netif_carrier_{on,off} should probably schedule_task() to fire off a > netlink message... [ Link status details ] Just that not all NICs have hardware support (and/or not all drivers use these facilities) for link status change notification using interrupts. Right now, most drivers _poll_ for media status and based on the poll rate, netif_carrier routines are (or should be) called. We can't make the poll rate very small for the general case, as MII access is time consuming (same discussion was some months ago when the bonding driver was updated). However, for users who know that they need this info to be more accurate (at the expense of CPU time), polling through ioctl's is the only solution. [ Back to general discussion ] So far, to the problem of too often access to hardware, 2 solutions were proposed: 1. cache the values. You can then let the user shoot him-/her-self in the foot by making too many ioctl calls. But this prevent any legit use of current hardware state. 2. rate limiting. You don't let the user access the hardware too often (to be defined), so he/she can't shoot his-/her-self in the foot. Legit use of current hardware state is possible. IMHO, solution 2 is much better. Can you find situations when it's not ? -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis
On Fri, 1 Jun 2001, Alan Cox wrote: > I am sure that to an unpriviledged application reporting back the same result > as we saw last time we asked the hardware unless it is over 30 seconds old > will work fine. Maybe 10 for link partner ? No way! If I implement a HA application which depends on link status, I want the info to be accurate, I don't want to know that 30 seconds ago I had good link. IMHO, rate limiting is the only solution. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] support for Cobalt Networks (x86 only) systems (for real this time)
On Fri, 1 Jun 2001, Pete Zaitcev wrote: > > But, each time a user cats this proc file, the user is banging the > > hardware. What happens when a malicious user forks off 100 processes to > > continually cat this file? :) > > Nothing good, probably. Same story as /proc/apm, which only > hits BIOS instead (and it's debateable what is better). Hmm, the MII related ioctl's in some net drivers (checked for 3c59x, tulip, eepro100) are also querying the hardware. And a user is allowed to ask for this info (but not able to modify it). Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] support for Cobalt Networks (x86 only) systems (for real this time)
On Fri, 1 Jun 2001, Pete Zaitcev wrote: But, each time a user cats this proc file, the user is banging the hardware. What happens when a malicious user forks off 100 processes to continually cat this file? :) Nothing good, probably. Same story as /proc/apm, which only hits BIOS instead (and it's debateable what is better). Hmm, the MII related ioctl's in some net drivers (checked for 3c59x, tulip, eepro100) are also querying the hardware. And a user is allowed to ask for this info (but not able to modify it). Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] support for Cobalt Networks (x86 only) systems (for
[ OK, this time I cc'ed netdev 8-) ] On Fri, 1 Jun 2001, Alan Cox wrote: Please re-read your comment. Then think about it. Then tell me how rate limiting differs from caching to the application. For caching, the kernel establishes the rate with which the info is updated. There's nothing wrong, but how is the application to know if the value is actual or cached (from when, until when) ? That means that a single application that needs data more often than the caching rate will get bogus data and not know about it. With rate limiting, you always get new values, unless the limit is exceeded. When the limit is exceeded, you log and: - block any request until some timer is expired. The application can detect that it's been blocked and react. You can detect if there are several calls waiting and return the same value to all. - return error until some timer is expired. The application can again detect that. In both cases, the application is also capable of guessing the value of the delay. For one application which follows the rules (doesn't need data more often than the caching rate or doesn't exceed the rate limit) there is no difference, I agree. But when somebody is playing tricks while you need data, you have the chance of detecting this by using rate limits. And yes, I agree that either of them (cache or rate limit) should be modifiable through proc entry/ioctl/whatever. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis
On Fri, 1 Jun 2001, Jeff Garzik wrote: The loss and regain of link status should be proactively signalled to userspace using netlink or something similar. [ For the general discussion ] I fully agree, but I just wanted to give an example of legit use from user space of _current_ values from hardware. Currently we have netif_carrier_{on,off,ok} but it is only passively checked. netif_carrier_{on,off} should probably schedule_task() to fire off a netlink message... [ Link status details ] Just that not all NICs have hardware support (and/or not all drivers use these facilities) for link status change notification using interrupts. Right now, most drivers _poll_ for media status and based on the poll rate, netif_carrier routines are (or should be) called. We can't make the poll rate very small for the general case, as MII access is time consuming (same discussion was some months ago when the bonding driver was updated). However, for users who know that they need this info to be more accurate (at the expense of CPU time), polling through ioctl's is the only solution. [ Back to general discussion ] So far, to the problem of too often access to hardware, 2 solutions were proposed: 1. cache the values. You can then let the user shoot him-/her-self in the foot by making too many ioctl calls. But this prevent any legit use of current hardware state. 2. rate limiting. You don't let the user access the hardware too often (to be defined), so he/she can't shoot his-/her-self in the foot. Legit use of current hardware state is possible. IMHO, solution 2 is much better. Can you find situations when it's not ? -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis
On Fri, 1 Jun 2001, David S. Miller wrote: Don't such HA apps need to run as root anyways? Not necessarily, but eventually you can let root (CAP_NET_ADMIN, anyway) go through without any limitations, root can bring down the system at will in other ways. In addition, the rate limiting solution allows a warning to be issued when the limit is exceeded, so that the poor sysadmin knows what hit him 8-) -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis
On Fri, 1 Jun 2001, Alan Cox wrote: I am sure that to an unpriviledged application reporting back the same result as we saw last time we asked the hardware unless it is over 30 seconds old will work fine. Maybe 10 for link partner ? No way! If I implement a HA application which depends on link status, I want the info to be accurate, I don't want to know that 30 seconds ago I had good link. IMHO, rate limiting is the only solution. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis
On Fri, 1 Jun 2001, jamal wrote: Jeff, Thanks for copying netdev. Wish more people would do that. Shame on me, I should have thought of that too... I joined lkml only about 2 weeks ago because netdev related topics are sometimes discussed only there... Not really. One idea i have been toying with is to maintain hysteris or threshold of some form in dev_watchdog; AFAIK, dev_watchdog is right now used only for Tx (if I'm wrong, please correct me!). So how do you sense link loss if you expect only high Rx traffic ? example: if watchdog timer expires threshold times, you declare the link dead and send netif_carrier_off netlink message. On recovery, you send netif_carrier_on I assume that you mean on recovery as in first succesful hard_start_xmit. Assumption: If the tx path is blocked, more than likely the link is down. Yes, but is this a good approximation ? I'm not saying that it's not, I'm merely asking for counter-arguments. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: APIC problem or 3com 3c590 driver problem in smp kernel 2.4.x
On Wed, 30 May 2001, Feng Xian wrote: > when I run the kernel smp-2.4.x, my PCI > device can not receive any interrupt while the /proc/interrupts shows that > 3c905 receives over million of interrupts and number grows very fast. That's a bit strange as one of the first things done in 3c59x ISR is: if ((status & IntLatch) == 0) goto handler_exit; /* No interrupt: shared IRQs can cause this */ with which the driver detects if the interrupt was generated by the card. Are you sure the driver for the other PCI device knows to play nice with shared interrupts ? Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: APIC problem or 3com 3c590 driver problem in smp kernel 2.4.x
On Wed, 30 May 2001, Feng Xian wrote: when I run the kernel smp-2.4.x, my PCI device can not receive any interrupt while the /proc/interrupts shows that 3c905 receives over million of interrupts and number grows very fast. That's a bit strange as one of the first things done in 3c59x ISR is: if ((status IntLatch) == 0) goto handler_exit; /* No interrupt: shared IRQs can cause this */ with which the driver detects if the interrupt was generated by the card. Are you sure the driver for the other PCI device knows to play nice with shared interrupts ? Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: LANANA: To Pending Device Number Registrants
On Tue, 15 May 2001, Jonathan Lundell wrote: > >The 2.4 kernel allows you to rename an interface. So you can build > >a little database of (MAC address/name) pairs. Apply this after booting > >and before bringing up the interfaces and everything has the name > >you wanted, based on MAC address. > > There's a bit of a catch 22, though, if you don't have unique MAC > addresses in the system (across multiple interfaces). The same situation appears when using bonding.o. For several years, Don Becker's (and derived) network drivers support changing MAC address when the interface is down. So Al's /dev/eth//MAC has different values depending on whether bonding is active or not. Should /dev/eth//MAC always have the original value (to be able to uniquely identify this card) or the in-use value (used by ARP, I believe) ? Or maybe have a /dev/eth//MAC_in_use ? Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: LANANA: To Pending Device Number Registrants
On Tue, 15 May 2001, Jonathan Lundell wrote: The 2.4 kernel allows you to rename an interface. So you can build a little database of (MAC address/name) pairs. Apply this after booting and before bringing up the interfaces and everything has the name you wanted, based on MAC address. There's a bit of a catch 22, though, if you don't have unique MAC addresses in the system (across multiple interfaces). The same situation appears when using bonding.o. For several years, Don Becker's (and derived) network drivers support changing MAC address when the interface is down. So Al's /dev/eth/n/MAC has different values depending on whether bonding is active or not. Should /dev/eth/n/MAC always have the original value (to be able to uniquely identify this card) or the in-use value (used by ARP, I believe) ? Or maybe have a /dev/eth/n/MAC_in_use ? Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [3com905b freeze Alpha SMP 2.4.2] FullDuplex issue ?
On Wed, 2 May 2001, Cabaniols, Sebastien wrote: > I insert the 3c59x module with debug=7. Why ? debug=7 is the highest debug level and produces _lots_ of debug data for high network activity. Do you have problems when insmod-ing without any option and use a higher debug level just to see what's going on? > The first of the above machines launching the get freezes. Why do you believe that the card/driver is responsible for the freeze ? The outputs that you provided show no problems to me. A duplex mismatch would not freeze a computer. You would get crappy transfer rates, usually some error messages from the driver, but everything should otherwise work. To verify the media settings, you might want to use mii-diag (from ftp.scyld.com). Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [3com905b freeze Alpha SMP 2.4.2] FullDuplex issue ?
On Wed, 2 May 2001, Cabaniols, Sebastien wrote: I insert the 3c59x module with debug=7. Why ? debug=7 is the highest debug level and produces _lots_ of debug data for high network activity. Do you have problems when insmod-ing without any option and use a higher debug level just to see what's going on? The first of the above machines launching the get freezes. Why do you believe that the card/driver is responsible for the freeze ? The outputs that you provided show no problems to me. A duplex mismatch would not freeze a computer. You would get crappy transfer rates, usually some error messages from the driver, but everything should otherwise work. To verify the media settings, you might want to use mii-diag (from ftp.scyld.com). Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Configuring synchronous interfaces in Linux
On Fri, 1 Dec 2000, Chris Wedgwood wrote: > Actually; Ethernet badly needs something like this too. I would kill > to be able to do something like: > > ifconfig eth0 speed 100 duplex full Even if you are thinking about Ethernet only, it's not easy to do it. Most modern NICs have MII transceivers, where media setting is more or less following a standard. All drivers written by Donald Becker and probably everything derived from them support MII get/set operations from user-space through ioctls, using mii-diag (from ftp.scyld.com). But there are NICs which do not have MII transceivers and media setting/selection is NIC-specific. Take a look at the media specific module options for several drivers (e.g. 3c59x and tulip) and you'll see what I'm talking about. Moreover, with the proposed ifconfig interface, there is a problem: do you want the media setting to be locked ? Quite a lot of NICs can do NWAY autonegotiation or the driver can go through the available modes trying to get one working. So if you say "I want to use this speed", do you want to specifically use that speed or give it just as a starting point for the driver which can decrease the speed in case it's not able to get it ? (the example is Ethernet specific, but the ideea is not). And finally (also Ethernet specific): some devices don't like forced media settings when they support autonegotiation. Look at the tulip recent archives for examples. Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Configuring synchronous interfaces in Linux
On Fri, 1 Dec 2000, Chris Wedgwood wrote: Actually; Ethernet badly needs something like this too. I would kill to be able to do something like: ifconfig eth0 speed 100 duplex full Even if you are thinking about Ethernet only, it's not easy to do it. Most modern NICs have MII transceivers, where media setting is more or less following a standard. All drivers written by Donald Becker and probably everything derived from them support MII get/set operations from user-space through ioctls, using mii-diag (from ftp.scyld.com). But there are NICs which do not have MII transceivers and media setting/selection is NIC-specific. Take a look at the media specific module options for several drivers (e.g. 3c59x and tulip) and you'll see what I'm talking about. Moreover, with the proposed ifconfig interface, there is a problem: do you want the media setting to be locked ? Quite a lot of NICs can do NWAY autonegotiation or the driver can go through the available modes trying to get one working. So if you say "I want to use this speed", do you want to specifically use that speed or give it just as a starting point for the driver which can decrease the speed in case it's not able to get it ? (the example is Ethernet specific, but the ideea is not). And finally (also Ethernet specific): some devices don't like forced media settings when they support autonegotiation. Look at the tulip recent archives for examples. Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Preallocated skb's?
On Fri, 15 Sep 2000, jamal wrote: > Only the timer runs at HZ granularity ;-< Some cards provide their own high resolution timers; latest 3Com cards provide several with different purposes (none currently used). The question is how many of these also provide the Rx early interrupts. You also mentioned an auto-tunable Rx mitigation scheme. How do you implement it without using hardware timers ? > 20Msec is probably too much time. If my math is not wrong, 1 bit time in > a 100Mps is 1 ns; 64 bytes is 512ns. I think your are wrong by a factor of 10 here, 1 bit time at 100Mbps should be 10 ns. Then 64 bytes is 5.12 us (u=micro). Anyway, this is comparable with the time needed to reach ISR, so you can have several (but small number) of packets already waiting for processing. > You use the period(5-10micros), while waiting > for full packet arrival, to make the route decision (lookup etc). > i.e this will allow for a better FF; it will not offload things. Just that you span several layers by doing this, it's not driver specific anymore. Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Preallocated skb's?
On Thu, 14 Sep 2000, jamal wrote: > If i remember correctly some of the 3coms still give this 'mid-interupt', > no? It could useful to just say quickly read the header and make routing > decisions as in fast routing but not under heavy load. The 3Com cards can generate this interrupt, however this is not used in current 3c59x.c. I suggested this to Andrew, but he is already worried about the current interrupt rate and unhappy that 3Com cards do not provide hardware support for Rx mitigation. An ideea might be to combine Rx early interrupts with some kind of software timer-based mitigation. IMHO this has 2 advantages: - because of the overhead that Andrew pointed out, by the time the CPU reaches the ISR code and the skbuff allocation is done, the entire packet might already be transferred; however, a check has to be done to assure that the packet was not dropped by the hardware and you try to fit a packet in a skbuff sized for the previous packet (in case several packets can be transferred during the "overhead" time) - under load, because interrupts occur anyway (the Rx early ones), you don't loose anything in terms of latency. Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Preallocated skb's?
On Thu, 14 Sep 2000, jamal wrote: If i remember correctly some of the 3coms still give this 'mid-interupt', no? It could useful to just say quickly read the header and make routing decisions as in fast routing but not under heavy load. The 3Com cards can generate this interrupt, however this is not used in current 3c59x.c. I suggested this to Andrew, but he is already worried about the current interrupt rate and unhappy that 3Com cards do not provide hardware support for Rx mitigation. An ideea might be to combine Rx early interrupts with some kind of software timer-based mitigation. IMHO this has 2 advantages: - because of the overhead that Andrew pointed out, by the time the CPU reaches the ISR code and the skbuff allocation is done, the entire packet might already be transferred; however, a check has to be done to assure that the packet was not dropped by the hardware and you try to fit a packet in a skbuff sized for the previous packet (in case several packets can be transferred during the "overhead" time) - under load, because interrupts occur anyway (the Rx early ones), you don't loose anything in terms of latency. Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Preallocated skb's?
On Fri, 15 Sep 2000, jamal wrote: Only the timer runs at HZ granularity ;- Some cards provide their own high resolution timers; latest 3Com cards provide several with different purposes (none currently used). The question is how many of these also provide the Rx early interrupts. You also mentioned an auto-tunable Rx mitigation scheme. How do you implement it without using hardware timers ? 20Msec is probably too much time. If my math is not wrong, 1 bit time in a 100Mps is 1 ns; 64 bytes is 512ns. I think your are wrong by a factor of 10 here, 1 bit time at 100Mbps should be 10 ns. Then 64 bytes is 5.12 us (u=micro). Anyway, this is comparable with the time needed to reach ISR, so you can have several (but small number) of packets already waiting for processing. You use the period(5-10micros), while waiting for full packet arrival, to make the route decision (lookup etc). i.e this will allow for a better FF; it will not offload things. Just that you span several layers by doing this, it's not driver specific anymore. Sincerely, Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/