from:"Bogdan Costescu"

Re: [PATCH] 3c59x: read current link status from phy

2005-09-09 Thread Bogdan Costescu


On Thu, 8 Sep 2005, Andy Fleming wrote:

The new PHY Layer (drivers/net/phy/*) can provide all these features 
for you without much difficulty, I suspect.


As pointed to be Andrew a few days ago, this driver supports a lot of 
chips - for most of them the test hardware would be hard to come by 
and the documentation even more. Unless you'd like to do it based on 
"whoever is interested should cry loud"...


The layer supports handling the interrupts for you, or (if it's 
shared with your controller's interrupt)


Yes, there is only one interrupt that for data transmission (both Tx 
and Rx), statistics, errors and (for those chips that support it) link 
state change.



Is the cost of an extra read every minute really too high?


You probably didn't look at the code. The MII registers are not 
exposed in the PCI space, they need to be accessed through a serial 
protocol, such that each MII register read is in fact about 200 (in 
total) of outw and inw/inl operations.


--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] 3c59x: read current link status from phy

2005-09-09 Thread Bogdan Costescu


On Thu, 8 Sep 2005, Andy Fleming wrote:

The new PHY Layer (drivers/net/phy/*) can provide all these features 
for you without much difficulty, I suspect.


As pointed to be Andrew a few days ago, this driver supports a lot of 
chips - for most of them the test hardware would be hard to come by 
and the documentation even more. Unless you'd like to do it based on 
whoever is interested should cry loud...


The layer supports handling the interrupts for you, or (if it's 
shared with your controller's interrupt)


Yes, there is only one interrupt that for data transmission (both Tx 
and Rx), statistics, errors and (for those chips that support it) link 
state change.



Is the cost of an extra read every minute really too high?


You probably didn't look at the code. The MII registers are not 
exposed in the PCI space, they need to be accessed through a serial 
protocol, such that each MII register read is in fact about 200 (in 
total) of outw and inw/inl operations.


--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] 3c59x: read current link status from phy

2005-09-08 Thread Bogdan Costescu


On Thu, 8 Sep 2005, Tommy Christensen wrote:

Besides, how long would you like to wait for network connectivity 
after plugging in the cable?  It is now lowered from [60-120] to 
[0-60] seconds.


I now understood what the problem was, so I'll put it in words for 
posterity: the Link Status bit of the MII Status register needs to be 
read twice to first clear the error state (link bit=0) after which the 
bit reports the actual value of the link. From the manual:


This bit has a latching function. A link failure causes the
bit to clear and remain clear until it has been read through
the management interface.

I tested this on a Tornado chip and it works as advertised (after link 
is back up, first read gives 0x7829, the second 0x782d).


But I still don't agree with your solution: you are reading the Status 
register twice in all cases, which is wrong. What you want is to read 
it a second time only after the link was marked as down: a simple 
check if bit 2 of the Status register is 0, in which case you issue 
the second read. This still means that there will be 2 reads if the 
link remains down, but at least there is only 1 read for the case 
where the link is up and remains up.



Personally, I'd prefer the delay to be < 10 seconds.


If you sample every 60 seconds ? Teach Shannon how to do it ;-)

If you mean to reduce the sampling period, there is a very good reason 
not to do it: these MDIO operations are expensive - it's a serial 
protocol. vortex_timer() might do 2 (and with the discussed change - 
3) of them - there are better things to do for the CPU than wait for 
these I/O operations. Plus, vortex_timer() also disables the 
interrupt...


The Tornado and at least some Cyclone chips support generating an 
interrupt whenever the link changes, which can be used instead of 
polling for link state. This feature is not used in the 3c59x driver 
and could give you much less than 10 seconds accuracy - but you have 
to code it. ;-)


--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] 3c59x: read current link status from phy

2005-09-08 Thread Bogdan Costescu


On Thu, 8 Sep 2005, Tommy Christensen wrote:

The idea is to avoid an extra delay of 60 seconds before detecting 
link-up.


But you are adding the read to a function that is called repeatedly to 
fix an event that happens only once at start-up !


If this read is really needed (I still doubt it...), can't it be 
performed in vortex_up(), by possibly doubling the existing one there ?

vortex_up() is executed only once at start-up, not every 60 seconds.


Please see http://bugzilla.kernel.org/show_bug.cgi?id=5025


Hah, a Cisco switch.  Look in Documentation/networking/vortex.txt for 
"portfast".


--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] 3c59x: read current link status from phy

2005-09-08 Thread Bogdan Costescu


On Wed, 7 Sep 2005, Jeff Garzik wrote:


The phy status register must be read twice in order to get the actual link
state.


Can the original poster give an explanation ? I've enjoyed a rather 
well functioning 3c59x driver for the past ~6 years without such 
double reading. Plus:

- this operation is I/O expensive
- it is performed inside a region protected by a spinlock
- it is performed often, every 60 seconds

Is there some specific hardware that exhibits a problem that is solved 
by this double reading ?


--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] 3c59x: read current link status from phy

2005-09-08 Thread Bogdan Costescu


On Wed, 7 Sep 2005, Jeff Garzik wrote:


The phy status register must be read twice in order to get the actual link
state.


Can the original poster give an explanation ? I've enjoyed a rather 
well functioning 3c59x driver for the past ~6 years without such 
double reading. Plus:

- this operation is I/O expensive
- it is performed inside a region protected by a spinlock
- it is performed often, every 60 seconds

Is there some specific hardware that exhibits a problem that is solved 
by this double reading ?


--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] 3c59x: read current link status from phy

2005-09-08 Thread Bogdan Costescu


On Thu, 8 Sep 2005, Tommy Christensen wrote:

The idea is to avoid an extra delay of 60 seconds before detecting 
link-up.


But you are adding the read to a function that is called repeatedly to 
fix an event that happens only once at start-up !


If this read is really needed (I still doubt it...), can't it be 
performed in vortex_up(), by possibly doubling the existing one there ?

vortex_up() is executed only once at start-up, not every 60 seconds.


Please see http://bugzilla.kernel.org/show_bug.cgi?id=5025


Hah, a Cisco switch.  Look in Documentation/networking/vortex.txt for 
portfast.


--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] 3c59x: read current link status from phy

2005-09-08 Thread Bogdan Costescu


On Thu, 8 Sep 2005, Tommy Christensen wrote:

Besides, how long would you like to wait for network connectivity 
after plugging in the cable?  It is now lowered from [60-120] to 
[0-60] seconds.


I now understood what the problem was, so I'll put it in words for 
posterity: the Link Status bit of the MII Status register needs to be 
read twice to first clear the error state (link bit=0) after which the 
bit reports the actual value of the link. From the manual:


This bit has a latching function. A link failure causes the
bit to clear and remain clear until it has been read through
the management interface.

I tested this on a Tornado chip and it works as advertised (after link 
is back up, first read gives 0x7829, the second 0x782d).


But I still don't agree with your solution: you are reading the Status 
register twice in all cases, which is wrong. What you want is to read 
it a second time only after the link was marked as down: a simple 
check if bit 2 of the Status register is 0, in which case you issue 
the second read. This still means that there will be 2 reads if the 
link remains down, but at least there is only 1 read for the case 
where the link is up and remains up.



Personally, I'd prefer the delay to be  10 seconds.


If you sample every 60 seconds ? Teach Shannon how to do it ;-)

If you mean to reduce the sampling period, there is a very good reason 
not to do it: these MDIO operations are expensive - it's a serial 
protocol. vortex_timer() might do 2 (and with the discussed change - 
3) of them - there are better things to do for the CPU than wait for 
these I/O operations. Plus, vortex_timer() also disables the 
interrupt...


The Tornado and at least some Cyclone chips support generating an 
interrupt whenever the link changes, which can be used instead of 
polling for link state. This feature is not used in the 3c59x driver 
and could give you much less than 10 seconds accuracy - but you have 
to code it. ;-)


--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.13] libata: Marvell SATA support (PIO mode)

2005-09-07 Thread Bogdan Costescu

A cfg=0x011f EDMA IRQ err 
cause/mask=0x/0x1f7f
mv_port_init: EDMA cfg=0x011f EDMA IRQ err 
cause/mask=0x/0x1f7f
mv_port_init: EDMA cfg=0x011f EDMA IRQ err 
cause/mask=0x/0x1f7f
mv_host_init: HC0: HC config=0x11dcf013 HC IRQ cause=0x0111
mv_host_init: HC MAIN IRQ cause/mask=0x0102/0x0007 PCI int 
cause/mask=0x/6mv_init_one: PCI config space:
mv_init_one: 504111ab 02b7 0103 2008
mv_init_one: f504   
mv_init_one:    81241043
mv_init_one:  0040  0109
mv_host_intr: ENTER, hc0 relevant=0x0102 HC IRQ cause=0x0111
mv_host_intr: EXIT
mv_phy_reset: ENTER, port 0, mmio 0xf8a22000
mv_host_intr: ENTER, hc0 relevant=0x0001 HC IRQ cause=0x
mv_err_intr: port 0 error; EDMA err cause: 0x0008 SERR: 0x
mv_phy_reset: ENTER, port 0, mmio 0xf8a22000
mv_phy_reset: Done.  Now calling __sata_phy_reset()
Badness in __sata_phy_reset at drivers/scsi/libata-core.c:1413
 [] __sata_phy_reset+0x75/0x12e [libata]
 [] mv_phy_reset+0xe8/0x175 [sata_mv]
 [] mv_host_intr+0xf1/0x187 [sata_mv]
 [] mv_interrupt+0x7f/0xa7 [sata_mv]
 [] handle_IRQ_event+0x25/0x4f
 [] do_IRQ+0x11c/0x1ae
 ===
 [] common_interrupt+0x18/0x20
 [] dup_task_struct+0x71/0xc0
 [] delay_tsc+0x9/0x13
 [] __delay+0x9/0xa
 [] mv_phy_reset+0xcc/0x175 [sata_mv]
 [] setup_irq+0xae/0xb7
 [] mv_interrupt+0x0/0xa7 [sata_mv]
 [] ata_bus_probe+0xe/0x7b [libata]
 [] ata_device_add+0x16c/0x1e8 [libata]
 [] mv_init_one+0x1f8/0x235 [sata_mv]
 [] pci_device_probe_static+0x2a/0x3d
 [] __pci_device_probe+0x1b/0x2c
 [] pci_device_probe+0x1b/0x2d
 [] bus_match+0x27/0x45
 [] driver_attach+0x37/0x66
 [] bus_add_driver+0x78/0x99
 [] pci_register_driver+0x6e/0x8a
 [] mv_init+0xa/0x15 [sata_mv]
 [] sys_init_module+0x116/0x238
 [] syscall_call+0x7/0xb
bad: scheduling while atomic!
 [] schedule+0x2d/0x87a
 [] scheduler_tick+0x3e/0x3e5
 [] profile_hook+0x1b/0x26
 [] __mod_timer+0x101/0x10b
 [] schedule_timeout+0xd3/0xee
 [] process_timeout+0x0/0x5
 [] mv_scr_read+0xf/0x54 [sata_mv]
 [] msleep+0x4f/0x55
 [] __sata_phy_reset+0xa8/0x12e [libata]
 [] mv_phy_reset+0xe8/0x175 [sata_mv]
 [] mv_host_intr+0xf1/0x187 [sata_mv]
 [] mv_interrupt+0x7f/0xa7 [sata_mv]
 [] handle_IRQ_event+0x25/0x4f
 [] do_IRQ+0x11c/0x1ae
 ===
 [] common_interrupt+0x18/0x20
 [] dup_task_struct+0x71/0xc0
 [] delay_tsc+0x9/0x13
 [] __delay+0x9/0xa
 [] mv_phy_reset+0xcc/0x175 [sata_mv]
 [] setup_irq+0xae/0xb7
 [] mv_interrupt+0x0/0xa7 [sata_mv]
 [] ata_bus_probe+0xe/0x7b [libata]
 [] ata_device_add+0x16c/0x1e8 [libata]
 [] mv_init_one+0x1f8/0x235 [sata_mv]
 [] pci_device_probe_static+0x2a/0x3d
 [] __pci_device_probe+0x1b/0x2c
 [] pci_device_probe+0x1b/0x2d
 [] bus_match+0x27/0x45
 [] driver_attach+0x37/0x66
 [] bus_add_driver+0x78/0x99
 [] pci_register_driver+0x6e/0x8a
 [] mv_init+0xa/0x15 [sata_mv]
 [] sys_init_module+0x116/0x238
 [] syscall_call+0x7/0xb

after which the system is dead.

> Either way, mv_phy_reset() is called from mv_err_intr()  which
> doesn't appear in either of the stack dumps above.

It appears now, both in the debug messages and on the stack (but not 
in the success case).

> -do the phy_reset part of error recovery after returning from
> interrupt handler?

I might be completely off here, but what the 3c59x network driver does
for the case where a MII link is used is to start a timer which checks
the state of the link, async from the init routine, which is very
similar in my understanding to your intention expressed above. This
scheme works fine for network...

Please note that I also exposed another problem in my previous
message: after 'rmmod sata_mv', the controller can still generate
interrupts, f.e. when a drive is attached.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.13] libata: Marvell SATA support (PIO mode)

2005-09-07 Thread Bogdan Costescu

):

mv_init_one: ENTER for PCI Bus:Slot.Func=2:8.0
mv_port_init: EDMA cfg=0x011f EDMA IRQ err 
cause/mask=0x/0x1f7f
mv_port_init: EDMA cfg=0x011f EDMA IRQ err 
cause/mask=0x/0x1f7f
mv_port_init: EDMA cfg=0x011f EDMA IRQ err 
cause/mask=0x/0x1f7f
mv_port_init: EDMA cfg=0x011f EDMA IRQ err 
cause/mask=0x/0x1f7f
mv_host_init: HC0: HC config=0x11dcf013 HC IRQ cause=0x0111
mv_host_init: HC MAIN IRQ cause/mask=0x0102/0x0007 PCI int 
cause/mask=0x/6mv_init_one: PCI config space:
mv_init_one: 504111ab 02b7 0103 2008
mv_init_one: f504   
mv_init_one:    81241043
mv_init_one:  0040  0109
mv_host_intr: ENTER, hc0 relevant=0x0102 HC IRQ cause=0x0111
mv_host_intr: EXIT
mv_phy_reset: ENTER, port 0, mmio 0xf8a22000
mv_host_intr: ENTER, hc0 relevant=0x0001 HC IRQ cause=0x
mv_err_intr: port 0 error; EDMA err cause: 0x0008 SERR: 0x
mv_phy_reset: ENTER, port 0, mmio 0xf8a22000
mv_phy_reset: Done.  Now calling __sata_phy_reset()
Badness in __sata_phy_reset at drivers/scsi/libata-core.c:1413
 [f8961dc8] __sata_phy_reset+0x75/0x12e [libata]
 [f887566e] mv_phy_reset+0xe8/0x175 [sata_mv]
 [f8875449] mv_host_intr+0xf1/0x187 [sata_mv]
 [f887555e] mv_interrupt+0x7f/0xa7 [sata_mv]
 [c010745e] handle_IRQ_event+0x25/0x4f
 [c01079be] do_IRQ+0x11c/0x1ae
 ===
 [c02d1c88] common_interrupt+0x18/0x20
 [c012007b] dup_task_struct+0x71/0xc0
 [c0111714] delay_tsc+0x9/0x13
 [c01c18d9] __delay+0x9/0xa
 [f8875652] mv_phy_reset+0xcc/0x175 [sata_mv]
 [c0107e9e] setup_irq+0xae/0xb7
 [f88754df] mv_interrupt+0x0/0xa7 [sata_mv]
 [f8961ce1] ata_bus_probe+0xe/0x7b [libata]
 [f8963fd4] ata_device_add+0x16c/0x1e8 [libata]
 [f8875abf] mv_init_one+0x1f8/0x235 [sata_mv]
 [c01c744d] pci_device_probe_static+0x2a/0x3d
 [c01c747b] __pci_device_probe+0x1b/0x2c
 [c01c74a7] pci_device_probe+0x1b/0x2d
 [c021d734] bus_match+0x27/0x45
 [c021d7fd] driver_attach+0x37/0x66
 [c021dbbb] bus_add_driver+0x78/0x99
 [c01c764e] pci_register_driver+0x6e/0x8a
 [f881b00a] mv_init+0xa/0x15 [sata_mv]
 [c0137c05] sys_init_module+0x116/0x238
 [c02d12cb] syscall_call+0x7/0xb
bad: scheduling while atomic!
 [c02ced41] schedule+0x2d/0x87a
 [c011e1ae] scheduler_tick+0x3e/0x3e5
 [c0122bba] profile_hook+0x1b/0x26
 [c0129741] __mod_timer+0x101/0x10b
 [c02cfa90] schedule_timeout+0xd3/0xee
 [c0129feb] process_timeout+0x0/0x5
 [f8875082] mv_scr_read+0xf/0x54 [sata_mv]
 [c012a562] msleep+0x4f/0x55
 [f8961dfb] __sata_phy_reset+0xa8/0x12e [libata]
 [f887566e] mv_phy_reset+0xe8/0x175 [sata_mv]
 [f8875449] mv_host_intr+0xf1/0x187 [sata_mv]
 [f887555e] mv_interrupt+0x7f/0xa7 [sata_mv]
 [c010745e] handle_IRQ_event+0x25/0x4f
 [c01079be] do_IRQ+0x11c/0x1ae
 ===
 [c02d1c88] common_interrupt+0x18/0x20
 [c012007b] dup_task_struct+0x71/0xc0
 [c0111714] delay_tsc+0x9/0x13
 [c01c18d9] __delay+0x9/0xa
 [f8875652] mv_phy_reset+0xcc/0x175 [sata_mv]
 [c0107e9e] setup_irq+0xae/0xb7
 [f88754df] mv_interrupt+0x0/0xa7 [sata_mv]
 [f8961ce1] ata_bus_probe+0xe/0x7b [libata]
 [f8963fd4] ata_device_add+0x16c/0x1e8 [libata]
 [f8875abf] mv_init_one+0x1f8/0x235 [sata_mv]
 [c01c744d] pci_device_probe_static+0x2a/0x3d
 [c01c747b] __pci_device_probe+0x1b/0x2c
 [c01c74a7] pci_device_probe+0x1b/0x2d
 [c021d734] bus_match+0x27/0x45
 [c021d7fd] driver_attach+0x37/0x66
 [c021dbbb] bus_add_driver+0x78/0x99
 [c01c764e] pci_register_driver+0x6e/0x8a
 [f881b00a] mv_init+0xa/0x15 [sata_mv]
 [c0137c05] sys_init_module+0x116/0x238
 [c02d12cb] syscall_call+0x7/0xb

after which the system is dead.

 Either way, mv_phy_reset() is called from mv_err_intr()  which
 doesn't appear in either of the stack dumps above.

It appears now, both in the debug messages and on the stack (but not 
in the success case).

 -do the phy_reset part of error recovery after returning from
 interrupt handler?

I might be completely off here, but what the 3c59x network driver does
for the case where a MII link is used is to start a timer which checks
the state of the link, async from the init routine, which is very
similar in my understanding to your intention expressed above. This
scheme works fine for network...

Please note that I also exposed another problem in my previous
message: after 'rmmod sata_mv', the controller can still generate
interrupts, f.e. when a drive is attached.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.13] libata: Marvell SATA support (PIO mode)

2005-09-02 Thread Bogdan Costescu

4 cmd 0x0 ctl 0xF8A22120 bmdma 0x0 irq 9
 ata2: SATA max PIO4 cmd 0x0 ctl 0xF8A24120 bmdma 0x0 irq 9
 ata3: SATA max PIO4 cmd 0x0 ctl 0xF8A26120 bmdma 0x0 irq 9
 ata4: SATA max PIO4 cmd 0x0 ctl 0xF8A28120 bmdma 0x0 irq 9
 Badness in __sata_phy_reset at drivers/scsi/libata-core.c:1413
  [] __sata_phy_reset+0x75/0x12e [libata]
  [] mv_phy_reset+0xbf/0x11e [sata_mv]
  [] end_that_request_last+0x6c/0x7e
  [] mv_host_intr+0xd6/0x142 [sata_mv]
  [] mv_interrupt+0xd5/0x145 [sata_mv]
  [] handle_IRQ_event+0x25/0x4f
  [] do_IRQ+0x18a/0x2bf
  ===
  [] common_interrupt+0x18/0x20
  [] mv_phy_reset+0xa8/0x11e [sata_mv]
  [] setup_irq+0x179/0x181
  [] mv_interrupt+0x0/0x145 [sata_mv]
  [] ata_bus_probe+0xe/0x7b [libata]
  [] ata_device_add+0x186/0x202 [libata]
  [] mv_init_one+0x197/0x1d5 [sata_mv]
  [] pci_device_probe_static+0x2a/0x3d
  [] __pci_device_probe+0x1b/0x2c
  [] pci_device_probe+0x1b/0x2d
  [] bus_match+0x27/0x45
  [] driver_attach+0x37/0x66
  [] bus_add_driver+0x77/0x97
  [] driver_register+0x51/0x58
  [] pci_register_driver+0x85/0xa1
  [] mv_init+0xa/0x15 [sata_mv]
  [] sys_init_module+0x1f1/0x2d9
  [] syscall_call+0x7/0xb
 bad: scheduling while atomic!
  [] schedule+0x2d/0x552
  [] handle_IRQ_event+0x25/0x4f
  [] schedule_timeout+0xf1/0x10c
  [] process_timeout+0x0/0x5
  [] mv_scr_read+0xf/0x54 [sata_mv]
  [] msleep+0x4e/0x54
  [] __sata_phy_reset+0xa8/0x12e [libata]
  [] mv_phy_reset+0xbf/0x11e [sata_mv]
  [] end_that_request_last+0x6c/0x7e
  [] mv_host_intr+0xd6/0x142 [sata_mv]
  [] mv_interrupt+0xd5/0x145 [sata_mv]
  [] handle_IRQ_event+0x25/0x4f
  [] do_IRQ+0x18a/0x2bf
  ===
  [] common_interrupt+0x18/0x20
  [] mv_phy_reset+0xa8/0x11e [sata_mv]
  [] setup_irq+0x179/0x181
  [] mv_interrupt+0x0/0x145 [sata_mv]
  [] ata_bus_probe+0xe/0x7b [libata]
  [] ata_device_add+0x186/0x202 [libata]
  [] mv_init_one+0x197/0x1d5 [sata_mv]
  [] pci_device_probe_static+0x2a/0x3d
  [] __pci_device_probe+0x1b/0x2c
  [] pci_device_probe+0x1b/0x2d
  [] bus_match+0x27/0x45
  [] driver_attach+0x37/0x66
  [] bus_add_driver+0x77/0x97
  [] driver_register+0x51/0x58
  [] pci_register_driver+0x85/0xa1
  [] mv_init+0xa/0x15 [sata_mv]
  [] sys_init_module+0x1f1/0x2d9
  [] syscall_call+0x7/0xb


I don't know how much of the problem comes from BIOS/ACPI and how much
from the combination of this driver with the RHEL kernel and my
hacking. But I don't know how to proceed further, so I'm waiting for 
some hints or patches :-)

Side note: I was able some time ago to use this controller with the
mv_sata driver 3.40, also with a RHEL kernel, without any fiddling
with ACPI.

--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.13] libata: Marvell SATA support (PIO mode)

2005-09-02 Thread Bogdan Costescu

 IRQ 5 for device :02:08.0
 PCI: Sharing IRQ 5 with :00:1d.0
 IRQ routing conflict for :02:08.0, have irq 9, want irq 5
 ata1: SATA max PIO4 cmd 0x0 ctl 0xF8A22120 bmdma 0x0 irq 9
 ata2: SATA max PIO4 cmd 0x0 ctl 0xF8A24120 bmdma 0x0 irq 9
 ata3: SATA max PIO4 cmd 0x0 ctl 0xF8A26120 bmdma 0x0 irq 9
 ata4: SATA max PIO4 cmd 0x0 ctl 0xF8A28120 bmdma 0x0 irq 9
 Badness in __sata_phy_reset at drivers/scsi/libata-core.c:1413
  [f88e8f0c] __sata_phy_reset+0x75/0x12e [libata]
  [f883f62f] mv_phy_reset+0xbf/0x11e [sata_mv]
  [c0250f16] end_that_request_last+0x6c/0x7e
  [f883f3bf] mv_host_intr+0xd6/0x142 [sata_mv]
  [f883f500] mv_interrupt+0xd5/0x145 [sata_mv]
  [c0107e2b] handle_IRQ_event+0x25/0x4f
  [c01087d3] do_IRQ+0x18a/0x2bf
  ===
  [c030fb7c] common_interrupt+0x18/0x20
  [f883f618] mv_phy_reset+0xa8/0x11e [sata_mv]
  [c01091d8] setup_irq+0x179/0x181
  [f883f42b] mv_interrupt+0x0/0x145 [sata_mv]
  [f88e8e25] ata_bus_probe+0xe/0x7b [libata]
  [f88eb34d] ata_device_add+0x186/0x202 [libata]
  [f883f97a] mv_init_one+0x197/0x1d5 [sata_mv]
  [c01ec15d] pci_device_probe_static+0x2a/0x3d
  [c01ec18b] __pci_device_probe+0x1b/0x2c
  [c01ec1b7] pci_device_probe+0x1b/0x2d
  [c024a33b] bus_match+0x27/0x45
  [c024a404] driver_attach+0x37/0x66
  [c024a7b9] bus_add_driver+0x77/0x97
  [c024abd4] driver_register+0x51/0x58
  [c01ec375] pci_register_driver+0x85/0xa1
  [f881a00a] mv_init+0xa/0x15 [sata_mv]
  [c013d5a3] sys_init_module+0x1f1/0x2d9
  [c030fa37] syscall_call+0x7/0xb
 bad: scheduling while atomic!
  [c030d515] schedule+0x2d/0x552
  [c0107e2b] handle_IRQ_event+0x25/0x4f
  [c030e40e] schedule_timeout+0xf1/0x10c
  [c012ad7e] process_timeout+0x0/0x5
  [f883f082] mv_scr_read+0xf/0x54 [sata_mv]
  [c012b498] msleep+0x4e/0x54
  [f88e8f3f] __sata_phy_reset+0xa8/0x12e [libata]
  [f883f62f] mv_phy_reset+0xbf/0x11e [sata_mv]
  [c0250f16] end_that_request_last+0x6c/0x7e
  [f883f3bf] mv_host_intr+0xd6/0x142 [sata_mv]
  [f883f500] mv_interrupt+0xd5/0x145 [sata_mv]
  [c0107e2b] handle_IRQ_event+0x25/0x4f
  [c01087d3] do_IRQ+0x18a/0x2bf
  ===
  [c030fb7c] common_interrupt+0x18/0x20
  [f883f618] mv_phy_reset+0xa8/0x11e [sata_mv]
  [c01091d8] setup_irq+0x179/0x181
  [f883f42b] mv_interrupt+0x0/0x145 [sata_mv]
  [f88e8e25] ata_bus_probe+0xe/0x7b [libata]
  [f88eb34d] ata_device_add+0x186/0x202 [libata]
  [f883f97a] mv_init_one+0x197/0x1d5 [sata_mv]
  [c01ec15d] pci_device_probe_static+0x2a/0x3d
  [c01ec18b] __pci_device_probe+0x1b/0x2c
  [c01ec1b7] pci_device_probe+0x1b/0x2d
  [c024a33b] bus_match+0x27/0x45
  [c024a404] driver_attach+0x37/0x66
  [c024a7b9] bus_add_driver+0x77/0x97
  [c024abd4] driver_register+0x51/0x58
  [c01ec375] pci_register_driver+0x85/0xa1
  [f881a00a] mv_init+0xa/0x15 [sata_mv]
  [c013d5a3] sys_init_module+0x1f1/0x2d9
  [c030fa37] syscall_call+0x7/0xb


I don't know how much of the problem comes from BIOS/ACPI and how much
from the combination of this driver with the RHEL kernel and my
hacking. But I don't know how to proceed further, so I'm waiting for 
some hints or patches :-)

Side note: I was able some time ago to use this controller with the
mv_sata driver 3.40, also with a RHEL kernel, without any fiddling
with ACPI.

--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux and system area networks

2001-06-28 Thread Bogdan Costescu

On Wed, 27 Jun 2001, Pekka Pietikainen wrote:

> Providing a wrapper library for use with Infiniband and the current
> SAN boards like WSD would probably be a useful exercise, but to really get
> good performance (especially latency-wise) you probably want to use
> something like MPI. For many applications a wrapper will be enough, though.

I'm sorry, but I don't understand your reference to MPI here. MPI is a
high-level API; MPI can run on top of whatever communication features
exists: TCP/IP, shared memory, VI, etc.
MPI (as well as other "standards" for parallel programming - PVM, OpenMP)
came from the need to have a common interface, not to have all parallel
programs include specific code to deal with TCP/IP, shared memory, VI,
etc. whenever they were available. Instead, MPI serves as a middle-man
between them and the parallel programs. So, MPI cannot be faster than the
underlying communication features.

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux and system area networks

2001-06-28 Thread Bogdan Costescu


On Wed, 27 Jun 2001, Pekka Pietikainen wrote:

 Providing a wrapper library for use with Infiniband and the current
 SAN boards like WSD would probably be a useful exercise, but to really get
 good performance (especially latency-wise) you probably want to use
 something like MPI. For many applications a wrapper will be enough, though.

I'm sorry, but I don't understand your reference to MPI here. MPI is a
high-level API; MPI can run on top of whatever communication features
exists: TCP/IP, shared memory, VI, etc.
MPI (as well as other standards for parallel programming - PVM, OpenMP)
came from the need to have a common interface, not to have all parallel
programs include specific code to deal with TCP/IP, shared memory, VI,
etc. whenever they were available. Instead, MPI serves as a middle-man
between them and the parallel programs. So, MPI cannot be faster than the
underlying communication features.

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3C905B -- EEPROM (i blive so) problem

2001-06-15 Thread Bogdan Costescu

On Wed, 13 Jun 2001, L. K. wrote:

> I have a 3COM 3C905B ethernet card that has been hit by a power outage for
> aprox. 0.5 sec.

What do you mean by "power outage" ? If you mean cutting the power, this
is not a serious reason for EEPROM damages, unless you were modifying it
at that moment.

> I do belive something happened to the eeprom of the card. I would like
> to know if I can overwrite-it with a new one so that I can make my
> ethernet card work again.

Maybe 3Com's DOS-based tool (3c90xcfg.exe) can help.

In order to re-write the EEPROM, you need to use vortex-diag; I think that
you need to hack it a bit as the EEPROM writting code is not enabled. But
most important is that you need a good EEPROM image to write; if you have
another similar card, you can use vortex-diag to dump the EEPROM, then
change the MAC address (if you put both cards on the same network
segment). If you don't have a similar card... you have to download the
card's documentation from 3Com and build your own EEPROM image based on
what you know about your card's capabilities - having an EEPROM image from
a different card might screw things up badly.

If you decide to go the last way, maybe I can help with some
interpretation of the docs - please e-mail me in private.

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3C905B -- EEPROM (i blive so) problem

2001-06-15 Thread Bogdan Costescu


On Wed, 13 Jun 2001, L. K. wrote:

 I have a 3COM 3C905B ethernet card that has been hit by a power outage for
 aprox. 0.5 sec.

What do you mean by power outage ? If you mean cutting the power, this
is not a serious reason for EEPROM damages, unless you were modifying it
at that moment.

 I do belive something happened to the eeprom of the card. I would like
 to know if I can overwrite-it with a new one so that I can make my
 ethernet card work again.

Maybe 3Com's DOS-based tool (3c90xcfg.exe) can help.

In order to re-write the EEPROM, you need to use vortex-diag; I think that
you need to hack it a bit as the EEPROM writting code is not enabled. But
most important is that you need a good EEPROM image to write; if you have
another similar card, you can use vortex-diag to dump the EEPROM, then
change the MAC address (if you put both cards on the same network
segment). If you don't have a similar card... you have to download the
card's documentation from 3Com and build your own EEPROM image based on
what you know about your card's capabilities - having an EEPROM image from
a different card might screw things up badly.

If you decide to go the last way, maybe I can help with some
interpretation of the docs - please e-mail me in private.

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PATCH: ethtool MII helpers

2001-06-12 Thread Bogdan Costescu


On Sun, 10 Jun 2001, Jeff Garzik wrote:

> Comments appreciated.

Some general comments first, the others are spread through the code.

- I don't know what the long-term plan is about ethtool vs. MII ioctl's.
If you do plan to replace completely the MII ioctl's, there should be a
way to access _all_ MII registers provided by the PHY, even if you do this
in a restricted way (i.e. for CAP_NET_ADMIN only). There is also useful
info in other registers than the 4 you have in your implementation.
- You are proposing some caching for the MII registers. I suppose that you
would like to have this code also working with whatever caching will be
done for MII access that was recently discussed. Wouldn't this produce
double caching under some circumstances ?

+   int speed;  /* 10, 100, 1000 or -1 (ask hw) */

Please note that the comment specifies 1000, while the code in several
places assumes only 2 possibilities: 10 and 100.

+   if (mii->autoneg < 0)
+   autoneg = mii->autoneg = (bmcr & BMCR_ANENABLE) ? 1 : 0;
+   elseautoneg = mii->autoneg;

You don't read anything from the hardware at this point. Why do you want
caching ?

Not related: I know that this comes from David Miller's older work, but
wouldn't be possible to have a more uniform naming scheme ? You have
BMCR_ANENABLE, but you have BMSR_ANEGCAPABLE...

+   if (mii->full_duplex < 0)
+   full_duplex = mii->full_duplex =
+   mii_nway_result(negotiated) & LPA_DUPLEX;
+   elsefull_duplex = mii->full_duplex;

If autoneg. is disabled, I don't think that you always get useful info in
'negotiated'. Applies to the next chunk, too.

+   if (mii->speed < 0) {
+   if (negotiated & LPA_100)
+   speed = mii->speed = 100;
+   else
+   speed = mii->speed = 10;
+   } else
+   speed = mii->speed;

That's one of the places where you don't have 1000...

+   ecmd->speed = speed == 100 ? SPEED_100 : SPEED_10;

... and that's the second.

+   ecmd->transceiver = XCVR_INTERNAL;

I didn't understand what XCVR_INTERNAL should mean as opposed to
XCVR_EXTERNAL or whatever. For example: some older 3Com cards use external
transceivers (not on the chip), while newer ones have NWAY capable MII
transceivers on the chip. So, you can have:
1. chip + MII
2. NWAY-chip
3. NWAY-chip + MII
All MII accesses are done through the serial mdio_* protocol. How should
be this handled w.r.t. XCVR_* or is it completely orthogonal?

+   if ((in.phy_address != out.phy_address) ||
+   (in.transceiver != XCVR_INTERNAL) ||
+   (in.maxtxpkt != out.maxtxpkt) ||
+   (in.maxrxpkt != out.maxrxpkt))
+   return -EOPNOTSUPP;

... and here too.

+   if (advert != mii->advertising) {
+   bmcr |= BMCR_ANRESTART;
+   mii->mdio_write(dev, mii->phy_id, MII_ADVERTISE, advert);
+   mii->advertising = advert;
+   }
+
+   /* some phys need autoneg dis/enabled separately from other settings */
+   if ((bmcr & BMCR_ANENABLE) && (!(mii->bmcr & BMCR_ANENABLE))) {
+   mii->mdio_write(dev, mii->phy_id, MII_BMCR,
+   mii->bmcr | BMCR_ANENABLE | BMCR_ANRESTART);
+   bmcr &= ~BMCR_ANRESTART;
+   } else if ((!(bmcr & BMCR_ANENABLE)) && (mii->bmcr & BMCR_ANENABLE)) {
+   mii->mdio_write(dev, mii->phy_id, MII_BMCR,
+   mii->bmcr & ~BMCR_ANENABLE);
+   }

This is nice, but I would like to able to restart autonegotiation even
without changing any of the advertised capabilities. If I missed this
possibility, please point me to it...

Nice work!

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PATCH: ethtool MII helpers

2001-06-12 Thread Bogdan Costescu


On Sun, 10 Jun 2001, Jeff Garzik wrote:

 Comments appreciated.

Some general comments first, the others are spread through the code.

- I don't know what the long-term plan is about ethtool vs. MII ioctl's.
If you do plan to replace completely the MII ioctl's, there should be a
way to access _all_ MII registers provided by the PHY, even if you do this
in a restricted way (i.e. for CAP_NET_ADMIN only). There is also useful
info in other registers than the 4 you have in your implementation.
- You are proposing some caching for the MII registers. I suppose that you
would like to have this code also working with whatever caching will be
done for MII access that was recently discussed. Wouldn't this produce
double caching under some circumstances ?

+   int speed;  /* 10, 100, 1000 or -1 (ask hw) */

Please note that the comment specifies 1000, while the code in several
places assumes only 2 possibilities: 10 and 100.

+   if (mii-autoneg  0)
+   autoneg = mii-autoneg = (bmcr  BMCR_ANENABLE) ? 1 : 0;
+   elseautoneg = mii-autoneg;

You don't read anything from the hardware at this point. Why do you want
caching ?

Not related: I know that this comes from David Miller's older work, but
wouldn't be possible to have a more uniform naming scheme ? You have
BMCR_ANENABLE, but you have BMSR_ANEGCAPABLE...

+   if (mii-full_duplex  0)
+   full_duplex = mii-full_duplex =
+   mii_nway_result(negotiated)  LPA_DUPLEX;
+   elsefull_duplex = mii-full_duplex;

If autoneg. is disabled, I don't think that you always get useful info in
'negotiated'. Applies to the next chunk, too.

+   if (mii-speed  0) {
+   if (negotiated  LPA_100)
+   speed = mii-speed = 100;
+   else
+   speed = mii-speed = 10;
+   } else
+   speed = mii-speed;

That's one of the places where you don't have 1000...

+   ecmd-speed = speed == 100 ? SPEED_100 : SPEED_10;

... and that's the second.

+   ecmd-transceiver = XCVR_INTERNAL;

I didn't understand what XCVR_INTERNAL should mean as opposed to
XCVR_EXTERNAL or whatever. For example: some older 3Com cards use external
transceivers (not on the chip), while newer ones have NWAY capable MII
transceivers on the chip. So, you can have:
1. chip + MII
2. NWAY-chip
3. NWAY-chip + MII
All MII accesses are done through the serial mdio_* protocol. How should
be this handled w.r.t. XCVR_* or is it completely orthogonal?

+   if ((in.phy_address != out.phy_address) ||
+   (in.transceiver != XCVR_INTERNAL) ||
+   (in.maxtxpkt != out.maxtxpkt) ||
+   (in.maxrxpkt != out.maxrxpkt))
+   return -EOPNOTSUPP;

... and here too.

+   if (advert != mii-advertising) {
+   bmcr |= BMCR_ANRESTART;
+   mii-mdio_write(dev, mii-phy_id, MII_ADVERTISE, advert);
+   mii-advertising = advert;
+   }
+
+   /* some phys need autoneg dis/enabled separately from other settings */
+   if ((bmcr  BMCR_ANENABLE)  (!(mii-bmcr  BMCR_ANENABLE))) {
+   mii-mdio_write(dev, mii-phy_id, MII_BMCR,
+   mii-bmcr | BMCR_ANENABLE | BMCR_ANRESTART);
+   bmcr = ~BMCR_ANRESTART;
+   } else if ((!(bmcr  BMCR_ANENABLE))  (mii-bmcr  BMCR_ANENABLE)) {
+   mii-mdio_write(dev, mii-phy_id, MII_BMCR,
+   mii-bmcr  ~BMCR_ANENABLE);
+   }

This is nice, but I would like to able to restart autonegotiation even
without changing any of the advertised capabilities. If I missed this
possibility, please point me to it...

Nice work!

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Looking for device to write device driver for

2001-06-05 Thread Bogdan Costescu

On Sun, 3 Jun 2001, Kip Macy wrote:

> I then tried to get the interface information from 3com on their new
> 3cr990 card to add IPsec offload support to the linux driver.

Which Linux driver ? They only provide a 2.2 one which is in an alpha
stage (as written in it!).

> They responded by telling me that due to IP-heavy nature of the product
> that they would not be releasing the interface.

You were much luckier than me. To me, they said that they don't provide
any support for Linux with these cards when I was only asking for docs for
how to use their own firmware to do basic operations!

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: MII access (was [PATCH] support for Cobalt Networks (x86 only)

2001-06-05 Thread Bogdan Costescu


On Sun, 3 Jun 2001, Jeff Garzik wrote:

> Bogdan Costescu wrote:
> > With clearer mind, I have to make some a correction to one of the previous
> > messages: the problem of not checking arguments range does not apply to
> > 3c59x which has in the ioctl function '& 0x1f' for both transceiver number
> > and register number. However, eepro100 and tulip don't do that. (I'm
> > checking now with 2.4.3 from Mandrake 8, but I don't think that there were
> > recent changes in these areas).
>
> half right -- tulip does this for the phy id but not the MII register
> number.  I'll fix that up.  Please bug Andrey about fixing up
> eepro100...

OK, Andrey is now CC-ed. However, I only checked the 3 mentioned drivers,
while MII ioctl's are used in many more... I was hoping that the
mantainers would jump in!

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Looking for device to write device driver for

2001-06-05 Thread Bogdan Costescu


On Sun, 3 Jun 2001, Kip Macy wrote:

 I then tried to get the interface information from 3com on their new
 3cr990 card to add IPsec offload support to the linux driver.

Which Linux driver ? They only provide a 2.2 one which is in an alpha
stage (as written in it!).

 They responded by telling me that due to IP-heavy nature of the product
 that they would not be releasing the interface.

You were much luckier than me. To me, they said that they don't provide
any support for Linux with these cards when I was only asking for docs for
how to use their own firmware to do basic operations!

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: MII access (was [PATCH] support for Cobalt Networks (x86 only)

2001-06-05 Thread Bogdan Costescu


On Sun, 3 Jun 2001, Jeff Garzik wrote:

 Bogdan Costescu wrote:
  With clearer mind, I have to make some a correction to one of the previous
  messages: the problem of not checking arguments range does not apply to
  3c59x which has in the ioctl function ' 0x1f' for both transceiver number
  and register number. However, eepro100 and tulip don't do that. (I'm
  checking now with 2.4.3 from Mandrake 8, but I don't think that there were
  recent changes in these areas).

 half right -- tulip does this for the phy id but not the MII register
 number.  I'll fix that up.  Please bug Andrey about fixing up
 eepro100...

OK, Andrey is now CC-ed. However, I only checked the 3 mentioned drivers,
while MII ioctl's are used in many more... I was hoping that the
mantainers would jump in!

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: MII access (was [PATCH] support for Cobalt Networks (x86 only)

2001-06-02 Thread Bogdan Costescu

On Sat, 2 Jun 2001, Alan Cox wrote:

> > One application needs to poll link status with 1 second resolution. On a
>
> Then it needs to be privileged

Fine. Can you think of a default value for expiring cache ?

> And if the approach is to block until the time for the next read occurs is
> done then the program get stuck for 30 seconds, misses its deadline and kills
> the cluster - how is this better ??

Is not better. Well, when somebody is playing against you, you're in
trouble either way:
- rate limit: - blocking - as above
  - non-blocking - notify the user that you can't get the info
and probably stop or aquire elevated priviledges and try
to restart the network
- cache: get outdated info

But when a HA application runs, it's usually preferable to stop (and you
notice it) than to continue with wrong data. Especially if you set the
cache expiry to something like 30 seconds; think in terms of how many
transactions/second today's hardware allows...

> Doing the MII monitoring somewhere centralised like the routing daemons would
> certainly let more inteillgent management and reporting get done

I don't argue over this point, already several people mentioned it. But I
explained the present situation in a previous message: the MII info is
normally read at a low rate and some applications need it more often. It
doesn't matter that it's delivered through ioctl, netlink or any other
way, you have to read it from the hardware and deliver to user-space at
user request. So the "doing the MII monitoring" is the tough part.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

MII access (was [PATCH] support for Cobalt Networks (x86 only)systems)

2001-06-02 Thread Bogdan Costescu

[ As this is becoming more and more MII specific, I changed the subject ]

On Sat, 2 Jun 2001, Alan Cox wrote:

> > This only answered the first part of the question: when. How do you pass
> > the "how long" info ?
> > Does the same applies for the MII ioctl case ?
>
> The mtime tells you exactly that.

Alan, please consider this situation:

One application needs to poll link status with 1 second resolution. On a
system where caching is done with an unknown cache expiring time, this
application is sometimes fed incorrect data. So, you need a way to tell
for how long this situation lasts. If you have a proc/ioctl interface for
setting cache expiring time, this same interface can then be used for
reading back this info. This application can then check that this value is
lower than 1 second and if not, notify the user that it cannot run.

As this thread started as a general hardware access problem, would only
_one_ value for all these cases be sufficient ? Or each case should have
its own timeout ? Anyway, for MII, accessing the status at sub-second
intervals might be a legit one, so what measuring units should be used?

> I disagree. A non priviledged app should not be able to poke around in MII
> registers anyway. So you only have to cache the generic state of the link.

At the beginning of this thread, Jeff said "calling the ioctls without
priveleges is quite useful". Now if you say that there is no such case,
the whole problem could simply be solved by checking for the appropriate
priviledges.

I just realized another thing, important (IMHO) if a normal user is still
allowed to access MII: the drivers (checked for 3c59x, eepro100, tulip) do
not verify that the value passed for register number is within the allowed
range and use it as:

int read_cmd = (0xf6 << 10) | ((phy_id & 0x1f) << 5) | location;

(phy_id is the MII address and location is the register number).

There is also no check that the MII address specified is actually in use
by the driver, but this is used with mii-diag to query a MII which was not
correctly identified (maybe this should be allowed for CAP_NET_ADMIN only ?)

>From one of Don Becker pages:
"MII transceivers have 32 management registers. The first 16 are reserved
for standard-defined uses, and the remaining one are available for
chip-specific features. Only the first seven registers are currently
defined."

Usually, the transceivers return garbage if you read from locations you
are not supposed to (overwritting phy_ad).  But if you begin overwritting
the READ command (0xf6 above)... Something like this should do:

int read_cmd = (0xf6 << 10) | ((phy_id & 0x1f) << 5) | (location & 0x1f);

> You don't need timers.

Too tired to think straight yesterday... You're right. And if you alloc
32*sizeof(int) (you want to keep jiffies, right ?) per netdevice, I think
that it could even be done outside the driver. Hmm, most of my
previous arguments are no longer valid 8-(

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

MII access (was [PATCH] support for Cobalt Networks (x86 only)systems)

2001-06-02 Thread Bogdan Costescu



[ As this is becoming more and more MII specific, I changed the subject ]

On Sat, 2 Jun 2001, Alan Cox wrote:

  This only answered the first part of the question: when. How do you pass
  the how long info ?
  Does the same applies for the MII ioctl case ?

 The mtime tells you exactly that.

Alan, please consider this situation:

One application needs to poll link status with 1 second resolution. On a
system where caching is done with an unknown cache expiring time, this
application is sometimes fed incorrect data. So, you need a way to tell
for how long this situation lasts. If you have a proc/ioctl interface for
setting cache expiring time, this same interface can then be used for
reading back this info. This application can then check that this value is
lower than 1 second and if not, notify the user that it cannot run.

As this thread started as a general hardware access problem, would only
_one_ value for all these cases be sufficient ? Or each case should have
its own timeout ? Anyway, for MII, accessing the status at sub-second
intervals might be a legit one, so what measuring units should be used?

 I disagree. A non priviledged app should not be able to poke around in MII
 registers anyway. So you only have to cache the generic state of the link.

At the beginning of this thread, Jeff said calling the ioctls without
priveleges is quite useful. Now if you say that there is no such case,
the whole problem could simply be solved by checking for the appropriate
priviledges.

I just realized another thing, important (IMHO) if a normal user is still
allowed to access MII: the drivers (checked for 3c59x, eepro100, tulip) do
not verify that the value passed for register number is within the allowed
range and use it as:

int read_cmd = (0xf6  10) | ((phy_id  0x1f)  5) | location;

(phy_id is the MII address and location is the register number).

There is also no check that the MII address specified is actually in use
by the driver, but this is used with mii-diag to query a MII which was not
correctly identified (maybe this should be allowed for CAP_NET_ADMIN only ?)

From one of Don Becker pages:
MII transceivers have 32 management registers. The first 16 are reserved
for standard-defined uses, and the remaining one are available for
chip-specific features. Only the first seven registers are currently
defined.

Usually, the transceivers return garbage if you read from locations you
are not supposed to (overwritting phy_ad).  But if you begin overwritting
the READ command (0xf6 above)... Something like this should do:

int read_cmd = (0xf6  10) | ((phy_id  0x1f)  5) | (location  0x1f);

 You don't need timers.

Too tired to think straight yesterday... You're right. And if you alloc
32*sizeof(int) (you want to keep jiffies, right ?) per netdevice, I think
that it could even be done outside the driver. Hmm, most of my
previous arguments are no longer valid 8-(

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: MII access (was [PATCH] support for Cobalt Networks (x86 only)

2001-06-02 Thread Bogdan Costescu


On Sat, 2 Jun 2001, Alan Cox wrote:

  One application needs to poll link status with 1 second resolution. On a

 Then it needs to be privileged

Fine. Can you think of a default value for expiring cache ?

 And if the approach is to block until the time for the next read occurs is
 done then the program get stuck for 30 seconds, misses its deadline and kills
 the cluster - how is this better ??

Is not better. Well, when somebody is playing against you, you're in
trouble either way:
- rate limit: - blocking - as above
  - non-blocking - notify the user that you can't get the info
and probably stop or aquire elevated priviledges and try
to restart the network
- cache: get outdated info

But when a HA application runs, it's usually preferable to stop (and you
notice it) than to continue with wrong data. Especially if you set the
cache expiry to something like 30 seconds; think in terms of how many
transactions/second today's hardware allows...

 Doing the MII monitoring somewhere centralised like the routing daemons would
 certainly let more inteillgent management and reporting get done

I don't argue over this point, already several people mentioned it. But I
explained the present situation in a previous message: the MII info is
normally read at a low rate and some applications need it more often. It
doesn't matter that it's delivered through ioctl, netlink or any other
way, you have to read it from the hardware and deliver to user-space at
user request. So the doing the MII monitoring is the tough part.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] support for Cobalt Networks (x86 only) systems (for

2001-06-01 Thread Bogdan Costescu

[ OK, this time I cc'ed netdev 8-) ]

On Fri, 1 Jun 2001, Alan Cox wrote:

> Please re-read your comment. Then think about it. Then tell me how rate
> limiting differs from caching to the application.

For caching, the kernel establishes the rate with which the info is
updated. There's nothing wrong, but how is the application to know if the
value is actual or cached (from when, until when) ? That means that a
single application that needs data more often than the caching rate will
get bogus data and not know about it.

With rate limiting, you always get new values, unless the limit is
exceeded. When the limit is exceeded, you log and:
- block any request until some timer is expired. The application can
detect that it's been blocked and react. You can detect if there are
several calls waiting and return the same value to all.
- return error until some timer is expired. The application can again
detect that.
In both cases, the application is also capable of guessing the value of
the delay.

For one application which follows the rules (doesn't need data more often
than the caching rate or doesn't exceed the rate limit) there is no
difference, I agree. But when somebody is playing tricks while you need
data, you have the chance of detecting this by using rate limits.

And yes, I agree that either of them (cache or rate limit) should be
modifiable through proc entry/ioctl/whatever.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis

2001-06-01 Thread Bogdan Costescu

On Fri, 1 Jun 2001, jamal wrote:

> Jeff, Thanks for copying netdev. Wish more people would do that.

Shame on me, I should have thought of that too... I joined lkml only about
2 weeks ago because netdev related topics are sometimes discussed only
there...

> Not really.
>
> One idea i have been toying with is to maintain hysteris or threshold of
> some form in dev_watchdog;

AFAIK, dev_watchdog is right now used only for Tx (if I'm wrong, please
correct me!). So how do you sense link loss if you expect only high Rx
traffic ?

> example: if watchdog timer expires threshold times, you declare the link
> dead and send netif_carrier_off netlink message.
> On recovery, you send  netif_carrier_on

I assume that you mean "on recovery" as in "first succesful hard_start_xmit".

> Assumption:
> If the tx path is blocked, more than likely the link is down.

Yes, but is this a good approximation ? I'm not saying that it's not, I'm
merely asking for counter-arguments.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis

2001-06-01 Thread Bogdan Costescu


On Fri, 1 Jun 2001, David S. Miller wrote:

> Don't such HA apps need to run as root anyways?

Not necessarily, but eventually you can let root (CAP_NET_ADMIN, anyway)
go through without any limitations, root can bring down the system at will
in other ways.

In addition, the rate limiting solution allows a warning to be issued when
the limit is exceeded, so that the poor sysadmin knows what hit him 8-)

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis

2001-06-01 Thread Bogdan Costescu

On Fri, 1 Jun 2001, Jeff Garzik wrote:

> The loss and regain of link status should be proactively signalled to
> userspace using netlink or something similar.

[ For the general discussion ]
I fully agree, but I just wanted to give an example of legit use from
user space of _current_ values from hardware.

>  Currently we have
> netif_carrier_{on,off,ok} but it is only passively checked.
> netif_carrier_{on,off} should probably schedule_task() to fire off a
> netlink message...

[ Link status details ]
Just that not all NICs have hardware support (and/or not all drivers use
these facilities) for link status change notification using interrupts.
Right now, most drivers _poll_ for media status and based on the poll
rate, netif_carrier routines are (or should be) called. We can't make the
poll rate very small for the general case, as MII access is time
consuming (same discussion was some months ago when the bonding driver
was updated). However, for users who know that they need this info to be
more accurate (at the expense of CPU time), polling through ioctl's is the
only solution.

[ Back to general discussion ]
So far, to the problem of too often access to hardware, 2 solutions were
proposed:
1. cache the values. You can then let the user shoot him-/her-self in the
   foot by making too many ioctl calls. But this prevent any legit use of
   current hardware state.
2. rate limiting. You don't let the user access the hardware too often (to
   be defined), so he/she can't shoot his-/her-self in the foot. Legit use
   of current hardware state is possible.

IMHO, solution 2 is much better. Can you find situations when it's not ?

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis

2001-06-01 Thread Bogdan Costescu


On Fri, 1 Jun 2001, Alan Cox wrote:

> I am sure that to an unpriviledged application reporting back the same result
> as we saw last time we asked the hardware unless it is over 30 seconds old
> will work fine. Maybe 10 for link partner ?

No way! If I implement a HA application which depends on link status, I
want the info to be accurate, I don't want to know that 30 seconds ago I
had good link.

IMHO, rate limiting is the only solution.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] support for Cobalt Networks (x86 only) systems (for real this time)

2001-06-01 Thread Bogdan Costescu


On Fri, 1 Jun 2001, Pete Zaitcev wrote:

> > But, each time a user cats this proc file, the user is banging the
> > hardware.  What happens when a malicious user forks off 100 processes to
> > continually cat this file?  :)
>
> Nothing good, probably. Same story as /proc/apm, which only
> hits BIOS instead (and it's debateable what is better).

Hmm, the MII related ioctl's in some net drivers (checked for 3c59x,
tulip, eepro100) are also querying the hardware. And a user is allowed to
ask for this info (but not able to modify it).

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] support for Cobalt Networks (x86 only) systems (for real this time)

2001-06-01 Thread Bogdan Costescu


On Fri, 1 Jun 2001, Pete Zaitcev wrote:

  But, each time a user cats this proc file, the user is banging the
  hardware.  What happens when a malicious user forks off 100 processes to
  continually cat this file?  :)

 Nothing good, probably. Same story as /proc/apm, which only
 hits BIOS instead (and it's debateable what is better).

Hmm, the MII related ioctl's in some net drivers (checked for 3c59x,
tulip, eepro100) are also querying the hardware. And a user is allowed to
ask for this info (but not able to modify it).

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] support for Cobalt Networks (x86 only) systems (for

2001-06-01 Thread Bogdan Costescu



[ OK, this time I cc'ed netdev 8-) ]

On Fri, 1 Jun 2001, Alan Cox wrote:

 Please re-read your comment. Then think about it. Then tell me how rate
 limiting differs from caching to the application.

For caching, the kernel establishes the rate with which the info is
updated. There's nothing wrong, but how is the application to know if the
value is actual or cached (from when, until when) ? That means that a
single application that needs data more often than the caching rate will
get bogus data and not know about it.

With rate limiting, you always get new values, unless the limit is
exceeded. When the limit is exceeded, you log and:
- block any request until some timer is expired. The application can
detect that it's been blocked and react. You can detect if there are
several calls waiting and return the same value to all.
- return error until some timer is expired. The application can again
detect that.
In both cases, the application is also capable of guessing the value of
the delay.

For one application which follows the rules (doesn't need data more often
than the caching rate or doesn't exceed the rate limit) there is no
difference, I agree. But when somebody is playing tricks while you need
data, you have the chance of detecting this by using rate limits.

And yes, I agree that either of them (cache or rate limit) should be
modifiable through proc entry/ioctl/whatever.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]







-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis

2001-06-01 Thread Bogdan Costescu


On Fri, 1 Jun 2001, Jeff Garzik wrote:

 The loss and regain of link status should be proactively signalled to
 userspace using netlink or something similar.

[ For the general discussion ]
I fully agree, but I just wanted to give an example of legit use from
user space of _current_ values from hardware.

  Currently we have
 netif_carrier_{on,off,ok} but it is only passively checked.
 netif_carrier_{on,off} should probably schedule_task() to fire off a
 netlink message...

[ Link status details ]
Just that not all NICs have hardware support (and/or not all drivers use
these facilities) for link status change notification using interrupts.
Right now, most drivers _poll_ for media status and based on the poll
rate, netif_carrier routines are (or should be) called. We can't make the
poll rate very small for the general case, as MII access is time
consuming (same discussion was some months ago when the bonding driver
was updated). However, for users who know that they need this info to be
more accurate (at the expense of CPU time), polling through ioctl's is the
only solution.

[ Back to general discussion ]
So far, to the problem of too often access to hardware, 2 solutions were
proposed:
1. cache the values. You can then let the user shoot him-/her-self in the
   foot by making too many ioctl calls. But this prevent any legit use of
   current hardware state.
2. rate limiting. You don't let the user access the hardware too often (to
   be defined), so he/she can't shoot his-/her-self in the foot. Legit use
   of current hardware state is possible.

IMHO, solution 2 is much better. Can you find situations when it's not ?

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis

2001-06-01 Thread Bogdan Costescu


On Fri, 1 Jun 2001, David S. Miller wrote:

 Don't such HA apps need to run as root anyways?

Not necessarily, but eventually you can let root (CAP_NET_ADMIN, anyway)
go through without any limitations, root can bring down the system at will
in other ways.

In addition, the rate limiting solution allows a warning to be issued when
the limit is exceeded, so that the poor sysadmin knows what hit him 8-)

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis

2001-06-01 Thread Bogdan Costescu


On Fri, 1 Jun 2001, Alan Cox wrote:

 I am sure that to an unpriviledged application reporting back the same result
 as we saw last time we asked the hardware unless it is over 30 seconds old
 will work fine. Maybe 10 for link partner ?

No way! If I implement a HA application which depends on link status, I
want the info to be accurate, I don't want to know that 30 seconds ago I
had good link.

IMHO, rate limiting is the only solution.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] support for Cobalt Networks (x86 only) systems (forrealthis

2001-06-01 Thread Bogdan Costescu


On Fri, 1 Jun 2001, jamal wrote:

 Jeff, Thanks for copying netdev. Wish more people would do that.

Shame on me, I should have thought of that too... I joined lkml only about
2 weeks ago because netdev related topics are sometimes discussed only
there...

 Not really.

 One idea i have been toying with is to maintain hysteris or threshold of
 some form in dev_watchdog;

AFAIK, dev_watchdog is right now used only for Tx (if I'm wrong, please
correct me!). So how do you sense link loss if you expect only high Rx
traffic ?

 example: if watchdog timer expires threshold times, you declare the link
 dead and send netif_carrier_off netlink message.
 On recovery, you send  netif_carrier_on

I assume that you mean on recovery as in first succesful hard_start_xmit.

 Assumption:
 If the tx path is blocked, more than likely the link is down.

Yes, but is this a good approximation ? I'm not saying that it's not, I'm
merely asking for counter-arguments.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: APIC problem or 3com 3c590 driver problem in smp kernel 2.4.x

2001-05-31 Thread Bogdan Costescu


On Wed, 30 May 2001, Feng Xian wrote:

> when I run the kernel smp-2.4.x, my PCI
> device can not receive any interrupt while the /proc/interrupts shows that
> 3c905 receives over million of interrupts and number grows very fast.

That's a bit strange as one of the first things done in 3c59x ISR is:

if ((status & IntLatch) == 0)
 goto handler_exit;  /* No interrupt: shared IRQs can cause this */

with which the driver detects if the interrupt was generated by the card.
Are you sure the driver for the other PCI device knows to play nice with
shared interrupts ?

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: APIC problem or 3com 3c590 driver problem in smp kernel 2.4.x

2001-05-31 Thread Bogdan Costescu


On Wed, 30 May 2001, Feng Xian wrote:

 when I run the kernel smp-2.4.x, my PCI
 device can not receive any interrupt while the /proc/interrupts shows that
 3c905 receives over million of interrupts and number grows very fast.

That's a bit strange as one of the first things done in 3c59x ISR is:

if ((status  IntLatch) == 0)
 goto handler_exit;  /* No interrupt: shared IRQs can cause this */

with which the driver detects if the interrupt was generated by the card.
Are you sure the driver for the other PCI device knows to play nice with
shared interrupts ?

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: LANANA: To Pending Device Number Registrants

2001-05-16 Thread Bogdan Costescu

On Tue, 15 May 2001, Jonathan Lundell wrote:

> >The 2.4 kernel allows you to rename an interface.  So you can build
> >a little database of (MAC address/name) pairs. Apply this after booting
> >and before bringing up the interfaces and everything has the name
> >you wanted, based on MAC address.
>
> There's a bit of a catch 22, though, if you don't have unique MAC
> addresses in the system (across multiple interfaces).

The same situation appears when using bonding.o. For several years,
Don Becker's (and derived) network drivers support changing MAC address
when the interface is down. So Al's /dev/eth//MAC has different values
depending on whether bonding is active or not. Should /dev/eth//MAC
always have the original value (to be able to uniquely identify this card)
or the in-use value (used by ARP, I believe) ? Or maybe have a
/dev/eth//MAC_in_use ?

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: LANANA: To Pending Device Number Registrants

2001-05-16 Thread Bogdan Costescu


On Tue, 15 May 2001, Jonathan Lundell wrote:

 The 2.4 kernel allows you to rename an interface.  So you can build
 a little database of (MAC address/name) pairs. Apply this after booting
 and before bringing up the interfaces and everything has the name
 you wanted, based on MAC address.

 There's a bit of a catch 22, though, if you don't have unique MAC
 addresses in the system (across multiple interfaces).

The same situation appears when using bonding.o. For several years,
Don Becker's (and derived) network drivers support changing MAC address
when the interface is down. So Al's /dev/eth/n/MAC has different values
depending on whether bonding is active or not. Should /dev/eth/n/MAC
always have the original value (to be able to uniquely identify this card)
or the in-use value (used by ARP, I believe) ? Or maybe have a
/dev/eth/n/MAC_in_use ?

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3com905b freeze Alpha SMP 2.4.2] FullDuplex issue ?

2001-05-02 Thread Bogdan Costescu

On Wed, 2 May 2001, Cabaniols, Sebastien wrote:

> I insert the 3c59x module with debug=7.

Why ? debug=7 is the highest debug level and produces _lots_ of debug data
for high network activity. Do you have problems when insmod-ing without
any option and use a higher debug level just to see what's going on?

> The first of the above machines launching the get freezes.

Why do you believe that the card/driver is responsible for the freeze ?
The outputs that you provided show no problems to me.

A duplex mismatch would not freeze a computer. You would get crappy
transfer rates, usually some error messages from the driver, but
everything should otherwise work. To verify the media settings, you might
want to use mii-diag (from ftp.scyld.com).

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3com905b freeze Alpha SMP 2.4.2] FullDuplex issue ?

2001-05-02 Thread Bogdan Costescu


On Wed, 2 May 2001, Cabaniols, Sebastien wrote:

 I insert the 3c59x module with debug=7.

Why ? debug=7 is the highest debug level and produces _lots_ of debug data
for high network activity. Do you have problems when insmod-ing without
any option and use a higher debug level just to see what's going on?

 The first of the above machines launching the get freezes.

Why do you believe that the card/driver is responsible for the freeze ?
The outputs that you provided show no problems to me.

A duplex mismatch would not freeze a computer. You would get crappy
transfer rates, usually some error messages from the driver, but
everything should otherwise work. To verify the media settings, you might
want to use mii-diag (from ftp.scyld.com).

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Configuring synchronous interfaces in Linux

2000-12-01 Thread Bogdan Costescu

On Fri, 1 Dec 2000, Chris Wedgwood wrote:

> Actually; Ethernet badly needs something like this too. I would kill
> to be able to do something like:
>
>   ifconfig eth0 speed 100 duplex full

Even if you are thinking about Ethernet only, it's not easy to do it. Most
modern NICs have MII transceivers, where media setting is more or less
following a standard. All drivers written by Donald Becker and probably
everything derived from them support MII get/set operations from
user-space through ioctls, using mii-diag (from ftp.scyld.com).
But there are NICs which do not have MII transceivers and media
setting/selection is NIC-specific. Take a look at the media specific
module options for several drivers (e.g. 3c59x and tulip) and you'll see
what I'm talking about.

Moreover, with the proposed ifconfig interface, there is a problem: do you
want the media setting to be locked ? Quite a lot of NICs can do
NWAY autonegotiation or the driver can go through the available modes
trying to get one working. So if you say "I want to use this speed", do
you want to specifically use that speed or give it just as a starting
point for the driver which can decrease the speed in case it's not able to
get it ? (the example is Ethernet specific, but the ideea is not).

And finally (also Ethernet specific): some devices don't like forced
media settings when they support autonegotiation. Look at the tulip recent
archives for examples.

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [RFC] Configuring synchronous interfaces in Linux

2000-12-01 Thread Bogdan Costescu


On Fri, 1 Dec 2000, Chris Wedgwood wrote:

 Actually; Ethernet badly needs something like this too. I would kill
 to be able to do something like:

   ifconfig eth0 speed 100 duplex full

Even if you are thinking about Ethernet only, it's not easy to do it. Most
modern NICs have MII transceivers, where media setting is more or less
following a standard. All drivers written by Donald Becker and probably
everything derived from them support MII get/set operations from
user-space through ioctls, using mii-diag (from ftp.scyld.com).
But there are NICs which do not have MII transceivers and media
setting/selection is NIC-specific. Take a look at the media specific
module options for several drivers (e.g. 3c59x and tulip) and you'll see
what I'm talking about.

Moreover, with the proposed ifconfig interface, there is a problem: do you
want the media setting to be locked ? Quite a lot of NICs can do
NWAY autonegotiation or the driver can go through the available modes
trying to get one working. So if you say "I want to use this speed", do
you want to specifically use that speed or give it just as a starting
point for the driver which can decrease the speed in case it's not able to
get it ? (the example is Ethernet specific, but the ideea is not).

And finally (also Ethernet specific): some devices don't like forced
media settings when they support autonegotiation. Look at the tulip recent
archives for examples.

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Preallocated skb's?

2000-09-15 Thread Bogdan Costescu

On Fri, 15 Sep 2000, jamal wrote:

> Only the timer runs at HZ granularity ;-<

Some cards provide their own high resolution timers; latest 3Com cards
provide several with different purposes (none currently used). The
question is how many of these also provide the Rx early interrupts.
You also mentioned an auto-tunable Rx mitigation scheme. How do you
implement it without using hardware timers ?

> 20Msec is probably too much time. If my math is not wrong, 1 bit time in
> a 100Mps is 1 ns; 64 bytes is 512ns.

I think your are wrong by a factor of 10 here, 1 bit time at 100Mbps
should be 10 ns. Then 64 bytes is 5.12 us (u=micro). Anyway, this is
comparable with the time needed to reach ISR, so you can have several
(but small number) of packets already waiting for processing.

> You use the period(5-10micros), while waiting
> for full packet arrival, to make the route decision (lookup etc).
> i.e this will allow for a better FF; it will not offload things.

Just that you span several layers by doing this, it's not driver specific
anymore.

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Preallocated skb's?

2000-09-15 Thread Bogdan Costescu


On Thu, 14 Sep 2000, jamal wrote:

> If i remember correctly some of the 3coms still give this 'mid-interupt',
> no? It could useful to just say quickly read the header and make routing
> decisions as in fast routing but not under heavy load.

The 3Com cards can generate this interrupt, however this is not used in
current 3c59x.c. I suggested this to Andrew, but he is already worried
about the current interrupt rate and unhappy that 3Com cards do not
provide hardware support for Rx mitigation.

An ideea might be to combine Rx early interrupts with some kind of
software timer-based mitigation. IMHO this has 2 advantages:
- because of the overhead that Andrew pointed out, by the time the CPU
reaches the ISR code and the skbuff allocation is done, the entire packet
might already be transferred; however, a check has to be done to assure
that the packet was not dropped by the hardware and you try to fit a
packet in a skbuff sized for the previous packet (in case several packets
can be transferred during the "overhead" time)
- under load, because interrupts occur anyway (the Rx early ones), you
don't loose anything in terms of latency.

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Preallocated skb's?

2000-09-15 Thread Bogdan Costescu


On Thu, 14 Sep 2000, jamal wrote:

 If i remember correctly some of the 3coms still give this 'mid-interupt',
 no? It could useful to just say quickly read the header and make routing
 decisions as in fast routing but not under heavy load.

The 3Com cards can generate this interrupt, however this is not used in
current 3c59x.c. I suggested this to Andrew, but he is already worried
about the current interrupt rate and unhappy that 3Com cards do not
provide hardware support for Rx mitigation.

An ideea might be to combine Rx early interrupts with some kind of
software timer-based mitigation. IMHO this has 2 advantages:
- because of the overhead that Andrew pointed out, by the time the CPU
reaches the ISR code and the skbuff allocation is done, the entire packet
might already be transferred; however, a check has to be done to assure
that the packet was not dropped by the hardware and you try to fit a
packet in a skbuff sized for the previous packet (in case several packets
can be transferred during the "overhead" time)
- under load, because interrupts occur anyway (the Rx early ones), you
don't loose anything in terms of latency.

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Preallocated skb's?

2000-09-15 Thread Bogdan Costescu


On Fri, 15 Sep 2000, jamal wrote:

 Only the timer runs at HZ granularity ;-

Some cards provide their own high resolution timers; latest 3Com cards
provide several with different purposes (none currently used). The
question is how many of these also provide the Rx early interrupts.
You also mentioned an auto-tunable Rx mitigation scheme. How do you
implement it without using hardware timers ?

 20Msec is probably too much time. If my math is not wrong, 1 bit time in
 a 100Mps is 1 ns; 64 bytes is 512ns.

I think your are wrong by a factor of 10 here, 1 bit time at 100Mbps
should be 10 ns. Then 64 bytes is 5.12 us (u=micro). Anyway, this is
comparable with the time needed to reach ISR, so you can have several
(but small number) of packets already waiting for processing.

 You use the period(5-10micros), while waiting
 for full packet arrival, to make the route decision (lookup etc).
 i.e this will allow for a better FF; it will not offload things.

Just that you span several layers by doing this, it's not driver specific
anymore.

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

50 matches

Mail list logo