Re: [patch] ipv4: fix lock usage in udp_ioctl
On Thu, Jun 15, 2006 at 08:28:07AM +0200, Ingo Molnar wrote: > > * Herbert Xu <[EMAIL PROTECTED]> wrote: > > > This is bogus. These two locks belong to two different queues and > > they never intersect. > > yeah - qeth does its own skb-queue management here, and it's done in an > irq-safe manner. > > Heiko, in qeth_main.c, could you do something like: > > + static struct lockdep_type_key qdio_out_skb_queue_key; > > ... > skb_queue_head_init(&card->qdio.out_qs[i]->bufs[j]. >skb_list); > + lockdep_reinit_key(&card->qdio.out_qs[i]->bufs[j].skb_list, > &qdio_out_skb_queue_key) How about the patch below? The warning goes away and I assume "tmp_list" needs lockdep_reinit_key too, since it should have the same locking rules as the rest of qeth's skb-queue management. From: Heiko Carstens <[EMAIL PROTECTED]> Avoid false positive illegal lock usage message in qeth driver. Signed-off-by: Heiko Carstens <[EMAIL PROTECTED]> --- drivers/s390/net/qeth_main.c |6 ++ 1 file changed, 6 insertions(+) --- a/drivers/s390/net/qeth_main.c 2006-06-15 08:46:26.0 +0200 +++ b/drivers/s390/net/qeth_main.c 2006-06-15 08:29:58.0 +0200 @@ -85,6 +85,8 @@ static debug_info_t *qeth_dbf_qerr = NUL DEFINE_PER_CPU(char[256], qeth_dbf_txt_buf); +static struct lockdep_type_key qdio_out_skb_queue_key; + /** * some more definitions and declarations */ @@ -3230,6 +3232,9 @@ qeth_alloc_qdio_buffers(struct qeth_card &card->qdio.out_qs[i]->qdio_bufs[j]; skb_queue_head_init(&card->qdio.out_qs[i]->bufs[j]. skb_list); + lockdep_reinit_key( + &card->qdio.out_qs[i]->bufs[j].skb_list.lock, + &qdio_out_skb_queue_key); INIT_LIST_HEAD(&card->qdio.out_qs[i]->bufs[j].ctx_list); } } @@ -5273,6 +5278,7 @@ qeth_free_vlan_buffer(struct qeth_card * struct sk_buff_head tmp_list; skb_queue_head_init(&tmp_list); + lockdep_reinit_key(&tmp_list.lock, &qdio_out_skb_queue_key); for(i = 0; i < QETH_MAX_BUFFER_ELEMENTS(card); ++i){ while ((skb = skb_dequeue(&buf->skb_list))){ if (vlan_tx_tag_present(skb) && - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.17-rc6] Remove Prism II support from Orinoco
Hi, Pavel Roskin wrote: > The patch in question was never submitted to the orinoco mailing list. > I believe any such changes should be discussed by people using the > driver and participating in its development. It's not some minor change > or API update. I'm sorry for not submitting/CCing this to the orinoco mailing list, I should have. > I'm ready to consider disabling some ID's conditionally, primarily for > systems that cannot use udev. But it's far from the top of my TODO > list. And I'm not sure it would actually help users of desktop > distributions. Well, that would be very nice, thanks! But do you think that it will be enough? I mean, orinoco_{pci,plx,tdm} exist only to support Prism2 chipsets. HostAP a rather big and complete driver for Prism2/2.5/3.0 chipsets; we're going to have another one based on the dscape stack too. Do you think that there's a point on having the orinoco driver supporting this chipset any more? I don't mean to disrespect your work -- orinoco has served us for quite a few years, but HostAP seems like a better alternative for these cards. Please don't get offended by this -- I'm sure you've spent quite some time on the particular code. Regarding the disabling of IDs, I could prepare a patch for orinoco_cs that would disable Prism2 support via a configuration option. Would that be helpful/acceptable? Best regards, Faidon - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] ipv4: fix lock usage in udp_ioctl
* Herbert Xu <[EMAIL PROTECTED]> wrote: > This is bogus. These two locks belong to two different queues and > they never intersect. yeah - qeth does its own skb-queue management here, and it's done in an irq-safe manner. Heiko, in qeth_main.c, could you do something like: + static struct lockdep_type_key qdio_out_skb_queue_key; ... skb_queue_head_init(&card->qdio.out_qs[i]->bufs[j]. skb_list); + lockdep_reinit_key(&card->qdio.out_qs[i]->bufs[j].skb_list, &qdio_out_skb_queue_key) Ingo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.17-rc6] Remove Prism II support from Orinoco
On Mon, 2006-06-12 at 17:10 -0700, Jesse Brandeburg wrote: > my problem is that for my prism 2 adapter both drivers are loaded at > which point neither of them works. I'm running FC5, and i have to > keep removing the orinoco*.ko files to keep them from loading, so I'm > all for this patch. I believe the right solution would be to do it in userspace. The kernel should not be making decisions which driver is _better_ for the device. I'm yet to see any serious arguments why the kernel should be doing it. As for non-working driver, this should be reported with sufficient details. I haven't seen any detailed reports of this problem. -- Regards, Pavel Roskin - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.17-rc6] Remove Prism II support from Orinoco
Hello, John! On Mon, 2006-06-12 at 11:24 -0400, John W. Linville wrote: > On Mon, Jun 12, 2006 at 01:49:54AM +0300, Faidon Liambotis wrote: > > > Having two drivers supporting the same set of hardware seems pretty > > pointless to me. Plus, it confuses hotplugging/automatic detection. > > This subject comes-up from time to time. In fact, I'm pretty sure > it came-up very recently w.r.t. orinoco and hostap. > > The consensus seems to be that drivers should have IDs for all devices > they support, even if that means that some devices are supported by > multiple drivers. This leaves the choice of which driver to use in > the hands of the user and/or distro. > > If the Orinoco guys want this patch, I'll consider it. Otherwise, > I'm not inclined to take it. I really appreciate your position in this regard. The patch in question was never submitted to the orinoco mailing list. I believe any such changes should be discussed by people using the driver and participating in its development. It's not some minor change or API update. I'm ready to consider disabling some ID's conditionally, primarily for systems that cannot use udev. But it's far from the top of my TODO list. And I'm not sure it would actually help users of desktop distributions. -- Regards, Pavel Roskin - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] e100 statistic value "rx_bytes" error
Wei Dong wrote: Hi All: When I test linux kernel(2.6.9-16), I found that maybe there is a bug in e100 driver. See function e100_rx_indicate() at line 1847: nic->net_stats.rx_bytes += actual_size; Here, actual_size is the actual size of an ethernent frame sans FCS.And the e100 driver gets it from skb. Because "rx_bytes" is a statistc value for a NIC, I think rx_bytes should include the FCS(4 bytes). The following is the patch for the function in e100.c This is definately not an issue, and I'm not for changing this: It always was like this in the first place. It's done for many drivers like this anyway, mostly those without real hardware counters do it this way anyway (I count half a dozen or so on first glance). On top of that we would be changing statistics numbers after x years of e100 driver. I'm sure everyone doing real performance work will frown upon this. Next it's unlikely that every driver (or worse, every nic in hardware) accounts for FCS in the rx_bytes count. It really wouldn't surprise me if a driver (or chip) got this wrong here or there. Bottom line is that for e100, it's well known and easily seeable that e100 is counting skb sizes. That's consistent and I think we should keep it that way. Auke PS please cc the driver maintainers when you post patches to a specific driver. diff -ruN old/drivers/net/e100.c new/drivers/net/e100.c --- old/drivers/net/e100.c 2006-03-20 13:53:29.0 +0800 +++ new/drivers/net/e100.c 2006-06-15 11:16:04.0 +0800 @@ -1844,7 +1844,8 @@ dev_kfree_skb_any(skb); } else { nic->net_stats.rx_packets++; - nic->net_stats.rx_bytes += actual_size; + /* Don't forget FCS */ + nic->net_stats.rx_bytes += actual_size + 4; nic->netdev->last_rx = jiffies; netif_receive_skb(skb); if(work_done) BR. Weidong Signed-off-by: Weidong <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch] e100 statistic value "rx_bytes" error
Hi All: When I test linux kernel(2.6.9-16), I found that maybe there is a bug in e100 driver. See function e100_rx_indicate() at line 1847: nic->net_stats.rx_bytes += actual_size; Here, actual_size is the actual size of an ethernent frame sans FCS.And the e100 driver gets it from skb. Because "rx_bytes" is a statistc value for a NIC, I think rx_bytes should include the FCS(4 bytes). The following is the patch for the function in e100.c diff -ruN old/drivers/net/e100.c new/drivers/net/e100.c --- old/drivers/net/e100.c 2006-03-20 13:53:29.0 +0800 +++ new/drivers/net/e100.c 2006-06-15 11:16:04.0 +0800 @@ -1844,7 +1844,8 @@ dev_kfree_skb_any(skb); } else { nic->net_stats.rx_packets++; - nic->net_stats.rx_bytes += actual_size; + /* Don't forget FCS */ + nic->net_stats.rx_bytes += actual_size + 4; nic->netdev->last_rx = jiffies; netif_receive_skb(skb); if(work_done) BR. Weidong Signed-off-by: Weidong <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Make in-kernel hostap less annoying
On Mon, Jun 12, 2006 at 03:13:02PM -0400, Kyle McMartin wrote: > Most user don't want their kern.log/dmesg filled with > debugging gibberish, and could turn it on if prompted. > > ( Example: > wifi0: TXEXC - status=0x0004 ([Discon]) tx_control=000c > retry_count=0 tx_rate=0 fc=0x0108 (Data::0 ToDS) > A1=00:0f:66:43:d7:0a A2=00:05:3c:06:63:01 A3=33:33:00:00:00:16 > A4=00:00:00:00:00:00 ) I agree with removing these by default. However, I would prefer to do this in more selective manor than disabling all debugging information at build time. This would probably involve going through all debug messages using this mechanism and selecting whether they are reasonable to enable by default or not and ideally doing this as a run-time option. > Also make hostap default to managed mode, instead of master mode, which > has bitten a few users expecting it to behave like the orinoco driver > it is replacing. NAK. Host AP has been configured to use master mode by default for the past six years and that is what most users would expect it to continue to do. I do understand that this default differs from all drivers that do not support AP mode, but I think it is too late to change this now. The default could change once Host AP gets replaced with net/d80211-based implementation for Prism2/2.5/3, but I would not change this for Host AP driver. > Two minor things I've been carrying around in my personal tree > for quite some time. (This is only relevant to the in-kernel driver, > I see no reason to change the out-of-tree driver.) That would be even more confusing for the default mode.. I believe that both versions should continue to use Master mode as the default unless overridden by user. -- Jouni MalinenPGP id EFC895FA - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)
On Wed, 2006-06-14 at 11:57 +0100, Alan Cox wrote: > The other problem I see with this code is it is very tightly tied to ATM > cell sizes, not to solving the generic question of packetisation. Others have made this point also. I can't speak for Jesper, but I did consider making it generic. The issue was that doing so would add more code, but I don't personally know of any real world situation that would use the generic solution. I didn't fancy the thought of arguing on these lists for code that no one would actually use. If someone could put up their hand and say "Hey, I need this," then expanding the patch to accommodate them would be a pleasure. I like generic code too. Russell - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] e1000: fix netpoll with NAPI
On Wed, 14 Jun 2006, Neil Horman wrote: > Hey, as promised, I've done some rudimentary performance benchmarking on > various > ways that we have talked about to solve this problem. As I previously > mentioned We see the same results here, Neil. However, we've got a much less invasive patch undergoing internal review, and which we will post to netdev once everybody gets happy with it. Basically, we just do our NAPI scheduling on the "real" netdev structure instead of our polling netdev, in the case where we only have one RX queue. Since this is the case for all our currently-shipping parts under Linux, netpoll works again across the board. It's a short-term fix because we do want to support multiple queues going forward, but for now we need to get everybody working. One of our engineers (on the I/O AT team) has been tasked with modifying the Linux kernel to properly support multiple hardware queues (both TX and RX). We'll make sure that he looks at the netpoll interface as part of that process. Stay tuned for our impending patch. -Mitch - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] ipv4: fix lock usage in udp_ioctl
Heiko Carstens <[EMAIL PROTECTED]> wrote: > > As reported by the lock validator: > > > [ BUG: illegal lock usage! ] > > illegal {in-hardirq-W} -> {hardirq-on-W} usage. > syslogd/739 [HC0[0]:SC0[1]:HE1:SE0] takes: > (&list->lock){++..}, at: [<002e36d6>] udp_ioctl+0x96/0x100 > {in-hardirq-W} state was registered at: > [<00062128>] lock_acquire+0x9c/0xc0 > [<0036209e>] _spin_lock_irqsave+0x66/0x84 > [<002912ce>] skb_dequeue+0x32/0xb0 > [<00263160>] qeth_qdio_output_handler+0x3e8/0xf8c > [<00219fdc>] tiqdio_thinint_handler+0xde0/0x2234 > [<0020448c>] do_adapter_IO+0x5c/0xa8 > [<0020842c>] do_IRQ+0x13c/0x18c > [<000208a2>] io_no_vtime+0x16/0x1c > [<0001978c>] cpu_idle+0x1d0/0x20c This is bogus. These two locks belong to two different queues and they never intersect. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tcp_slow_start_after_idle
From: Rick Jones <[EMAIL PROTECTED]> Date: Wed, 14 Jun 2006 09:46:58 -0700 > Also, does the congestion window "time out" or does it decay? The modification made to the cwnd is indeed a decay function, but the event is a time out, and it is also termed a restart in other writings and contexts. I think it's all the same. :) > > +/* By default, RFC2861 behavior. */ > > +int sysctl_tcp_slow_start_after_idle = 1; > > + > > Is this a candidate for "readmostly?" All the networking sysctls are, we should do a sweep over them at some point. Thanks for reminding me. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
On Wed, Jun 14, 2006 at 10:47:20PM +0200, Francois Romieu wrote: > Grant Grundler <[EMAIL PROTECTED]> : > [...] > > I'm not keen on adding more code to tulip_interrupt() routine > > for something that rarely happens (compared to IRQs) and is handled > > outside the interrupt routine. I'm pretty sure stopping interrupts > > before stopping DMA is sufficient. > > Can you show an example where it doesn't work? > > Shared irq. > > The device has not quiesced, the kernel stop listening to it and the > neighbor device receives a late interruption from the network device. I thought we've worked through that already: http://www.spinics.net/lists/netdev/msg05902.html Patch v3 takes care of that problem. The first step in the sequence is to mask IRQs on the tulip. The "neighbor" device sharing the IRQ will not see any interrupts from the tulip after that. thanks, grant - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tcp_slow_start_after_idle
From: Zach Brown <[EMAIL PROTECTED]> Date: Wed, 14 Jun 2006 10:09:52 -0700 > Nice, thanks for the heads-up. I'll pass the notice on to the guys who > were asking about this in that thread. Which Wall Street brokerage firm was it? :-) That's basically who wants this stuff, people doing financial transactions. They seem to open up a connection, and just blast out data periodically (with frequency > RTO, which is the whole problem) and they want good latency results from that. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[Ubuntu PATCH] IRDA: Add some IBM think pads
From: Ben Collins <[EMAIL PROTECTED]> [UBUNTU:nsc-ircc] Add some IBM think pads Add Thinkpad T60/X60/Z60/T43/R52 Infrared driver support. http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-dapper.git;a=commitdiff;h=7b8d2713435a9fb69719a282ba75e117f3f76a5b Signed-off-by: Ben Collins <[EMAIL PROTECTED]> --- --- a/drivers/net/irda/nsc-ircc.c +++ b/drivers/net/irda/nsc-ircc.c @@ -115,8 +115,12 @@ static nsc_chip_t chips[] = { /* Contributed by Jan Frey - IBM A30/A31 */ { "PC8739x", { 0x2e, 0x4e, 0x0 }, 0x20, 0xea, 0xff, nsc_ircc_probe_39x, nsc_ircc_init_39x }, - { "IBM", { 0x2e, 0x4e, 0x0 }, 0x20, 0xf4, 0xff, - nsc_ircc_probe_39x, nsc_ircc_init_39x }, + /* IBM ThinkPads using PC8738x (T60/X60/Z60) */ + { "IBM-PC8738x", { 0x2e, 0x4e, 0x0 }, 0x20, 0xf4, 0xff, + nsc_ircc_probe_39x, nsc_ircc_init_39x }, + /* IBM ThinkPads using PC8394T (T43/R52/?) */ + { "IBM-PC8394T", { 0x2e, 0x4e, 0x0 }, 0x20, 0xf9, 0xff, + nsc_ircc_probe_39x, nsc_ircc_init_39x }, { NULL } }; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[Ubuntu PATCH] Make tulip driver not handle Davicom NICs
Make tulip driver not handle Davicom NICs, let dmfe take over" Reference: https://launchpad.net/bugs/48287 Source URL of Patch: http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-dapper.git;a=commitdiff;h=1804482911a71bee9114cae1c2079507a38e9e7f --- linux-2.6.17-rc5/drivers/net/tulip/tulip_core.c 2006-06-05 09:20:30.0 +0800 +++ ubuntu-kernel/drivers/net/tulip/tulip_core.c2006-06-05 09:56:55.0 +0800 @@ -223,8 +223,12 @@ { 0x1259, 0xa120, PCI_ANY_ID, PCI_ANY_ID, 0, 0, COMET }, { 0x11F6, 0x9881, PCI_ANY_ID, PCI_ANY_ID, 0, 0, COMPEX9881 }, { 0x8086, 0x0039, PCI_ANY_ID, PCI_ANY_ID, 0, 0, I21145 }, + /* dmfe module seems to handle these better. See: +* https://launchpad.net/bugs/48287 */ +#if 0 { 0x1282, 0x9100, PCI_ANY_ID, PCI_ANY_ID, 0, 0, DM910X }, { 0x1282, 0x9102, PCI_ANY_ID, PCI_ANY_ID, 0, 0, DM910X }, +#endif { 0x1113, 0x1216, PCI_ANY_ID, PCI_ANY_ID, 0, 0, COMET }, { 0x1113, 0x1217, PCI_ANY_ID, PCI_ANY_ID, 0, 0, MX98715 }, { 0x1113, 0x9511, PCI_ANY_ID, PCI_ANY_ID, 0, 0, COMET }, - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
On Wed, Jun 14, 2006 at 03:51:37PM -0400, Jeff Garzik wrote: > You need to turn off the thing that generates work (DMA engine), before > turning off the thing that reaps work (irq handler). ... > It should be completely obvious that the chip is still generating > work... Yes, I agree it still generates work. ie we can still RX packets. But those will get discarded anyway. In other words, If work is generated and I don't know it and don't care, was it really work? :) > You don't want to leave the hardware in a position where it has > unacknowledged events. Ok. Let me repeat two questions I asked a while ago: | Are you worried about a masked, pending interrupt causing | problems when the driver is re-opened or resumed? The answer to "Yes" it seems. And we will disagree on that since I've proven it's not a problem. And it can't be a problem anyone else has seen since it's been this way for 5+ years. | If firmware left the device in that state at boot time wouldn't | the driver be required to handle it? Can you comment on this please? thanks, grant - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[Ubuntu PATCH] forcedeth: Let the driver work when no PHY is found
From: Ben Collins <[EMAIL PROTECTED]> [UBUNTU:forcedeth] Let the driver work when no PHY is found This matches breezy behavior. Reference: https://launchpad.net/products/launchpad/+bug/45257 http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-dapper.git;a=commitdiff;h=38c1aaedc1f907e138698e54ceadeb9ae560b0d7 Signed-off-by: Ben Collins <[EMAIL PROTECTED]> --- --- a/drivers/net/forcedeth.c +++ b/drivers/net/forcedeth.c @@ -2582,14 +2582,13 @@ static int __devinit nv_probe(struct pci np->phy_oui = id1 | id2; break; } - if (i == 33) { + + /* Let the damn card work if it can */ + if (i == 33) printk(KERN_INFO "%s: open: Could not find a valid PHY.\n", pci_name(pci_dev)); - goto out_freering; - } - - /* reset it */ - phy_init(dev); + else + phy_init(dev); /* set default link speed settings */ np->linkspeed = NVREG_LINKSPEED_FORCE|NVREG_LINKSPEED_10; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[Ubuntu PATCH] Broadcom wireless patch, PCIE/Mactel support
From: Matthew Garrett <[EMAIL PROTECTED]> Broadcom wireless patch, PCIE/Mactel support http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-dapper.git;a=commitdiff;h=1373a8487e911b5ee204f4422ddea00929c8a4cc This patch adds support for PCIE cores to the bcm43xx driver. This is needed for wireless to work on the Intel imacs. I've submitted it to bcm43xx upstream. (cherry picked from d88edf6a433074323a1805365a8dfc9c26fceae3 commit) (cherry picked from 7dbd83ed3255fde4371edcbb6ad1d30f3e6ddf08 commit) --- --- a/drivers/net/wireless/bcm43xx/bcm43xx.h +++ b/drivers/net/wireless/bcm43xx/bcm43xx.h @@ -202,6 +202,8 @@ #define BCM43xx_COREID_USB20_HOST 0x819 #define BCM43xx_COREID_USB20_DEV0x81a #define BCM43xx_COREID_SDIO_HOST0x81b +#define BCM43xx_COREID_PCIE0x820 +#define BCM43xx_COREID_CHIPCOMMON_NEW 0x900 /* Core Information Registers */ #define BCM43xx_CIR_BASE0xf00 --- a/drivers/net/wireless/bcm43xx/bcm43xx_main.c +++ b/drivers/net/wireless/bcm43xx/bcm43xx_main.c @@ -130,6 +130,8 @@ MODULE_PARM_DESC(fwpostfix, "Postfix for { PCI_VENDOR_ID_BROADCOM, 0x4301, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, /* Broadcom 4307 802.11b */ { PCI_VENDOR_ID_BROADCOM, 0x4307, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, + /* Broadcom 4312 80211a/b/g */ + { PCI_VENDOR_ID_BROADCOM, 0x4312, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, /* Broadcom 4318 802.11b/g */ { PCI_VENDOR_ID_BROADCOM, 0x4318, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, /* Broadcom 4319 802.11a/b/g */ @@ -2580,7 +2582,8 @@ static int bcm43xx_probe_cores(struct bc core_vendor = (sb_id_hi & 0x) >> 16; /* if present, chipcommon is always core 0; read the chipid from it */ - if (core_id == BCM43xx_COREID_CHIPCOMMON) { + if (core_id == BCM43xx_COREID_CHIPCOMMON || + core_id == BCM43xx_COREID_CHIPCOMMON_NEW) { chip_id_32 = bcm43xx_read32(bcm, 0); chip_id_16 = chip_id_32 & 0x; bcm->core_chipcommon.available = 1; @@ -2614,7 +2617,8 @@ static int bcm43xx_probe_cores(struct bc /* ChipCommon with Core Rev >=4 encodes number of cores, * otherwise consult hardcoded table */ - if ((core_id == BCM43xx_COREID_CHIPCOMMON) && (core_rev >= 4)) { + if (((core_id == BCM43xx_COREID_CHIPCOMMON) && (core_rev >= 4)) || +core_id == BCM43xx_COREID_CHIPCOMMON_NEW) { core_count = (chip_id_32 & 0x0F00) >> 24; } else { switch (chip_id_16) { @@ -2686,6 +2690,7 @@ static int bcm43xx_probe_cores(struct bc core = NULL; switch (core_id) { case BCM43xx_COREID_PCI: + case BCM43xx_COREID_PCIE: core = &bcm->core_pci; if (core->available) { printk(KERN_WARNING PFX "Multiple PCI cores found.\n"); @@ -2724,6 +2729,7 @@ static int bcm43xx_probe_cores(struct bc case 6: case 7: case 9: + case 10: break; default: printk(KERN_ERR PFX "Error: Unsupported 80211 core revision %u\n", @@ -3002,7 +3008,7 @@ static int bcm43xx_setup_backplane_pci_c if (err) goto out; - if (bcm->core_pci.rev < 6) { + if (bcm->core_pci.rev < 6 && bcm->core_pci.id != BCM43xx_COREID_PCIE) { value = bcm43xx_read32(bcm, BCM43xx_CIR_SBINTVEC); value |= (1 << backplane_flag_nr); bcm43xx_write32(bcm, BCM43xx_CIR_SBINTVEC, value); @@ -3024,7 +3030,7 @@ static int bcm43xx_setup_backplane_pci_c value |= BCM43xx_SBTOPCI2_PREFETCH | BCM43xx_SBTOPCI2_BURST; bcm43xx_write32(bcm, BCM43xx_PCICORE_SBTOPCI2, value); - if (bcm->core_pci.rev < 5) { + if (bcm->core_pci.rev < 5 && bcm->core_pci.id != BCM43xx_COREID_PCIE) { value = bcm43xx_read32(bcm, BCM43xx_CIR_SBIMCONFIGLOW); value |= (2 << BCM43xx_SBIMCONFIGLOW_SERVICE_TOUT_SHIFT) & BCM43xx_SBIMCONFIGLOW_SERVICE_TOUT_MASK; @@ -3351,7 +3357,7 @@ static int bcm43xx_read_phyinfo(struct b bcm->ieee->freq_band = IEEE80211_24GHZ_BAND; break; case BCM43xx_PHYTYPE_G: - if (phy_rev > 7) + if (phy_rev > 8) phy_rev_ok = 0; bcm->ieee->modulation = IEEE80211_OFDM_MODULATION | IEEE80211_CCK_MODULATION; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] ipv4: fix lock usage in udp_ioctl
From: Heiko Carstens <[EMAIL PROTECTED]> Date: Wed, 14 Jun 2006 21:43:05 +0200 > From: Heiko Carstens <[EMAIL PROTECTED]> > > Fix lock usage in udp_ioctl(). > > Signed-off-by: Heiko Carstens <[EMAIL PROTECTED]> More likely the qeth driver shouldn't call into the socket code in hardware interrupt context. From your logs that's what it seems is happening. The socket receive queue should only be touched in software interrupt context, never in hardware interrupt context. That's why the locking does BH disabling at best. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
type of sadb_x_kmprivate_reserved in pfkeyv2.h
Hi David and folks, In include/linux/pfkeyv2.h, is the type 'u_int32_t' for sadb_x_kmprivate_reserved intentional or just an error while bringing in the PF_KEY IPsec extensions from KAME? struct sadb_x_kmprivate { uint16_tsadb_x_kmprivate_len; uint16_tsadb_x_kmprivate_exttype; u_int32_t sadb_x_kmprivate_reserved;< } __attribute__((packed)); This is causing erroneous ipsec-tools builds. How does the __BIT_TYPES_DEFINED define work? Seems like u_int32_t does not get defined in include/linux/types.h when building a userland program such as ipsec-tools. An easy fix is to change the type to uint32_t. Patch attached. Thanks. - Tushar Source: MontaVista Software, Inc. MR: 19039 Type: Defect Fix Disposition: needs submitting to kernel.org Signed-off-by: Tushar Gohad <[EMAIL PROTECTED]> Description: When bringing over the PF_KEY extensions for IPsec from the KAME stack, folks probably forgot to change this only variable to be of type uint32_t. Or otherwise. This is the easiest and harmless fix. Index: linux-p4/include/linux/pfkeyv2.h === --- linux-p4.orig/include/linux/pfkeyv2.h +++ linux-p4/include/linux/pfkeyv2.h @@ -159,7 +159,7 @@ struct sadb_spirange { struct sadb_x_kmprivate { uint16_t sadb_x_kmprivate_len; uint16_t sadb_x_kmprivate_exttype; - u_int32_t sadb_x_kmprivate_reserved; + uint32_t sadb_x_kmprivate_reserved; } __attribute__((packed)); /* sizeof(struct sadb_x_kmprivate) == 8 */
Re: [RFC/PATCH 1/2] in-kernel sockets API
On Wed, 2006-06-14 at 10:48 -0700, Daniel Phillips wrote: > > Did we settle the question of whether these particular exports should be > EXPORT_SYMBOL_GPL? When i submitted this patch, i didn't really think about the different ways to export these symbols. I simply used the EXPORT_SYMBOL() that is used by all the other exports in net/socket.c including kernel_sendmsg() and kernel_recvmsg(). I am OK with either option(EXPORT_SYMBOL or EXPORT_SYMBOL_GPL) and i will leave it to David Miller to make that decision at this point. Thanks Sridhar - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
Grant Grundler <[EMAIL PROTECTED]> : [...] > I'm not keen on adding more code to tulip_interrupt() routine > for something that rarely happens (compared to IRQs) and is handled > outside the interrupt routine. I'm pretty sure stopping interrupts > before stopping DMA is sufficient. > Can you show an example where it doesn't work? Shared irq. The device has not quiesced, the kernel stop listening to it and the neighbor device receives a late interruption from the network device. -- Ueimor - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] e1000: fix netpoll with NAPI
On Mon, Jun 12, 2006 at 02:06:00PM -0400, Neil Horman wrote: > On Mon, Jun 12, 2006 at 09:42:14AM -0700, Mitch Williams wrote: > > On Sun, 2006-06-11 at 17:13 -0700, Neil Horman wrote: > > > Any further thoughts on this guys? I still think my last solution > > > solves all of > > > the netpoll problems, and isn't going to have any noticable impact on > > > performance. > > > > > I haven't had time to evaluate performance on your patch (sorry!), but > > after thinking about it, I agree that it should not have any noticeable > > impact. OTOH, performance tuning is a funny thing, and things you think > > won't cause problems often do. > > > Thats ok, I just didn't hear out of anyone on friday, so I was curious as to > where we were on this. I don't have the ability to do any real world > performance testing here, but I'll try to record the run time of the interrupt > routine on a limited number of frames here. > Hey, as promised, I've done some rudimentary performance benchmarking on various ways that we have talked about to solve this problem. As I previously mentioned I didn't have the equipment to do any real full scale testing here, so what I did was take a read of the real time counter at the start and end of the e1000_intr routine with various patches applied, and I recorded the number of ticks elapsed on the tsc during its run. I did this on my single cpu x86_64 machine here, using the latest unpatched e1000 driver as a base, and then comparing it to the e1000 driver using my patch and separately with a patch that spinlocks the e1000_clean_rx_irq routine (so as to serialize the critical section that would otherwise be subject to data corruption. Here are my results: Base line: Avg. 8145 Ticks on the tsc. With my patch: http://marc.theaimsgroup.com/?l=linux-netdev&m=114970807606096&w=2 Avg. 8159 Ticks on the tsc. (+0.17% increase) With a spinlock added to e1000_clean_rx_irq: Avg. 8238 Ticks on the tsc. (+1.1% increase) If you like I can send you the time stamp counter patch that I used, as well as the patch which adds spinlocks to the clean routine. Note that the free running counter values will vary so you probably want to look at percentage increase. Either way, I think either solution provides very little impact on interrupt run time. Given that my patch (granted using my test methodology here) is the faster of the two, and arguably the more correct in terms of not using the poll controller method to recieve frames, We should go with that patch. Thoughts/opinions? Neil -- /*** *Neil Horman *Software Engineer *gpg keyid: 1024D / 0x92A74FA1 - http://pgp.mit.edu ***/ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
Grant Grundler wrote: On Wed, Jun 14, 2006 at 11:03:48AM -0400, Jeff Garzik wrote: Grant Grundler wrote: Switching the order to be: tulip_stop_rxtx(tp);/* Stop DMA */ free_irq (dev->irq, dev); /* no more races after this */ still leaves us open to IRQs being delivered _after_ we've stopped DMA. Correct. And that is the preferred, natural, logical, obvious order: 1) Turn things off. 2) Wait for activity to cease. Patch v3 does this in two stages: 1) turn off tulip interrupts 2) free_irq() calls syncronize_irq() to handle pending IRQs then calls tulip_stop_rxtx() which: 1) tells tulip to stop DMA 2) poll until DMA completes After this we can free remaining resources. You need to turn off the thing that generates work (DMA engine), before turning off the thing that reaps work (irq handler). That in turn allows the interrupt handler to re-enable DMA again. Then that would be a problem to solve... Some interrupt handlers will test netif_running() or a driver-specific shutting-down flag, specifically to avoid such behaviors. I'm not keen on adding more code to tulip_interrupt() routine for something that rarely happens (compared to IRQs) and is handled outside the interrupt routine. I'm pretty sure stopping interrupts before stopping DMA is sufficient. Can you show an example where it doesn't work? It should be completely obvious that the chip is still generating work... You don't want to leave the hardware in a position where it has unacknowledged events. Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch] ipv4: fix lock usage in udp_ioctl
From: Heiko Carstens <[EMAIL PROTECTED]> Fix lock usage in udp_ioctl(). Signed-off-by: Heiko Carstens <[EMAIL PROTECTED]> --- udp_poll() seems to have the same problem, right? As reported by the lock validator: [ BUG: illegal lock usage! ] illegal {in-hardirq-W} -> {hardirq-on-W} usage. syslogd/739 [HC0[0]:SC0[1]:HE1:SE0] takes: (&list->lock){++..}, at: [<002e36d6>] udp_ioctl+0x96/0x100 {in-hardirq-W} state was registered at: [<00062128>] lock_acquire+0x9c/0xc0 [<0036209e>] _spin_lock_irqsave+0x66/0x84 [<002912ce>] skb_dequeue+0x32/0xb0 [<00263160>] qeth_qdio_output_handler+0x3e8/0xf8c [<00219fdc>] tiqdio_thinint_handler+0xde0/0x2234 [<0020448c>] do_adapter_IO+0x5c/0xa8 [<0020842c>] do_IRQ+0x13c/0x18c [<000208a2>] io_no_vtime+0x16/0x1c [<0001978c>] cpu_idle+0x1d0/0x20c irq event stamp: 1694 hardirqs last enabled at (1693): [<003629c2>] _spin_unlock_irqrestore+0x92/0xa8 hardirqs last disabled at (1692): [<00362074>] _spin_lock_irqsave+0x3c/0x84 softirqs last enabled at (1682): [<0028c7c4>] release_sock+0xe4/0xf4 softirqs last disabled at (1694): [<00361f7e>] _spin_lock_bh+0x2e/0x70 other info that might help us debug this: no locks held by syslogd/739. stack backtrace: 0fd6c148 0de2f960 0002 0de2fa00 0de2f978 0de2f978 0001737c 0de2f960 000c 0de2f960 0de2f9d0 0036fe70 0001737c 0de2f960 0de2f9b0 Call Trace: ([<0001730a>] show_trace+0x166/0x16c) [<000173d6>] show_stack+0xc6/0xf8 [<00017436>] dump_stack+0x2e/0x3c [<0005f978>] print_usage_bug+0x23c/0x250 [<000607cc>] mark_lock+0x594/0x714 [<000613be>] __lock_acquire+0x252/0xf20 [<00062128>] lock_acquire+0x9c/0xc0 [<00361fa8>] _spin_lock_bh+0x58/0x70 [<002e36d6>] udp_ioctl+0x96/0x100 [<002eadd6>] inet_ioctl+0x72/0x11c [<002893f2>] sock_ioctl+0x1ca/0x2c0 [<000c13ee>] do_ioctl+0x56/0xe0 [<000c14f2>] vfs_ioctl+0x7a/0x384 [<000c184e>] sys_ioctl+0x52/0x84 [<000e80a2>] do_ioctl32_pointer+0x2a/0x3c [<000e55c8>] compat_sys_ioctl+0x168/0x378 [<00020338>] sysc_noemu+0x10/0x16 diffstat: net/ipv4/udp.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 3f93292..b15a17b 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -740,7 +740,7 @@ int udp_ioctl(struct sock *sk, int cmd, unsigned long amount; amount = 0; - spin_lock_bh(&sk->sk_receive_queue.lock); + spin_lock_irq(&sk->sk_receive_queue.lock); skb = skb_peek(&sk->sk_receive_queue); if (skb != NULL) { /* @@ -750,7 +750,7 @@ int udp_ioctl(struct sock *sk, int cmd, */ amount = skb->len - sizeof(struct udphdr); } - spin_unlock_bh(&sk->sk_receive_queue.lock); + spin_unlock_irq(&sk->sk_receive_queue.lock); return put_user(amount, (int __user *)arg); } - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Remove Prism II support from Orinoco
On Tue, Jun 13, 2006 at 09:24:49PM +0300, Jar wrote: > It always loads itself with or without blacklist. That's why I have to > do 'rm -f orinoco*.* && depmod -a' when the new kernel arrives. Seems > that users are directed to use unsecure orinoco (wep) driver rather than > secure hostap (wpa/wpa2,tkip,aes) driver for their prism2 hardware. The hostap drivers are also much better behaved for rfmon than the orinoco drivers for prism2. -m -- Mike Kershaw/Dragorn <[EMAIL PROTECTED]> GPG Fingerprint: 3546 89DF 3C9D ED80 3381 A661 D7B2 8822 738B BDB1 Bus Error at 008BE426 while reading byte from DEADBEEF in User data space pgp0O8uVbEENO.pgp Description: PGP signature
Re: [PATCH 6/5] rt2x00: per-queue TX flow control
Hi, On Wednesday 14 June 2006 18:36, Jiri Benc wrote: > This is a patch for rt2x00 driver to do TX flow control. > > It is compile-tested only. > > Signed-off-by: Jiri Benc <[EMAIL PROTECTED]> I'll put my comments for the rt2400pci driver only, since the same changes are made for each rt2x00 driver. > --- dscape.orig/drivers/net/wireless/d80211/rt2x00/rt2400pci.c > +++ dscape/drivers/net/wireless/d80211/rt2x00/rt2400pci.c > @@ -1001,7 +1001,7 @@ rt2400pci_txdone(void *data) > struct data_entry *entry; > struct txd *txd; > int tx_status; > - int ack; > + int ack, wake, queue; > > while (!rt2x00_ring_empty(ring)) { > entry = rt2x00_get_data_entry_done(ring); > @@ -1048,7 +1048,11 @@ rt2400pci_txdone(void *data) > rt2x00_set_field32(&txd->word0, TXD_W0_VALID, 0); > entry->skb = NULL; > > + wake = rt2x00_ring_full(ring); > + queue = entry->tx_status.control.queue; > rt2x00_ring_index_done_inc(ring); > + if (wake) > + ieee80211_wake_queue(ring->net_dev, queue); > } This will not give the correct result I fear, and it would cause (unwanted) overhead of checking if the queue was full. Queue_full can be checked when the loop starts, and the waking of the queue can best be done after freeing all entries and after the second check if the queue is still not full. (There is no guarentee the while() loop will end while there are free entries in the queue) > /* > @@ -1541,24 +1545,31 @@ rt2400pci_tx(struct net_device *net_dev, > ERROR("Attempt to send packet over invalid queue %d.\n" > "Please file bug report to %s.\n", > control->queue, DRV_PROJECT); > - return NET_XMIT_DROP; > + dev_kfree_skb_any(skb); > + return NETDEV_TX_OK; > } > > - if (rt2x00_ring_full(ring)) > - return NET_XMIT_DROP; > + if (rt2x00_ring_full(ring)) { > + ieee80211_stop_queue(net_dev, control->queue); > + return NETDEV_TX_BUSY; > + } > > entry = rt2x00_get_data_entry(ring); > txd = entry->desc_addr; > > - if (rt2x00_get_field32(txd->word0, TXD_W0_OWNER_NIC) > - || rt2x00_get_field32(txd->word0, TXD_W0_VALID)) > - return NET_XMIT_DROP; > + if (rt2x00_get_field32(txd->word0, TXD_W0_OWNER_NIC) || > + rt2x00_get_field32(txd->word0, TXD_W0_VALID)) { > + ieee80211_stop_queue(net_dev, control->queue); > + return NETDEV_TX_BUSY; > + } Not sure if I am happy with this one. When this check is made, it occurs after the ring_full check. This means that when this statement is true, the queue is not full. Instead it has more of a meaning that something has gone wrong with the queue and this should not have happened. But this is not really a problem in the patch itself, just a problem I only now recognize thanks to your patch. ;) For the time being I'll add a debug message, but I need to find a method to clean up the ring if this occurs. This check currently does not happen in the rt2570 and rt73 USB drivers, but it is safer to add them in there as well. > memcpy(entry->data_addr, skb->data, skb->len); > rt2400pci_write_tx_desc(rt2x00pci, txd, skb, control); > entry->skb = skb; > > rt2x00_ring_index_inc(ring); > + if (rt2x00_ring_full(ring)) > + ieee80211_stop_queue(net_dev, control->queue); > > rt2x00_register_read(rt2x00pci, TXCSR0, ®); > if (control->queue == IEEE80211_TX_QUEUE_DATA0) > @@ -1668,6 +1679,7 @@ rt2400pci_open(struct net_device *net_de > rt2x00_register_write(rt2x00pci, CSR8, reg); > > SET_FLAG(rt2x00pci, RADIO_ENABLED); > + ieee80211_start_queues(net_dev); > > return 0; Based on Jiri's patch for rt2x00 driver to do TX flow control. Signed-off-by Ivo van Doorn <[EMAIL PROTECTED]> --- diff --git a/drivers/net/wireless/d80211/rt2x00/rt2400pci.c b/drivers/net/wireless/d80211/rt2x00/rt2400pci.c index 8b856dd..946cf86 100644 --- a/drivers/net/wireless/d80211/rt2x00/rt2400pci.c +++ b/drivers/net/wireless/d80211/rt2x00/rt2400pci.c @@ -1002,6 +1002,12 @@ rt2400pci_txdone(void *data) struct txd *txd; int tx_status; int ack; + int ring_full; + + /* +* Store the current status of the ring. +*/ + ring_full = rt2x00_ring_full(ring); while (!rt2x00_ring_empty(ring)) { entry = rt2x00_get_data_entry_done(ring); @@ -1062,6 +1068,16 @@ rt2400pci_txdone(void *data) rt2x00pci->scan->status = SCANNING_READY; complete(&rt2x00pci->scan->completion); } + + /* +* If the data ring was full before the txdone hand
Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
On Wed, Jun 14, 2006 at 11:03:48AM -0400, Jeff Garzik wrote: > Grant Grundler wrote: > >Switching the order to be: > >tulip_stop_rxtx(tp);/* Stop DMA */ > >free_irq (dev->irq, dev); /* no more races after this */ > > > >still leaves us open to IRQs being delivered _after_ we've stopped DMA. > > Correct. And that is the preferred, natural, logical, obvious order: > > 1) Turn things off. > 2) Wait for activity to cease. Patch v3 does this in two stages: 1) turn off tulip interrupts 2) free_irq() calls syncronize_irq() to handle pending IRQs then calls tulip_stop_rxtx() which: 1) tells tulip to stop DMA 2) poll until DMA completes After this we can free remaining resources. > >That in turn allows the interrupt handler to re-enable DMA again. > > Then that would be a problem to solve... Some interrupt handlers will > test netif_running() or a driver-specific shutting-down flag, > specifically to avoid such behaviors. I'm not keen on adding more code to tulip_interrupt() routine for something that rarely happens (compared to IRQs) and is handled outside the interrupt routine. I'm pretty sure stopping interrupts before stopping DMA is sufficient. Can you show an example where it doesn't work? This is important since I'm going to propose a new Documentation/pci.txt based on this experience. thanks, grant - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [openib-general] [PATCH v2 1/2] iWARP Connection Manager.
[EMAIL PROTECTED] wrote: > On Tue, 2006-06-13 at 16:46 -0500, Steve Wise wrote: >> On Tue, 2006-06-13 at 14:36 -0700, Sean Hefty wrote: > Er...no. It will lose this event. Depending on the event...the > carnage varies. We'll take a look at this. > This behavior is consistent with the Infiniband CM (see drivers/infiniband/core/cm.c function cm_recv_handler()). But I think we should at least log an error because a lost event will usually stall the rdma connection. >>> >>> I believe that there's a difference here. For the Infiniband CM, an >>> allocation error behaves the same as if the received MAD were lost >>> or dropped. Since MADs are unreliable anyway, it's not so much that >>> an IB CM event gets lost, as it doesn't ever occur. A remote CM >>> should retry the send, which hopefully allows the > connection to make forward progress. >>> >> >> hmm. Ok. I see. I misunderstood the code in cm_recv_handler(). >> >> Tom and I have been talking about what we can do to not drop the >> event. Stay tuned. > > Here's a simple solution that solves the problem: > > For any given cm_id, there are a finite (and small) number of > outstanding CM events that can be posted. So we just > pre-allocate them when the cm_id is created and keep them on > a free list hanging off of the cm_id struct. Then the event > handler function will pull from this free list. > > The only case where there is any non-finite issue is on the > passive listening cm_id. Each incoming connection request > will consume a work struct. So based on client connects, we > could run out of work structs. > However, the CMA has the concept of a backlog, which is > defined as the max number of pending unaccepted connection > requests. So we allocate these work structs based on that > number (or a computation based on that number), and if we run > out, we simply drop the incoming connection request due to > backlog overflow (I suggest we log the drop event too). > When a MPA connection request is dropped, the (IETF > conforming) MPA client will eventually time out the > connection and the consumer can retry. > > Comments? > If the IWCM cannot accept a Connection Request event from the driver then *someone* should generate a non-peer reject MPA Response frame. Since the IWCM does not have the resources to relay the event, it probably does not have the resources to generate the MPA Response frame either. So simply returning an "I'm Busy" error and expecting the driver to handle it makes sense to me. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/2] in-kernel sockets API
Daniel, On Wed, 14 Jun 2006, Daniel Phillips wrote: > > Speaking as a former member of a "grey market" binary module vendor that > came in from the cold I can assure you that the distinction between EXPORT > and EXPORT_GPL _is_ meaningful. That tainted flag makes it extremely > difficult to do deals with mainstream Linux companies and there is always > the fear that it will turn into a legal problem. The latter bit tends to > make venture capitalists nervous. > EXPORT_SYMBOL_GPL and the Tainted flag have nothing to do with each other. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/2] in-kernel sockets API
Hi Harald, You wrote: On Tue, Jun 13, 2006 at 02:12:41PM -0700, I wrote: This has the makings of a nice stable internal kernel api. Why do we want to provide this nice stable internal api to proprietary modules? because there is IMHO legally nothing we can do about it anyway. Speaking as a former member of a "grey market" binary module vendor that came in from the cold I can assure you that the distinction between EXPORT and EXPORT_GPL _is_ meaningful. That tainted flag makes it extremely difficult to do deals with mainstream Linux companies and there is always the fear that it will turn into a legal problem. The latter bit tends to make venture capitalists nervous. That said, the EXPORT_GPL issue is not about black and white legal issues, it is about gentle encouragement. In this case we are offering a clumsy, on-the-metal, guaranteed-to-change-and-make-you-edit-code interface to non-GPL-compatible modules and a decent, stable (in the deserves to live sense) interface for the pure of heart. Gentle encouragement at exactly the right level. Did we settle the question of whether these particular exports should be EXPORT_SYMBOL_GPL? Regards, Daniel - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Interrupt handling on SMP system
Hi all, I had a few question regarding how interrupt handling work on linux within a SMP systems. 1. Which processor gets the interrupt when a new packet arrives? Is there any policy mechanism which can guide the interrupt to the idle processor etc? Do the processors share an interrupt line and an interrupt controller assign it to a specific processor? 2. When a soft IRQ is scheduled by the driver and picked up by a ksoftirqd thread, does it get processed entirely on the same processor? Does ksoftirqd/i thread processes the packets in the ith processor queue only? Regards, Majid - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tcp_slow_start_after_idle
David Miller wrote: > Bringing back up this old topic: > > http://marc.theaimsgroup.com/?l=linux-netdev&m=114564962420171&w=2 > > I've decided to add this tunable to the net-2.6.18 tree, patch below. Nice, thanks for the heads-up. I'll pass the notice on to the guys who were asking about this in that thread. - z - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tcp_slow_start_after_idle
+tcp_slow_start_after_idle - BOOLEAN + If set, provide RFC2861 behavior and time out the congestion + window after an idle period. An idle period is defined at + the current RTO. If unset, the congestion window will not + be timed out after an idle period. + Default: 1 Did you mean "defined as" rather than "defined at?" Also, does the congestion window "time out" or does it decay? Perhaps: tcp_slow_start_after_idle - BOOLEAN If set, provide RFC2861 behavior and decay the congestion window after the connection has been idle for the connection's current RTO. If unset, the congestion window will not decay when the connection has been idle. Default: 1 diff --git a/include/net/tcp.h b/include/net/tcp.h index de88c54..bfc71f9 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -227,6 +227,7 @@ extern int sysctl_tcp_abc; extern int sysctl_tcp_mtu_probing; extern int sysctl_tcp_base_mss; extern int sysctl_tcp_workaround_signed_windows; +extern int sysctl_tcp_slow_start_after_idle; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 743016b..be6d929 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -59,6 +59,9 @@ int sysctl_tcp_tso_win_divisor = 3; int sysctl_tcp_mtu_probing = 0; int sysctl_tcp_base_mss = 512; +/* By default, RFC2861 behavior. */ +int sysctl_tcp_slow_start_after_idle = 1; + Is this a candidate for "readmostly?" rick jones - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/5] rt2x00: per-queue TX flow control
This is a patch for rt2x00 driver to do TX flow control. It is compile-tested only. Signed-off-by: Jiri Benc <[EMAIL PROTECTED]> --- drivers/net/wireless/d80211/rt2x00/rt2400pci.c | 26 ++--- drivers/net/wireless/d80211/rt2x00/rt2500pci.c | 26 ++--- drivers/net/wireless/d80211/rt2x00/rt2500usb.c | 18 + drivers/net/wireless/d80211/rt2x00/rt61pci.c | 26 ++--- drivers/net/wireless/d80211/rt2x00/rt73usb.c | 18 + 5 files changed, 85 insertions(+), 29 deletions(-) --- dscape.orig/drivers/net/wireless/d80211/rt2x00/rt2400pci.c +++ dscape/drivers/net/wireless/d80211/rt2x00/rt2400pci.c @@ -1001,7 +1001,7 @@ rt2400pci_txdone(void *data) struct data_entry *entry; struct txd *txd; int tx_status; - int ack; + int ack, wake, queue; while (!rt2x00_ring_empty(ring)) { entry = rt2x00_get_data_entry_done(ring); @@ -1048,7 +1048,11 @@ rt2400pci_txdone(void *data) rt2x00_set_field32(&txd->word0, TXD_W0_VALID, 0); entry->skb = NULL; + wake = rt2x00_ring_full(ring); + queue = entry->tx_status.control.queue; rt2x00_ring_index_done_inc(ring); + if (wake) + ieee80211_wake_queue(ring->net_dev, queue); } /* @@ -1541,24 +1545,31 @@ rt2400pci_tx(struct net_device *net_dev, ERROR("Attempt to send packet over invalid queue %d.\n" "Please file bug report to %s.\n", control->queue, DRV_PROJECT); - return NET_XMIT_DROP; + dev_kfree_skb_any(skb); + return NETDEV_TX_OK; } - if (rt2x00_ring_full(ring)) - return NET_XMIT_DROP; + if (rt2x00_ring_full(ring)) { + ieee80211_stop_queue(net_dev, control->queue); + return NETDEV_TX_BUSY; + } entry = rt2x00_get_data_entry(ring); txd = entry->desc_addr; - if (rt2x00_get_field32(txd->word0, TXD_W0_OWNER_NIC) - || rt2x00_get_field32(txd->word0, TXD_W0_VALID)) - return NET_XMIT_DROP; + if (rt2x00_get_field32(txd->word0, TXD_W0_OWNER_NIC) || + rt2x00_get_field32(txd->word0, TXD_W0_VALID)) { + ieee80211_stop_queue(net_dev, control->queue); + return NETDEV_TX_BUSY; + } memcpy(entry->data_addr, skb->data, skb->len); rt2400pci_write_tx_desc(rt2x00pci, txd, skb, control); entry->skb = skb; rt2x00_ring_index_inc(ring); + if (rt2x00_ring_full(ring)) + ieee80211_stop_queue(net_dev, control->queue); rt2x00_register_read(rt2x00pci, TXCSR0, ®); if (control->queue == IEEE80211_TX_QUEUE_DATA0) @@ -1668,6 +1679,7 @@ rt2400pci_open(struct net_device *net_de rt2x00_register_write(rt2x00pci, CSR8, reg); SET_FLAG(rt2x00pci, RADIO_ENABLED); + ieee80211_start_queues(net_dev); return 0; --- dscape.orig/drivers/net/wireless/d80211/rt2x00/rt2500pci.c +++ dscape/drivers/net/wireless/d80211/rt2x00/rt2500pci.c @@ -1089,7 +1089,7 @@ rt2500pci_txdone(void *data) struct data_entry *entry; struct txd *txd; int tx_status; - int ack; + int ack, wake, queue; while (!rt2x00_ring_empty(ring)) { entry = rt2x00_get_data_entry_done(ring); @@ -1136,7 +1136,11 @@ rt2500pci_txdone(void *data) rt2x00_set_field32(&txd->word0, TXD_W0_VALID, 0); entry->skb = NULL; + wake = rt2x00_ring_full(ring); + queue = entry->tx_status.control.queue; rt2x00_ring_index_done_inc(ring); + if (wake) + ieee80211_wake_queue(ring->net_dev, queue); } /* @@ -1664,24 +1668,31 @@ rt2500pci_tx(struct net_device *net_dev, ERROR("Attempt to send packet over invalid queue %d.\n" "Please file bug report to %s.\n", control->queue, DRV_PROJECT); - return NET_XMIT_DROP; + dev_kfree_skb_any(skb); + return NETDEV_TX_OK; } - if (rt2x00_ring_full(ring)) - return NET_XMIT_DROP; + if (rt2x00_ring_full(ring)) { + ieee80211_stop_queue(net_dev, control->queue); + return NETDEV_TX_BUSY; + } entry = rt2x00_get_data_entry(ring); txd = entry->desc_addr; - if (rt2x00_get_field32(txd->word0, TXD_W0_OWNER_NIC) - || rt2x00_get_field32(txd->word0, TXD_W0_VALID)) - return NET_XMIT_DROP; + if (rt2x00_get_field32(txd->word0, TXD_W0_OWNER_NIC) || +
Re: [openib-general] [PATCH v2 1/2] iWARP Connection Manager.
On Tue, 2006-06-13 at 16:46 -0500, Steve Wise wrote: > On Tue, 2006-06-13 at 14:36 -0700, Sean Hefty wrote: > > >> Er...no. It will lose this event. Depending on the event...the carnage > > >> varies. We'll take a look at this. > > >> > > > > > >This behavior is consistent with the Infiniband CM (see > > >drivers/infiniband/core/cm.c function cm_recv_handler()). But I think > > >we should at least log an error because a lost event will usually stall > > >the rdma connection. > > > > I believe that there's a difference here. For the Infiniband CM, an > > allocation > > error behaves the same as if the received MAD were lost or dropped. Since > > MADs > > are unreliable anyway, it's not so much that an IB CM event gets lost, as it > > doesn't ever occur. A remote CM should retry the send, which hopefully > > allows > > the connection to make forward progress. > > > > hmm. Ok. I see. I misunderstood the code in cm_recv_handler(). > > Tom and I have been talking about what we can do to not drop the event. > Stay tuned. Here's a simple solution that solves the problem: For any given cm_id, there are a finite (and small) number of outstanding CM events that can be posted. So we just pre-allocate them when the cm_id is created and keep them on a free list hanging off of the cm_id struct. Then the event handler function will pull from this free list. The only case where there is any non-finite issue is on the passive listening cm_id. Each incoming connection request will consume a work struct. So based on client connects, we could run out of work structs. However, the CMA has the concept of a backlog, which is defined as the max number of pending unaccepted connection requests. So we allocate these work structs based on that number (or a computation based on that number), and if we run out, we simply drop the incoming connection request due to backlog overflow (I suggest we log the drop event too). When a MPA connection request is dropped, the (IETF conforming) MPA client will eventually time out the connection and the consumer can retry. Comments? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] bcm43xx: use softmac-suggested TX rate
Hi John, Sorry, took a little bit longer than expected, but here it is. :) Please queue for 2.6.18. -- From: Daniel Drake <[EMAIL PROTECTED]> Use Softmac-suggested TX ratecode: ieee80211softmac_suggest_txrate() Signed-off-by: Daniel Drake <[EMAIL PROTECTED]> Signed-off-by: Michael Buesch <[EMAIL PROTECTED]> Index: wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_xmit.c === --- wireless-2.6.orig/drivers/net/wireless/bcm43xx/bcm43xx_xmit.c 2006-06-14 16:53:50.0 +0200 +++ wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_xmit.c2006-06-14 17:44:23.0 +0200 @@ -296,11 +296,14 @@ u16 control = 0; u16 wsec_rate = 0; u16 encrypt_frame; + const u16 ftype = WLAN_FC_GET_TYPE(le16_to_cpu(wireless_header->frame_ctl)); + const int is_mgt = (ftype == IEEE80211_FTYPE_MGMT); /* Now construct the TX header. */ memset(txhdr, 0, sizeof(*txhdr)); - bitrate = bcm->softmac->txrates.default_rate; + bitrate = ieee80211softmac_suggest_txrate(bcm->softmac, + is_multicast_ether_addr(wireless_header->addr1), is_mgt); ofdm_modulation = !(ieee80211_is_cck_rate(bitrate)); fallback_bitrate = bcm43xx_calc_fallback_rate(bitrate); fallback_ofdm_modulation = !(ieee80211_is_cck_rate(fallback_bitrate)); -- Greetings Michael. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL
jamal wrote: I have taken linux-kernel off the list. Russell's site is inaccessible to me (I actually think this is related to some DNS issues i may be having) and your masters is too long to spend 2 minutes and glean it; so heres a question or two for you: - Have you tried to do a long-lived session such as a large FTP and seen how far off the deviation was? That would provide some interesting data point. - To be a devil's advocate (and not claim there is no issue), where do you draw the line with "overhead"? Me and many others have run a smilar hack for years, there is also a userspace project still alive which does the same. The difference is that without it I would need to sacrifice almost half my 288kbit atm/dsl showtime bandwidth to be sure of control. With the modification I can run at 286kbit / 288 and know I will never have jitter worse than the bitrate latency of a mtu packet. The 286 figure was choses to allow a full buffer to drain/ allow for timer innaccuracy etc. On a p200 with tsc, 2.6.12 it's never gone over for me - though talking of timers I notice on my desktop 2.6.16 I gain 2 minutes a day now. Andy. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/2] in-kernel sockets API
On Wed, Jun 14, 2006 at 04:29:04PM +0200, Erik Mouw wrote: > On Wed, Jun 14, 2006 at 03:30:22PM +0200, Harald Welte wrote: > > On Tue, Jun 13, 2006 at 02:12:41PM -0700, Daniel Phillips wrote: > > > > > This has the makings of a nice stable internal kernel api. Why do we want > > > to provide this nice stable internal api to proprietary modules? > > > > because there is IMHO legally nothing we can do about it anyway. Use of > > an industry-standard API that is provided in multiple operating system > > is one of the clearest idnication of some program _not_ being a > > derivative work. > > IMHO there is no industry-standard API for in-kernel use of sockets. > There is however one for user space. it doesn't matter in what space you are. If the API really is similar enough, then any piece of code (no matter where it was originally intended to run) will be able to work with any such socket API. The whole point of this is: Where is the derivation of an existing work? I can write a program against some BSD socket api somewhere, and I can easily make it use the proposed in-kernel sockets API. No derivation of anything that is inside the kernel and GPL licensed. > (IANAL, etc) Neither am I, but I'm constantly dealing with legal questions related to the GPL while running gpl-violations.org. -- - Harald Welte <[EMAIL PROTECTED]> http://gnumonks.org/ We all know Linux is great...it does infinite loops in 5 seconds. -- Linus pgpIzJfnLL2Vq.pgp Description: PGP signature
Re: Refactor Netlink connector?
On Wed, 14 Jun 2006, jamal wrote: > > So whats the resolution on this? I actually have some cycles this coming > weekend that i was hopping to spend updating the doc instead. Haven't had a chance to look at it since. -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL
On Wed, 2006-06-14 at 10:27 -0400, Phillip Susi wrote: > Jesper Dangaard Brouer wrote: > > The Linux traffic's control engine inaccurately calculates > > transmission times for packets sent over ADSL links. For > > some packet sizes the error rises to over 50%. This occurs > > because ADSL uses ATM as its link layer transport, and ATM > > transmits packets in fixed sized 53 byte cells. > > > > I could have sworn that DSL uses its own framing protocol that is > similar to the frame/superframe structure of HDSL ( T1 ) lines and over > that you can run ATM or ethernet. Or is it typically ethernet -> ATM -> > HDSL? Nope, not according to the ADSL standards G.992.1 and G.992.2. > In any case, why does the kernel care about the exact time that the IP > packet has been received and reassembled on the headend? I think you have misunderstood what the rate table does... (There is an explaination in the thesis page 57 section 6.1.2) http://www.adsl-optimizer.dk/thesis/ -- Med venlig hilsen / Best regards Jesper Brouer ComX Networks A/S Linux Network developer Cand. Scient Datalog / MSc. Author of http://adsl-optimizer.dk signature.asc Description: This is a digitally signed message part
Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
Grant Grundler wrote: On Tue, Jun 13, 2006 at 08:33:22PM -0400, Jeff Garzik wrote: Grant Grundler wrote: o tulip_stop_rxtx() has to be called _after_ free_irq(). ie. v2 patch didn't fix the original race condition and when under test, dies about as fast as the original code. You made the race window smaller, but it's still there. The chip's DMA engines should be stopped before you unregister the interrupt handler. Switching the order to be: tulip_stop_rxtx(tp);/* Stop DMA */ free_irq (dev->irq, dev); /* no more races after this */ still leaves us open to IRQs being delivered _after_ we've stopped DMA. Correct. And that is the preferred, natural, logical, obvious order: 1) Turn things off. 2) Wait for activity to cease. That in turn allows the interrupt handler to re-enable DMA again. Then that would be a problem to solve... Some interrupt handlers will test netif_running() or a driver-specific shutting-down flag, specifically to avoid such behaviors. Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
On Wed, Jun 14, 2006 at 09:05:06AM -0400, Kyle McMartin wrote: > I think the correct sequence would be: > > reset tulip interrupt mask > flush posted write > > synchronize irq /* make sure we got 'em all */ > tulip_stop_rxtx /* turn off dma */ > free irq/* bye bye */ > > The synchronize irq guarantees we shouldn't see another irq > generated by the card because it was held up somewhere. Kyle, syncronize_irq() only guarantees currently executing interrupt handler completes before handing control back to the caller. It does not guarantee IRQ signals still inflight are "flushed". Remember that IRQ lines are a "sideband" signal and not subject to PCI data ordering rules. thanks, grant - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/2] in-kernel sockets API
On Wed, Jun 14, 2006 at 03:30:22PM +0200, Harald Welte wrote: > On Tue, Jun 13, 2006 at 02:12:41PM -0700, Daniel Phillips wrote: > > > This has the makings of a nice stable internal kernel api. Why do we want > > to provide this nice stable internal api to proprietary modules? > > because there is IMHO legally nothing we can do about it anyway. Use of > an industry-standard API that is provided in multiple operating system > is one of the clearest idnication of some program _not_ being a > derivative work. IMHO there is no industry-standard API for in-kernel use of sockets. There is however one for user space. Erik (IANAL, etc) -- +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL
Jesper Dangaard Brouer wrote: The Linux traffic's control engine inaccurately calculates transmission times for packets sent over ADSL links. For some packet sizes the error rises to over 50%. This occurs because ADSL uses ATM as its link layer transport, and ATM transmits packets in fixed sized 53 byte cells. I could have sworn that DSL uses its own framing protocol that is similar to the frame/superframe structure of HDSL ( T1 ) lines and over that you can run ATM or ethernet. Or is it typically ethernet -> ATM -> HDSL? In any case, why does the kernel care about the exact time that the IP packet has been received and reassembled on the headend? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: http://bugzilla.kernel.org/show_bug.cgi?id=6197
Michael Tokarev wrote: > Patrick McHardy wrote: > [] > >>He patched his kernel with the IMQ device, which is known to cause all >>kinds of weird problems. > > > Wich problems? Known to whom? Known to me (who wrote the original implementation of the current IMQ device) and numerous people who were hit by them. IIRC it does some invalid skb refcounting hacks which result in crashes in certain scenarios - but I don't remeber the exact details. > I was considering using imq for our needs (not done yet), and from the > FAQ at http://www.linuximq.net/faq.html (item #3, "Is it stable?") it > seems there's no problems except of gre tunnels and locally generated > traffic... > > Googling for "imq linux problem" shows usual pile of various user > support questions (how to configure.. what did I do wrong.. etc), > but nothing relevant. The lartc list had lots of reports of crashes. I guess "imq crash" or "imq oops" will give better results. > So... I'm curious whenever the claim on linuximq.net site about the > stability is true, or there in fact are some real issue... >From what I know these problems haven't been fixed. Current kernels include Jamal's tc actions and the ifb-Device, which obsolete IMQ anyway. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Offering presents for more routing tables!
I've asked for this feature several years ago and evidently it is not trivial to increase the number of routing tables. But, perhaps someone now has time & inclination? I would like to have more (a few thousand) routing tables available in the kernel so that I can use a routing table for each of my many VLANs. Currently, the netlink protocol only specifies an 8-bit id for the routing table: http://www.faqs.org/rfcs/rfc3549.html Section 3.1.1 so a new netlink message would need to be created and the 'ip' tool updated. I think at least a 16-bit identifier should be used, possibly a full 32 bits so we don't have to revisit this again for a while! The kernel itself would also need to be modified so that it can have more routing tables. I realize most people don't need a large number of tables, so the maximum number should be configured at either compile time or run time. If I remember right, there are certain tables (253 - 255) that are currently special in the kernel. For complete backwards compatibility, this hole would probably have to remain as it is, with the new tables starting at 256. I would be willing to help test any resulting patches, and can also offer bribes of money, hardware, beer, etc. Thanks, Ben -- Ben Greear <[EMAIL PROTECTED]> Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/2] in-kernel sockets API
On Tue, Jun 13, 2006 at 02:12:41PM -0700, Daniel Phillips wrote: > This has the makings of a nice stable internal kernel api. Why do we want > to provide this nice stable internal api to proprietary modules? because there is IMHO legally nothing we can do about it anyway. Use of an industry-standard API that is provided in multiple operating system is one of the clearest idnication of some program _not_ being a derivative work. Whether we like it or not, it doesn't really matter if we export them GPL-only or not. Anybody using those scoket API calls will be having an easy time arguing in favor of non-derivative work. The GPL doesn't extend beyon what copyright law allows you to do... -- - Harald Welte <[EMAIL PROTECTED]> http://gnumonks.org/ We all know Linux is great...it does infinite loops in 5 seconds. -- Linus pgpFWlVVgch8f.pgp Description: PGP signature
Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late
On Tue, Jun 13, 2006 at 10:44:12PM -0600, Grant Grundler wrote: > On Tue, Jun 13, 2006 at 08:33:22PM -0400, Jeff Garzik wrote: > > Grant Grundler wrote: > > >o tulip_stop_rxtx() has to be called _after_ free_irq(). > > > ie. v2 patch didn't fix the original race condition > > > and when under test, dies about as fast as the original code. > > > > You made the race window smaller, but it's still there. The chip's DMA > > engines should be stopped before you unregister the interrupt handler. > > Switching the order to be: > tulip_stop_rxtx(tp);/* Stop DMA */ > free_irq (dev->irq, dev); /* no more races after this */ > I think the correct sequence would be: reset tulip interrupt mask flush posted write synchronize irq /* make sure we got 'em all */ tulip_stop_rxtx /* turn off dma */ free irq/* bye bye */ The synchronize irq guarantees we shouldn't see another irq generated by the card because it was held up somewhere. Cheers, Kyle M. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: driver for pptp
On Sat, Jun 03, 2006 at 11:06:19AM +0400, [EMAIL PROTECTED] wrote: > I have developed the driver for Point-to-Point Tunneling Protocol (PPTP). great news. something that I always thought of a nice-to-have. > I have published the project on http://accel-pptp.sourceforge.net/ Please don't expect Linux Kernel networking developers to actually go to sourceforge download and extract code that you want to have reviewed/submitted. Please read Documentation/SubmittingPatches (and CodingStyle) and submit your kernel patch to netdev. > Hope this driver will go to a kernel tree and will make linux more productive. not without you pushing it actively and getting through review cycles (which I hope you will!). Some initial comments: 1) why wasn't it possible to use the PPPoX infrastructure of the kernel which is already being used by PPPoE ? Or at least model it somehow similar to the existing PPPoE/PPPoX infrastructure? 2) why are you using a timer for asynchronous processing of GRE frames? First of all, why does it have to happen asynchronously at all? Secondly, why using a timer when there's nothing time related (or do I miss something)? If deferred, out-of-context execution is required, there are other primitives such as tasklets. 3) you conflict with the ip_gre.c genric GRE encapsulation driver. this is because both want to reigster a proto handler for GRE. Ideally, there needs to be another demultiplex between the GRE protocl and its users. The code registered for GRE would look at the packet and determine whether e.g. it is a PPTP GRE packet and then pass it on to the pptp module. 4) your code doesn't look nonlinear skb clean 5) why did you chose to implement /dev/pptp rather than a socket family like the existing pppox/pppoe code? 6) lots of codingstyle issues -- - Harald Welte <[EMAIL PROTECTED]> http://gnumonks.org/ We all know Linux is great...it does infinite loops in 5 seconds. -- Linus pgpuEiqiED7WD.pgp Description: PGP signature
Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)
On Wed, 2006-06-14 at 11:57 +0100, Alan Cox wrote: > Ar Mer, 2006-06-14 am 11:40 +0200, ysgrifennodd Jesper Dangaard Brouer: > > option to calculate traffic transmission times (rate table) > > over all ATM links, including ADSL, with perfect accuracy. > > The other problem I see with this code is it is very tightly tied to ATM > cell sizes, not to solving the generic question of packetisation. Well, we did consider to do so, but we though that it would be harder to get it into the kernel. Actually thats the reason for the defines: #defineATM_CELL_SIZE 53 #defineATM_CELL_PAYLOAD48 Changing these should should make it possible to adapt to any other SAR (Segment And Reasembly) link layer. > I'm > not sure if that matters but for modern processors I'm also sceptical > that the clever computation is actually any faster than just doing the > maths, especially if something cache intensive is also running. I guess you are refering to the rate table lookup system, that is based upon array lookups. I do think that the rate table array lookup system has been outdated, as memory access is the bottleneck on modern CPUs. But its design by Alexey for a long time ago where the hardware restrictions were different. It also avoids floting point operations in the kernel. Thanks for your comments. -- Med venlig hilsen / Best regards Jesper Brouer ComX Networks A/S Linux Network developer Cand. Scient Datalog / MSc. Author of http://adsl-optimizer.dk signature.asc Description: This is a digitally signed message part
Problems with xfrm (IPSec) and multicast
Hi, I have configured two Linux PC's to use IPSec to encrypt some mcast traffic, using "ip xfrm". Each PC has two network cards, one connected to a LAN (unencrypted side, also called red side) and one connected to the other node (encrypted side, also called black side). Currently the setup uses static keys in the SA entries, so IKE is not a problem. |<-- RED side -->|<- BLACK network ->|<-- RED side -->| +---+ +-+ +-+ +---+ | LAN A +---+ IPSEC-A +---+ IPSEC-B +---+ LAN B | +---+ +-+ +-+ +---+ 10.0.10.0/24 192.168.0.0/24 10.0.20.0/24 Configuration: Kernel tested: Linux-2.6.16.13 + 2.6.17-rc4 LAN A: 10.0.10.0/24 LAN B: 10.0.20.0/24 IPSEC-A: RED IP: 10.0.10.1, BLACK IP: 192.168.0.1 IPSEC-B: RED IP: 10.0.20.1, BLACK IP: 192.168.0.2 RED mcast group used: 239.192.20.1 BLACK mcast group used: 239.192.10.1 IPSEC-A SA and SP entries: [EMAIL PROTECTED] ~]# ip xfrm state src 192.168.0.1 dst 239.192.10.1 proto esp spi 0x0001 reqid 0 mode tunnel replay-window 4 auth sha1 0x01020301 enc aes 0x0001 encap type espinudp sport 4500 dport 4500 addr 0.0.0.0 [EMAIL PROTECTED] ~]# ip xfrm policy src 0.0.0.0/0 dst 239.192.20.1/32 dir in priority 2147483648 tmpl src 192.168.0.1 dst 239.192.10.1 proto esp reqid 0 mode tunnel src 10.0.10.0/24 dst 239.192.20.1/32 dir out priority 2147483648 tmpl src 192.168.0.1 dst 239.192.10.1 proto esp reqid 0 mode tunnel src 0.0.0.0/0 dst 239.192.20.1/32 dir fwd priority 2147483648 tmpl src 192.168.0.1 dst 239.192.10.1 proto esp reqid 0 mode tunnel (The entries for IPSEC-B is similar, but the address 192.168.0.1 is changed to the IPSEC-B BLACK IP 192.168.0.2). And I have a small userspace app that opens a socket and bind to port 4500 and issue setsockopt (fd, SOL_UDP, UDP_ENCAP) so that the kernel will accept UDP encap ESP packets. When I send multicast traffic from IPSEC-A (bound to the RED interface 10.0.10.1) to mcast group 239.192.20.1, the traffic matches the out SP entry and is encrypted according to the SA entry and sent as UDP encap ESP to mcast group 239.192.10.1 on the BLACK network. On IPSEC-B the UDP encap ESP packet is decrypted and is visible for userspace processes. So far so good. But then I start sending similar mcast traffic the other way, but from IPSEC-B (bound to IPSEC-B RED IP 10.0.20.1). This traffic is also encrypted and sent to IPSEC-A. But this packet is not decrypted at IPSEC-A, it seems to disappear. The IP and UDP SNMP counters increase for the received UDP encap ESP packet, but I cannot see what happens to the packet after the UDP layer. Seems like it is dropped somewhere in XFRM? By sending some more packets from IPSEC-B (roughly 5-8 more packets), these packets suddenly starts to be decrypted at IPSEC-A and all is well. Until I start traffic the other way around again, when the same problem occurs at IPSEC-B. So I cannot make encrypted multicast traffic to flow both ways at the same time, and has no clue as to why the first packets after changing direction is dropped somewhere. Anyone have a clue to this observed problem with linux xfrm and multicast or a better solution for encrypted multicast on linux 2.6.x? Any help with this is appreciated and more info (tcpdump, snmp stats and so on) can be obtained if needed. -- Roar Bjørgum Rotvik - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL
On Wed, 2006-06-14 at 08:06 -0400, jamal wrote: > Russell's site is inaccessible to me (I actually think this is related > to some DNS issues i may be having) Strange, I have access to Russell's site. Maybe its his redirect feature that confuses your browser, try: http://ace-host.stuart.id.au/russell/files/tc/tc-atm/ > and your masters is too long to > spend 2 minutes and glean it; so heres a question or two for you: Yes, I is quite long and very detailed. But it worth reading (... says the author him self ;-)) > - Have you tried to do a long-lived session such as a large FTP and > seen how far off the deviation was? That would provide some interesting > data point. The deviation can be calculated. The impact is of cause small for large packets. But the argument that bulk TCP transfers is not as badly affected, is wrong because all the TCP ACK packets gets maximum penalty. On an ADSL link with more than 8 bytes overhead, a 40 bytes TCP ACK will use more that one ATM frame, causing 2 ATM frames to be send that consumes 106 bytes, eg. 62% overhead. On a small upstream ADSL line that hurts! (See thesis page 53, table 5.3 "Overhead summary"). > - To be a devil's advocate (and not claim there is no issue), where do > you draw the line with "overhead"? > Example the smallest ethernet packet is 64 bytes of which 14 bytes are > ethernet headers ("overhead" for IP) - and this is not counting CRC etc. > If you were to set an MTU of say 64 bytes and tried to do a http or ftp, > how accurate do you think the calculation would be? I would think not > very different. I do think we handle this situation, but I'm not quite sure that I fully understand the question (sorry). > Does it matter if it is accurate on the majority of the cases? > - For further reflection: Have you considered the case where the rate > table has already been considered on some link speed in user space and > then somewhere post-config the physical link speed changes? This would > happen in the case where ethernet AN is involved and the partner makes > some changes (use ethtool). > > I would say the last bullet is a more interesting problem than a corner > case of some link layer technology that has high overhead. We only claim to do magic on ATM/ADSL links... nothing else ;-) > Your work would be more interesting if it was generic for many link > layers instead of just ATM. Well, we did consider to do so, but we though that it would be harder to get it into the kernel. Actually thats the reason for the defines: #defineATM_CELL_SIZE 53 #defineATM_CELL_PAYLOAD48 Changing these should should make it possible to adapt to any other SAR (Segment And Reasembly) link layer. > On Wed, 2006-14-06 at 11:40 +0200, Jesper Dangaard Brouer wrote: > > The Linux traffic's control engine inaccurately calculates > > transmission times for packets sent over ADSL links. For > > some packet sizes the error rises to over 50%. This occurs > > because ADSL uses ATM as its link layer transport, and ATM > > transmits packets in fixed sized 53 byte cells. > > > > The following patches to iproute2 and the kernel add an > > option to calculate traffic transmission times over all > > ATM links, including ADSL, with perfect accuracy. > > > > A longer presentation of the patch, its rational, what it > > does and how to use it can be found here: > >http://www.stuart.id.au/russell/files/tc/tc-atm/ > > > > A earlier version of the patch, and a _detailed_ empirical > > investigation of its effects can be found here: > >http://www.adsl-optimizer.dk/ > > > > The patches are both backwards and forwards compatible. > > This means unpatched kernels will work with a patched > > version of iproute2, and an unpatched iproute2 will work > > on patches kernels. > > > > > > This is a combined effort of Jesper Brouer and Russell Stuart, > > to get these patches into the upstream kernel. > > > > Let the discussion start about what we need to change to get this > > upstream? > > > > We see this as a feature enhancement, as thus hope that it can be > > queued in davem's net-2.6.18.git tree. > > > > --- > > Regards, > > Jesper Brouer & Russell Stuart. > > > Thanks for your comments :-) -- Med venlig hilsen / Best regards Jesper Brouer ComX Networks A/S Linux Network developer Cand. Scient Datalog / MSc. Author of http://adsl-optimizer.dk signature.asc Description: This is a digitally signed message part
Re: netif_tx_disable and lockless TX
On Wed, 2006-31-05 at 19:52 +0200, Robert Olsson wrote: > jamal writes: > > > Latency-wise: TX completion interrupt provides the best latency. > > Processing in the poll() -aka softirq- was almost close to the hardirq > > variant. So if you can make things run in a softirq such as transmit > > one, then the numbers will likely stay the same. > > I don't remember we tried tasklet for TX a la Herbert's suggestion but we > used use tasklets for controlling RX processing to avoid hardirq livelock > in pre-NAPI versions. > Hrm - it may have been a private thing i did then. I could swear we did that experiment together ... Perhaps Herbert's motivation was not really to optimize but rather to get something unstuck in the transmit path state machine maybe in a context of netconsole? The conditions for which that tasklet would even run require a CPU collision to the transmit. Sorry, I didnt quiet follow the motivation/discussion that ended in that patch. > Had variants of tulip driver with both TX cleaning at ->poll and TX > cleaning at hardirq and didn't see any performance difference. The > ->poll was much cleaner but we kept Alexey's original work for tulip. > It certainly is cleaner - but i do recall the hardirq variant had better latency much observable under high packet rates aka small packets. > > Sorry, I havent been following discussions on netchannels[1] so i am not > > qualified to comment on the "replacement" part Dave mentioned earlier. > > What I can say is the tx processing doesnt have to be part of the NAPI > > poll() and still use hardirq. > > Yes true but I see TX numbers with newer boards (wire rate small pakets) > with cleaing in ->poll. Also now linux is very safe in network "overload" > situations. Moving work to hardirq may change that. > Oh, I am not suggesting a change - i am a lot more conservative than that ;-> these areas are delicate (not code-delicate Acme ;->) but rather what seems obvious requires a lot of experimental results first. Robert, your transmit results Intel or AMD based? cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Refactor Netlink connector?
So whats the resolution on this? I actually have some cycles this coming weekend that i was hopping to spend updating the doc instead. cheers, jamal On Thu, 2006-01-06 at 10:24 -0400, James Morris wrote: > On Thu, 1 Jun 2006, Thomas Graf wrote: > > > It shouldn't be hard to split what is implemented in nlmsg_route_perms[] > > for NETLINK_ROUTE into the definitions of the generic netlink > > operations, could look like this: > > > > struct genl_ops some_op = { > > [...] > > .perm= NETLINK_GENERIC_SOCKET__NLMSG_READ, > > }; > > We wouldn't need the socket class outside of SELinux, just the perm, so > something like: > > NL_PERM_READ > > > int genl_peek_cmd(struct nlmsghdr *nlh) > > { > > struct genlmsghdr *hdr = nlmsg_data(nlh); > > > > if (nlh->nlmsglen < nlmsg_msg_sizeo(GENL_HDRLEN)) > > return -EINVAL; > > > > return hdr->cmd; > > } > > Unless I'm mistaken, people are already multiplexing commands inside genl > commands (and if so, why even bother with registerable ops?). > > > I'll look at it in more detail soon. > > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL
I have taken linux-kernel off the list. Russell's site is inaccessible to me (I actually think this is related to some DNS issues i may be having) and your masters is too long to spend 2 minutes and glean it; so heres a question or two for you: - Have you tried to do a long-lived session such as a large FTP and seen how far off the deviation was? That would provide some interesting data point. - To be a devil's advocate (and not claim there is no issue), where do you draw the line with "overhead"? Example the smallest ethernet packet is 64 bytes of which 14 bytes are ethernet headers ("overhead" for IP) - and this is not counting CRC etc. If you were to set an MTU of say 64 bytes and tried to do a http or ftp, how accurate do you think the calculation would be? I would think not very different. Does it matter if it is accurate on the majority of the cases? - For further reflection: Have you considered the case where the rate table has already been considered on some link speed in user space and then somewhere post-config the physical link speed changes? This would happen in the case where ethernet AN is involved and the partner makes some changes (use ethtool). I would say the last bullet is a more interesting problem than a corner case of some link layer technology that has high overhead. Your work would be more interesting if it was generic for many link layers instead of just ATM. cheers, jamal On Wed, 2006-14-06 at 11:40 +0200, Jesper Dangaard Brouer wrote: > The Linux traffic's control engine inaccurately calculates > transmission times for packets sent over ADSL links. For > some packet sizes the error rises to over 50%. This occurs > because ADSL uses ATM as its link layer transport, and ATM > transmits packets in fixed sized 53 byte cells. > > The following patches to iproute2 and the kernel add an > option to calculate traffic transmission times over all > ATM links, including ADSL, with perfect accuracy. > > A longer presentation of the patch, its rational, what it > does and how to use it can be found here: >http://www.stuart.id.au/russell/files/tc/tc-atm/ > > A earlier version of the patch, and a _detailed_ empirical > investigation of its effects can be found here: >http://www.adsl-optimizer.dk/ > > The patches are both backwards and forwards compatible. > This means unpatched kernels will work with a patched > version of iproute2, and an unpatched iproute2 will work > on patches kernels. > > > This is a combined effort of Jesper Brouer and Russell Stuart, > to get these patches into the upstream kernel. > > Let the discussion start about what we need to change to get this > upstream? > > We see this as a feature enhancement, as thus hope that it can be > queued in davem's net-2.6.18.git tree. > > --- > Regards, > Jesper Brouer & Russell Stuart. > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Remove Prism II support from Orinoco
On Tue, 13 Jun 2006 07:30:56 +0300, Faidon Liambotis wrote: > Unfortunately, that workaround doesn't work so well when you want to > have the ability to plug real orinoco (hermes) cards to your computer... > In other words and unless I'm missing something, there isn't currently a > way to have a Hermes card and a Prism II card both plugged in and working. Do you know about /sys/bus/pci/drivers/*/bind and unbind? http://lwn.net/Articles/143397/ -- Jiri Benc SUSE Labs - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/2] in-kernel sockets API
Alan, On Wed, 14 Jun 2006, Alan Cox wrote: > It isn't "policy" its called copyright law. I know that I said I'd shut up, but I missed in TRIPS where it said that symbols must be EXPORT_SYMBOL_GPL... Could you point that out? (Just kidding.) > You don't seem to understand copyright law either. The GPL like all > copyright licenses deals with the right to make copies and to create and > control derivative works. It's not "defeated" by four lines of code. The 3 or 4 lines of code that I wrote as an original expression before the patch was submitted. > Is that a confession ;) No, just a declaration: the code in question was released under GPL Version 2. > Copyright is not about novelty, you have it confused with the > theoretical (not actual) role of patents. Wrong kind of intellectual > monopoly right. Yes, perhaps I should have said "original" instead of "novel". The patch is not "original" as it was predated by equivalent (machine translatable) original expressions. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)
Ar Mer, 2006-06-14 am 11:40 +0200, ysgrifennodd Jesper Dangaard Brouer: > option to calculate traffic transmission times (rate table) > over all ATM links, including ADSL, with perfect accuracy. Only if the lowest level is encoded in a time linear manner. If you are using NRZ, NRZI etc at the bottom level then you may still be out... The other problem I see with this code is it is very tightly tied to ATM cell sizes, not to solving the generic question of packetisation. I'm not sure if that matters but for modern processors I'm also sceptical that the clever computation is actually any faster than just doing the maths, especially if something cache intensive is also running. Alan - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/2] in-kernel sockets API
On Tue, Jun 13, 2006 at 07:53:19PM -0500, Chase Venters wrote: > > It is the lack of an ABI that is most frustrating to these users. > > And the presence of an ABI would be _very_ frustrating to core > developers. Not only would these people suffer, everyone would -- > developer time would be wasted dealing with cruft, and forward > progress would be slowed. Note that just because an interface is EXPORT_SYMBOL doesn't mean that the interface is guaranteed to be stable. So folks who are aruging that an interface shouldn't be usable by non-GPL applications because we are therefore guaranteeing a stable API are making an unwarranted assumption. - Ted - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/2] in-kernel sockets API
Ar Mer, 2006-06-14 am 00:07 -0600, ysgrifennodd Brian F. G. Bidulock: > I think that a policy that intentionally makes it hard for proprietary > modules to be developed defeats the purpose of ultimate opening and merging. It isn't "policy" its called copyright law. > The interface currently under discussion is ultimately derived from the BSD > socket-protocol interface, and IMHO should be EXPORT_SYMBOL instead of > EXPORT_SYMBOL_GPL, if only because using _GPL serves no purpose here and can > be defeated with 3 or 4 obvious (and probably existing) lines of code You don't seem to understand copyright law either. The GPL like all copyright licenses deals with the right to make copies and to create and control derivative works. It's not "defeated" by four lines of code. I > wrote similar wrappers for STREAMS TPI to Linux NET4 interface instead of > using pointers directly quite a few years ago. I doubt I was the first. Is that a confession ;) > There is nothing really so novel here that it deserves _GPL. Copyright is not about novelty, you have it confused with the theoretical (not actual) role of patents. Wrong kind of intellectual monopoly right. Alan - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)
The Linux traffic's control engine inaccurately calculates transmission times for packets sent over ADSL links. For some packet sizes the error rises to over 50%. This occurs because ADSL uses ATM as its link layer transport, and ATM transmits packets in fixed sized 53 byte cells. This changes the userspace tool iproute2/tc by adding an option to calculate traffic transmission times (rate table) over all ATM links, including ADSL, with perfect accuracy. A longer presentation of the patch, its rational, what it does and how to use it can be found here: http://www.stuart.id.au/russell/files/tc/tc-atm/ A earlier version of the patch, and a _detailed_ empirical investigation of its effects can be found here: http://www.adsl-optimizer.dk/ Signed-off-by: Jesper Dangaard Brouer <[EMAIL PROTECTED]> Signed-off-by: Russell Stuart <[EMAIL PROTECTED]> --- diff -Nurp iproute2.orig/include/linux/pkt_sched.h iproute2/include/linux/pkt_sched.h --- iproute2.orig/include/linux/pkt_sched.h 2005-12-10 09:27:44.0 +1000 +++ iproute2/include/linux/pkt_sched.h 2006-06-13 11:53:27.0 +1000 @@ -77,8 +77,9 @@ struct tc_ratespec { unsigned char cell_log; unsigned char __reserved; - unsigned short feature; - short addend; + unsigned short feature;/* Always 0 in pre-atm patch kernels */ + charcell_align; /* Always 0 in pre-atm patch kernels */ + unsigned char __unused; unsigned short mpu; __u32 rate; }; diff -Nurp iproute2.orig/tc/m_police.c iproute2/tc/m_police.c --- iproute2.orig/tc/m_police.c 2005-01-19 08:11:58.0 +1000 +++ iproute2/tc/m_police.c 2006-06-13 11:53:27.0 +1000 @@ -35,7 +35,7 @@ struct action_util police_action_util = static void explain(void) { fprintf(stderr, "Usage: ... police rate BPS burst BYTES[/BYTES] [ mtu BYTES[/BYTES] ]\n"); - fprintf(stderr, "[ peakrate BPS ] [ avrate BPS ]\n"); + fprintf(stderr, "[ peakrate BPS ] [ avrate BPS ] [ overhead OVERHEAD ] [ atm ]\n"); fprintf(stderr, "[ ACTIONTERM ]\n"); fprintf(stderr, "Old Syntax ACTIONTERM := action [/NOTEXCEEDACT] \n"); fprintf(stderr, "New Syntax ACTIONTERM := conform-exceed [/NOTEXCEEDACT] \n"); @@ -134,7 +134,10 @@ int act_parse_police(struct action_util __u32 ptab[256]; __u32 avrate = 0; int presult = 0; - unsigned buffer=0, mtu=0, mpu=0; + unsigned buffer=0, mtu=0; + __u8 mpu=0; + __s8 overhead=0; + int atm=0; int Rcell_log=-1, Pcell_log = -1; struct rtattr *tail; @@ -184,7 +187,7 @@ int act_parse_police(struct action_util fprintf(stderr, "Double \"mpu\" spec\n"); return -1; } - if (get_size(&mpu, *argv)) { + if (get_u8(&mpu, *argv, 10)) { explain1("mpu"); return -1; } @@ -198,6 +201,18 @@ int act_parse_police(struct action_util explain1("rate"); return -1; } + } else if (strcmp(*argv, "overhead") == 0) { + NEXT_ARG(); + if (p.rate.rate) { + fprintf(stderr, "Double \"overhead\" spec\n"); + return -1; + } + if (get_s8(&overhead, *argv, 10)) { + explain1("overhead"); + return -1; + } + } else if (strcmp(*argv, "atm") == 0) { + atm = 1; } else if (strcmp(*argv, "avrate") == 0) { NEXT_ARG(); if (avrate) { @@ -264,22 +279,12 @@ int act_parse_police(struct action_util } if (p.rate.rate) { - if ((Rcell_log = tc_calc_rtable(p.rate.rate, rtab, Rcell_log, mtu, mpu)) < 0) { - fprintf(stderr, "TBF: failed to calculate rate table.\n"); - return -1; - } + tc_calc_ratespec(&p.rate, rtab, p.rate.rate, Rcell_log, mtu, mpu, atm, overhead); p.burst = tc_calc_xmittime(p.rate.rate, buffer); - p.rate.cell_log = Rcell_log; - p.rate.mpu = mpu; } p.mtu = mtu; if (p.peakrate.rate) { - if ((Pcell_log = tc_calc_rtable(p.peakrate.rate, ptab, Pcell_log, mtu, mpu)) < 0) { - fprintf(stderr, "POLICE: failed to calculate peak rate table.\n"); - return -1; - } - p.peakrate.cell_log = Pcell_log; - p.peakrate.mpu = mpu; +
[PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL
The Linux traffic's control engine inaccurately calculates transmission times for packets sent over ADSL links. For some packet sizes the error rises to over 50%. This occurs because ADSL uses ATM as its link layer transport, and ATM transmits packets in fixed sized 53 byte cells. The following patches to iproute2 and the kernel add an option to calculate traffic transmission times over all ATM links, including ADSL, with perfect accuracy. A longer presentation of the patch, its rational, what it does and how to use it can be found here: http://www.stuart.id.au/russell/files/tc/tc-atm/ A earlier version of the patch, and a _detailed_ empirical investigation of its effects can be found here: http://www.adsl-optimizer.dk/ The patches are both backwards and forwards compatible. This means unpatched kernels will work with a patched version of iproute2, and an unpatched iproute2 will work on patches kernels. This is a combined effort of Jesper Brouer and Russell Stuart, to get these patches into the upstream kernel. Let the discussion start about what we need to change to get this upstream? We see this as a feature enhancement, as thus hope that it can be queued in davem's net-2.6.18.git tree. --- Regards, Jesper Brouer & Russell Stuart. signature.asc Description: This is a digitally signed message part
[PATCH 1/2] NET: Accurate packet scheduling for ATM/ADSL (kernel)
The Linux traffic's control engine inaccurately calculates transmission times for packets sent over ADSL links. For some packet sizes the error rises to over 50%. This occurs because ADSL uses ATM as its link layer transport, and ATM transmits packets in fixed sized 53 byte cells. This changes the kernel rate table lookup, to be able to lookup packet transmission times over all ATM links, including ADSL, with perfect accuracy. The accuracy is dependent on the rate table that is calculated in userspace by iproute2 command tc. A longer presentation of the patch, its rational, what it does and how to use it can be found here: http://www.stuart.id.au/russell/files/tc/tc-atm/ A earlier version of the patch, and a _detailed_ empirical investigation of its effects can be found here: http://www.adsl-optimizer.dk/ Signed-off-by: Jesper Dangaard Brouer <[EMAIL PROTECTED]> Signed-off-by: Russell Stuart <[EMAIL PROTECTED]> --- diff -Nurp kernel-source-2.6.16.orig/include/linux/pkt_sched.h kernel-source-2.6.16/include/linux/pkt_sched.h --- kernel-source-2.6.16.orig/include/linux/pkt_sched.h 2006-03-20 15:53:29.0 +1000 +++ kernel-source-2.6.16/include/linux/pkt_sched.h 2006-06-13 11:42:12.0 +1000 @@ -77,8 +77,9 @@ struct tc_ratespec { unsigned char cell_log; unsigned char __reserved; - unsigned short feature; - short addend; + unsigned short feature;/* Always 0 in pre-atm patch kernels */ + charcell_align; /* Always 0 in pre-atm patch kernels */ + unsigned char __unused; unsigned short mpu; __u32 rate; }; diff -Nurp kernel-source-2.6.16.orig/include/net/sch_generic.h kernel-source-2.6.16/include/net/sch_generic.h --- kernel-source-2.6.16.orig/include/net/sch_generic.h 2006-03-20 15:53:29.0 +1000 +++ kernel-source-2.6.16/include/net/sch_generic.h 2006-06-13 11:42:12.0 +1000 @@ -307,4 +307,18 @@ drop: return NET_XMIT_DROP; } +/* Lookup a qdisc_rate_table to determine how long it will take to send a + packet given its size. + */ +static inline u32 qdisc_l2t(struct qdisc_rate_table* rtab, int pktlen) +{ + int slot = pktlen + rtab->rate.cell_align; + if (slot < 0) + slot = 0; + slot >>= rtab->rate.cell_log; + if (slot > 255) + return rtab->data[255] + 1; + return rtab->data[slot]; +} + #endif diff -Nurp kernel-source-2.6.16.orig/net/sched/act_police.c kernel-source-2.6.16/net/sched/act_police.c --- kernel-source-2.6.16.orig/net/sched/act_police.c2006-03-20 15:53:29.0 +1000 +++ kernel-source-2.6.16/net/sched/act_police.c 2006-06-13 11:42:12.0 +1000 @@ -33,8 +33,8 @@ #include #include -#define L2T(p,L) ((p)->R_tab->data[(L)>>(p)->R_tab->rate.cell_log]) -#define L2T_P(p,L) ((p)->P_tab->data[(L)>>(p)->P_tab->rate.cell_log]) +#define L2T(p,L) qdisc_l2t((p)->R_tab,L) +#define L2T_P(p,L) qdisc_l2t((p)->P_tab,L) #define PRIV(a) ((struct tcf_police *) (a)->priv) /* use generic hash table */ diff -Nurp kernel-source-2.6.16.orig/net/sched/sch_cbq.c kernel-source-2.6.16/net/sched/sch_cbq.c --- kernel-source-2.6.16.orig/net/sched/sch_cbq.c 2006-03-20 15:53:29.0 +1000 +++ kernel-source-2.6.16/net/sched/sch_cbq.c2006-06-13 11:42:12.0 +1000 @@ -193,7 +193,7 @@ struct cbq_sched_data }; -#define L2T(cl,len)((cl)->R_tab->data[(len)>>(cl)->R_tab->rate.cell_log]) +#define L2T(cl,len)qdisc_l2t((cl)->R_tab,len) static __inline__ unsigned cbq_hash(u32 h) diff -Nurp kernel-source-2.6.16.orig/net/sched/sch_htb.c kernel-source-2.6.16/net/sched/sch_htb.c --- kernel-source-2.6.16.orig/net/sched/sch_htb.c 2006-03-20 15:53:29.0 +1000 +++ kernel-source-2.6.16/net/sched/sch_htb.c2006-06-13 11:42:12.0 +1000 @@ -206,12 +206,10 @@ struct htb_class static __inline__ long L2T(struct htb_class *cl,struct qdisc_rate_table *rate, int size) { -int slot = size >> rate->rate.cell_log; -if (slot > 255) { +long result = qdisc_l2t(rate, size); +if (result > rate->data[255]) cl->xstats.giants++; - slot = 255; -} -return rate->data[slot]; +return result; } struct htb_sched diff -Nurp kernel-source-2.6.16.orig/net/sched/sch_tbf.c kernel-source-2.6.16/net/sched/sch_tbf.c --- kernel-source-2.6.16.orig/net/sched/sch_tbf.c 2006-03-20 15:53:29.0 +1000 +++ kernel-source-2.6.16/net/sched/sch_tbf.c2006-06-13 11:42:12.0 +1000 @@ -132,8 +132,8 @@ struct tbf_sched_data struct Qdisc*qdisc; /* Inner qdisc, default - bfifo queue */ }; -#define L2T(q,L) ((q)->R_tab->data[(L)>>(q)->R_tab->rate.cell_log]) -#define L2T_P(q,L) ((q)->P_tab->data[(L)>>(q)->P_tab->rate.cell_log]) +#define L2T(q,L) qdisc_l2t((q)->R_tab,L) +#define L2T_P(q,L) qdisc_l2t((q)->P_tab,L) static int tbf_enqueue(struct sk_buff *skb, struct Qdisc* sch
Re: [RFC/PATCH 1/2] in-kernel sockets API
Chase, On Wed, 14 Jun 2006, Chase Venters wrote: > > One point I remember coming up in the discussion was that the > EXPORT_SYMBOL()/EXPORT_SYMBOL_GPL() split was a compromise of sorts. > Interfaces that were needed to support users would reasonably be placed under > EXPORT_SYMBOL(). By contrast, EXPORT_SYMBOL_GPL() would indicate > functionality that would only seem to be used by derived works. It implies > that any code using it should probably be GPL as well. The difficulty with EXPORT_SYMBOL_GPL() as I see it that it reached farther than the GPL. GPL does not impact non-derived works, which can be licensed under any terms their authors see fit. Whereas, EXPORT_SYMBOL_GPL() requires a non-derived work to declare a GPL license to even use it. If you subscribe to the FSF view of derived work (just linking is a derivation) then I suppose you would support the EXPORT_SYMBOL_GPL(). IANAL, but I don't believe that TRIPS nor Berne Convention case law supports the FSF view. Linus' statements in the COPYING file take a different view: that simple use of a technical interface is not necessarily (in itself) derivation. Now, I understand the use of EXPORT_SYMBOL() vs. EXPORT_SYMBOL_GPL() to allow authors to differ on this idea. But, in the case in point, the function pointers can be accessed by merely including the appropriate header files. Changing a the wrapper access to them to EXPORT_SYMBOL_GPL() strikes me as similar to changing kmalloc() from EXPORT_SYMBOL() to EXPORT_SYMBOL_GPL(). Understand that all exported symbols, regardless of licensing or modversions or whatever, are available in the kernel boot image and can be linked to by any module at any time. That is, those that would abuse the concept of derivation will not be impeded by EXPORT_SYMBOL_GPL(). (Rip the symbol from the kernel image, write a thin GPL'ed module that aliases the symbol and the exports it again as EXPORT_SYMBOL() without module versioning, copy the lines of code into the proprietary module, reversing the order of arbitrary lines, etc.) In any case, all it serves to do is to punish honest non-derivative works not published compatible with the GPL. What I resist is the apparent attempt to change these symbols to _GPL as some matter of general policy in this case contrary to the author's original intentions as expressed in the original patch submission, and without the author of the interface being wrappered jumping up an screaming that his code was under strict FSF linking-is-derivation GPL (in which case we could have had a good discussion on whether Linux NET4 is actually a derivative work of BSD 4.4 Lite which was licensed under the "old" BSD license, incompatible with the GPL ;) As a general policy I would say make it EXPORT_SYMBOL() unless the author of the patch (derivation) or author of the original (derived) code dictates that it be EXPORT_SYMBOL_GPL(). Ok, I'll shut up now... ...really. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.17: networking bug??
Mark Lord wrote: Further to this, the current behaviour is badly unpredictable. A machine could be working perfectly, not (noticeably) affected by this bug. And then the user adds another stick of RAM to it. This "bug" already existed in 2.6.16 to a certain extent: you were losing out on a lot of TCP performance. Go back to 2.6.7, measure TCP performance, and you'll probably find it was significantly better. Also, there aren't that many broken end-points out there. www.everymac.com loads fine for me and does not ignore the window scale factor. The problem in your case is a broken router in the middle. I had the same problem: certain sites would not load, but there is absolutely nothing wrong with the servers that run these sites: http://marc.theaimsgroup.com/?l=linux-netdev&m=114478312100641&w=2 I contacted my ISP and informed them of the issue. They fixed it nationwide within a few weeks. You might try confirming that your problem only applies to HTTP like mine did (ISP runs some lame transparent webcaches), and it was a bug in the software there (NetApp). We already had the "some routers are broken, should we do anything" discussion back at the time of 2.6.8: http://lwn.net/Articles/92727/ Daniel - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/2] in-kernel sockets API
On Wednesday 14 June 2006 01:06, Brian F. G. Bidulock wrote: > > The interface currently under discussion is ultimately derived from the BSD > socket-protocol interface, and IMHO should be EXPORT_SYMBOL instead of > EXPORT_SYMBOL_GPL, if only because using _GPL serves no purpose here and > can be defeated with 3 or 4 obvious (and probably existing) lines of code. > I wrote similar wrappers for STREAMS TPI to Linux NET4 interface instead of > using pointers directly quite a few years ago. I doubt I was the first. > There is nothing really so novel here that it deserves _GPL. I mentioned that I don't have any particular opinion on the BSD socket API in this discussion. All that I'm speaking of here is a property of licensing. I've watched a lot of what has happened with binary drivers. You'll find in the LKML archives plenty of lengthy discussions about whether or not binary drivers are allowed under the GPL. If I were to guess, there is still disagreement. Although some hardware support could improve, we thankfully seem to have some kind of an equilibrium capable of supporting lots of users. One point I remember coming up in the discussion was that the EXPORT_SYMBOL()/EXPORT_SYMBOL_GPL() split was a compromise of sorts. Interfaces that were needed to support users would reasonably be placed under EXPORT_SYMBOL(). By contrast, EXPORT_SYMBOL_GPL() would indicate functionality that would only seem to be used by derived works. It implies that any code using it should probably be GPL as well. I don't raise this in an attempt to belittle anything people are working on. It's an observation about the ecosystem - Linux in the 2.6 series has seen a great amount of corporate contribution in terms of enhancing what the kernel is capable of doing. GPL, I believe, encourages this. Thanks, Chase - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html