Re: Bcm43xx softMac Driver in 2.6.18
On 9/22/06, Larry Finger [EMAIL PROTECTED] wrote: When we found the cause of NETDEV watchdog timeouts in the wireless-2.6 code, I knew that the 2.6.18 release code would cause a serious regression. I don't know if this is the lockup you're trying to address, but 2.6.18's bcm43xx has definitely regressed for me versus 2.6.17.x. 2.6.18 vanilla and 2.6.18 with your patch both lock my system hard with bcm43xx. I've got an HP/Compaq nx6125 laptop. Symptoms are that it will associate fine on its own and send traffic to/fro upon ifup, but when I do an iwconfig, ifdown, ifup to change the access point, the system locks (somewhat randomly) during one of those operations. Well, the iwconfig or the ifup, actually. lspci -v: 02:02.0 Network controller: Broadcom Corporation BCM4309 802.11a/b/g (rev 03) Subsystem: Hewlett-Packard Company Unknown device 12f9 Flags: bus master, fast devsel, latency 64, IRQ 11 Memory at d001 (32-bit, non-prefetchable) [size=8K] ./bcm43xx-fwcutter -i BCMWL5.SYS filename : bcmwl5.sys version : 4.10.40.1 MD5 : 69f940672be0ecee5bd1e905706ba8ce Wireless tools are Version: 28-1ubuntu2. I've got multiple access points in view of the laptop, a g (54Mb), and a b (11Mb). Neither with encryption enabled, if that makes a difference (we live in the boonies). It's 2.6.18 + your patch, compiled for x86_64, ubuntu devel. Any suggestions or requests for tests? Ray - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [e2e] performance of BIC-TCP, High-Speed-TCP, H-TCP etc
I suggest you take a closer look Injong - there is a whole page of data from tests covering a wide range of levels of background traffic. These results are all new, and significantly strengthen the conclusions I think, as is the expanded explanatory discussion of the observed behaviour of the various algorithms (the result of a fair bit of detective work of course). Your claim that Your report mostly ignores the effect of background traffic is simply not true. I can't really comment on your own tests without more information, although I can say that we went to a good bit of trouble to make sure our results were consistent and reproducible - in fact all our reported results are from at least five, and usually more, runs of each test. We were also careful to control for differences in kernel implementation so that we compare congestion control algorithms rather than other aspects of the network stack implementation. All of this is documented in the paper. The kernel we used is available on the web. Our measurements are also publicly available - the best way forward might be to pick one or two tests and compare results of them in detail with a view to diagnosing the source of any differences. General comments such as our experience tells that the RTT variations of mid size flows play a very important role in creating significant dynamics in testing environments are not too helpful. What do you mean by a mid-sized flow ? What do you mean by significant dynamics ? What do you mean by important role - is this quantified ? Best to stick to science rather than grandstanding. This is especially true when dealing with a sensitive subject such as the evaluation of competing algorithms. Re FAST, we have of course discussed our results with the Caltech folks. As stated in the paper, some of the observed behaviour seems to be associated with the alpha tuning algorithm. Other behaviour seems to be associated with packet burst effects that have also been reported independently by the Caltech folks. Similar results to ours have since been observed by other groups I believe. Perhaps differences between our results point to some issue in your testbed setup. Doug Injong Rhee wrote: This is a resend with fixed web links. The links were broken in my previous email -- sorry about multiple transmissions. - Hi Doug, Thanks for sharing your paper. Also congratulations to the acceptance of your journal paper to TONs. But I am wondering what's new in this paper. At first glance, I did not find many new things that are different from your previously publicized reports. How much is this different from the ones you put out in this mail list a year or two ago and also the one publicized in PFLDnet February this year http://www.hpcc.jp/pfldnet2006/? In that same workshop, we also presented our experimental results that shows significant discrepancy from yours but i am not sure why you forgot to reference our experimental work presented in that same PFLDnet. Here is a link to a more detailed version of that report accepted to COMNET http://netsrv.csc.ncsu.edu/highspeed/comnet-asteppaper.pdf The main point of contention [that we talked about in that PFLDnet workshop] is the presence of background traffic and the method to add them. Your report mostly ignores the effect of background traffic. Some texts in this paper state that you added some web traffic (10%), but the paper shows only the results from NO background traffic scenarios. But our results differ from yours in many aspects. Below are the links to our results (the links to them have been available in our BIC web site for a long time and also mentioned in our PFLDnet paper; this result is with the patch that corrects HTCP bugs). [Convergence and intra protocol fairness] without background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/nobk/intra_protocol/intra_protocol.htm with background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/bk/intra_protocol/intra_protocol.htm [RTT fairness]: w/o background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/nobk/rtt_fairness/rtt_fairness.htm with background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/bk/rtt_fairness/rtt_fairness.htm [TCP friendliness] without background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/nobk/tcp_friendliness/tcp_friendliness.htm with background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/bk/tcp_friendliness/tcp_friendliness.htm After our discussion in that PFLDnet, I puzzled why we get different results. My guess is that the main difference between your experiment and ours is the inclusion of mid-sized flows with various RTTs -- our experience tells that the RTT variations of mid size flows play a very important role in creating significant dynamics in testing environments. The same
[PATCH 2.6.17.13 1/2] LARTC: trace control for netem: userspace
Trace Control for Netem: Emulate network properties such as long range dependency and self-similarity of cross-traffic. user space (iproute2): The directory tc/netem was split in two parts, one containing the original distribution tables and the other the tools to generate trace files as well as the program responsible for reading the delay values from the trace file and sending them to the kernel (called flowseed). If the trace option is set, netem initializes the kernel and starts the flowseedprocess. The flowseedprocess does not send data to the kernel until the registration is completed. The data is sent to the kernel module via configfs. For each qdisc applied, a new directory (in /config/tcn/) is created. The write returns when the kernel needs new data, or when the corresponding qdisc was deleted. In the first case new data is sent and in the latter case the flowseedprocess terminates himself. Signed-off-by: Rainer Baumann [EMAIL PROTECTED] --- Patch for iproute2-2.6.16-060323: http://tcn.hypert.net/tcn_iproute2.patch - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.17.13 2/2] LARTC: trace control for netem: kernelspace
Trace Control for Netem: Emulate network properties such as long range dependency and self-similarity of cross-traffic. kernel space: The delay, drop, duplication and corruption values are readout in user space and sent to kernel space via configfs. The userspace process will hang on write until the kernel needs new data. In order to have always packet action values ready to apply, there are two buffers that hold these values. Packet action values can be read from one buffer and the other buffer can be refilled with new values simultaneously. The synchronization of need more delay values and return from write is done with the use of wait queues. Having applied the delay value to a packet, the packet gets processed by the original netem functions. Signed-off-by: Rainer Baumann [EMAIL PROTECTED] --- Patch for linux kernel 2.6.17.13: http://tcn.hypert.net/tcn_kernel_configfs.patch - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.17.13 0/2] LARTC: trace control for netem
Trace Control for Netem: Emulate network properties such as long range dependency and self-similarity of cross-traffic. A new option (trace) has been added to the netem command. If the trace option is used, the values for packet delay etc. are read from a pregenerated trace file, afterwards the packets are processed by the normal netem functions. The packet action values are readout from the trace file in user space and sent to kernel space via configfs. Sorry, yesterday, this was the old version, this here is now the new version! After our patches from 2nd and 22th of August we have integrated the comments from Stephen and hope we are on the right way now. We are looking forward for any comments, feedback and suggestions! - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [e2e] performance of BIC-TCP, High-Speed-TCP, H-TCP etc
Doug Leith wrote- I suggest you take a closer look Injong - there is a whole page of data from tests covering a wide range of levels of background traffic. These results are all new, and significantly strengthen the conclusions I think, as is the expanded explanatory discussion of the observed behaviour of the various algorithms (the result of a fair bit of detective work of course). I was not sure whether this whole new page is good enough to make another public announcement about this paper -- this paper has been publicized by you many times in these mailing lists and also in the workshop. It would have saved us some time if you had just pointed out the new stuff. I can't really comment on your own tests without more information, although I can say that we went to a good bit of trouble to make sure our results were consistent and reproducible - in fact all our reported results are from at least five, and usually more, runs of each test. I am not doubting your effort here and I am sure your methods are correct. Just i was pondering why we got different results and try to see if we can come to some understanding on this different results we got. Who knows we together might run into some fundamental research issues regarding testing. Also the more information about our own experiment is already given in the paper and also in our web site. If you could tell what specific info you need more, I can provide. Let's put our heads together to solve this mystery of different results. General comments such as our experience tells that the RTT variations of mid size flows play a very important role in creating significant dynamics in testing environments are not too helpful. What do you mean by a mid-sized flow ? What do you mean by significant dynamics ? What do you mean by important role - is this quantified ? Best to stick to science rather than grandstanding. This is especially true when dealing with a sensitive subject such as the evaluation of competing algorithms. I hope you can perhaps enlighten us with this science. Well..this WAS just email. There wasn't much space to delve into science there. So that is why I gave the link to Floyd and Kohler's paper. Sally's paper on this role of RTT variations provides more scientific explanation on this dynamics. In case you missed it, here is the link again. http://www.icir.org/models/hotnetsFinal.pdf. Please read Section 3.3. Also about mid size flows, I am referring to the flow lifetimes. The mid sized flows cannot be represented well by the Pareto distribution -- the ones that are in the middle of the distribution that heavy tail is not capable of providing with a large number. Since the Pareto distribution (of your web traffic sz) follows the power law, the distribution of flow sizes around the origin (very short-term) is very high while very long-term flows have relatively high probability. So speaking of science, can you please tell me whether all flows of your web traffic have the same RTTs or not? If you could please point me to the results you have with your web traffic tests instead of simply hand-wavy about the results saying they are just the same (or similar) as the results from your NO background traffic tests, I'd appreciate that very much. Re FAST, we have of course discussed our results with the Caltech folks. As stated in the paper, some of the observed behaviour seems to be associated with the alpha tuning algorithm. Other behaviour seems to be associated with packet burst effects that have also been reported independently by the Caltech folks. Similar results to ours have since been observed by other groups I believe. Perhaps differences between our results point to some issue in your testbed setup. That might be the case. Thanks for pointing that out. But it is hard to explain why we got coincidently the same results as the FAST folks. Maybe our and FAST folks' testbeds have this issue while yours are completely sound and scientific. But I think it is more to do with the different setups we have regarding buffer sizes and the maximum bandwidth. FAST doesn't adapt very well especially under small buffers because of this alpha tuning. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bcm43xx: further fix for periodic work errors
On Saturday 23 September 2006 06:08, Larry Finger wrote: Recent changes in the setup for preemptible periodic work fixed most of the problems with NETDEV watchdog timeouts; however, some variants of the bcm43xx device still had the problem. These were fixed by setting the parameter MAXIMUM_BADNESS to 0. By doing so, all the functionality associated with calculating the 'badness' of the upcoming periodic work is no longer needed; therefore it is removed. Uhm, no. Wait. _Why_ does the watchdog trigger. All periodic work in the fastpath (which you remove with this patch) is supposed to execute in a few microseconds. I don't think we want to fix this my removing the fastpath and always taking the _expensive_ slowpath periodic work. So why does the watchdog trigger for the fast periodic work? We need to find out. Removing the fastpath is just bad for overall latency. The two fastpath periodic works are 15 and 30, if executed standalone. If the 15 and/or 30 is execiuted alongside with a 60sec work, it's all slowpath, of course. Signed-off-by: Larry Finger [EMAIL PROTECTED] --- John, This patch relies on [PATCH] bcm43xx: fix netdev watchdog timeouts, which was submitted on 9/14/06. It is important for this one, as well as those already queued, to make the 2.6.19 cutoff. Thanks, Larry Index: wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c === --- wireless-2.6.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c +++ wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c @@ -3136,67 +3136,32 @@ static void do_periodic_work(struct bcm4 schedule_delayed_work(bcm-periodic_work, HZ * 15); } -/* Estimate a Badness value based on the periodic work - * state-machine state. Badness is worse (bigger), if the - * periodic work will take longer. - */ -static int estimate_periodic_work_badness(unsigned int state) -{ - int badness = 0; - - if (state % 8 == 0) /* every 120 sec */ - badness += 10; - if (state % 4 == 0) /* every 60 sec */ - badness += 5; - if (state % 2 == 0) /* every 30 sec */ - badness += 1; - if (state % 1 == 0) /* every 15 sec */ - badness += 1; - -#define BADNESS_LIMIT4 - return badness; -} - static void bcm43xx_periodic_work_handler(void *d) { struct bcm43xx_private *bcm = d; unsigned long flags; u32 savedirqs = 0; - int badness; - - badness = estimate_periodic_work_badness(bcm-periodic_state); - if (badness BADNESS_LIMIT) { - /* Periodic work will take a long time, so we want it to - * be preemtible. - */ - mutex_lock(bcm-mutex); - netif_tx_disable(bcm-net_dev); - spin_lock_irqsave(bcm-irq_lock, flags); - bcm43xx_mac_suspend(bcm); - if (bcm43xx_using_pio(bcm)) - bcm43xx_pio_freeze_txqueues(bcm); - savedirqs = bcm43xx_interrupt_disable(bcm, BCM43xx_IRQ_ALL); - spin_unlock_irqrestore(bcm-irq_lock, flags); - bcm43xx_synchronize_irq(bcm); - } else { - /* Periodic work should take short time, so we want low - * locking overhead. - */ - mutex_lock(bcm-mutex); - spin_lock_irqsave(bcm-irq_lock, flags); - } + /* Periodic work may take a long time, so we want it to + * be preemtible. In any case, we need to disable transmits. + */ + mutex_lock(bcm-mutex); + netif_tx_disable(bcm-net_dev); + spin_lock_irqsave(bcm-irq_lock, flags); + bcm43xx_mac_suspend(bcm); + if (bcm43xx_using_pio(bcm)) + bcm43xx_pio_freeze_txqueues(bcm); + savedirqs = bcm43xx_interrupt_disable(bcm, BCM43xx_IRQ_ALL); + spin_unlock_irqrestore(bcm-irq_lock, flags); + bcm43xx_synchronize_irq(bcm); do_periodic_work(bcm); - - if (badness BADNESS_LIMIT) { - spin_lock_irqsave(bcm-irq_lock, flags); - tasklet_enable(bcm-isr_tasklet); - bcm43xx_interrupt_enable(bcm, savedirqs); - if (bcm43xx_using_pio(bcm)) - bcm43xx_pio_thaw_txqueues(bcm); - bcm43xx_mac_enable(bcm); - netif_wake_queue(bcm-net_dev); - } + spin_lock_irqsave(bcm-irq_lock, flags); + tasklet_enable(bcm-isr_tasklet); + bcm43xx_interrupt_enable(bcm, savedirqs); + if (bcm43xx_using_pio(bcm)) + bcm43xx_pio_thaw_txqueues(bcm); + bcm43xx_mac_enable(bcm); + netif_wake_queue(bcm-net_dev); mmiowb(); spin_unlock_irqrestore(bcm-irq_lock, flags); mutex_unlock(bcm-mutex); -- Greetings Michael. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at
Re: [PATCH] Restore the original TX FIFO overflow process.
On Fri, 22 Sep 2006 15:30:01 -0400 Jesse Huang [EMAIL PROTECTED] wrote: From: Jesse Huang [EMAIL PROTECTED] Change Logs: - Restore the original TX FIFO overflow process. Signed-off-by: Jesse Huang [EMAIL PROTECTED] ... + txthreshold = ioread16 (ioaddr + TxStartThresh); Your patch ip100a-fix-tx-pause-bug-reset_tx-intr_handler.patch removed TxStartThresh, so it won't compile. I don't have a clue what's happening with this driver - I'll drop everything. I suggest you send a complete new patch series against Jeff's latest tree. I'll send you a copy of that. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bcm43xx softMac Driver in 2.6.18
On Saturday, 23 September 2006 08:03, Ray Lee wrote: On 9/22/06, Larry Finger [EMAIL PROTECTED] wrote: When we found the cause of NETDEV watchdog timeouts in the wireless-2.6 code, I knew that the 2.6.18 release code would cause a serious regression. I don't know if this is the lockup you're trying to address, but 2.6.18's bcm43xx has definitely regressed for me versus 2.6.17.x. 2.6.18 vanilla and 2.6.18 with your patch both lock my system hard with bcm43xx. I've got an HP/Compaq nx6125 laptop. Symptoms are that it will associate fine on its own and send traffic to/fro upon ifup, but when I do an iwconfig, ifdown, ifup to change the access point, the system locks (somewhat randomly) during one of those operations. Well, the iwconfig or the ifup, actually. I have observed similar symptoms on HPC nx6325, although I haven't managed to get the adapter associate with an AP. This is a PCI-E card so I need some additional patches to make the driver detect it, and I use the firmware cut from wl_apsta.o. The kernel is also 64-bit, 2.6.18-rc6-mm2. lspci -v: 30:00.0 Network controller: Broadcom Corporation BCM4310 UART (rev 01) Subsystem: Hewlett-Packard Company Unknown device 1361 Flags: bus master, fast devsel, latency 0, IRQ 10 Memory at c800 (32-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 2 Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/0 Enable- Capabilities: [d0] Express Legacy Endpoint IRQ 0 Greetings, Rafael -- You never change things by fighting the existing reality. R. Buckminster Fuller - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [e2e] performance of BIC-TCP, High-Speed-TCP, H-TCP etc
I was not sure whether this whole new page is good enough to make another public announcement about this paper At the risk of repeating myself, the page referred to contains the results of approx. 500 new test runs (and we have carried out many more than that which are summarised in the text) and directly addresses the primary concern raised by yourself and others that situations with a mix of connection lengths may lead to significantly different conclusions from tests with only long-lived flows. Our finding is that, for the metrics studied, the mix of flow sizes makes little difference to our conclusions. That, combined with the scrutiny provided by the peer review process, greatly strengthens our conclusions and certainly seems worth reporting. I am not doubting your effort here and I am sure your methods are correct. Just i was pondering why we got different results and try to see if we can come to some understanding on this different results we got. Who knows we together might run into some fundamental research issues regarding testing. I'm certainly up for taking a closer look at this. Sally's paper on this role of RTT variations provides more scientific explanation on this dynamics. In case you missed it, here is the link again. http://www.icir.org/models/hotnetsFinal.pdf. Please read Section 3.3. Section 3.3 of this paper seems to concern Active Queue Management: Oscillations. The discussion relates to queue dynamics of RED. How is this relevant ? All of our tests are for drop-tail queues only. Also about mid size flows, I am referring to the flow lifetimes. The mid sized flows cannot be represented well by the Pareto distribution -- the ones that are in the middle of the distribution that heavy tail is not capable of providing with a large number. Since the Pareto distribution (of your web traffic sz) follows the power law, the distribution of flow sizes around the origin (very short-term) is very high while very long-term flows have relatively high probability. I suspect your answers in the previous point and here just re-emphasise my point. Its not clear for example what actual values of flow lifetime you consider mid-size nor what the basis for those values is - there are a huge number of measurement studies on traffic stats and if the aim is to get closer to real link behaviour then it seems sensible to make use of this sort of data. I do agree it might be interesting to see if our test results are sensitive to the connection size distribution used, although I suspect the answer will be that they are largely insensitive - should be easy enough to check though if you'd be kind enough to send me details of the sort of distribution you have in mind. That might be the case. Thanks for pointing that out. But it is hard to explain why we got coincidently the same results as the FAST folks. Its hard for me to comment without more information - can you post a link to the results by the FAST folks that you mention ? Perhaps they also might like to comment here ? See also the next comment below ... But I think it is more to do with the different setups we have regarding buffer sizes and the maximum bandwidth. FAST doesn't adapt very well especially under small buffers because of this alpha tuning. I thought you were suggesting in your last post that you obtained different results for the *same* setup as us ? Some clarity here seems important as otherwise your comments are in danger of just serving to muddy the water. If the network setup is different, then its maybe no surprise if the results are a little different. Our own experience (and a key part of the rationale for our work) underlines the need to carry out tests over a broad range of conditions rather than confining testing to a small number of specific scenarios (e.g. only gigabit speed links or only links with large buffers) - otherwise its hard to get an overall feel for expected behaviour. We did carry out tests for really quite a wide range of network conditions and do already comment, for example, that FAST performance does depend on the buffer size. Doug - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [e2e] performance of BIC-TCP, High-Speed-TCP, H-TCP etc
This is a resend with fixed web links. The links were broken in my previous email -- sorry about multiple transmissions. - Hi Doug, Thanks for sharing your paper. Also congratulations to the acceptance of your journal paper to TONs. But I am wondering what's new in this paper. At first glance, I did not find many new things that are different from your previously publicized reports. How much is this different from the ones you put out in this mail list a year or two ago and also the one publicized in PFLDnet February this year http://www.hpcc.jp/pfldnet2006/? In that same workshop, we also presented our experimental results that shows significant discrepancy from yours but i am not sure why you forgot to reference our experimental work presented in that same PFLDnet. Here is a link to a more detailed version of that report accepted to COMNET http://netsrv.csc.ncsu.edu/highspeed/comnet-asteppaper.pdf The main point of contention [that we talked about in that PFLDnet workshop] is the presence of background traffic and the method to add them. Your report mostly ignores the effect of background traffic. Some texts in this paper state that you added some web traffic (10%), but the paper shows only the results from NO background traffic scenarios. But our results differ from yours in many aspects. Below are the links to our results (the links to them have been available in our BIC web site for a long time and also mentioned in our PFLDnet paper; this result is with the patch that corrects HTCP bugs). [Convergence and intra protocol fairness] without background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/nobk/intra_protocol/intra_protocol.htm with background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/bk/intra_protocol/intra_protocol.htm [RTT fairness]: w/o background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/nobk/rtt_fairness/rtt_fairness.htm with background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/bk/rtt_fairness/rtt_fairness.htm [TCP friendliness] without background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/nobk/tcp_friendliness/tcp_friendliness.htm with background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/bk/tcp_friendliness/tcp_friendliness.htm After our discussion in that PFLDnet, I puzzled why we get different results. My guess is that the main difference between your experiment and ours is the inclusion of mid-sized flows with various RTTs -- our experience tells that the RTT variations of mid size flows play a very important role in creating significant dynamics in testing environments. The same point about the importance of mid size flows with RTT variations has been raised in several occasions by Sally Floyd as well, including in this year's E2E research group meeting. You can find some reference to the importance of RTT variations in her paper too [ http://www.icir.org/models/hotnetsFinal.pdf]. Just having web-traffic (all with the same RTTs) does not create a realistic environment as it does not do anything about RTTs and also flow sizes tend to be highly skewed with the Pareto distribution-- but I don't know exactly how you create your testing environment with web-traffic -- I can only guess from the description you have about the web traffic in your paper. Another puzzle in this difference seems that even under no background traffic, we also get different results from yours..hmm...especially with FAST because under no background traffic, FAST seems to work fairly well with good RTT fairness in our experiment. But your results show FAST has huge RTT-unfairness. That is very strange. Is that because we have different bandwidth and buffer sizes in the setup? I think we need to compare our notes more. Also in the journal paper of FAST experimental results [ http://netlab.caltech.edu/publications/FAST-ToN-final-060209-2007.pdf ], FAST seems to work very well under no background traffic. We will verify our results again in the exact same environment as you have in your report, to make sure we can reproduce your resultsbut here are some samples of our results for FAST. http://netsrv.csc.ncsu.edu/highspeed/1200/nobk/rtt_fairness/1200--2.4_FAST-2.4_FAST-NONE--400-3-1333--1000-76-3-0-0-0-5-500--20-0.6-1000-10-1200-64000-150--1/ In this experiment, FAST flows are just perfect. Also the same result is confirmed inthe FAST journal paper [ http://netlab.caltech.edu/publications/FAST-ToN-final-060209-2007.pdf -- please look at Section IV.B and C. But your results show really bad RTT fairness.] Best regards, Injong --- Injong Rhee NCSU On Sep 22, 2006, at 10:22 AM, Douglas Leith wrote: For those interested in TCP for high-speed environments, and perhaps also people interested in TCP evaluation generally, I'd like to point you towards the results of a detailed experimental study which are now available at:
tested: Re: [PATCH] tcp: make cubic the default
Stephen, I've applied both of your patches (http://marc.theaimsgroup.com/?l=linux-netdevm=115878447914612w=2 and http://marc.theaimsgroup.com/?l=linux-netdevm=115878448125216w=2 ) and tried to break them, but it now appears to do the right thing in all cases, even when malforming the .config by hand, a 'make oldconfig' restores sanity. Reno is chosen if none of the non-scary congestion avoidance algorithms are available, and the default for when they are are as you intended. I've testbooted the resulting kernel and everything appears to work as desired, the proper TCP gets chosen, loading other ones does not change the default, but does make them available. Unloading the module containing the configured policy sets the policy to 'cubic', which is probably the next entry in the policy list. All in all, this final iteration of the congestion selection patches appears to do the job! Davem, I'd recommend both patches for merging. Bert On Wed, Sep 20, 2006 at 01:32:58PM -0700, Stephen Hemminger wrote: Change default congestion control used from BIC to the newer CUBIC which it the successor to BIC but has better properties over long delay links. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- net/ipv4/Kconfig | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) --- net-test.orig/net/ipv4/Kconfig2006-09-20 12:22:06.0 -0700 +++ net-test/net/ipv4/Kconfig 2006-09-20 13:31:21.0 -0700 @@ -454,7 +454,7 @@ modules. Nearly all users can safely say no here, and a safe default - selection will be made (BIC-TCP with new Reno as a fallback). + selection will be made (CUBIC with new Reno as a fallback). If unsure, say N. @@ -462,7 +462,7 @@ config TCP_CONG_BIC tristate Binary Increase Congestion (BIC) control - default y + default m ---help--- BIC-TCP is a sender-side only change that ensures a linear RTT fairness under large windows while offering both scalability and @@ -476,7 +476,7 @@ config TCP_CONG_CUBIC tristate CUBIC TCP - default m + default y ---help--- This is version 2.0 of BIC-TCP which uses a cubic growth function among other techniques. @@ -573,7 +573,7 @@ choice prompt Default TCP congestion control - default DEFAULT_BIC + default DEFAULT_CUBIC help Select the TCP congestion control that will be used by default for all connections. @@ -600,7 +600,7 @@ endif -config TCP_CONG_BIC +config TCP_CONG_CUBIC tristate depends on !TCP_CONG_ADVANCED default y @@ -613,7 +613,7 @@ default vegas if DEFAULT_VEGAS default westwood if DEFAULT_WESTWOOD default reno if DEFAULT_RENO - default bic + default cubic source net/ipv4/ipvs/Kconfig - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html !DSPAM:4511a594269391527717022! -- http://www.PowerDNS.com Open source, database driven DNS Software http://netherlabs.nl Open and Closed source services - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/03][RESUBMIT] net: EtherIP tunnel driver
This patchset is the resubmit of the Ethernet over IPv4 tunnel driver for Linux. I want to thank all reviewers for their annotations and helpfull input. This version contains some major changes to the driver. It uses an own device type now (ARPHRD_ETHERIP). This fixes the problem that EtherIP devices could not be safely differenced from Ethernet devices. This change also required some other changes. First a second patch to the bridge code is included to allow the use of EtherIP devices in a bridge. The third patch includes the necessary changes to iproute2 (support of the new ARPHRD and general tunnel configuration support for EtherIP). Signed-off-by: Joerg Roedel [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/03][RESUBMIT] net: EtherIP tunnel driver
On Sat, 2006-09-23 14:07:04 +0200, Joerg Roedel [EMAIL PROTECTED] wrote: This patchset is the resubmit of the Ethernet over IPv4 tunnel driver for Linux. I want to thank all reviewers for their annotations and helpfull input. This version contains some major changes to the driver. It uses an own device type now (ARPHRD_ETHERIP). This fixes the problem that EtherIP devices could not be safely differenced from Ethernet devices. This change also required some other changes. First a second patch to the bridge code is included to allow the use of EtherIP devices in a bridge. The third patch includes the necessary changes to iproute2 (support of the new ARPHRD and general tunnel configuration support for EtherIP). I haven't seen the first submission, but is this driver really needed? Can't this be done with creating two tap interfaces on both endpoints and bridge them with a local ethernet device using userland software? MfG, JBG -- Jan-Benedict Glaw [EMAIL PROTECTED] +49-172-7608481 Signature of: Alles wird gut! ...und heute wirds schon ein bißchen besser. the second : signature.asc Description: Digital signature
[PATCH 01/03] net: EtherIP driver, header and MAINTAINERS changes
This patch contains the reworked EtherIP driver, the necessary header updates and adds an entry for EtherIP to the MAINTAINERS file. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] diff -uprN -X linux-2.6.18-vanilla/Documentation/dontdiff linux-2.6.18-vanilla/include/linux/if_arp.h linux-2.6.18/include/linux/if_arp.h --- linux-2.6.18-vanilla/include/linux/if_arp.h 2006-09-20 05:42:06.0 +0200 +++ linux-2.6.18/include/linux/if_arp.h 2006-09-23 12:50:05.0 +0200 @@ -85,6 +85,7 @@ #define ARPHRD_IEEE80211 801 /* IEEE 802.11 */ #define ARPHRD_IEEE80211_PRISM 802 /* IEEE 802.11 + Prism2 header */ #define ARPHRD_IEEE80211_RADIOTAP 803 /* IEEE 802.11 + radiotap header */ +#define ARPHRD_ETHERIP 804/* Ethernet over IPv4 tunnel */ #define ARPHRD_VOID 0x/* Void type, nothing is known */ #define ARPHRD_NONE 0xFFFE/* zero header length */ diff -uprN -X linux-2.6.18-vanilla/Documentation/dontdiff linux-2.6.18-vanilla/include/linux/in.h linux-2.6.18/include/linux/in.h --- linux-2.6.18-vanilla/include/linux/in.h 2006-09-20 05:42:06.0 +0200 +++ linux-2.6.18/include/linux/in.h 2006-09-20 22:52:30.0 +0200 @@ -40,6 +40,7 @@ enum { IPPROTO_ESP = 50,/* Encapsulation Security Payload protocol */ IPPROTO_AH = 51, /* Authentication Header protocol */ + IPPROTO_ETHERIP = 97,/* Ethernet over IPv4 protocol */ IPPROTO_PIM= 103,/* Protocol Independent Multicast */ IPPROTO_COMP = 108,/* Compression Header protocol */ diff -uprN -X linux-2.6.18-vanilla/Documentation/dontdiff linux-2.6.18-vanilla/net/ipv4/etherip.c linux-2.6.18/net/ipv4/etherip.c --- linux-2.6.18-vanilla/net/ipv4/etherip.c 1970-01-01 01:00:00.0 +0100 +++ linux-2.6.18/net/ipv4/etherip.c 2006-09-23 12:52:38.0 +0200 @@ -0,0 +1,542 @@ +/* + * etherip.c: Ethernet over IPv4 tunnel driver (according to RFC3378) + * + * This driver could be used to tunnel Ethernet packets through IPv4 + * networks. This is especially usefull together with the bridging + * code in Linux. + * + * This code was written with an eye on the IPIP driver in linux from + * Sam Lantinga. Thanks for the great work. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * version 2 (no later version) as published by the + * Free Software Foundation. + * + */ + +#include linux/capability.h +#include linux/init.h +#include linux/module.h +#include linux/kernel.h +#include linux/types.h +#include linux/mutex.h +#include linux/netdevice.h +#include linux/etherdevice.h +#include linux/skbuff.h +#include linux/ip.h +#include linux/if_tunnel.h +#include linux/if_arp.h +#include linux/list.h +#include linux/string.h +#include linux/netfilter_ipv4.h +#include net/ip.h +#include net/protocol.h +#include net/route.h +#include net/ipip.h +#include net/xfrm.h +#include net/inet_ecn.h + +MODULE_LICENSE(GPL); +MODULE_AUTHOR(Joerg Roedel [EMAIL PROTECTED]); +MODULE_DESCRIPTION(Ethernet over IPv4 tunnel driver); + +/* + * These 2 defines are taken from ipip.c - if it's good enough for them + * it's good enough for me. + */ +#define HASH_SIZE16 +#define HASH(addr) ((addr^(addr4))0xF) + +#define ETHERIP_HEADER ((u16)0x0300) +#define ETHERIP_HLEN 2 + +#define BANNER1 etherip: Ethernet over IPv4 tunneling driver\n + +struct etherip_tunnel { + struct list_head list; + struct net_device *dev; + struct net_device_stats stats; + struct ip_tunnel_parm parms; + unsigned int recursion; +}; + +static struct net_device *etherip_tunnel_dev; +static struct list_head tunnels[HASH_SIZE]; + +static DEFINE_RWLOCK(etherip_lock); + +static void etherip_tunnel_setup(struct net_device *dev); + +/* add a tunnel to the hash */ +static void etherip_tunnel_add(struct etherip_tunnel *tun) +{ + unsigned h = HASH(tun-parms.iph.daddr); + list_add_tail(tun-list, tunnels[h]); +} + +/* delete a tunnel from the hash*/ +static void etherip_tunnel_del(struct etherip_tunnel *tun) +{ + list_del(tun-list); +} + +/* find a tunnel in the hash by parameters from userspace */ +static struct etherip_tunnel* etherip_tunnel_find(struct ip_tunnel_parm *p) +{ + struct etherip_tunnel *ret; + unsigned h = HASH(p-iph.daddr); + + list_for_each_entry(ret, tunnels[h], list) + if (ret-parms.iph.daddr == p-iph.daddr) + return ret; + + return NULL; +} + +/* find a tunnel by its destination address */ +static struct etherip_tunnel* etherip_tunnel_locate(u32 remote) +{ + struct etherip_tunnel *ret; + unsigned h = HASH(remote); + + list_for_each_entry(ret, tunnels[h], list) + if (ret-parms.iph.daddr == remote) + return
[PATCH 02/03] net/bridge: add support for EtherIP devices
This patch changes the device check in the bridge code to allow EtherIP devices to be added. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] diff -uprN -X linux-2.6.18-vanilla/Documentation/dontdiff linux-2.6.18-vanilla/net/bridge/br_if.c linux-2.6.18/net/bridge/br_if.c --- linux-2.6.18-vanilla/net/bridge/br_if.c 2006-09-20 05:42:06.0 +0200 +++ linux-2.6.18/net/bridge/br_if.c 2006-09-20 23:03:26.0 +0200 @@ -407,7 +407,8 @@ int br_add_if(struct net_bridge *br, str struct net_bridge_port *p; int err = 0; - if (dev-flags IFF_LOOPBACK || dev-type != ARPHRD_ETHER) + if (dev-flags IFF_LOOPBACK || + dev-type != ARPHRD_ETHER dev-type != ARPHRD_ETHERIP) return -EINVAL; if (dev-hard_start_xmit == br_dev_xmit)
Re: [PATCH 03/03][IPROUTE2] EtherIP tunnel and device support for iproute2
This patch adds support for EtherIP tunnels and devices to the iproute2 userspace software package. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] diff -urp iproute2-2.6.16-060323.orig/ip/iptunnel.c iproute2-2.6.16-060323/ip/iptunnel.c --- iproute2-2.6.16-060323.orig/ip/iptunnel.c 2005-02-10 19:31:18.0 +0100 +++ iproute2-2.6.16-060323/ip/iptunnel.c2006-09-20 22:35:30.0 +0200 @@ -44,7 +44,7 @@ static void usage(void) __attribute__((n static void usage(void) { fprintf(stderr, Usage: ip tunnel { add | change | del | show } [ NAME ]\n); - fprintf(stderr, [ mode { ipip | gre | sit } ] [ remote ADDR ] [ local ADDR ]\n); + fprintf(stderr, [ mode { ipip | gre | sit | etherip } ] [ remote ADDR ] [ local ADDR ]\n); fprintf(stderr, [ [i|o]seq ] [ [i|o]key KEY ] [ [i|o]csum ]\n); fprintf(stderr, [ ttl TTL ] [ tos TOS ] [ [no]pmtudisc ] [ dev PHYS_DEV ]\n); fprintf(stderr, \n); @@ -202,6 +202,12 @@ static int parse_args(int argc, char **a exit(-1); } p-iph.protocol = IPPROTO_IPV6; + } else if (strcmp(*argv, etherip) == 0) { + if (p-iph.protocol p-iph.protocol != IPPROTO_ETHERIP) { + fprintf(stderr,You managed to ask for more than one tunnel mode.\n); + exit(-1); + } + p-iph.protocol = IPPROTO_ETHERIP; } else { fprintf(stderr,Cannot guess tunnel mode.\n); exit(-1); @@ -324,11 +330,15 @@ static int parse_args(int argc, char **a p-iph.protocol = IPPROTO_IPIP; else if (memcmp(p-name, sit, 3) == 0) p-iph.protocol = IPPROTO_IPV6; + else if (memcmp(p-name, ethip, 5) == 0) + p-iph.protocol = IPPROTO_ETHERIP; } - if (p-iph.protocol == IPPROTO_IPIP || p-iph.protocol == IPPROTO_IPV6) { + if (p-iph.protocol == IPPROTO_IPIP || + p-iph.protocol == IPPROTO_IPV6 || + p-iph.protocol == IPPROTO_ETHERIP) { if ((p-i_flags GRE_KEY) || (p-o_flags GRE_KEY)) { - fprintf(stderr, Keys are not allowed with ipip and sit.\n); + fprintf(stderr, Keys are not allowed with ipip, sit or etherip.\n); return -1; } } @@ -351,6 +361,21 @@ static int parse_args(int argc, char **a fprintf(stderr, Broadcast tunnel requires a source address.\n); return -1; } + + if (p-iph.protocol == IPPROTO_ETHERIP) { + if ((cmd == SIOCADDTUNNEL || cmd == SIOCCHGTUNNEL) !p-iph.daddr) { + fprintf(stderr, EtherIP tunnel requires a + destination address.\n); + return -1; + } + + /* + if (cmd != SIOCDELTUNNEL p-iph.frag_off htons(IP_DF)) { + fprintf(stderr, Warning: [no]pmtudisc is ignored on +EtherIP tunnels\n); + } + */ + } return 0; } @@ -374,6 +399,8 @@ static int do_add(int cmd, int argc, cha return do_add_ioctl(cmd, gre0, p); case IPPROTO_IPV6: return do_add_ioctl(cmd, sit0, p); + case IPPROTO_ETHERIP: + return do_add_ioctl(cmd, ethip0, p); default: fprintf(stderr, cannot determine tunnel mode (ipip, gre or sit)\n); return -1; @@ -395,6 +422,8 @@ int do_del(int argc, char **argv) return do_del_ioctl(gre0, p); case IPPROTO_IPV6: return do_del_ioctl(sit0, p); + case IPPROTO_ETHERIP: + return do_del_ioctl(ethip0, p); default: return do_del_ioctl(p.name, p); } @@ -418,7 +447,8 @@ void print_tunnel(struct ip_tunnel_parm p-name, p-iph.protocol == IPPROTO_IPIP ? ip : (p-iph.protocol == IPPROTO_GRE ? gre : - (p-iph.protocol == IPPROTO_IPV6 ? ipv6 : unknown)), + (p-iph.protocol == IPPROTO_ETHERIP ? etherip : + (p-iph.protocol == IPPROTO_IPV6 ? ipv6 : unknown))), p-iph.daddr ? format_host(AF_INET, 4, p-iph.daddr, s1, sizeof(s1)) : any, p-iph.saddr ? rt_addr_n2a(AF_INET, 4, p-iph.saddr, s2, sizeof(s2)) : any); @@ -431,19 +461,19 @@ void print_tunnel(struct ip_tunnel_parm if (p-iph.ttl) printf( ttl %d , p-iph.ttl); else - printf( ttl inherit ); + printf( ttl %s,
Re: [PATCH 00/03][RESUBMIT] net: EtherIP tunnel driver
On Sat, Sep 23, 2006 at 02:13:27PM +0200, Jan-Benedict Glaw wrote: I haven't seen the first submission, but is this driver really needed? Can't this be done with creating two tap interfaces on both endpoints and bridge them with a local ethernet device using userland software? In general it is possible to use a tap interface to tunnel Ethernet packets. But this driver uses the EtherIP protocol defined in RFC 3378 which itself defines an own IP protocol for it (number 97). This protocol is also supported by different other operating systems (some of the major BSD versions). This driver makes Linux interoperable with these implementations. Regards, Joerg Roedel - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
forcedeth broken powermanagement/irq handling ?
Hi, since there hasn't been much progress with the bugzilla entry I'm bringing this issue to your attention here. :) http://bugzilla.kernel.org/show_bug.cgi?id=6398 vanilla forcedeth doesn't seem to support suspend and an ifdown/up-cycle is needed to get it working again after suspend. Francois Romieu's Awfully experimental patch is working just fine for me (with message signalled interrupts disabled) and has survived quite a few suspend/resume cycles. So I'd very much like to see (at least partial, with msi disabled) suspend support for forcedeth in mainline. Romieu's patch: --- linux-2.6.18-rc6/drivers/net/forcedeth.c2006-09-09 09:45:43.0 +0200 +++ linux-2.6.17.11-xen/drivers/net/forcedeth.c 2006-09-09 09:41:25.0 +0200 @@ -4433,6 +4433,50 @@ pci_set_drvdata(pci_dev, NULL); } + +#ifdef CONFIG_PM + +static int nv_suspend(struct pci_dev *pdev, pm_message_t state) +{ + struct net_device *dev = pci_get_drvdata(pdev); + struct fe_priv *np = netdev_priv(dev); + + if (!netif_running(dev)) + goto out; + + netif_device_detach(dev); + + // Gross. + nv_close(dev); + + pci_save_state(pdev); + pci_enable_wake(pdev, pci_choose_state(pdev, state), np-wolenabled); + pci_set_power_state(pdev, pci_choose_state(pdev, state)); +out: + return 0; +} + +static int nv_resume(struct pci_dev *pdev) +{ + struct net_device *dev = pci_get_drvdata(pdev); + int rc = 0; + + if (!netif_running(dev)) + goto out; + + netif_device_attach(dev); + + pci_set_power_state(pdev, PCI_D0); + pci_restore_state(pdev); + pci_enable_wake(pdev, PCI_D0, 0); + + rc = nv_open(dev); +out: + return rc; +} + +#endif /* CONFIG_PM */ + static struct pci_device_id pci_tbl[] = { { /* nForce Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_1), @@ -4534,6 +4578,10 @@ .id_table = pci_tbl, .probe = nv_probe, .remove = __devexit_p(nv_remove), +#ifdef CONFIG_PM + .suspend= nv_suspend, + .resume = nv_resume, +#endif }; -- Tobias PGP: http://9ac7e0bc.uguu.de このメールは十割再利用されたビットで作られています。 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/03][RESUBMIT] net: EtherIP tunnel driver
On Sat, 2006-23-09 at 14:13 +0200, Jan-Benedict Glaw wrote: On Sat, 2006-09-23 14:07:04 +0200, Joerg Roedel [EMAIL PROTECTED] wrote: This patchset is the resubmit of the Ethernet over IPv4 tunnel driver for Linux. I want to thank all reviewers for their annotations and helpfull input. This version contains some major changes to the driver. It uses an own device type now (ARPHRD_ETHERIP). This fixes the problem that EtherIP devices could not be safely differenced from Ethernet devices. This change also required some other changes. First a second patch to the bridge code is included to allow the use of EtherIP devices in a bridge. The third patch includes the necessary changes to iproute2 (support of the new ARPHRD and general tunnel configuration support for EtherIP). I haven't seen the first submission, but is this driver really needed? Can't this be done with creating two tap interfaces on both endpoints and bridge them with a local ethernet device using userland software? You just need to use GRE tunnel instead of what you describe above. While i feel bad that Joerg (and Lennert and others before) have put the effort to do the work, i too question the need for this driver. I dont think even the authors of the original RFC feel this provides anything that GRE cant (according to some posting on netdev that one of the authors made). My understanding is also that the only other OS that implemented this got it wrong - hence you will have to interop with them and provide quirks checks. I am actually curious if anyone uses it instead of GRE in openbsd? You could argue that including this driver would allow Linux to have another bulb in the christmas tree; the other (more pragmatic way) to look at this is it allows spreading a bad idea and needs to be censored. I prefer the later - and hope this doesnt discourage Joerg from contributing in the future. cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/9] network namespaces: async socket operations
On Fri, Sep 22, 2006 at 05:33:56PM +0200, Daniel Lezcano wrote: Andrey Savochkin wrote: Non-trivial part of socket namespaces: asynchronous events should be run in proper context. Signed-off-by: Andrey Savochkin [EMAIL PROTECTED] --- af_inet.c| 10 ++ inet_timewait_sock.c |8 tcp_timer.c |9 + 3 files changed, 27 insertions(+) --- ./net/ipv4/af_inet.c.venssock-asyn Mon Aug 14 17:04:07 2006 +++ ./net/ipv4/af_inet.cTue Aug 15 13:45:44 2006 @@ -366,10 +366,17 @@ out_rcu_unlock: int inet_release(struct socket *sock) { struct sock *sk = sock-sk; + struct net_namespace *ns, *orig_net_ns; if (sk) { long timeout; + /* Need to change context here since protocol -close +* operation may send packets. +*/ + ns = get_net_ns(sk-sk_net_ns); + push_net_ns(ns, orig_net_ns); + Is it not a race condition here ? What happens if you have a packet incoming during the namespace context switching ? All asynchronous operations (RX softirq, timers) should set their context explicitly, and can't rely on the current context being the right one (or a valid pointer at all). Andrey - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/03][RESUBMIT] net: EtherIP tunnel driver
On Sat, Sep 23, 2006 at 08:38:37AM -0400, jamal wrote: Hello Jamal, You just need to use GRE tunnel instead of what you describe above. The main intention for this driver was not only to provide Ethernet over IPv4 tunneling. This is also possible in userspace using a tap interface (as Jan-Benedict Glaw mentioned). Another main intention for this driver was to provide tunneling of Ethernet packets using the EtherIP protocol. While i feel bad that Joerg (and Lennert and others before) have put the effort to do the work, i too question the need for this driver. I dont think even the authors of the original RFC feel this provides anything that GRE cant (according to some posting on netdev that one of the authors made). You are right. I completly agree with this. But this is also true for the IPIP and the SIT driver. You can do both with GRE. And there are reasons to keep both in the Kernel. My understanding is also that the only other OS that implemented this got it wrong - hence you will have to interop with them and provide quirks checks. At the moment I know at least that at least OpenBSD, NetBSD and FreeBSD support the EtherIP protocol. The first of them was OpenBSD, thats right. I don't think OpenBSD made a wrong implementation at this point (I assume you are speaking of the position of the 3 in the header). The RFC is not clear at this point. It defines that the first 4 bits in the 16 bit Ethernet header MUST be 0011. But it don't defines the byteorder of that 16 bit word nor if the least or most significant bit comes first. This was the reason (to keep interoperability with the existing implementations) I implemented it the same way as OpenBSD and my driver does not check the incoming EtherIP header. I am actually curious if anyone uses it instead of GRE in openbsd? When I searched Google for EtherIP I found some entries in BSD forums discussing questions concering EtherIP usage. This, and the fact I know a BSD user that uses EtherIP too, makes be believe there are numerous users of EtherIP in the BSD world. And at least the BSD user I know wants interoperability of his NetBSD implemenation with Linux. This request was the starting point for this driver. You could argue that including this driver would allow Linux to have another bulb in the christmas tree; the other (more pragmatic way) to look at this is it allows spreading a bad idea and needs to be censored. I am not a friend of censorship. I think the users should have the freedom to decide what they want to use. There are reasons to have more than one way to tunnel Ethernet packets in the Kernel (the reason for EtherIP is the interoperability with the BSD implementations). I don't know if the GRE driver in mainline already support Ethernet tunneling. But if not, my driver is already the second way to do it (after the tap devices). I prefer the later - and hope this doesnt discourage Joerg from contributing in the future. Surely not. I intend to further contribute even if this driver would be finally rejected :) Regards, Joerg Roedel - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bcm43xx softMac Driver in 2.6.18
Rafael J. Wysocki wrote: 2.6.18 vanilla and 2.6.18 with your patch both lock my system hard with bcm43xx. I've got an HP/Compaq nx6125 laptop. Symptoms are that it will associate fine on its own and send traffic to/fro upon ifup, but when I do an iwconfig, ifdown, ifup to change the access point, the system locks (somewhat randomly) during one of those operations. Well, the iwconfig or the ifup, actually. I have observed similar symptoms on HPC nx6325, although I haven't managed to get the adapter associate with an AP. Yeah, I'm having the same troubles. Carefully watching the iwconfig results showed me that only half of the time did my `iwconfig eth1 essid AccessPointName` actually take. (It listed the essid of the ap I told it to associate with, but then showed Access Point: Invalid or words to that effect, until I issued the exact same iwconfig again.) So, try it twice, double check the iwconfig output, then try bringing up the interface. Though that seems awfully difficult to do as well (DHCP is just sending out stuff with nothing coming back). When I switch consoles while DHCP is plaintively asking for an IP, and issue *another* iwconfig with the same essid, then it seems to kick something in the driver and DHCP immediately associates. Happened twice for me so far, though that could merely be a coincidence. Ray - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Advertise PPPoE MTU / avoid memory leak.
PPPoE must advertise the underlying device's MTU via the ppp channel descriptor structure, as multilink functionality depends on it. __pppoe_xmit must free any skb it allocates if there is an error submitting the skb downstream. Signed-off-by: Michal Ostrowski [EMAIL PROTECTED] --- drivers/net/pppoe.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c index 475dc93..b4dc516 100644 --- a/drivers/net/pppoe.c +++ b/drivers/net/pppoe.c @@ -600,6 +600,7 @@ static int pppoe_connect(struct socket * po-chan.hdrlen = (sizeof(struct pppoe_hdr) + dev-hard_header_len); + po-chan.mtu = dev-mtu - sizeof(struct pppoe_hdr); po-chan.private = sk; po-chan.ops = pppoe_chan_ops; @@ -831,7 +832,7 @@ static int __pppoe_xmit(struct sock *sk, struct pppoe_hdr *ph; int headroom = skb_headroom(skb); int data_len = skb-len; - struct sk_buff *skb2; + struct sk_buff *skb2 = NULL; if (sock_flag(sk, SOCK_DEAD) || !(sk-sk_state PPPOX_CONNECTED)) goto abort; @@ -887,6 +888,8 @@ static int __pppoe_xmit(struct sock *sk, return 1; abort: + if (skb2) + kfree_skb(skb2); return 0; } -- 1.4.1.1 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
softmac mtu
Hi, why softmac (and maybe device using linux 80211 stack) can't increase their mtu above 1500 ? IRRC 802.11 allow to send bigger frame. Moreover some driver like airo allow to use mtu biger than 2000. thanks, Matthieu - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
softmac mtu
Hi, why softmac (and maybe device using linux 80211 stack) can't increase their mtu above 1500 ? IRRC 802.11 allow to send bigger frame. Moreover some driver like airo allow to use mtu biger than 2000. thanks, Matthieu - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.1[78] page allocation failure. order:3, mode:0x20
Andrew Morton wrote: On Fri, 22 Sep 2006 22:25:07 -0700 (PDT) David Miller [EMAIL PROTECTED] wrote: From: Andrew Morton [EMAIL PROTECTED] Date: Fri, 22 Sep 2006 21:50:00 -0700 On Fri, 22 Sep 2006 10:10:36 -0700 Auke Kok [EMAIL PROTECTED] wrote: e1000: account for NET_IP_ALIGN when calculating bufsiz Account for NET_IP_ALIGN when requesting buffer sizes from netdev_alloc_skb to reduce slab allocation by half. Could we please do whatever is needed to get this blessed and merged? This is such a common problem on such a common driver that I would suggest that we want this in 2.6.18.x as well. At least, I'd expect distributors to ship this fix (they're nuts if they don't) and so it makes sense to deliver it from kernel.org. The NET_IP_ALIGN existed not just for fun :) There are ramifications for removing it. It's still there, isn't it? For the 9k MTU case, for example, we end up allocating 16384 byte skbs instead of 32786 kbytes ones. yes, the only thing I'm doing is accounting for the 2 bytes one steap earlier. It works fine for the general case and I tested it too, but I am not too sure about the corner cases as the hardware has no notion of mtu at all and could possibly overwrite by two bytes. I think my patch actually give the hardware two bytes too much now, so we're on the other side (safe) of that problem, but I have to verify this first of course. I'll be wrestling this on monday with Jesse and try to nail it down. Auke diff -puN drivers/net/e1000/e1000_main.c~e1000-account-for-net_ip_align-when-calculating-bufsiz drivers/net/e1000/e1000_main.c --- a/drivers/net/e1000/e1000_main.c~e1000-account-for-net_ip_align-when-calculating-bufsiz +++ a/drivers/net/e1000/e1000_main.c @@ -1101,7 +1101,7 @@ e1000_sw_init(struct e1000_adapter *adap pci_read_config_word(pdev, PCI_COMMAND, hw-pci_cmd_word); - adapter-rx_buffer_len = MAXIMUM_ETHERNET_VLAN_SIZE; + adapter-rx_buffer_len = MAXIMUM_ETHERNET_VLAN_SIZE + NET_IP_ALIGN; adapter-rx_ps_bsize0 = E1000_RXBUFFER_128; hw-max_frame_size = netdev-mtu + ENET_HEADER_SIZE + ETHERNET_FCS_SIZE; @@ -3163,26 +3163,27 @@ e1000_change_mtu(struct net_device *netd * larger slab size * i.e. RXBUFFER_2048 -- size-4096 slab */ - if (max_frame = E1000_RXBUFFER_256) + if (max_frame + NET_IP_ALIGN = E1000_RXBUFFER_256) adapter-rx_buffer_len = E1000_RXBUFFER_256; - else if (max_frame = E1000_RXBUFFER_512) + else if (max_frame + NET_IP_ALIGN = E1000_RXBUFFER_512) adapter-rx_buffer_len = E1000_RXBUFFER_512; - else if (max_frame = E1000_RXBUFFER_1024) + else if (max_frame + NET_IP_ALIGN = E1000_RXBUFFER_1024) adapter-rx_buffer_len = E1000_RXBUFFER_1024; - else if (max_frame = E1000_RXBUFFER_2048) + else if (max_frame + NET_IP_ALIGN = E1000_RXBUFFER_2048) adapter-rx_buffer_len = E1000_RXBUFFER_2048; - else if (max_frame = E1000_RXBUFFER_4096) + else if (max_frame + NET_IP_ALIGN = E1000_RXBUFFER_4096) adapter-rx_buffer_len = E1000_RXBUFFER_4096; - else if (max_frame = E1000_RXBUFFER_8192) + else if (max_frame + NET_IP_ALIGN = E1000_RXBUFFER_8192) adapter-rx_buffer_len = E1000_RXBUFFER_8192; - else if (max_frame = E1000_RXBUFFER_16384) + else adapter-rx_buffer_len = E1000_RXBUFFER_16384; /* adjust allocation if LPE protects us, and we aren't using SBP */ if (!adapter-hw.tbi_compatibility_on ((max_frame == MAXIMUM_ETHERNET_FRAME_SIZE) || (max_frame == MAXIMUM_ETHERNET_VLAN_SIZE))) - adapter-rx_buffer_len = MAXIMUM_ETHERNET_VLAN_SIZE; + adapter-rx_buffer_len = MAXIMUM_ETHERNET_VLAN_SIZE + + NET_IP_ALIGN; netdev-mtu = new_mtu; @@ -4002,7 +4003,8 @@ e1000_alloc_rx_buffers(struct e1000_adap struct e1000_buffer *buffer_info; struct sk_buff *skb; unsigned int i; - unsigned int bufsz = adapter-rx_buffer_len + NET_IP_ALIGN; + /* we have already accounted for NET_IP_ALIGN */ + unsigned int bufsz = adapter-rx_buffer_len; i = rx_ring-next_to_use; buffer_info = rx_ring-buffer_info[i]; _ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bcm43xx softMac Driver in 2.6.18
Ray Lee wrote: Rafael J. Wysocki wrote: 2.6.18 vanilla and 2.6.18 with your patch both lock my system hard with bcm43xx. I've got an HP/Compaq nx6125 laptop. Symptoms are that it will associate fine on its own and send traffic to/fro upon ifup, but when I do an iwconfig, ifdown, ifup to change the access point, the system locks (somewhat randomly) during one of those operations. Well, the iwconfig or the ifup, actually. I have observed similar symptoms on HPC nx6325, although I haven't managed to get the adapter associate with an AP. Yeah, I'm having the same troubles. Carefully watching the iwconfig results showed me that only half of the time did my `iwconfig eth1 essid AccessPointName` actually take. (It listed the essid of the ap I told it to associate with, but then showed Access Point: Invalid or words to that effect, until I issued the exact same iwconfig again.) So, try it twice, double check the iwconfig output, then try bringing up the interface. Though that seems awfully difficult to do as well (DHCP is just sending out stuff with nothing coming back). When I switch consoles while DHCP is plaintively asking for an IP, and issue *another* iwconfig with the same essid, then it seems to kick something in the driver and DHCP immediately associates. Happened twice for me so far, though that could merely be a coincidence. I don't know about the problems associating, and/or with changing APs - I have only one and it associates and authenticates with WPA-PSK without any trouble. As to the lockups that you are seeing, I have generated a diff between vanilla 2.6.18 and wireless-2.6 with some essential patches added. At the moment, I'm compiling and testing it. There are more problems with locking than I realized. If the patch works here, I'll post it to you and to the bcm43xx list. The hard part may be getting stable to accept it for 2.6.18.1. Thanks for the bug reports. Larry - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bcm43xx: further fix for periodic work errors
Michael Buesch wrote: On Saturday 23 September 2006 06:08, Larry Finger wrote: Recent changes in the setup for preemptible periodic work fixed most of the problems with NETDEV watchdog timeouts; however, some variants of the bcm43xx device still had the problem. These were fixed by setting the parameter MAXIMUM_BADNESS to 0. By doing so, all the functionality associated with calculating the 'badness' of the upcoming periodic work is no longer needed; therefore it is removed. Uhm, no. Wait. _Why_ does the watchdog trigger. All periodic work in the fastpath (which you remove with this patch) is supposed to execute in a few microseconds. I don't think we want to fix this my removing the fastpath and always taking the _expensive_ slowpath periodic work. So why does the watchdog trigger for the fast periodic work? We need to find out. Removing the fastpath is just bad for overall latency. The two fastpath periodic works are 15 and 30, if executed standalone. If the 15 and/or 30 is execiuted alongside with a 60sec work, it's all slowpath, of course. I was thinking that the 15 second periodic work called mac suspend, which is the most expensive part of the slowpath, but I see that is an unlikely condition. I'm now testing to see if moving the netif_tx_disable/netif_wake_queue pair into all paths fixes the errors. Those calls should be relatively inexpensive. Larry - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bcm43xx: further fix for periodic work errors
On Saturday 23 September 2006 21:06, Larry Finger wrote: Michael Buesch wrote: On Saturday 23 September 2006 06:08, Larry Finger wrote: Recent changes in the setup for preemptible periodic work fixed most of the problems with NETDEV watchdog timeouts; however, some variants of the bcm43xx device still had the problem. These were fixed by setting the parameter MAXIMUM_BADNESS to 0. By doing so, all the functionality associated with calculating the 'badness' of the upcoming periodic work is no longer needed; therefore it is removed. Uhm, no. Wait. _Why_ does the watchdog trigger. All periodic work in the fastpath (which you remove with this patch) is supposed to execute in a few microseconds. I don't think we want to fix this my removing the fastpath and always taking the _expensive_ slowpath periodic work. So why does the watchdog trigger for the fast periodic work? We need to find out. Removing the fastpath is just bad for overall latency. The two fastpath periodic works are 15 and 30, if executed standalone. If the 15 and/or 30 is execiuted alongside with a 60sec work, it's all slowpath, of course. I was thinking that the 15 second periodic work called mac suspend, which is the most expensive part of the slowpath, but I see that is an unlikely condition. I'm now testing to see if moving the netif_tx_disable/netif_wake_queue pair into all paths fixes the errors. Those calls should be relatively inexpensive. Well, even _if_ mac_suspend takes a few milliseconds (which it does not), it would not trigger the watchdog. I measured the time it takes to execute the various works and based the badness selection on the results. If the 15 or 30 second work is really able to trigger a watchdog timeout, it's a _bug_ that needs to be fixed and not to be papered over. It won't trigger the watchdog, because it is running too long uninterruptible (it won't run 5sec...). If it triggers, it's triggered by something else (like the synchronize_net thingie in the past). -- Greetings Michael. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tested: Re: [PATCH] tcp: make cubic the default
From: bert hubert [EMAIL PROTECTED] Date: Sat, 23 Sep 2006 13:14:34 +0200 All in all, this final iteration of the congestion selection patches appears to do the job! Davem, I'd recommend both patches for merging. Great, I'll make sure I review them too and integrate them. Thanks for checking this stuff out so thoroughly. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.1[78] page allocation failure. order:3, mode:0x20
From: Auke Kok [EMAIL PROTECTED] Date: Sat, 23 Sep 2006 11:50:34 -0700 Andrew Morton wrote: It's still there, isn't it? For the 9k MTU case, for example, we end up allocating 16384 byte skbs instead of 32786 kbytes ones. yes, the only thing I'm doing is accounting for the 2 bytes one steap earlier. Ok, I'm fine with this patch unless it causes some regression that hasn't been discovered yet :-) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bcm43xx: further fix for periodic work errors
Michael Buesch wrote: Well, even _if_ mac_suspend takes a few milliseconds (which it does not), it would not trigger the watchdog. I measured the time it takes to execute the various works and based the badness selection on the results. If the 15 or 30 second work is really able to trigger a watchdog timeout, it's a _bug_ that needs to be fixed and not to be papered over. It won't trigger the watchdog, because it is running too long uninterruptible (it won't run 5sec...). If it triggers, it's triggered by something else (like the synchronize_net thingie in the past). Even the synchronize_net problem wasn't taking 5 seconds to complete, it was messing up the transmit process. I went back to check my logs again, and the actual error was BCM43xx_IRQ_XMIT_ERROR, which is always preceded by a MAC suspend failed. These never happened all the time I was running with MAXIMUM_BADNESS of 0. I think the _bug_ is letting the transmit process run while doing the periodic work, which is why I'm testing with the tx_disable before all periodic work. I'll let you know in 2 or 3 days if it fixes the problem. It takes that long to trigger. Larry Larry - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bcm43xx: further fix for periodic work errors
On Saturday 23 September 2006 22:05, Larry Finger wrote: Michael Buesch wrote: Well, even _if_ mac_suspend takes a few milliseconds (which it does not), it would not trigger the watchdog. I measured the time it takes to execute the various works and based the badness selection on the results. If the 15 or 30 second work is really able to trigger a watchdog timeout, it's a _bug_ that needs to be fixed and not to be papered over. It won't trigger the watchdog, because it is running too long uninterruptible (it won't run 5sec...). If it triggers, it's triggered by something else (like the synchronize_net thingie in the past). Even the synchronize_net problem wasn't taking 5 seconds to complete, it was messing up the transmit process. That's what I am saying. There must be another similiar bug. I went back to check my logs again, and the actual error was BCM43xx_IRQ_XMIT_ERROR, which is always preceded by a MAC suspend failed. These never happened all the time I was running with MAXIMUM_BADNESS of 0. We can debug with the recently spec'ed reason and error registers why this is triggered. See v4 specs. I think the _bug_ is letting the transmit process run while doing the periodic work, No. We don't let TX run while doing any periodic work (slow or fast). Same for the IRQ handler. We take the IRQ lock, which protects against IRQ and TX path (and everything else). The _only_ difference between slowpath and fastpath periodic work is that slowpath (long) periodic work is preemptible. This is gained by not taking the IRQ lock, but protecting it otherwise (disabling IRQs and TX). So what you are doing by your patch is: _never_ taking the lock. which is why I'm testing with the tx_disable before all periodic work. I'll let you know in 2 or 3 days if it It is not needed. tx_disable is only needed for long periodic work. -- Greetings Michael. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: softmac mtu
Matthieu CASTET wrote: Hi, why softmac (and maybe device using linux 80211 stack) can't increase their mtu above 1500 ? IRRC 802.11 allow to send bigger frame. Moreover some driver like airo allow to use mtu biger than 2000. The maximum value for MTU is set in include/linux/if_ether.h for all ethernet-type communications, not in softmac or ieee80211. I doubt that one could easily change the number. It may be that the 802.11 standard allows bigger frames, but it looks to me as if Linux does not. Larry - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: softmac mtu
From: Larry Finger [EMAIL PROTECTED] Date: Sat, 23 Sep 2006 16:40:15 -0500 The maximum value for MTU is set in include/linux/if_ether.h for all ethernet-type communications, not in softmac or ieee80211. I doubt that one could easily change the number. It may be that the 802.11 standard allows bigger frames, but it looks to me as if Linux does not. Not correct. Linux is perfectly fine with setting 9000 byte MTU on ethernet devices that support it, and in fact just about every gigabit ethernet driver supports it. That macro you see in if_ether.h is just the value of the base MTU limit, so larger MTU settings are easily allowable on a per-device basis. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: softmac mtu
David Miller wrote: From: Larry Finger [EMAIL PROTECTED] Date: Sat, 23 Sep 2006 16:40:15 -0500 The maximum value for MTU is set in include/linux/if_ether.h for all ethernet-type communications, not in softmac or ieee80211. I doubt that one could easily change the number. It may be that the 802.11 standard allows bigger frames, but it looks to me as if Linux does not. Not correct. Linux is perfectly fine with setting 9000 byte MTU on ethernet devices that support it, and in fact just about every gigabit ethernet driver supports it. That macro you see in if_ether.h is just the value of the base MTU limit, so larger MTU settings are easily allowable on a per-device basis. Where/how does the device allow it? When I tried 'ifconfig eth0 mtu 2000' on my VIA Technologies, Inc. VT6102 [Rhine-II] wired controller, I got a 'SIOCSIFMTU: Invalid argument' message, which is the same message I get on my BCM4306 wireless card. Larry - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: softmac mtu
On 9/23/06, Arnaldo Carvalho de Melo [EMAIL PROTECTED] wrote: On 9/23/06, Larry Finger [EMAIL PROTECTED] wrote: David Miller wrote: From: Larry Finger [EMAIL PROTECTED] Date: Sat, 23 Sep 2006 16:40:15 -0500 The maximum value for MTU is set in include/linux/if_ether.h for all ethernet-type communications, not in softmac or ieee80211. I doubt that one could easily change the number. It may be that the 802.11 standard allows bigger frames, but it looks to me as if Linux does not. Not correct. Linux is perfectly fine with setting 9000 byte MTU on ethernet devices that support it, and in fact just about every gigabit ethernet driver supports it. That macro you see in if_ether.h is just the value of the base MTU limit, so larger MTU settings are easily allowable on a per-device basis. Where/how does the device allow it? When I tried 'ifconfig eth0 mtu 2000' on my VIA Technologies, Inc. VT6102 [Rhine-II] wired controller, I got a 'SIOCSIFMTU: Invalid argument' message, which is the same message I get on my BCM4306 wireless card. David didn't said 1500 all the way to 9000, he said that some drivers support 9000, some don't, lemme check for ya which one does... drivers/net/8139cp.c: max is 4096 drivers/net/acenic.c: 9000 just do a: vi $(find drivers/net | xargs grep -l change_mtu) and check the rest :-) - Arnaldo - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Advertise PPPoE MTU / avoid memory leak.
From: [EMAIL PROTECTED] Date: Sat, 23 Sep 2006 12:30:23 -0500 __pppoe_xmit must free any skb it allocates if there is an error submitting the skb downstream. This isn't right, dev_queue_xmit() can return -ENETDOWN and still free the SKB, so your change will cause the SKB to be freed up twice in that case, from dev_queue_xmit(): rc = -ENETDOWN; rcu_read_unlock_bh(); out_kfree_skb: kfree_skb(skb); return rc; dev_queue_xmit() is basically expected to consume the packet, error or not. What case of calling dev_queue_xmit() did you discover that did not kfree the SKB on error? We should fix that. On a quick scan on the entire dev_queue_xmit() implmentation, I cannot find such a case. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: softmac mtu
From: Larry Finger [EMAIL PROTECTED] Date: Sat, 23 Sep 2006 16:59:48 -0500 Where/how does the device allow it? When I tried 'ifconfig eth0 mtu 2000' on my VIA Technologies, Inc. VT6102 [Rhine-II] wired controller, I got a 'SIOCSIFMTU: Invalid argument' message, which is the same message I get on my BCM4306 wireless card. It allows it in the device specific -change_mtu() method. Tigon3, for example, overrides this with it's own function called tg3_change_mtu() which checks if the particular model of the chip supports jumbo MTU and if so allows such a setting. The VIA driver simply doesn't override that function, and uses the default ethernet one because either that ethernet chip doesn't support the larger MTU or the author simply hasn't gotten around to implementing the override. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/03][RESUBMIT] net: EtherIP tunnel driver
From: jamal [EMAIL PROTECTED] Date: Sat, 23 Sep 2006 08:38:37 -0400 You just need to use GRE tunnel instead of what you describe above. While i feel bad that Joerg (and Lennert and others before) have put the effort to do the work, i too question the need for this driver. I dont think even the authors of the original RFC feel this provides anything that GRE cant (according to some posting on netdev that one of the authors made). My understanding is also that the only other OS that implemented this got it wrong - hence you will have to interop with them and provide quirks checks. I am actually curious if anyone uses it instead of GRE in openbsd? You could argue that including this driver would allow Linux to have another bulb in the christmas tree; the other (more pragmatic way) to look at this is it allows spreading a bad idea and needs to be censored. I prefer the later - and hope this doesnt discourage Joerg from contributing in the future. First, the only mentioned real use of EtherIP I've seen anywhere is to tunnel old LAN based games that used protocols other than IP :-) Second, the OpenBSD interoperability issues are very real, and there is even a Xerox implementation that used an 8-bit instead of a 16-bit header size. Third, even the introductory material in RFC3378 mentions that people are strongly encouraged to use other technologies over EtherIP. Fourth, and finally, if GRE can provide the same functionality then that plus the first three points makes EtherIP something we really should not latch onto. And if it doesn't go in, it's not the end of the world. Anyone can maintain and use the external patch, and if usage gets widespread enough we'll of course be required to reevaluate integration. So I think we should pass on this for now. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [ATM] he: Fix __init/__devinit conflict
he_init_one() is declared __devinit, but calls lots of init functions that are marked __init. However, if CONFIG_HOTPLUG is enabled, __devinit functions go into normal .text, which leads to WARNING: drivers/atm/he.o - Section mismatch: reference to .init.text: from .text between 'he_start' (at offset 0x2130) and 'he_service_tbrq' Fix this by changing the __init functions to __devinit. Signed-off-by: Roland Dreier [EMAIL PROTECTED] --- Dave, this was acked by Chas (and he even requested it go into 2.6.18) but it seems to have gotten dropped somewhere -- it's not in Linus's tree even after he pulled your net tree. Please apply. diff --git a/drivers/atm/he.c b/drivers/atm/he.c index d369130..9e0383d 100644 --- a/drivers/atm/he.c +++ b/drivers/atm/he.c @@ -454,7 +454,7 @@ #define NONZERO (1 14) return (NONZERO | (exp 9) | (rate 0x1ff)); } -static void __init +static void __devinit he_init_rx_lbfp0(struct he_dev *he_dev) { unsigned i, lbm_offset, lbufd_index, lbuf_addr, lbuf_count; @@ -485,7 +485,7 @@ he_init_rx_lbfp0(struct he_dev *he_dev) he_writel(he_dev, he_dev-r0_numbuffs, RLBF0_C); } -static void __init +static void __devinit he_init_rx_lbfp1(struct he_dev *he_dev) { unsigned i, lbm_offset, lbufd_index, lbuf_addr, lbuf_count; @@ -516,7 +516,7 @@ he_init_rx_lbfp1(struct he_dev *he_dev) he_writel(he_dev, he_dev-r1_numbuffs, RLBF1_C); } -static void __init +static void __devinit he_init_tx_lbfp(struct he_dev *he_dev) { unsigned i, lbm_offset, lbufd_index, lbuf_addr, lbuf_count; @@ -546,7 +546,7 @@ he_init_tx_lbfp(struct he_dev *he_dev) he_writel(he_dev, lbufd_index - 1, TLBF_T); } -static int __init +static int __devinit he_init_tpdrq(struct he_dev *he_dev) { he_dev-tpdrq_base = pci_alloc_consistent(he_dev-pci_dev, @@ -568,7 +568,7 @@ he_init_tpdrq(struct he_dev *he_dev) return 0; } -static void __init +static void __devinit he_init_cs_block(struct he_dev *he_dev) { unsigned clock, rate, delta; @@ -664,7 +664,7 @@ he_init_cs_block(struct he_dev *he_dev) } -static int __init +static int __devinit he_init_cs_block_rcm(struct he_dev *he_dev) { unsigned (*rategrid)[16][16]; @@ -785,7 +785,7 @@ #define RTGTBL_OFFSET 0x400 return 0; } -static int __init +static int __devinit he_init_group(struct he_dev *he_dev, int group) { int i; @@ -955,7 +955,7 @@ #endif return 0; } -static int __init +static int __devinit he_init_irq(struct he_dev *he_dev) { int i; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
bcm43xx driver unstable behaviour (and linux wireless is junk btw)
Hi folks ! So this is 2.6.18 + Larry fix (though I've seen this problem before, it seems using WPA just make it happen more often). This is also a 4318, so the link is pretty weak due to the Tx Power problem and I suspects it makes the driver problems more visible... So basically, I lose the link every few minutes for a minute or so, I suspect it's related to wpa_supplicant vs. the ack losses due to the 4318 Tx Power problems. That alone would be ok though, if the driver wasn't totally stuck after a while. (Similar problem to after sleep/wakeup, looks like nothign goes through). When it goes bunk, it looks like that in the logs: Sep 24 12:24:18 localhost kernel: [ 285.686826] SoftMAC: Sent Authentication Request to 00:0f:66:52:4b:60. Sep 24 12:24:18 localhost kernel: [ 285.686976] SoftMAC: generic IE set to Sep 24 12:24:18 localhost kernel: [ 285.686999] SoftMAC: Already associating or associated to 00:0f:66:52:4b:60 Sep 24 12:24:28 localhost kernel: [ 295.687229] SoftMAC: Start scanning with channel: 1 Sep 24 12:24:28 localhost kernel: [ 295.687240] SoftMAC: Scanning 14 channels Sep 24 12:24:29 localhost kernel: [ 296.027053] SoftMAC: Scanning finished Sep 24 12:24:29 localhost kernel: [ 296.035267] SoftMAC: generic IE set to Sep 24 12:24:29 localhost kernel: [ 296.035310] SoftMAC: Already associating or associated to 00:0f:66:52:4b:60 Sep 24 12:24:31 localhost kernel: [ 297.690969] SoftMAC: Sent Authentication Request to 00:0f:66:52:4b:60. Sep 24 12:24:39 localhost kernel: [ 306.039210] SoftMAC: Start scanning with channel: 1 Sep 24 12:24:39 localhost kernel: [ 306.039222] SoftMAC: Scanning 14 channels Sep 24 12:24:39 localhost kernel: [ 306.375046] SoftMAC: Scanning finished Sep 24 12:24:39 localhost kernel: [ 306.383018] SoftMAC: generic IE set to Sep 24 12:24:39 localhost kernel: [ 306.383075] SoftMAC: Already associating or associated to 00:0f:66:52:4b:60 Sep 24 12:24:42 localhost kernel: [ 309.695021] SoftMAC: Sent Authentication Request to 00:0f:66:52:4b:60. Sep 24 12:24:49 localhost kernel: [ 316.387211] SoftMAC: Start scanning with channel: 1 etc... Then, if you rmmod, you get back a prompt, and about a second later, the kernel blows up. At this point, I've always been in X and it's too dead to dump anything into the disk logs so I don't know what the precise crash is, but it looks to me like the driver is not properly removing some timer or something there. Note that it also goes bunk on sleep/wakeup, and sometimes ifdown/ifup... in general, it's fragile and just 'loses it' in which case the only way to get it back is to rmmod/insmod. Doesn't help me to have my prism54 not working with WPA (apparently, the driver looks like it handles hostap ioctls but it doesn't agree on the ioctl numbers, among others, with whatever wpa_supplicant sends when configured to wpa mode... somebody knows if that driver is maintained ?) So at this point I have a choice between two wireless devices that don't work (and none of them is less than a couple years old). Looks like the linux wireless situation isn't getting any better since last KS. Oh and I don't care about it works in dscape stack sort of crap I regulary get. I want something that works with upstream kernels. That isn't that much to ask... or is it ? Ben, back to ethernet cables. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bcm43xx driver unstable behaviour (and linux wireless is junk btw)
Benjamin Herrenschmidt wrote: Oh and I don't care about it works in dscape stack sort of crap I regulary get. I want something that works with upstream kernels. That isn't that much to ask... or is it ? wpa_supplicant triggers races in softmac relatively easily, which are hard to fix properly. At least for me, motivation to work on this stuff is low given the potentially impending merge of devicescape, and every time I do spend some time investigating I just get even more frustrated at how difficult WE is to implement *properly* for non-hardmac drivers. We really have a need for a configuration system designed around 802.11. I agree, the stuff in mainline should be fixed, but at least personally I am finding it harder and harder to justify working on softmac. Daniel - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bcm43xx driver unstable behaviour (and linux wireless is junk btw)
On Sun, 2006-09-24 at 12:43 +1000, Benjamin Herrenschmidt wrote: Hi folks ! So this is 2.6.18 + Larry fix (though I've seen this problem before, it seems using WPA just make it happen more often). This is also a 4318, so the link is pretty weak due to the Tx Power problem and I suspects it makes the driver problems more visible... So basically, I lose the link every few minutes for a minute or so, I suspect it's related to wpa_supplicant vs. the ack losses due to the 4318 Tx Power problems. That alone would be ok though, if the driver wasn't totally stuck after a while. (Similar problem to after sleep/wakeup, looks like nothign goes through). When it goes bunk, it looks like that in the logs: Sep 24 12:24:18 localhost kernel: [ 285.686826] SoftMAC: Sent Authentication Request to 00:0f:66:52:4b:60. Sep 24 12:24:18 localhost kernel: [ 285.686976] SoftMAC: generic IE set to Sep 24 12:24:18 localhost kernel: [ 285.686999] SoftMAC: Already associating or associated to 00:0f:66:52:4b:60 Sep 24 12:24:28 localhost kernel: [ 295.687229] SoftMAC: Start scanning with channel: 1 Sep 24 12:24:28 localhost kernel: [ 295.687240] SoftMAC: Scanning 14 channels Sep 24 12:24:29 localhost kernel: [ 296.027053] SoftMAC: Scanning finished Sep 24 12:24:29 localhost kernel: [ 296.035267] SoftMAC: generic IE set to Sep 24 12:24:29 localhost kernel: [ 296.035310] SoftMAC: Already associating or associated to 00:0f:66:52:4b:60 Sep 24 12:24:31 localhost kernel: [ 297.690969] SoftMAC: Sent Authentication Request to 00:0f:66:52:4b:60. Sep 24 12:24:39 localhost kernel: [ 306.039210] SoftMAC: Start scanning with channel: 1 Sep 24 12:24:39 localhost kernel: [ 306.039222] SoftMAC: Scanning 14 channels Sep 24 12:24:39 localhost kernel: [ 306.375046] SoftMAC: Scanning finished Sep 24 12:24:39 localhost kernel: [ 306.383018] SoftMAC: generic IE set to Sep 24 12:24:39 localhost kernel: [ 306.383075] SoftMAC: Already associating or associated to 00:0f:66:52:4b:60 Sep 24 12:24:42 localhost kernel: [ 309.695021] SoftMAC: Sent Authentication Request to 00:0f:66:52:4b:60. Sep 24 12:24:49 localhost kernel: [ 316.387211] SoftMAC: Start scanning with channel: 1 etc... Then, if you rmmod, you get back a prompt, and about a second later, the kernel blows up. At this point, I've always been in X and it's too dead to dump anything into the disk logs so I don't know what the precise crash is, but it looks to me like the driver is not properly removing some timer or something there. Note that it also goes bunk on sleep/wakeup, and sometimes ifdown/ifup... in general, it's fragile and just 'loses it' in which case the only way to get it back is to rmmod/insmod. Doesn't help me to have my prism54 not working with WPA (apparently, the driver looks like it handles hostap ioctls but it doesn't agree on the ioctl numbers, among others, with whatever wpa_supplicant sends when configured to wpa mode... somebody knows if that driver is maintained ?) prism54 fullmac, right? Try using -Dwext; the prism54 wpa_supplicant driver is a dead-end and I added WE-19 commands to it a bit ago anyway. Oddly enough, I couldn't seem to get the driver to work reliably for me using straight WEP either, let alone WPA. It's pretty unmaintained at the moment. Dan So at this point I have a choice between two wireless devices that don't work (and none of them is less than a couple years old). Looks like the linux wireless situation isn't getting any better since last KS. Oh and I don't care about it works in dscape stack sort of crap I regulary get. I want something that works with upstream kernels. That isn't that much to ask... or is it ? Ben, back to ethernet cables. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/03] net/bridge: add support for EtherIP devices
On Sat, 23 Sep 2006 14:16:29 +0200 Joerg Roedel [EMAIL PROTECTED] wrote: This patch changes the device check in the bridge code to allow EtherIP devices to be added. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] If the device looks like a duck (Ethernet), then why does it need a separate ARP type. There are other tools that might work without modification if it just fully pretended to be an ether device. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [e2e] performance of BIC-TCP, High-Speed-TCP, H-TCP etc
On Fri, 22 Sep 2006 22:43:22 -0400 Injong Rhee [EMAIL PROTECTED] wrote: This is a resend with fixed web links. The links were broken in my previous email -- sorry about multiple transmissions. - Hi Doug, Thanks for sharing your paper. Also congratulations to the acceptance of your journal paper to TONs. But I am wondering what's new in this paper. At first glance, I did not find many new things that are different from your previously publicized reports. How much is this different from the ones you put out in this mail list a year or two ago and also the one publicized in PFLDnet February this year http://www.hpcc.jp/pfldnet2006/? In that same workshop, we also presented our experimental results that shows significant discrepancy from yours but i am not sure why you forgot to reference our experimental work presented in that same PFLDnet. Here is a link to a more detailed version of that report accepted to COMNET http://netsrv.csc.ncsu.edu/highspeed/comnet-asteppaper.pdf The main point of contention [that we talked about in that PFLDnet workshop] is the presence of background traffic and the method to add them. Your report mostly ignores the effect of background traffic. Some texts in this paper state that you added some web traffic (10%), but the paper shows only the results from NO background traffic scenarios. But our results differ from yours in many aspects. Below are the links to our results (the links to them have been available in our BIC web site for a long time and also mentioned in our PFLDnet paper; this result is with the patch that corrects HTCP bugs). [Convergence and intra protocol fairness] without background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/nobk/intra_protocol/intra_protocol.htm with background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/bk/intra_protocol/intra_protocol.htm [RTT fairness]: w/o background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/nobk/rtt_fairness/rtt_fairness.htm with background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/bk/rtt_fairness/rtt_fairness.htm [TCP friendliness] without background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/nobk/tcp_friendliness/tcp_friendliness.htm with background traffic: http://netsrv.csc.ncsu.edu/highspeed/1200/bk/tcp_friendliness/tcp_friendliness.htm After our discussion in that PFLDnet, I puzzled why we get different results. My guess is that the main difference between your experiment and ours is the inclusion of mid-sized flows with various RTTs -- our experience tells that the RTT variations of mid size flows play a very important role in creating significant dynamics in testing environments. The same point about the importance of mid size flows with RTT variations has been raised in several occasions by Sally Floyd as well, including in this year's E2E research group meeting. You can find some reference to the importance of RTT variations in her paper too [ http://www.icir.org/models/hotnetsFinal.pdf]. Just having web-traffic (all with the same RTTs) does not create a realistic environment as it does not do anything about RTTs and also flow sizes tend to be highly skewed with the Pareto distribution-- but I don't know exactly how you create your testing environment with web-traffic -- I can only guess from the description you have about the web traffic in your paper. Another puzzle in this difference seems that even under no background traffic, we also get different results from yours..hmm...especially with FAST because under no background traffic, FAST seems to work fairly well with good RTT fairness in our experiment. But your results show FAST has huge RTT-unfairness. That is very strange. Is that because we have different bandwidth and buffer sizes in the setup? I think we need to compare our notes more. Also in the journal paper of FAST experimental results [ http://netlab.caltech.edu/publications/FAST-ToN-final-060209-2007.pdf ], FAST seems to work very well under no background traffic. We will verify our results again in the exact same environment as you have in your report, to make sure we can reproduce your resultsbut here are some samples of our results for FAST. http://netsrv.csc.ncsu.edu/highspeed/1200/nobk/rtt_fairness/1200--2.4_FAST-2.4_FAST-NONE--400-3-1333--1000-76-3-0-0-0-5-500--20-0.6-1000-10-1200-64000-150--1/ In this experiment, FAST flows are just perfect. Also the same result is confirmed inthe FAST journal paper [ http://netlab.caltech.edu/publications/FAST-ToN-final-060209-2007.pdf -- please look at Section IV.B and C. But your results show really bad RTT fairness.] Best regards, Injong Since a lot of the discussion seems to be about emulated environments, has anyone run tests with the current crop of TCP variants over a real high BDP
Re: Is TCP over IPsec broken in 2.6.18?
On Sat, 23 Sep 2006, Evgeniy Polyakov wrote: I never saw unencrypted packets before. It's normal and expected, perhaps you didn't notice or had tcpdump filtering them. 17:45:11.102212 IP 192.168.4.78 192.168.4.79: ESP(spi=0x01f452be,seq=0x3), length 84 17:45:12.098146 IP 192.168.4.79.isakmp 192.168.4.78.isakmp: isakmp: phase 2/others ? oakley-quick[E] 17:45:12.098427 IP 192.168.4.78.isakmp 192.168.4.79.isakmp: isakmp: phase 2/others ? inf And why racoon packets are here at this stage. Can you try this with either a fully manual config (setkey only) or openswan? I use racoon, may be there are some problems with it's version, I will try new one after weekend. I just verified that racoon is working with current kernels. Racoon can be troublesome. I'm using racoon from ipsec-tools-0.6.5-3.1. You didn't specify a lifetime in your phase 1 spec ('remote anonymous') section. Not sure what happens in that case, could be something to do with it. - James -- James Morris [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html