Re: Please review: lazy ext refcount initialization

2013-09-09 Thread Andre Oppermann
On 30.08.2013 00:58, Navdeep Parhar wrote: I'd like to merge r254342 from user/np/cxl_tuning to head if there are no objections. I don't object in principle, though I'm wonder whether we should have a more generic way of passing this kind of flags to the allocator? We probably get more demands

Re: mbuf autotuning effect

2013-09-08 Thread Andre Oppermann
On 07.09.2013 21:56, Ian Lepore wrote: On Sat, 2013-09-07 at 12:21 -0700, hiren panchasara wrote: On Sep 6, 2013 8:26 PM, Warner Losh i...@bsdimp.com wrote: On Sep 6, 2013, at 7:11 PM, Adrian Chadd wrote: Yeah, why is VM_KMEM_SIZE only 12mbyte for MIPS? That's a little low for a platform

Re: Flow ID, LACP, and igb

2013-08-29 Thread Andre Oppermann
On 29.08.2013 01:42, Alan Somers wrote: On Mon, Aug 26, 2013 at 2:40 PM, Andre Oppermann an...@freebsd.org wrote: On 26.08.2013 19:18, Justin T. Gibbs wrote: Hi Net, I'm an infrequent traveler through the networking code and would appreciate some feedback on some proposed solutions

Re: Network stack changes

2013-08-28 Thread Andre Oppermann
On 28.08.2013 20:30, Alexander V. Chernikov wrote: Hello list! Hello Alexander, you sent quite a few things in the same email. I'll try to respond as much as I can right now. Later you should split it up to have more in-depth discussions on the individual parts. If you could make it to the

Re: Flow ID, LACP, and igb

2013-08-27 Thread Andre Oppermann
On 27.08.2013 01:30, Adrian Chadd wrote: ... is there any reason we wouldn't want to have the TX and RX for a given flow mapped to the same core? They are. Thing is the inbound and outbound packet flow id's are totally independent from each other. The inbound one determines the RX ring it

Re: Please review: LRO entry last-active timestamp.

2013-08-26 Thread Andre Oppermann
On 21.08.2013 21:11, Navdeep Parhar wrote: I'd like to add a last-active timestamp to the structure that tracks the LRO state in a NIC's rx handler. This is r254336 in user/np/cxl_tuning that will be merged to head if there are no objections. No objections. This is good thing. The last time

Re: route/arp lifetime (Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux))

2013-08-26 Thread Andre Oppermann
On 25.08.2013 16:42, Adrian Chadd wrote: On 24 August 2013 10:09, Andre Oppermann an...@freebsd.org mailto:an...@freebsd.org wrote: On 24.08.2013 19:04, Adrian Chadd wrote: I'm very close to starting an mbuf batching thing to use in a few places like receive, transmit

Re: route/arp lifetime (Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux))

2013-08-26 Thread Andre Oppermann
On 26.08.2013 16:46, Luigi Rizzo wrote: On Mon, Aug 26, 2013 at 04:27:59PM +0200, Andre Oppermann wrote: ... 1. lle lock to rmlock. 2. if_addr and IN_ADDR locks to rmlocks. 3. routing table locking (rmlocks, and by doing away with rtentry locks and refcounting through copy-out

Re: Flow ID, LACP, and igb

2013-08-26 Thread Andre Oppermann
On 26.08.2013 19:18, Justin T. Gibbs wrote: Hi Net, I'm an infrequent traveler through the networking code and would appreciate some feedback on some proposed solutions to issues Spectra has seen with outbound LACP traffic. lacp_select_tx_port() uses the flow ID if it is available in the

Re: route/arp lifetime (Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux))

2013-08-24 Thread Andre Oppermann
On 24.08.2013 19:04, Adrian Chadd wrote: I'm very close to starting an mbuf batching thing to use in a few places like receive, transmit and transmit completion - free path. I'd be interested in your review/feedback and testing as it sounds like something you can easily stress test there. :)

Re: Netmap ixgbe stripping Vlan tags

2013-08-23 Thread Andre Oppermann
On 23.08.2013 00:36, Harika Tandra wrote: Hi all, I am running Netmap with intel 10G 82598EB card in promiscuous mode. While capturing packets via Netmap the driver is stripping off Vlan tags. I tested my setup, I am able to see Vlan tags when the same card is in promiscuous mode without

Re: Netmap ixgbe stripping Vlan tags

2013-08-23 Thread Andre Oppermann
On 23.08.2013 09:13, Juli Mallett wrote: On Fri, Aug 23, 2013 at 12:02 AM, Andre Oppermann an...@freebsd.org mailto:an...@freebsd.org wrote: On 23.08.2013 00:36, Harika Tandra wrote: Hi all, I am running Netmap with intel 10G 82598EB card in promiscuous mode

Re: Netmap ixgbe stripping Vlan tags

2013-08-23 Thread Andre Oppermann
On 23.08.2013 15:12, Harika Tandra wrote: Hi all, I agree with Andre's statement A netmap consumer typically doesn't expect packets be mangled at all, mostly likely netmap is expressly used to get the packet exactly as they were seen on the wire. For my application I want to see the whole

Further mbuf adjustments and changes

2013-08-21 Thread Andre Oppermann
I want to put these mbuf changes/updates/adjustments up for objections, if any, before committing them. This is a moderate overhaul of the mbuf headers and fields to take us into the next 5 years and two releases. The mbuf headers, in particular the pkthdr, have seen a number of new uses and

Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Andre Oppermann
On 20.08.2013 05:13, Julian Elischer wrote: On 8/20/13 6:38 AM, Peter Grehan wrote: Hi Andre, (moving to the more appropriate freebsd-net) I'm sorry for ambushing but this stuff has to be done. I have provided an alternative way of handling it and I'm happy to help you with your use case

Re: CFR: FIB handling improvements

2013-08-21 Thread Andre Oppermann
On 21.08.2013 17:42, Will Andrews wrote: Hi, I'm working to port forward to FreeBSD/head, improvements made to FIB handling by my colleagues Alan Somers and Justin Gibbs. Please review: http://people.freebsd.org/~will/fix-fib-issues.1.diff This patch includes fixes for several issues relating

Re: TSO and FreeBSD vs Linux

2013-08-21 Thread Andre Oppermann
On 13.08.2013 19:29, Julian Elischer wrote: I have been tracking down a performance embarrassment on AMAZON EC2 and have found it I think. Our OS cousins over at Linux land have implemented some interesting behaviour when TSO is in use. There used to be a different problem with EC2 and

Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Andre Oppermann
On 21.08.2013 18:38, Navdeep Parhar wrote: On 08/21/13 08:08, Andre Oppermann wrote: On 20.08.2013 00:38, Peter Grehan wrote: snip If there's an alternative to M_NOFREE, I'd be more than happy to use that. Set up your own (*ext_free) function and omit freeing of the mbuf itself. Make

Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux)

2013-08-21 Thread Andre Oppermann
On 18.08.2013 23:54, Adrian Chadd wrote: Hi, I think the UNIX architecture is a bit broken for anything other than the occasional (for various traffic levels defining occasional!) traffic connection. It's serving us well purely through the sheer force of will of modern CPU power but I think we

Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux)

2013-08-21 Thread Andre Oppermann
On 14.08.2013 12:21, Luigi Rizzo wrote: On Wed, Aug 14, 2013 at 05:23:02PM +1000, Lawrence Stewart wrote: I think (check the driver code in question as I'm not sure) that if you ifconfig if lro and the driver has hardware support or has been made aware of our software implementation, it should

Re: TSO and FreeBSD vs Linux

2013-08-21 Thread Andre Oppermann
On 15.08.2013 01:27, Kevin Oberman wrote: On Wed, Aug 14, 2013 at 12:46 PM, Julian Elischer jul...@freebsd.orgwrote: On 8/14/13 3:23 PM, Lawrence Stewart wrote: On 08/14/13 16:33, Julian Elischer wrote: They switched to using an initial window of 10 segments some time ago. FreeBSD starts

Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Andre Oppermann
On 21.08.2013 20:23, Navdeep Parhar wrote: I believe we need an extra patch to get M_NOFREE correct. I've had it forever in some of my internal repos but never committed it upstream (just plain forgot). Since this stuff is fresh in your mind, can you review this: diff -r cd78031b7885

Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Andre Oppermann
On 21.08.2013 21:40, Navdeep Parhar wrote: On 08/21/13 12:22, Andre Oppermann wrote: On 21.08.2013 20:23, Navdeep Parhar wrote: I believe we need an extra patch to get M_NOFREE correct. I've had it forever in some of my internal repos but never committed it upstream (just plain forgot

Re: route/arp lifetime (Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux))

2013-08-21 Thread Andre Oppermann
On 19.08.2013 13:42, Alexander V. Chernikov wrote: On 14.08.2013 19:48, Luigi Rizzo wrote: On Wed, Aug 14, 2013 at 05:40:28PM +0200, Marko Zec wrote: On Wednesday 14 August 2013 14:40:24 Luigi Rizzo wrote: On Wed, Aug 14, 2013 at 04:15:25PM +0400, Alexander V. Chernikov wrote: ... FWIW,

Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Andre Oppermann
On 21.08.2013 22:52, Navdeep Parhar wrote: On 08/21/13 13:44, Andre Oppermann wrote: On 21.08.2013 21:40, Navdeep Parhar wrote: On 08/21/13 12:22, Andre Oppermann wrote: On 21.08.2013 20:23, Navdeep Parhar wrote: I believe we need an extra patch to get M_NOFREE correct. I've had it forever

Re: TCP Initial Window 10 MFC

2013-08-14 Thread Andre Oppermann
On 14.08.2013 04:36, Lawrence Stewart wrote: Hi Andre, [RE team is BCCed so they're aware of this discussion] On 07/06/13 00:58, Andre Oppermann wrote: Author: andre Date: Fri Jul 5 14:58:24 2013 New Revision: 252789 URL: http://svnweb.freebsd.org/changeset/base/252789 Log: MFC r242266

Re: [net] protecting interfaces from races between control and data ?

2013-08-07 Thread Andre Oppermann
On 07.08.2013 09:18, Luigi Rizzo wrote: On Wed, Aug 7, 2013 at 5:26 AM, Mike Karels m...@karels.net mailto:m...@karels.net wrote: Jumping to (near) the end of the thread, I like most of Andre's proposal. Running with minimal locks at this layer is an admirable goal, and I agree with

Re: [net] protecting interfaces from races between control and data ?

2013-08-07 Thread Andre Oppermann
On 07.08.2013 22:48, Adrian Chadd wrote: On 7 August 2013 13:08, Scott Long scott4l...@yahoo.com wrote: An even rore relevant difference is that taskqueues have a much stronger management API. Ithreads can only be scheduled by generating a hardware interrupt, can only be drained by calling

Re: [net] protecting interfaces from races between control and data ?

2013-08-06 Thread Andre Oppermann
On 05.08.2013 23:53, Luigi Rizzo wrote: On Mon, Aug 05, 2013 at 11:04:44PM +0200, Andre Oppermann wrote: On 05.08.2013 19:36, Luigi Rizzo wrote: ... [picking a post at random to reply in this thread] tell whether or not we should bail out). Ideally we don't want to have any locks

Re: [net] protecting interfaces from races between control and data ?

2013-08-05 Thread Andre Oppermann
On 05.08.2013 16:59, Bryan Venteicher wrote: - Original Message - i am slightly unclear of what mechanisms we use to prevent races between interface being reconfigured (up/down/multicast setting, etc, all causing reinitialization of the rx and tx rings) and i) packets from the host

Re: [net] protecting interfaces from races between control and data ?

2013-08-05 Thread Andre Oppermann
On 05.08.2013 19:36, Luigi Rizzo wrote: On Mon, Aug 5, 2013 at 7:17 PM, Adrian Chadd adr...@freebsd.org wrote: I'm travelling back to San Jose today; poke me tomorrow and I'll brain dump what I did in ath(4) and the lessons learnt. The TL;DR version - you don't want to grab an extra lock in

Re: Recommendations for 10gbps NIC

2013-07-27 Thread Andre Oppermann
On 27.07.2013 10:42, Alexander V. Chernikov wrote: On 27.07.2013 12:15, Luigi Rizzo wrote: On Sat, Jul 27, 2013 at 10:02 AM, Alexander V. Chernikov melif...@freebsd.org wrote: This makes me curious because i believe people have used netmap with the 82598 and achieved close to line rate even

Re: A huge amount of sonewconn: pcb 0xfffffe0053916dc8: Listen queue overflow: 193 already in queue awaiting acceptance in logs recently (9-STABLE)

2013-07-25 Thread Andre Oppermann
On 25.07.2013 12:46, Lev Serebryakov wrote: Hello, Freebsd-net. I have 9.1-STABLE r253105 system, which started to flood logs with sonewconn: pcb 0xfe0053916dc8: Listen queue overflow: 193 already in queue awaiting acceptance messages (there are thousnds of it, if you take last message

Re: Improved SYN Cookies: Looking for testers

2013-07-16 Thread Andre Oppermann
On 16.07.2013 13:32, Loganaden Velvindron wrote: On Thu, Jul 11, 2013 at 10:36:22AM +0200, Andre Oppermann wrote: On 10.07.2013 15:18, Fabian Keil wrote: Andre Oppermann an...@freebsd.org wrote: We have a SYN cookie implementation for quite some time now but it has some limitations

Re: Listen queue overflow: N already in queue awaiting acceptance

2013-07-12 Thread Andre Oppermann
On 12.07.2013 10:25, Gleb Smirnoff wrote: On Thu, Jul 11, 2013 at 05:43:09PM +0200, Andre Oppermann wrote: A Andriy for example would never have found out about this problem other A than receiving vague user complaints about aborted connection attempts. A Maybe after spending many hours

Re: Listen queue overflow: N already in queue awaiting acceptance

2013-07-11 Thread Andre Oppermann
On 11.07.2013 09:05, Andriy Gapon wrote: kernel: sonewconn: pcb 0xfe0047db3930: Listen queue overflow: 193 already in queue awaiting acceptance last message repeated 113 times last message repeated 518 times last message repeated 2413 times last message repeated 2041 times last message

Re: Improved SYN Cookies: Looking for testers

2013-07-11 Thread Andre Oppermann
On 10.07.2013 15:18, Fabian Keil wrote: Andre Oppermann an...@freebsd.org wrote: We have a SYN cookie implementation for quite some time now but it has some limitations with current realities for window scaling and SACK encoding the in the few available bits. This patch updates and improves

Re: Listen queue overflow: N already in queue awaiting acceptance

2013-07-11 Thread Andre Oppermann
On 11.07.2013 15:35, Gleb Smirnoff wrote: On Thu, Jul 11, 2013 at 09:19:40AM +0200, Andre Oppermann wrote: A On 11.07.2013 09:05, Andriy Gapon wrote: A kernel: sonewconn: pcb 0xfe0047db3930: Listen queue overflow: 193 already in A queue awaiting acceptance A last message repeated 113

Re: Listen queue overflow: N already in queue awaiting acceptance

2013-07-11 Thread Andre Oppermann
On 11.07.2013 17:04, Andriy Gapon wrote: on 11/07/2013 17:28 Andre Oppermann said the following: Andriy for example would never have found out about this problem other than receiving vague user complaints about aborted connection attempts. Maybe after spending many hours searching for the cause

Improved SYN Cookies: Looking for testers

2013-07-08 Thread Andre Oppermann
We have a SYN cookie implementation for quite some time now but it has some limitations with current realities for window scaling and SACK encoding the in the few available bits. This patch updates and improves SYN cookies mainly by: a) encoding of MSS, WSCALE (window scaling) and SACK into

Re: hw.igb.num_queues default

2013-06-20 Thread Andre Oppermann
On 20.06.2013 15:37, Eugene Grosbein wrote: On 20.06.2013 17:34, Eggert, Lars wrote: real memory = 8589934592 (8192 MB) avail memory = 8239513600 (7857 MB) By default, the igb driver seems to set up one queue per detected CPU. Googling around, people seemed to suggest that limiting the

Re: [PATH] ALTQ(9) codel algorithm implementation

2013-06-14 Thread Andre Oppermann
On 14.06.2013 11:51, Gleb Smirnoff wrote: Ermal, On Mon, Jun 10, 2013 at 03:43:12PM +0200, Ermal Lu?i wrote: E at location [1] can be found a patch for Codel[3] algorithm implementation. E E Triggered by a mail to the mailing lists[2] of OpenBSD i completed the E implementation for FreeBSD.

Re: RFC: removing redundant checks in ether_input_internal()

2013-05-22 Thread Andre Oppermann
On 22.05.2013 14:58, Luigi Rizzo wrote: if_ethersubr.c :: ether_input_internal() is only called as follows: static void ether_nh_input(struct mbuf *m) { ether_input_internal(m-m_pkthdr.rcvif, m); } hence the following checks in the body are unnecessary:

Re: netmap bridge can tranmit big packet in line rate ?

2013-05-21 Thread Andre Oppermann
On 21.05.2013 16:21, Hooman Fazaeli wrote: AsBarney pointed outalready, your numbers are reasonable. You have almost saturated the link with 1514 byte packets.In the case of 64 byte packets, you do not achieve line rate probably because of the congestion on the bus.Can you show us top -SI

Re: Siftr inflight byte question

2013-05-07 Thread Andre Oppermann
On 06.05.2013 12:37, Lawrence Stewart wrote: [ccing freebsd-net@ so my problem description enters the collective subconscious in case I forget about this again] For everyone tuning in, Aris asked me the apt question of why siftr(4)'s # inflight bytes field doesn't take into account sacked

Re: Calculation of inflight data

2013-05-04 Thread Andre Oppermann
On 03.05.2013 09:28, Aris Angelo wrote: Hi, I am trying to implement an extension to the FreeBSD TCP stack. In order to do that, I have a question regarding the calculation of the pipe variable, the amount of data that the sender calculates as being inflight. I am puzzled for the case when no

Re: Is there any way to limit the amount of data in an mbuf chain submitted to a driver?

2013-05-04 Thread Andre Oppermann
On 04.05.2013 22:47, Jack Vogel wrote: Yes, I checked: #define IXGBE_TSO_SIZE 262140 So, the driver is not limiting you to 64K assuming you are using a version of recent vintage. The stack won't generate TCP and IP packets larger than 64K. However the ethernet header gets prepended to it

Re: pf performance?

2013-04-26 Thread Andre Oppermann
On 26.04.2013 16:49, Erich Weiler wrote: The pf isn't a process, so you can't see it in top. pf has some helper threads however, but packet processing isn't performed by any of them. But the work pf does would show up in 'system' on top right? So if I see all my CPUs tied up 100% in

Re: forwarding/ipfw/pf evolution (in pps) on -current

2013-04-25 Thread Andre Oppermann
On 25.04.2013 07:40, Olivier Cochard-Labbé wrote: On Wed, Apr 24, 2013 at 1:46 PM, Sami Halabi sodyn...@gmail.com wrote: 3. there some point of improved performance (without fw) that went down again somewhere before Clang got prod. Found it ! It's commit 242402: Rework the known mutexes...

Re: forwarding/ipfw/pf evolution (in pps) on -current

2013-04-24 Thread Andre Oppermann
On 24.04.2013 12:45, Olivier Cochard-Labbé wrote: Hi all, here is the result of my simple-and-dummy bench script regarding forwarding/ipfw/pf performance evolution on -current on a single-core server with one flow only. It's the result of more than 810 bench tests (including reboot between

Re: ipfilter(4) needs maintainer

2013-04-16 Thread Andre Oppermann
On 15.04.2013 19:48, Cy Schubert wrote: I did consider a port but given it would has to touch bits and pieces of the source tree (/usr/src), a port would be messy and the decision was made to work on importing it into base. Actually it shouldn't touch many if any pieces of src/sys. Everything

Re: RFC 3042 Implementation

2013-04-11 Thread Andre Oppermann
On 11.04.2013 20:59, Matt Miller wrote: In some of our tests, we noticed some duplicate pure ACKs (not window updates), most of which the duplicates were coming from this tcp_output() call in tcp_do_segment() (line 2534): 2508 } else if (V_tcp_do_rfc3042) { 2509

Re: panic in tcp_do_segment()

2013-04-09 Thread Andre Oppermann
On 09.04.2013 10:16, Peter Holm wrote: On Mon, Apr 08, 2013 at 02:13:40PM +0200, Andre Oppermann wrote: On 05.04.2013 13:09, Matt Miller wrote: Hey Rick, I believe Juan and I have root caused this crash recently. The t_state = 0x1, TCPS_LISTEN, in the link provided at the time

Re: panic in tcp_do_segment()

2013-04-08 Thread Andre Oppermann
On 05.04.2013 13:09, Matt Miller wrote: Hey Rick, I believe Juan and I have root caused this crash recently. The t_state = 0x1, TCPS_LISTEN, in the link provided at the time of the assertion. In tcp_input(), if we're in TCPS_LISTEN, SO_ACCEPTCONN should be set on the socket and we should

Re: Syncookies break with Windows 8

2013-04-05 Thread Andre Oppermann
On 04.04.2013 23:52, Kevin Day wrote: On Feb 1, 2013, at 5:09 PM, Andre Oppermann an...@freebsd.org wrote: I'm working on a solution. Have to make sure that the chance to crack a reduced cookie during its 30 seconds lifetime isn't too high. That means involving our resident crypto experts

Re: panic in tcp_do_segment()

2013-04-05 Thread Andre Oppermann
On 05.04.2013 00:33, Rick Macklem wrote: Hi, When pho@ was doing some NFS testing, he got the following crash, which I can't figure out. (As far as I can see, INP_WLOCK() is always held when tp-t_state = TCPS_CLOSED and it is held from before the test for TCPS_CLOSED in tcp_input() up until the

Re: Limits on jumbo mbuf cluster allocation

2013-03-19 Thread Andre Oppermann
On 19.03.2013 05:29, Garrett Wollman wrote: On Tue, 12 Mar 2013 23:48:00 -0400 (EDT), Rick Macklem rmack...@uoguelph.ca said: I've attached a patch that has assorted changes. So I've done some preliminary testing on a slightly modified form of this patch, and it appears to have no major

Re: MPLS

2013-03-18 Thread Andre Oppermann
On 18.03.2013 13:20, Alexander V. Chernikov wrote: On 17.03.2013, at 23:54, Andre Oppermann an...@freebsd.org wrote: On 17.03.2013 19:57, Alexander V. Chernikov wrote: On 17.03.2013 13:20, Sami Halabi wrote: ITOH OpenBSD has a complete implementation of MPLS out of the box, maybe

Re: MPLS

2013-03-17 Thread Andre Oppermann
On 17.03.2013 19:57, Alexander V. Chernikov wrote: On 17.03.2013 13:20, Sami Halabi wrote: ITOH OpenBSD has a complete implementation of MPLS out of the box, maybe Their control plane code is mostly useless due to design approach (routing daemons talk via kernel). What's your approach?

Re: kern/176446: [netinet] [patch] Concurrency in ixgbe driving out-of-order packet process and spurious RST

2013-03-15 Thread Andre Oppermann
On 15.03.2013 14:57, John Baldwin wrote: On Thursday, March 14, 2013 5:59:44 pm Ryan Stone wrote: What's the benefit in having a both an interrupt thread and task that performs the same function? It seems to me that having two threads that do the same job is what is making this so complicated.

Re: increasing 'requests for jumbo clusters denied'

2013-03-11 Thread Andre Oppermann
On 11.03.2013 08:52, Ihsan Junaidi Ibrahim wrote: Hi, I'm on 9.0-RELEASE-p3 and have had a number of instances where my igb0 network connectivity locked up under heavy load. This problem is also known on CURRENT and we are under active investigation on how to solve it properly. I've had

Re: Limits on jumbo mbuf cluster allocation

2013-03-11 Thread Andre Oppermann
On 11.03.2013 00:46, Rick Macklem wrote: Andre Oppermann wrote: On 10.03.2013 03:22, Rick Macklem wrote: Garett Wollman wrote: Also, it occurs to me that this strategy is subject to livelock. To put backpressure on the clients, it is far better to get them to stop sending (by advertising

Re: Limits on jumbo mbuf cluster allocation

2013-03-11 Thread Andre Oppermann
On 11.03.2013 17:05, Garrett Wollman wrote: In article 513db550.5010...@freebsd.org, an...@freebsd.org writes: Garrett's problem is receive side specific and NFS can't do much about it. Unless, of course, NFS is holding on to received mbufs for a longer time. Well, I have two problems: one

Re: Limits on jumbo mbuf cluster allocation

2013-03-10 Thread Andre Oppermann
On 09.03.2013 01:47, Rick Macklem wrote: Garrett Wollman wrote: On Fri, 08 Mar 2013 08:54:14 +0100, Andre Oppermann an...@freebsd.org said: [stuff I wrote deleted] You have an amd64 kernel running HEAD or 9.x? Yes, these are 9.1 with some patches to reduce mutex contention on the NFS

Re: Limits on jumbo mbuf cluster allocation

2013-03-10 Thread Andre Oppermann
On 10.03.2013 07:04, Garrett Wollman wrote: On Fri, 8 Mar 2013 12:13:28 -0800, Jack Vogel jfvo...@gmail.com said: Yes, in the past the code was in this form, it should work fine Garrett, just make sure the 4K pool is large enough. [Andre Oppermann's patch:] if (adapter-max_frame_size =

Re: Limits on jumbo mbuf cluster allocation

2013-03-10 Thread Andre Oppermann
On 10.03.2013 03:22, Rick Macklem wrote: Garett Wollman wrote: Also, it occurs to me that this strategy is subject to livelock. To put backpressure on the clients, it is far better to get them to stop sending (by advertising a small receive window) than to accept their traffic but queue it for

Re: Limits on jumbo mbuf cluster allocation

2013-03-08 Thread Andre Oppermann
On 08.03.2013 18:04, Garrett Wollman wrote: On Fri, 8 Mar 2013 00:31:18 -0800, Jack Vogel jfvo...@gmail.com said: I am not strongly opposed to trying the 4k mbuf pool for all larger sizes, Garrett maybe if you would try that on your system and see if that helps you, I could envision making

Re: [patch] interface routes

2013-03-07 Thread Andre Oppermann
On 07.03.2013 12:43, Alexander V. Chernikov wrote: On 07.03.2013 11:39, Andre Oppermann wrote: On 07.03.2013 07:34, Alexander V. Chernikov wrote: Hello list! There is a known long-lived issue with interface routes addition/deletion: ifconfig iface inet 1.2.3.4/24 can fail if given prefix

Re: [patch] interface routes

2013-03-07 Thread Andre Oppermann
On 07.03.2013 14:38, Ermal Luçi wrote: On Thu, Mar 7, 2013 at 12:55 PM, Andre Oppermann an...@freebsd.org mailto:an...@freebsd.org wrote: On 07.03.2013 12:43, Alexander V. Chernikov wrote: On 07.03.2013 11:39, Andre Oppermann wrote: On 07.03.2013 07:34, Alexander V

Re: [patch] interface routes

2013-03-07 Thread Andre Oppermann
On 07.03.2013 14:54, Alexander V. Chernikov wrote: On 07.03.2013 15:55, Andre Oppermann wrote: On 07.03.2013 12:43, Alexander V. Chernikov wrote: On 07.03.2013 11:39, Andre Oppermann wrote: This brings up a long standing sore point of our routing code which this patch makes more pronounced

Re: Default route changes unexpectedly #2 (was Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102)

2013-03-07 Thread Andre Oppermann
about this for FreeBSD 9.x It's not compiled in GENERIC on 9.x because it had/has some stability issues. I just wanted to make sure that the problem really come out of the arpresolve area before digging into it. -- Andre On Wed, Mar 6, 2013 at 10:27 AM, Andre Oppermann an...@freebsd.org wrote

Re: Default route changes unexpectedly #2 (was Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102)

2013-03-07 Thread Andre Oppermann
On 07.03.2013 20:27, Krzysztof Barcikowski wrote: W dniu 2013-03-07 18:09, Andre Oppermann pisze: On 07.03.2013 17:54, Nick Rogers wrote: I'm not sure. I have not explicitly enabled/disabled it. I am using the GENERIC kernel from 9.1 plus PF+ALTQ. # sysctl net.inet.flowtable.enable sysctl

Re: [patch] interface routes

2013-03-07 Thread Andre Oppermann
On 07.03.2013 16:34, Alexander V. Chernikov wrote: On 07.03.2013 17:51, Andre Oppermann wrote: On 07.03.2013 14:38, Ermal Luçi wrote: Isn't it better to teach the routing code about metrics. Routing daemons cope better this way and they can handle this. So the policy of this behaviour can

Re: Limits on jumbo mbuf cluster allocation

2013-03-07 Thread Andre Oppermann
On 08.03.2013 08:10, Garrett Wollman wrote: I have a machine (actually six of them) with an Intel dual-10G NIC on the motherboard. Two of them (so far) are connected to a network using jumbo frames, with an MTU a little under 9k, so the ixgbe driver allocates 32,000 9k clusters for its receive

Re: Default route changes unexpectedly

2013-03-06 Thread Andre Oppermann
On 05.03.2013 18:39, Nick Rogers wrote: Hello, I am attempting to create awareness of a serious issue affecting users of FreeBSD 9.x and PF. There appears to be a bug that allows the kernel's routing table to be corrupted by traffic routing through the system. Under heavy traffic load, the

Re: Default route changes unexpectedly #2 (was Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102)

2013-03-06 Thread Andre Oppermann
Courtland, the arpresolve observation is very important. Do you have flowtable enabled in your kernel? -- Andre On 06.03.2013 17:16, Adrian Chadd wrote: Another instance of it.. Adrian On 6 March 2013 07:21, Courtland ncrog...@gmail.com wrote: Has there been any progress on resolving this

Re: [patch] interface routes

2013-03-06 Thread Andre Oppermann
On 07.03.2013 07:34, Alexander V. Chernikov wrote: Hello list! There is a known long-lived issue with interface routes addition/deletion: ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in kernel route table (for example, advertised by IGP like OSPF). Interface route can

Re: Bug in sbsndptr()

2013-03-05 Thread Andre Oppermann
On 05.03.2013 04:21, Lawrence Stewart wrote: On 03/05/13 03:35, Andre Oppermann wrote: On 26.02.2013 14:38, Lawrence Stewart wrote: Hi Andre, Hi Lawrence, :-) A colleague and I spent a very frustrating day tracing an accounting bug in the multipath TCP patch we're working on at CAIA

Re: Bug in sbsndptr()

2013-03-04 Thread Andre Oppermann
On 26.02.2013 14:38, Lawrence Stewart wrote: Hi Andre, Hi Lawrence, :-) A colleague and I spent a very frustrating day tracing an accounting bug in the multipath TCP patch we're working on at CAIA to a bug in sbsndptr(). I haven't tested it with regular TCP yet, but I believe the following

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-02-13 Thread Andre Oppermann
On 13.02.2013 09:25, Lawrence Stewart wrote: FYI I've read the whole thread as of this reply and plan to follow up to a few of the other posts separately, but first for my initial thoughts... On 01/23/13 07:11, John Baldwin wrote: As I mentioned in an earlier thread, I recently had to debug an

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-02-13 Thread Andre Oppermann
On 13.02.2013 15:26, Lawrence Stewart wrote: On 02/13/13 21:27, Andre Oppermann wrote: On 13.02.2013 09:25, Lawrence Stewart wrote: The idea is useful. I'd just like to discuss the implementation specifics a little further before recommending whether the patch should go in as is to provide

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-02-12 Thread Andre Oppermann
On 12.02.2013 11:55, Andrey Zonov wrote: On 2/11/13 3:18 PM, Andre Oppermann wrote: Smaller RTO (1s) has become a RFC so there was very broad consensus in TCPM that is a good thing. We don't have it yet because we were not fully compliant in one case (loss of first segment). I've fixed

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-02-12 Thread Andre Oppermann
On 11.02.2013 19:56, Adrian Chadd wrote: On 11 February 2013 03:18, Andre Oppermann an...@freebsd.org wrote: In general Google does provide quite a bit of data with their experiments showing that it isn't harmful and that it helps the case. Smaller RTO (1s) has become a RFC so there was very

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-02-11 Thread Andre Oppermann
On 09.02.2013 15:41, Alfred Perlstein wrote: However, the end result must be far different than what has occurred so far. If the code was deemed unacceptable for general inclusion, then we must find a way to provide a light framework to accomplish the needs of the community member. We've got

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-02-11 Thread Andre Oppermann
On 05.02.2013 22:40, John Baldwin wrote: On Tuesday, February 05, 2013 12:44:27 pm Andre Oppermann wrote: I would prefer to encapsulate it into its own not-so-much-congestion-management algorithm so you can eventually do other tweaks as well like more aggressive loss recovery which would fit

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-02-11 Thread Andre Oppermann
On 10.02.2013 11:36, Andrey Zonov wrote: On 2/10/13 9:05 AM, Kevin Oberman wrote: This is a subject rather near to my heart, having fought battles with congestion back in the dark days of Windows when it essentially defaulted to TCPIGNOREIDLE. It was a huge pain, but it was the only way

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-02-05 Thread Andre Oppermann
On 05.02.2013 18:11, John Baldwin wrote: On Wednesday, January 30, 2013 12:26:17 pm Andre Oppermann wrote: You can simply create your own congestion control algorithm with only the restart window changed. See (pseudo) code below. BTW, I just noticed that the other cc algos don't do not reset

Re: A question about SYN cookies...

2013-02-04 Thread Andre Oppermann
On 04.02.2013 01:09, George Neville-Neil wrote: Howdy, I've been reviewing the SYN cache and SYN cookie code and I'm wondering why we do all the work of generating a SYN cache entry before sending a SYN cookie. If the point of SYN cookies is to defend against a SYN flood then, to my mind,

Re: m_get2() name

2013-02-01 Thread Andre Oppermann
On 01.02.2013 13:04, Gleb Smirnoff wrote: Hi! The m_get2() function allocates a single mbuf with enough space to hold specified amount of data. It can return either a single mbuf, an mbuf with a standard cluster, page size cluster, or jumbo cluster. While m_get2() is a good function,

Re: m_get2() name

2013-02-01 Thread Andre Oppermann
On 01.02.2013 14:15, Gleb Smirnoff wrote: On Fri, Feb 01, 2013 at 02:04:38PM +0100, Andre Oppermann wrote: A The m_get2() function allocates a single mbuf with enough space A to hold specified amount of data. It can return either a single mbuf, A an mbuf with a standard cluster, page size

Re: Syncookies break with Windows 8

2013-02-01 Thread Andre Oppermann
On 01.02.2013 22:21, Kevin Day wrote: We've got a large cluster of HTTP servers, each server handling 10,000req/sec. Occasionally, and during periods of heavy load, we'd get complaints from some users that downloads were working but going EXTREMELY slowly. After a whole lot of debugging, we

Re: Syncookies break with Windows 8

2013-02-01 Thread Andre Oppermann
On 01.02.2013 23:54, Kevin Day wrote: On Feb 1, 2013, at 4:39 PM, Andre Oppermann opperm...@networx.ch wrote: This is not true. FreeBSD uses bits in the timestamp to encode all recognized TCP options including window scaling. Sorry, you are correct here. Reading through a half dozen TCP

Re: [patch] good bye sockaddr_inarp

2013-01-30 Thread Andre Oppermann
On 30.01.2013 10:25, Gleb Smirnoff wrote: Hello! It looks to me that the only thing the sockaddr_inarp was ever used for is to carry the SIN_PROXY flag. The SIN_PROXY flag in its turn, meant install a proxy only ARP entry. Such entry behaves as any published entry, but doesn't modify

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-01-30 Thread Andre Oppermann
On 30.01.2013 17:58, John Baldwin wrote: On Tuesday, January 29, 2013 6:07:22 pm Andre Oppermann wrote: On 29.01.2013 19:50, John Baldwin wrote: On Thursday, January 24, 2013 11:14:40 am John Baldwin wrote: Agree, per-socket option could be useful than global sysctls under certain situation

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-01-30 Thread Andre Oppermann
On 30.01.2013 18:11, Alfred Perlstein wrote: On 1/30/13 11:58 AM, John Baldwin wrote: On Tuesday, January 29, 2013 6:07:22 pm Andre Oppermann wrote: Yes, unfortunately I do object. This option, combined with the inflated CWND at the end of a burst, effectively removes much, if not all

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-01-29 Thread Andre Oppermann
On 29.01.2013 19:50, John Baldwin wrote: On Thursday, January 24, 2013 11:14:40 am John Baldwin wrote: Agree, per-socket option could be useful than global sysctls under certain situation. However, in addition to the per-socket option, could global sysctl nodes to disable idle_restart/idle_cwv

Re: [PATCH] Allow tcpdrop to use non-space separators

2013-01-29 Thread Andre Oppermann
On 29.01.2013 18:05, John Baldwin wrote: A common use case I have at work is to find a busted connection using netstat -n or sockstat and then want to tcpdrop it. However, tcpdrop requires spaces between the address and port so I can't simply cut and paste from one terminal window into another

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-01-24 Thread Andre Oppermann
On 24.01.2013 03:31, Sepherosa Ziehau wrote: On Thu, Jan 24, 2013 at 12:15 AM, John Baldwin j...@freebsd.org wrote: On Wednesday, January 23, 2013 1:33:27 am Sepherosa Ziehau wrote: On Wed, Jan 23, 2013 at 4:11 AM, John Baldwin j...@freebsd.org wrote: As I mentioned in an earlier thread, I

Re: Some questions about the new TCP congestion control code

2013-01-24 Thread Andre Oppermann
On 24.01.2013 14:28, Lawrence Stewart wrote: On 01/16/13 06:27, John Baldwin wrote: One other thing I noticed which is may or may not be odd during this, is that if you have a connection with TCP_NODELAY enabled and you fill your cwnd and then you get an ACK back for an earlier small segment

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-01-22 Thread Andre Oppermann
On 22.01.2013 21:35, Alfred Perlstein wrote: On 1/22/13 12:11 PM, John Baldwin wrote: As I mentioned in an earlier thread, I recently had to debug an issue we were seeing across a link with a high bandwidth-delay product (both high bandwidth and high RTT). Our specific use case was to use a

  1   2   3   4   5   6   7   8   >