Re: svn commit: r292379 - in head/sys: netinet netinet6
On 17/12/2015 23:46, Gleb Smirnoff wrote: On Thu, Dec 17, 2015 at 11:26:52PM +, Steven Hartland wrote: S> You may have not read all the detail in the review so you might not have S> noticed that I S> identified that carp IPv6 NA was broken by r251584 which was committed 2 1/2 S> years ago. I'm guessing not may people use it for IPv6. My suggestion is to look at this regression separated from the lagg failover and fix it separately. We could, but from this new code it was a few characters, implemented separately you'd need a good portion of the code from this change anyway, so it made sense to just include it here IMO. S> > The "link aggregation" itself refers to an aggregation of links between S> > two logical devices. If you build lagg(4) interface on top of two ports S> > that are plugged into different switches, you are calling for trouble. S> S> While multiple switches complicates the matter its not the only issue as S> you can S> reproduce this with a single switch and two nics in LAGG failover mode S> with a simple S> ifconfig down. At this point any traffic entering the switch for S> LAGG member S> will back-whole instead of being received by the other nic. S> S> It is much more common in networking now to have multiple physical switches S> configured as part of bigger logical devices using protocols such as S> MLAG, which is S> what we're using with Cisco's and Arista's, so not some cheepo network ;-) Right, you are confirming what I said above. Multiple physical devices, but still one logical on each side of lagg. In our target environment this is correct. S> > Nevertheless, someone wants to give a kick to this initially broken S> > network design and run it somehow. And this "somehow" implies Layer2 S> > upcalling into upper layers to do something, since there is no S> > established standard layer2 heartbeat packet. I have chatted with S> > networking gurus at my job, and they said, that they don't know S> > any decent network equipment that supports such setup. However, they S> > noticed that Windows is capable for such failover. I haven't yet S> > learned on how Windows solves the problem. Actually, those who S> > pushed committing 156226 should have done these investigations. S> > Probably Windows does exactly the same, sends gratutious ARP or S> > its IPv6 analog. Or may be does something better like sending S> > useless L2 datagram, but with a proper source hardware address. S> Actually our testing here showed both Windows and Linux worked as S> expected and S> from my reading doing the GARP / UNA is actually expected in this S> situation, for this very reason. Is it possible for you to sniff the traffic and see what actually happens in there? My expectations are the same, but want to be sure. Netops here did do that, which lead them to conclude the missing GARP/NA. S> I'd like to step back for a second and get you feedback on the changes S> that where S> reverted, which didn't have the DELAY in the callout. What where the S> issues as you S> saw them? So we don't spam people any more I've reopened the review so S> we can S> take this there: https://reviews.freebsd.org/D4111 Before going into implementation, can we first settle on the protocol? Could be that GARP/NA is the only solution there, but let's be sure first. I did try forcing traffic out from backup interface using the console once the primary was down, and unfortunately that didn't help. net.link.lagg.failover_rx_all=1 helps in the converse test but the only thing we found that fixed it fully in a timely manor was GARP/NA. In the tests you can clearly see the impact of ARP timeouts as sometimes it would converge quicker than others. If you would like me to try something else by all means LMK. Regards Steve ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r292379 - in head/sys: netinet netinet6
On 18/12/2015 9:55 AM, Steven Hartland wrote: > On 17/12/2015 19:20, Gleb Smirnoff wrote: >>Steven, >> >> another feasible solution for the design described in the 156226 >> would be to run STP on the switches, and if_bridge(4) instead of >> if_lagg(4) on FreeBSD, also with STP enabled. Would work perfectly. >> >> Of course, if switches are dumb and cheap, and can't do STP, >> then a tiny bpf-writer is the right solution. >> >> P.S. When I was running network in my university dormitory, we >> used a lot of cheap solutions, and a lot of dirty workarounds, >> but none of the latter made its way to FreeBSD kernel. You can >> also ask Eugene Grosbein, he also has huge experience of living >> on not so pleasant workarounds, but not pushing them agrressively >> into the kernel. >> > Last time I heard STP is a bad word in networking, so I'm sure they > network team > would have me crucified for even suggesting it and start shouting MLAG > for the > rest of the day ;-) > > Regards > Steve FWIW, during my testing at $lastjob, STP-enabled switches didn't provide optimial failback times, even with portfast enabled. This was for two FreeBSD machines providing HA network management services for multiple backends configured with dual-port bge/em NIC's configured with lagg(4) (bge0/em0, bge1,em1) in failover mode (no aggregation), with multiple carp(4) interfaces for public service IP's on top. ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r292379 - in head/sys: netinet netinet6
Hey Gleb, sorry I didn't wait for your review, but I did ask around on what the timeout for this would be and was informed 2 weeks and as it had been over a month, with quite a bit of feedback from others in the area, all of which had been addressed I thought that was reasonable. A simple reply of "I do intend to review but don't have time yet so please wait" I think would be useful in these types of situations, as I know everyone's busy but its impossible to know what others are thinking. Clearly the MTU think was just a silly mistake, which I backed out instantly so lets not be to harsh on that one ;-) With regards MAINTAINERS, is there any sort of automation which could alert on this (pre-commit) hook maybe as I'm sure that would be helpful as a reminder. I would definitely like to understand more about your concerns and learn from your knowledge in this area, so thanks for that offer, and while it does sound unforgiving I totally understand where you're coming from. Hopefully together we can bring this to a satisfactory conclusion as I would hate for both carp and lagg to stay as broken, 2 years is long enough :D Regards Steve On 17/12/2015 00:38, Gleb Smirnoff wrote: Steven, I'm sorry that wasn't able to review D4111 in time, but I have very strong concerns against r292275. And r292379 doesn't improve situation. I am asking you to back out both patches, and then we can get together back to the problem. The 156226 bug was sitting for 2 years in the bugzilla for a reason. It is a not "low hanging fruit" like koobs@ says. I'm sorry if I sound unforgiving, but you got a very bad commit record in this area. You committed r290403 to ip_carp.c which "added MTU support to carp interfaces", and that was after 4 YEARS of carp(4) being not an interface. So, I assume you doesn't have a good understanding of the current state of the stack, direction it is developed and things that can be done and can not (including DELAY() in callout(9). Note, that the MAINTAINERS file still lists me for ip_carp.c, and you didn't wait for my review. yet another reason to ask for backout. I understand that you got a product at work that needs to have problem fixed. I'm glad that you got a patch that works it around. But that doesn't mean the patch should immeditely be dumped in head with a threat of soon MFC. ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r292379 - in head/sys: netinet netinet6
Steven, another feasible solution for the design described in the 156226 would be to run STP on the switches, and if_bridge(4) instead of if_lagg(4) on FreeBSD, also with STP enabled. Would work perfectly. Of course, if switches are dumb and cheap, and can't do STP, then a tiny bpf-writer is the right solution. P.S. When I was running network in my university dormitory, we used a lot of cheap solutions, and a lot of dirty workarounds, but none of the latter made its way to FreeBSD kernel. You can also ask Eugene Grosbein, he also has huge experience of living on not so pleasant workarounds, but not pushing them agrressively into the kernel. -- Totus tuus, Glebius. ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r292379 - in head/sys: netinet netinet6
Steven, On Thu, Dec 17, 2015 at 02:29:02PM +, Steven Hartland wrote: S> Hey Gleb, sorry I didn't wait for your review, but I did ask around on what S> the timeout for this would be and was informed 2 weeks and as it had been S> over a month, with quite a bit of feedback from others in the area, all S> of which S> had been addressed I thought that was reasonable. S> S> A simple reply of "I do intend to review but don't have time yet so S> please wait" S> I think would be useful in these types of situations, as I know S> everyone's busy S> but its impossible to know what others are thinking. Of course it is my fault that I don't review everything phabricator sends to me, which is a lot. Because it is so easy to add a bunch of subscribers to a revision. However, in the good old pre-phabricator times, people used to send email directly to each other. Those email arrive to my regular mailbox, which I check as often as I can. You could ping me via email. -- Totus tuus, Glebius. ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r292379 - in head/sys: netinet netinet6
On Thursday, December 17, 2015 02:29:02 PM Steven Hartland wrote: > With regards MAINTAINERS, is there any sort of automation which could alert > on this (pre-commit) hook maybe as I'm sure that would be helpful as a > reminder. To this point (and this point only): I would encourage anyone who with a MAINTAINERS entry to setup a herald rule in phabricator to auto-subscribe to changes that affect files you care about. Uploading diffs to phab will then work as a sort of pre-commit hook. -- John Baldwin ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r292379 - in head/sys: netinet netinet6
On 17/12/2015 18:57, Gleb Smirnoff wrote: Steven, On Thu, Dec 17, 2015 at 02:29:02PM +, Steven Hartland wrote: S> Hey Gleb, sorry I didn't wait for your review, but I did ask around on what S> the timeout for this would be and was informed 2 weeks and as it had been S> over a month, with quite a bit of feedback from others in the area, all S> of which S> had been addressed I thought that was reasonable. S> S> A simple reply of "I do intend to review but don't have time yet so S> please wait" S> I think would be useful in these types of situations, as I know S> everyone's busy S> but its impossible to know what others are thinking. Of course it is my fault that I don't review everything phabricator sends to me, which is a lot. Because it is so easy to add a bunch of subscribers to a revision. However, in the good old pre-phabricator times, people used to send email directly to each other. Those email arrive to my regular mailbox, which I check as often as I can. You could ping me via email. Ooo, do you not get the chase emails done in phabricator then, as that's what I used? My apologies if that's the case. ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r292379 - in head/sys: netinet netinet6
On Thu, Dec 17, 2015 at 11:26:52PM +, Steven Hartland wrote: S> You may have not read all the detail in the review so you might not have S> noticed that I S> identified that carp IPv6 NA was broken by r251584 which was committed 2 1/2 S> years ago. I'm guessing not may people use it for IPv6. My suggestion is to look at this regression separated from the lagg failover and fix it separately. S> > The "link aggregation" itself refers to an aggregation of links between S> > two logical devices. If you build lagg(4) interface on top of two ports S> > that are plugged into different switches, you are calling for trouble. S> S> While multiple switches complicates the matter its not the only issue as S> you can S> reproduce this with a single switch and two nics in LAGG failover mode S> with a simple S> ifconfig down. At this point any traffic entering the switch for S> LAGG member S> will back-whole instead of being received by the other nic. S> S> It is much more common in networking now to have multiple physical switches S> configured as part of bigger logical devices using protocols such as S> MLAG, which is S> what we're using with Cisco's and Arista's, so not some cheepo network ;-) Right, you are confirming what I said above. Multiple physical devices, but still one logical on each side of lagg. S> > Nevertheless, someone wants to give a kick to this initially broken S> > network design and run it somehow. And this "somehow" implies Layer2 S> > upcalling into upper layers to do something, since there is no S> > established standard layer2 heartbeat packet. I have chatted with S> > networking gurus at my job, and they said, that they don't know S> > any decent network equipment that supports such setup. However, they S> > noticed that Windows is capable for such failover. I haven't yet S> > learned on how Windows solves the problem. Actually, those who S> > pushed committing 156226 should have done these investigations. S> > Probably Windows does exactly the same, sends gratutious ARP or S> > its IPv6 analog. Or may be does something better like sending S> > useless L2 datagram, but with a proper source hardware address. S> Actually our testing here showed both Windows and Linux worked as S> expected and S> from my reading doing the GARP / UNA is actually expected in this S> situation, for this very reason. Is it possible for you to sniff the traffic and see what actually happens in there? My expectations are the same, but want to be sure. S> I'd like to step back for a second and get you feedback on the changes S> that where S> reverted, which didn't have the DELAY in the callout. What where the S> issues as you S> saw them? So we don't spam people any more I've reopened the review so S> we can S> take this there: https://reviews.freebsd.org/D4111 Before going into implementation, can we first settle on the protocol? Could be that GARP/NA is the only solution there, but let's be sure first. -- Totus tuus, Glebius. ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r292379 - in head/sys: netinet netinet6
On 17/12/2015 19:20, Gleb Smirnoff wrote: Steven, another feasible solution for the design described in the 156226 would be to run STP on the switches, and if_bridge(4) instead of if_lagg(4) on FreeBSD, also with STP enabled. Would work perfectly. Of course, if switches are dumb and cheap, and can't do STP, then a tiny bpf-writer is the right solution. P.S. When I was running network in my university dormitory, we used a lot of cheap solutions, and a lot of dirty workarounds, but none of the latter made its way to FreeBSD kernel. You can also ask Eugene Grosbein, he also has huge experience of living on not so pleasant workarounds, but not pushing them agrressively into the kernel. Last time I heard STP is a bad word in networking, so I'm sure they network team would have me crucified for even suggesting it and start shouting MLAG for the rest of the day ;-) Regards Steve ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r292379 - in head/sys: netinet netinet6
On 17/12/2015 19:16, Gleb Smirnoff wrote: Steven, On Thu, Dec 17, 2015 at 02:29:02PM +, Steven Hartland wrote: S> I would definitely like to understand more about your concerns and learn S> from S> your knowledge in this area, so thanks for that offer, and while it does S> sound S> unforgiving I totally understand where you're coming from. S> S> Hopefully together we can bring this to a satisfactory conclusion as I S> would hate S> for both carp and lagg to stay as broken, 2 years is long enough :D Ok, let's get technical. CARP and LAGG were not broken for 2 years. They were working very well in the way they were designed to work. The setup in the bug 156226 was broken initially. You may have not read all the detail in the review so you might not have noticed that I identified that carp IPv6 NA was broken by r251584 which was committed 2 1/2 years ago. I'm guessing not may people use it for IPv6. The "link aggregation" itself refers to an aggregation of links between two logical devices. If you build lagg(4) interface on top of two ports that are plugged into different switches, you are calling for trouble. While multiple switches complicates the matter its not the only issue as you can reproduce this with a single switch and two nics in LAGG failover mode with a simple ifconfig down. At this point any traffic entering the switch for LAGG member will back-whole instead of being received by the other nic. It is much more common in networking now to have multiple physical switches configured as part of bigger logical devices using protocols such as MLAG, which is what we're using with Cisco's and Arista's, so not some cheepo network ;-) All comments in the 156226 from Eugene Grosbein are valid. I would not repeat them, but ask you to reread them in bugzilla. There was a good reason why for 2 years committers stayed away from this "bug" and related patch. Yes but not confuse the different types, we're talking specifically about failover mode here which has no special configuration hence its reliant on the OS implementation only. Nevertheless, someone wants to give a kick to this initially broken network design and run it somehow. And this "somehow" implies Layer2 upcalling into upper layers to do something, since there is no established standard layer2 heartbeat packet. I have chatted with networking gurus at my job, and they said, that they don't know any decent network equipment that supports such setup. However, they noticed that Windows is capable for such failover. I haven't yet learned on how Windows solves the problem. Actually, those who pushed committing 156226 should have done these investigations. Probably Windows does exactly the same, sends gratutious ARP or its IPv6 analog. Or may be does something better like sending useless L2 datagram, but with a proper source hardware address. Actually our testing here showed both Windows and Linux worked as expected and from my reading doing the GARP / UNA is actually expected in this situation, for this very reason. Okay, what if we want same in FreeBSD as in Windows? Should we do the following list of evil things: - put DELAY in context of callout(or in context of any network processing) - introduce new notions of a link state, or new KPI for link handling Note that link handling KPI was stable for iver 10 years and satisfied all the different types of interfaces we support - create new interface methods - call into address families supplying an ifnet that doesn't have this AF instantiated, and then to fix immediate panic putting there a kludge of "if (foo == NULL) return;" - etc... Sorry, I'm putting "etc" here, because tires on details. You would agree that the whole process of fixing the "bug" was overcoming the problems that the network stack is not designed for the things that you are willing to do. Won't you agree? I am indeed trying to produce feature parity, to prevent the powers that be throwing FreeBSD out as the only OS which fails to work as expected in failover mode, even in the simple case as described above. Yes we could apply user land work around but then everyone has to be aware its need and to set it up which doesn't sound like the best solution. Or should we just write a tiny program, that would observe state of networking ports, and if a port changes state then send a tiny packet as a bpf(4) write? This could be done but still means our lagg failover doesn't do what people would expect. I'd like to step back for a second and get you feedback on the changes that where reverted, which didn't have the DELAY in the callout. What where the issues as you saw them? So we don't spam people any more I've reopened the review so we can take this there: https://reviews.freebsd.org/D4111 Apologies if these are very obvious to others but clearly those involved with this didn't spot them so it would be really nice to learn from this. Regards Steve
Re: svn commit: r292379 - in head/sys: netinet netinet6
Steven, On Thu, Dec 17, 2015 at 02:29:02PM +, Steven Hartland wrote: S> I would definitely like to understand more about your concerns and learn S> from S> your knowledge in this area, so thanks for that offer, and while it does S> sound S> unforgiving I totally understand where you're coming from. S> S> Hopefully together we can bring this to a satisfactory conclusion as I S> would hate S> for both carp and lagg to stay as broken, 2 years is long enough :D Ok, let's get technical. CARP and LAGG were not broken for 2 years. They were working very well in the way they were designed to work. The setup in the bug 156226 was broken initially. The "link aggregation" itself refers to an aggregation of links between two logical devices. If you build lagg(4) interface on top of two ports that are plugged into different switches, you are calling for trouble. All comments in the 156226 from Eugene Grosbein are valid. I would not repeat them, but ask you to reread them in bugzilla. There was a good reason why for 2 years committers stayed away from this "bug" and related patch. Nevertheless, someone wants to give a kick to this initially broken network design and run it somehow. And this "somehow" implies Layer2 upcalling into upper layers to do something, since there is no established standard layer2 heartbeat packet. I have chatted with networking gurus at my job, and they said, that they don't know any decent network equipment that supports such setup. However, they noticed that Windows is capable for such failover. I haven't yet learned on how Windows solves the problem. Actually, those who pushed committing 156226 should have done these investigations. Probably Windows does exactly the same, sends gratutious ARP or its IPv6 analog. Or may be does something better like sending useless L2 datagram, but with a proper source hardware address. Okay, what if we want same in FreeBSD as in Windows? Should we do the following list of evil things: - put DELAY in context of callout(or in context of any network processing) - introduce new notions of a link state, or new KPI for link handling Note that link handling KPI was stable for iver 10 years and satisfied all the different types of interfaces we support - create new interface methods - call into address families supplying an ifnet that doesn't have this AF instantiated, and then to fix immediate panic putting there a kludge of "if (foo == NULL) return;" - etc... Sorry, I'm putting "etc" here, because tires on details. You would agree that the whole process of fixing the "bug" was overcoming the problems that the network stack is not designed for the things that you are willing to do. Won't you agree? Or should we just write a tiny program, that would observe state of networking ports, and if a port changes state then send a tiny packet as a bpf(4) write? -- Totus tuus, Glebius. ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r292379 - in head/sys: netinet netinet6
On 17/12/2015 11:38 AM, Gleb Smirnoff wrote: > I'm sorry that wasn't able to review D4111 in time, but I have > very strong concerns against r292275. And r292379 doesn't > improve situation. I am asking you to back out both patches, > and then we can get together back to the problem. The 156226 > bug was sitting for 2 years in the bugzilla for a reason. It > is a not "low hanging fruit" like koobs@ says. To clarify my comment on the review, I intended to refer to the *value*, and concrete *definition* of the desired feature (ie the problem space), not the triviality of any solution or implementation. I do my best not to presume where I don't have knowledge :) ./koobs ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r292379 - in head/sys: netinet netinet6
Steven, I'm sorry that wasn't able to review D4111 in time, but I have very strong concerns against r292275. And r292379 doesn't improve situation. I am asking you to back out both patches, and then we can get together back to the problem. The 156226 bug was sitting for 2 years in the bugzilla for a reason. It is a not "low hanging fruit" like koobs@ says. I'm sorry if I sound unforgiving, but you got a very bad commit record in this area. You committed r290403 to ip_carp.c which "added MTU support to carp interfaces", and that was after 4 YEARS of carp(4) being not an interface. So, I assume you doesn't have a good understanding of the current state of the stack, direction it is developed and things that can be done and can not (including DELAY() in callout(9). Note, that the MAINTAINERS file still lists me for ip_carp.c, and you didn't wait for my review. yet another reason to ask for backout. I understand that you got a product at work that needs to have problem fixed. I'm glad that you got a patch that works it around. But that doesn't mean the patch should immeditely be dumped in head with a threat of soon MFC. -- Totus tuus, Glebius. ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"