Re: How to support QUIC with ipfw

2021-04-12 Thread Matt Joras
Hi Michael,

On Sun, Apr 11, 2021 at 2:27 PM Michael Sierchio  wrote:
>
> On Sun, Apr 11, 2021 at 2:20 PM Matt Joras  wrote:
>
> > Hi Michael,
> >
> > On Sun, Apr 11, 2021, 1:25 PM Michael Sierchio  wrote:
> >
> >> Hi, all.  I noticed my firewall was dropping what seemed to be unsolicited
> >> UDP connections from Google and Facebook, but this turned out to be QUIC
> >> traffic. The traffic can be initiated by the browser (or other supporting
> >> software) or the server.  The problem is that dynamic rules generally
> >> don't
> >> cut it – udp traffic here is predominantly NTP and DNS, and the dynamic
> >> rule lifetime for UDP is very short (3-6 s).  And of course they don't
> >> work
> >> at all for traffic initiated by the server side.
> >>
> >
> > QUIC connections aren't initiated by the server. The browser is initiating
> > these connections. I'm not an ipfw user, the best generic firewall strategy
> > would be to have some sort of flow tracking for ~30s for UDP flows
> > associated with tuples originating on the client for remote port 443. 443
> > will cover the vast majority of Internet cases, as QUIC is only being used
> > at scale for HTTP/3.
> >
> >
> Hej, Matt. Thanks. That's a solution that occurred to me, but it means a
> ton of dynamic rules will get instantiated for ephemeral DNS lookups – 3
> seconds is a very long time for a conversation with a DNS server, because
> it has probably recursed from the root zone all the way to the A record in
> a fraction of that time.  30 seconds is forever – well, since UDP doesn't
> have an analogue to a FIN or RST, the rule doesn't go away when the
> conversation does.

Is it not possible to do the dynamic rule instantiation for select UDP
ports, i.e. 443? That may cause issues if DNS-over-HTTP/3 becomes a
thing, but at least for now it would exclude DNS.

>
> I'll get some metrics on it. Thanks again.
>
>
> --
>
> "Well," Brahmā said, "even after ten thousand explanations, a fool is no
> wiser, but an intelligent person requires only two thousand five hundred."
>
> - The Mahābhārata

Matt Joras
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: How to support QUIC with ipfw

2021-04-11 Thread Matt Joras
Hi Michael,

On Sun, Apr 11, 2021, 1:25 PM Michael Sierchio  wrote:

> Hi, all.  I noticed my firewall was dropping what seemed to be unsolicited
> UDP connections from Google and Facebook, but this turned out to be QUIC
> traffic. The traffic can be initiated by the browser (or other supporting
> software) or the server.  The problem is that dynamic rules generally don't
> cut it – udp traffic here is predominantly NTP and DNS, and the dynamic
> rule lifetime for UDP is very short (3-6 s).  And of course they don't work
> at all for traffic initiated by the server side.
>

QUIC connections aren't initiated by the server. The browser is initiating
these connections. I'm not an ipfw user, the best generic firewall strategy
would be to have some sort of flow tracking for ~30s for UDP flows
associated with tuples originating on the client for remote port 443. 443
will cover the vast majority of Internet cases, as QUIC is only being used
at scale for HTTP/3.


> My kludgy solution at present is to troll the dynamic rules, locate the TCP
> connections in them with 443 and 5228 as the target port, and add those
> addresses to a table that permits UDP traffic from those ports.  I only see
> QUIC on IPv6, by the way.  The cron job runs once per minute, adds the
> addresses seen, and deletes those older than N seconds.  I use time_t
> seconds since epoch as the table arg, so I know when it was added or
> refreshed.
>
> Any suggestions on a better solution?
>
> Thanks.
>
> – M
>
> --
>
> "Well," Brahmā said, "even after ten thousand explanations, a fool is no
> wiser, but an intelligent person requires only two thousand five hundred."
>
> - The Mahābhārata
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>

Matt Joras

>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Raw Sockets: Two Questions

2018-03-20 Thread Matt Joras
On Tue, Mar 20, 2018 at 8:43 PM, Eugene Grosbein  wrote:
> On 21.03.2018 08:03, Michael Tuexen wrote:
>
>>> On 21. Mar 2018, at 00:39, Eugene Grosbein  wrote:
>>>
>>> 21.03.2018 3:09, Ronald F. Guilmette wrote:
>>>
 I'm going to be doing some stuff with raw sockets pretty soon, and
 while scrounging around, looking for some nice coding examples, I
 found the following very curious comment on one particular message
 board:


 https://stackoverflow.com/questions/7048448/raw-sockets-on-bsd-operating-systems

  "Using raw sockets isn't hard but it's not entirely portable. For
  instance, both in BSD and in Linux you can send whatever you want,
  but in BSD you can't receive anything that has a handler (like TCP
  and UDP)."

 So, first question:  Is the above comment actually true & accurate?
>>>
>>> Not for FreeBSD.
>> Are you saying that I can receive on a raw socket SCTP, TCP and UDP packets?
>
> No. I'm saying one can send/receive RAW IP packets no matter are they SCTP, 
> TCP or UDP
> or something else by means of libdnet. It uses raw sockets and BPF internally
> but hides this complexity. nmap uses it just fine.
>
Saying "Not for FreeBSD" is needlessly confusing and not accurate. In
the common parlance "raw sockets" does not refer to libdnet, which is
not a part of the FreeBSD base system. You cannot use traditional raw
sockets on FreeBSD to receive traditional protocol packets. The only
way to do that in the base system is to use a BPF handle directly.

Matt
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Issues with bxe NIC

2018-02-01 Thread Matt Joras
On Thu, Feb 1, 2018 at 6:56 AM, Marius Halden <mariu...@lden.org> wrote:
> On Thu, Feb 1, 2018, at 15:17, Marius Halden wrote:
> [...]
> Is this a hardware or a driver issue?
This is one of many issues we have seen with the NetXtreme II device.
After a lot of back and forth (1.5 years+) with Qlogic, the issue was
never sufficiently root caused. This was with the help of instrumented
drivers, updated firmware, raw ASIC dumps, and hardware defect
analyses of the NICs exhibiting the issue.

The conclusions I've come to is that this is almost certainly not a
problem with the FreeBSD driver. The condition generally persists
across reboots. Often the problem would stop after the NIC was fully
re-seated within the chassis, so I would definitely try that. If that
doesn't work I do think the issue likely warrants an RMA of the device
from Qlogic. They should have a record of the issue internally.

Good luck.

Matt Joras
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Problem with igb NIC on fresh 11 Install

2017-09-30 Thread Matt Joras
On 09/30/2017 11:36, Lee Brown wrote:
> It looks like the driver is not passing packets from the VLAN layer to the
> NIC.  Switch counters verify this and netstat seems to indicate the same.
> ...
> root@rtr-net-r1: - # ifconfig igb0
> igb0: flags=8c02 metric 0 mtu 1500

igb0 looks to be OACTIVE, (though I'm not sure why all your A's are
swapped with R's). When it's OACTIVE it will drop packets and generally
behave as if the link is down. Does igb0 stay in OACTIVE indefinitely,
or does it go back to a normal state at any point?

Matt

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Splitting Mellanox ConnectX-4 interface using breakout cables

2017-09-13 Thread Matt Joras
On Sep 13, 2017 6:43 AM, "David Horn"  wrote:

I was under the impression that these Mellanox 40G->4x10G breakout cables
were for the Mellanox switch side (not Mellanox NIC) to provide more
flexible utilization options for the switch.  I have never heard of doing
this from the 40G NIC (on any OS).

Please reply if you have seen this working on *any* OS from the Mellanox
40G NIC, as I am curious as well and have a both ConnectX-3 and -4 HW.

-Dave

On Wed, Sep 13, 2017 at 7:00 AM, Andrey V. Elsukov 
wrote:

> Hi All,
>
> we are wondering, is it possible to use such configuration under
> FreeBSD? I.e. split one mce interface 40G => 4x10G or 100G => 4x25G?
>
> --
> WBR, Andrey V. Elsukov
>
>



I was also under the impression that they were exclusively used on the
switch for Mellanox. Mellanox's documentation seems to back this up:

"Important: The 40GbE split options is supported only on Mellanox
switches and not supported on Mellanox adapters (e.g. ConnectX-3) when
equipped with 40GbE ports. In case you wish to limit the 40GbE port on
the adapter to 10GbE you can use QSA or similar 40GbE-10GbE cables
(Refer to Mellanox.com - here)" [1]

That doesn't mention their newer NICs, however, though I don't see a
mention of the feature on their product sheets.

Matt

[1] https://community.mellanox.com/docs/DOC-1450
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Request for comments/discussion on listen queue overflow logging

2017-08-03 Thread Matt Joras
Hello,

Some of you may have noticed a review I posed on Phabricator:
https://reviews.freebsd.org/D11725. The general problem this is trying
to solve is the lack of information about which application's listen
queue has overflowed. As mentioned in the review, there is no good way
to implement this where the message is currently logged, as it is at the
generic socket layer. The approach I take in the review is to construct
an additional more verbose log message (the TCP 4-tuple) at call point
of sonewconn in the syncache code.

I'd appreciate any feedback on the review or suggestions of alternative
approaches. An approach I did think about, but discarded, was to add a
protocol function to struct protosw, something like int
pr_getsockdescr(struct socket *, char *, size_t) which would fill in a
"human-readable" description of the socket based on whatever is appropriate.

Thanks,

Matt Joras

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Sporadic TCP/RST sent to client

2017-06-27 Thread Matt Joras
On Tue, Jun 27, 2017 at 5:05 AM, Youssef  GHORBAL
 wrote:
>
> There is nothing in the 802.3ad that mandates stickiness of flows per NIC, 
> the only thing explicit is that hash algorithm needs to maintain packet 
> order. In this case, strictly speaking, it's : Packets do leave in "order" 
> and do arrive in "order".
I think the important point is that the ordering is not guaranteed in
this case, despite whether it's happening or not. As soon as you are
using a round-robin lagg on one end you've pretty much lost all
guarantees of ordering at the remote end. Unless the switch has some
way to know, which as Steinar noted is usually done through a
negotiated or statically-configured hash-based lagg, there's no way
for it to enforce the ordering you're expecting for proper behaviour.
So even if there was some notion of protocol ordering in netisr, the
fact that you're using round-robin on one endpoint opens up the
possibility for this kind of situation anyway.

Further, I would argue that round robin is not a valid 802.3ad/802.1AX
algorithm, per how it defines a frame distributor:

"This standard does not mandate any particular distribution
algorithm(s); however, any distribution algorithm shall ensure that,
when frames are received by a Frame Collector as specified in 5.2.3,
the algorithm shall not cause:
a) Misordering of frames that are part of any given conversation, or
b) Duplication of frames.

The above requirement to maintain frame ordering is met by ensuring
that all frames that compose a given conversation are transmitted on a
single link in the order that they are generated by the MAC Client;
hence, this requirement does not involve the addition (or
modification) of any information to the MAC frame, nor any buffering
or processing on the part of the corresponding Frame Collector in
order to reorder frames."

> Sure, I was just wondering if the FreeBSD network stack was built with the 
> fact that each flow needs to arrive on the same NIC and the system was 
> designed with this assumption in mind or not.
>
> I reported it here, thinking that maybe it's a subtle buggy corner case and 
> maybe the community was interesting to know about and maybe fix :
>
> - If the stack is working as expected and was built with the assumption that 
> each incoming flow needs to stick to a NIC during it's lifetime, maybe 
> documentation needs to be more explicit regarding this situation. In that 
> case I'll file documentation enhancement bug report.
> - If the stack is misbehaving, maybe help the community identify the root 
> cause and help fixing it
>
As far as I can tell, as Navdeep noted, there's no unexpected
behaviour in your case. "Flows" are a concept that the protocols, in
this case TCP, knows about. The devices themselves (Ethernet cards)
usually have mechanics to make packet delivery decisions based on flow
information (e.g. RSS hashing), but as far as I know that is generally
limited within a single port, so it doesn't really help in the general
case of a lagg.

Matt
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Sporadic TCP/RST sent to client

2017-06-26 Thread Matt Joras
Out of curiosity, what sort of lagg setup are you using that's causing
the TCP packets to be split across the two lagg interfaces?

Matt

On Mon, Jun 26, 2017 at 1:35 PM, Navdeep Parhar  wrote:
> On Thu, Jun 22, 2017 at 3:57 PM, Youssef  GHORBAL
>  wrote:
>> Hello,
>>
>> I'm having an issue with a FreeBSD 11 based system, sending 
>> sporadically TCP/RST to clients after initial TCP session correctly 
>> initiated.
>> The sequence goes this way :
>>
>> 1 Client -> Server : SYN
>> 2 Server -> Client : SYN/ACK
>> 3 Client -> Server : ACK
>> 4 Client -> Server : PSH/ACK (upper protocol data sending starts 
>> here)
>> 5 Server -> Client : RST
>>
>> - The problem happens sporadically, same client and same server can 
>> communicate smoothely on the same service port. But from time to time 
>> (hours, sometime days) the previous sequence happens.
>> - The service running on server is not responsible for the RST sent. 
>> The service was deeply profiled and nothing happens to justify the RST.
>> - tcpdump on the server side assures that packet arrives timely 
>> ordered.
>> - the traffic is very light. Some TCP sessions per day.
>> - the server is connected using a lagg enslaving two cxgb interfaces.
>>
>> In my effort to diagnose the problem (try to have a reproductible 
>> test case) I noticed that the issue is triggered most likely when those two 
>> conditions are met :
>> - the ACK (in step 3) and the PSH/ACK (in step 4) arrive on 
>> different lagg NICs.
>> - the timing between those two packets is sub 10 microseconds.
>>
>> When searching the interwebs I came across a strangely similar issue 
>> reported here 7 years ago :
>> 
>> https://lists.freebsd.org/pipermail/freebsd-net/2010-August/026029.html
>>
>> (The OP seemed to have resolved his issue changing the netisr policy 
>> from direct to hybrid. but no reference of laggs being used)
>>
>> I'm pretty sure that I'm hitting some race condition, a scenario 
>> where due to multithreading the PSH/ACK is somehow handled before the ACK 
>> making the kernel rising TCP/RST since the initial TCP handshake did'nt 
>> finish yet.
>>
>> I've read about netisr work and I was under the impression that even 
>> if it's SMP enabled it was made to keep prorocol ordering.
>>
>> What's the expected behaviour in this scenario on the netisr side ?
>> How can I push the investigation further ?
>
> I think you've already figured out the situation here -- the PSH/ACK is likely
> being handled before the ACK for the SYN because they arrived on different
> interfaces.  There is nothing in netisr dispatch that will maintain protocol
> ordering in this case.
>
> Regards,
> Navdeep
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Request for reviewers for vlan(4) locking improvements

2017-06-26 Thread Matt Joras
Hello,

I am looking for people to give feedback on a review I've opened to
improve the locking in vlan(4). Anyone who's done a fair amount of
destroying vlan interfaces on live systems has probably run into
panics in if_vlan. This is because there is no real synchronization to
prevent a vlan interface from being destroyed while there are mbufs in
the network going through its functions. Isilon's customers have hit
panics like this, so I've reworked the locking to make destroying
vlans safe on live systems, and fixed every instance of unsafe access
I could find.

If anyone has an interest in this work please review the revision:
https://reviews.freebsd.org/D11370

Thanks,
Matt Joras
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mbuf_jumbo_9k & iSCSI failing

2017-06-26 Thread Matt Joras
On Mon, Jun 26, 2017 at 6:36 AM, Andrey V. Elsukov  wrote:
> On 26.06.2017 16:29, Ben RUBSON wrote:
>>
>>> On 26 Jun 2017, at 15:25, Andrey V. Elsukov  wrote:
>>>
>>> On 26.06.2017 16:27, Ben RUBSON wrote:

> On 26 Jun 2017, at 15:13, Andrey V. Elsukov  wrote:
>
> I think it is not mlxen specific problem, we have the same symptoms with
> ixgbe(4) driver too. To avoid the problem we have patches that are
> disable using of 9k mbufs, and instead only use 4k mbufs.

 Interesting feedback Andrey, thank you !
 The problem may be then "general".
 So you still use large MTU (>=9000) but only allocating 4k mbufs, as a 
 workaround ?
>>>
>>> Yes.
>>
>> Is it a kernel patch or a driver/ixgbe patch ?
>
> I attached it.
>
> --
> WBR, Andrey V. Elsukov

I didn't think that ixgbe(4) still suffered from this problem, and we
use it in the same situations rstone mentioned above. Indeed, ixgbe(4)
doesn't presently suffer from this problem (you can see that in your
patch, as it is only effectively changing the other drivers), though
it used to. It looks like it was first fixed to not to in r280182.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Fix dangling pointer left by ifmedia_removeall

2017-01-12 Thread Matt Joras
I thought I'd bring this review to people's attention:
https://reviews.freebsd.org/D9164

The problem is fairly straightforward, as is the fix. ixgbe(4) uses
ifmedia_removeall in a way that other drivers do not, which ends up
leaving a dangling pointer in the ifmedia struct leading to incorrect
information displayed by ifconfig(8).

Matt
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: lagg(4): LOR, deadlock and panic

2016-06-16 Thread Matt Joras
On Tue, 2016-06-14 at 09:26 -0600, Alan Somers wrote:
> 
> I don't know the best answer either.  But while you're in there, are
> you interested in fixing any other lagg panics too?  I've written
> some
> ATF torture tests for lagg, but I haven't checked them into head yet
> because most of them quickly panic.
> 
> -Alan
We run into if_lagg and if_vlan panics at $WORK all the time in our
automation. I've fixed the if_vlan panics and I'm hoping to update this
review: https://reviews.freebsd.org/D5825 soon with something that
accomodates drivers sleeping in the vlan_*config event handlers (which
involves having a global rmlock _and_ sx in if_vlan).

I was planning on doing a similar audit/fixing of if_lagg soon when I
get the chance.

Matt Joras
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Question on sysctl tree handling

2016-04-11 Thread Matt Joras
Expanding on what mmacy said... I don't think the benefits of "easy
reconfiguration" are worth the headaches you're going to potentially
run into in production.

bxe(4) used to do this, and it caused us a lot of problems (i.e. panics)
at $DAY_JOB. For example, if a lagg was on top of bxe and then you
downed bxe you could very easily hit a use-after-free since bxe free'd
its rings while if_lagg is trying to transmit a packet.

Matt Joras

On Mon, Apr 11, 2016 at 2:03 PM, K. Macy <km...@freebsd.org> wrote:
> You do understand that init needs to be run every time interface
> settings are changed (TSO / PROMISC / CSUM/ etc)? Reallocating queues
> and interrupts every time is fragile (long running systems can run low
> on contiguous memory) and, in the common case that you're not actually
> changing the number, gratuitous.
>
> Cheers.
> -M
>
> On Fri, Apr 8, 2016 at 2:56 PM, Jack Vogel <jfvo...@gmail.com> wrote:
>> LOL, why does it seem that as soon as I ask the answer hits me in the nose
>> :)
>>
>> I found the sysctl_ctx_free call, sorry for the noise
>>
>> Jack
>>
>>
>> On Fri, Apr 8, 2016 at 2:51 PM, Jack Vogel <jfvo...@gmail.com> wrote:
>>
>>>
>>> I have a driver design where the queue/ring/irq layout is done in init
>>> rather
>>> than in attach, allowing easy reconfiguration. What I'm not sure about is
>>> how to handle the sysctl tree during a reinit, I don't see a procedure to
>>> free up things so I can restructure :(
>>>
>>> Am I missing something, any pointers or suggestions appreciated.
>>>
>>> Thanks,
>>>
>>> Jack
>>>
>>>
>> ___
>> freebsd-net@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


if_vlan locking fixes

2016-04-04 Thread Matt Joras
Hello,

At Isilon we end up creating/destroying vlan interfaces a lot more
than users typically do. Unfortunately this lead to a myriad of panics
due to locking insufficiencies in if_vlan. I've fixed these internally
and I've submitted a review of the changes. I would appreciate if
anyone would like to review them.

https://reviews.freebsd.org/D5825

The essence of the changes is to make the global vlan lock an
rmlock(9) and expand its scope to synchronize reading/using ifvlans
with destruction events as well as ensure exlusivity in certain
configuration paths.

Thanks,
Matt Joras
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"