Re: [pfSense] terrible performance on NFS & CIFS

Sean Thu, 06 Nov 2014 19:54:38 -0800

Ah, my bad...  I kind of glazed over the CIFS bit.  ;-)
Have you compared a packet capture of client traffic while it's on the LAN
performing at 1gb to the capture through pfSense?
The TCP Window Size could be a red herring...?



On Thu, Nov 6, 2014 at 5:23 PM, Adam Thompson <[email protected]> wrote:

> Ok, recap again...
> - this affects multiple protocols, not just NFS. I've now confirmed it
> affects SSH as well.
> - this only occurs when the server is behind pfSense and the client is on
> the "outside" of the firewall.
> - this problem does not occur in the other direction through pfSense
> (LAN->WAN).
> - to repeat myself, NFS works fine at ~1gbps between the same client and
> server without pfSense in the middle.
>
> Ergo, I conclude it's something pfSense-related. Haven't had a chance to
> turn off of scrub yet.
> -Adam
>
>
> On November 6, 2014 5:12:59 PM CST, Sean <[email protected]> wrote:
>>
>> I strongly recommend not tinkering with your MTU setting and instead
>> correct the setting on the server side...
>>
>> I think you should start reading here:
>> http://nfs.sourceforge.net/nfs-howto/ar01s05.html
>>
>> Particularly this section:
>>
>>> 5.3. Overflow of Fragmented Packets
>>>
>>> Using an *rsize* or *wsize* larger than your network's MTU (often set
>>> to 1500, in many networks) will cause IP packet fragmentation when using
>>> NFS over UDP. IP packet fragmentation and reassembly require a significant
>>> amount of CPU resource at both ends of a network connection. In addition,
>>> packet fragmentation also exposes your network traffic to greater
>>> unreliability, since a complete RPC request must be retransmitted if a UDP
>>> packet fragment is dropped for any reason. Any increase of RPC
>>> retransmissions, along with the possibility of increased timeouts, are the
>>> single worst impediment to performance for NFS over UDP.
>>>
>>> Packets may be dropped for many reasons. If your network topography is
>>> complex, fragment routes may differ, and may not all arrive at the Server
>>> for reassembly. NFS Server capacity may also be an issue, since the kernel
>>> has a limit of how many fragments it can buffer before it starts throwing
>>> away packets. With kernels that support the /proc filesystem, you can
>>> monitor the files /proc/sys/net/ipv4/ipfrag_high_thresh and
>>> /proc/sys/net/ipv4/ipfrag_low_thresh. Once the number of unprocessed,
>>> fragmented packets reaches the number specified by *ipfrag_high_thresh* (in
>>> bytes), the kernel will simply start throwing away fragmented packets until
>>> the number of incomplete packets reaches the number specified by
>>> *ipfrag_low_thresh*.
>>>
>>> Another counter to monitor is *IP: ReasmFails* in the file
>>> /proc/net/snmp; this is the number of fragment reassembly failures. if
>>> it goes up too quickly during heavy file activity, you may have a problem.
>>>
>> Since this is not an NFS support list I suggest you let this die here
>> lest you incur the spite of the moderators. ;-)
>>
>>
>>
>> On Thu, Nov 6, 2014 at 4:58 PM, Sean <[email protected]> wrote:
>>
>>> Not a TCP expert but the MTU is nearly always 1500 (or just under) hence
>>> your limit.  Sending packets greater than the MTU will lead to
>>> fragmentation.  Fragmentation leads to re-transmissions (depends on do not
>>> fragment bit?) and performance problems.  Performance problems leads to
>>> frustration and anger.  Anger leads to the dark side of the force.
>>>
>>> You can increase the MTU to like 9000 or something if you enable jumbo
>>> frames but you'd need to support it across the board (pfSense, routers,
>>> switches?, servers, etc.).  It's a hassle probably not worth the effort in
>>> terms of gains.  Some people do it as a means to increase iSCSI traffic
>>> performance but others say the throughput gain is dubious at best.  I would
>>> make sure some doofus didn't enable jumbo frames on your NFS server and if
>>> so then turn it off and check the MTU setting in the network stack on the
>>> NFS server as well.
>>>
>>> I may not know what the hell i'm talking about though so someone else
>>> can feel free to jump in and tell me what an idiot I am.
>>>
>>>
>>>
>>> On Wed, Nov 5, 2014 at 6:47 PM, Adam Thompson <[email protected]>
>>> wrote:
>>>
>>>> Problem: really, really bad performance (<10Mbps) on both NFS (both tcp
>>>> and udp) and CIFS through pfSense.
>>>>
>>>> Proximate cause: running a packet capture on the Client shows one
>>>> smoking gun - the TCP window size on packets sent from the client is always
>>>> ~1444 bytes.  Packets arriving from the server show a TCP window size of
>>>> ~32k.
>>>>
>>>>
>>>> The Network:
>>>>                     +------+
>>>>                     |Router|
>>>>                     +--+---+
>>>>                        |
>>>>                 --+----+----+--
>>>>                   |         |
>>>>                +--+---+  +-------+
>>>>                |Client|  |pfSense|
>>>>                +------+  +--+----+
>>>>                             |
>>>>                           --+---+--
>>>>                                 |
>>>>                              +--+---+
>>>>                              |Server|
>>>>                              +------+
>>>>
>>>>     - Client and pfSense both have Router as default gateway.
>>>>     - pfSense has custom outbound NAT rules preventing NAT between
>>>> Server subnet and Client subnet, but NAT'ing all other     - outbound
>>>> connections.
>>>>     - Router has static route pointing to Server subnet via pfSense.
>>>>
>>>> Hardware:
>>>>     Router is an OpenBSD system (a CARP cluster, actually) running on
>>>> silly-overpowered hardware.
>>>>     Client is actually multiple systems, ranging from laptops to
>>>> high-end servers.
>>>>     Server is a Xeon E3-1230v3 running Linux, exporting a filesystem
>>>> via both NFS (v2, v3 & v4) and CIFS (samba).
>>>>     pfSense is v2.1.5 (i386) on a dual P-III 1.1GHz, CPU usage
>>>> typically peaks at around 5%.
>>>>
>>>>
>>>> Performance on local Server subnet (i.e. from a same-subnet client) is
>>>> very good on all protocols, nearly saturating the gigabit link.
>>>> Traffic outbound from the server subnet to the internet (via Router)
>>>> moves at a decent pace, this firewall can typically handle ~400Mbps without
>>>> any trouble, IIRC synthetic benchmarks previously showed it can peak at
>>>> over 800Mbps.
>>>>
>>>> Based on the FUBAR TCP window sizes I've observed, I assume pfSense is
>>>> doing something to my TCP connections... but why are only the non-NAT'd
>>>> connections affected?  I know there's an option to disable pf scrub, but
>>>> that's only supposed to affect NFSv3 (AFAIK), and this also affects
>>>> NFSv4-over-TCP and CIFS.
>>>>
>>>> --
>>>> -Adam Thompson
>>>>  [email protected]
>>>>
>>>> _______________________________________________
>>>> List mailing list
>>>> [email protected]
>>>> https://lists.pfsense.org/mailman/listinfo/list
>>>>
>>>
>>>
>> ------------------------------
>>
>> List mailing list
>> [email protected]
>> https://lists.pfsense.org/mailman/listinfo/list
>>
>>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>

_______________________________________________
List mailing list
[email protected]
https://lists.pfsense.org/mailman/listinfo/list

Re: [pfSense] terrible performance on NFS & CIFS

Reply via email to