Re: [pfSense] terrible performance on NFS & CIFS

Adam Thompson Thu, 06 Nov 2014 15:23:42 -0800

Ok, recap again...
- this affects multiple protocols, not just NFS.  I've now confirmed it affects 
SSH as well.
- this only occurs when the server is behind pfSense and the client is on the 
"outside" of the firewall.
- this problem does not occur in the other direction through pfSense (LAN->WAN).
- to repeat myself, NFS works fine at ~1gbps between the same client and server 
without pfSense in the middle.


Ergo, I conclude it's something pfSense-related.  Haven't had a chance to turn 
off of scrub yet.
-Adam

On November 6, 2014 5:12:59 PM CST, Sean <[email protected]> wrote:
>I strongly recommend not tinkering with your MTU setting and instead
>correct the setting on the server side...
>
>I think you should start reading here:
>http://nfs.sourceforge.net/nfs-howto/ar01s05.html
>
>Particularly this section:
>
>> 5.3. Overflow of Fragmented Packets
>>
>> Using an *rsize* or *wsize* larger than your network's MTU (often set
>to
>> 1500, in many networks) will cause IP packet fragmentation when using
>NFS
>> over UDP. IP packet fragmentation and reassembly require a
>significant
>> amount of CPU resource at both ends of a network connection. In
>addition,
>> packet fragmentation also exposes your network traffic to greater
>> unreliability, since a complete RPC request must be retransmitted if
>a UDP
>> packet fragment is dropped for any reason. Any increase of RPC
>> retransmissions, along with the possibility of increased timeouts,
>are the
>> single worst impediment to performance for NFS over UDP.
>>
>> Packets may be dropped for many reasons. If your network topography
>is
>> complex, fragment routes may differ, and may not all arrive at the
>Server
>> for reassembly. NFS Server capacity may also be an issue, since the
>kernel
>> has a limit of how many fragments it can buffer before it starts
>throwing
>> away packets. With kernels that support the /proc filesystem, you can
>> monitor the files /proc/sys/net/ipv4/ipfrag_high_thresh and
>> /proc/sys/net/ipv4/ipfrag_low_thresh. Once the number of unprocessed,
>> fragmented packets reaches the number specified by
>*ipfrag_high_thresh* (in
>> bytes), the kernel will simply start throwing away fragmented packets
>until
>> the number of incomplete packets reaches the number specified by
>> *ipfrag_low_thresh*.
>>
>> Another counter to monitor is *IP: ReasmFails* in the file
>/proc/net/snmp;
>> this is the number of fragment reassembly failures. if it goes up too
>> quickly during heavy file activity, you may have a problem.
>>
>Since this is not an NFS support list I suggest you let this die here
>lest
>you incur the spite of the moderators. ;-)
>
>
>
>On Thu, Nov 6, 2014 at 4:58 PM, Sean <[email protected]> wrote:
>
>> Not a TCP expert but the MTU is nearly always 1500 (or just under)
>hence
>> your limit.  Sending packets greater than the MTU will lead to
>> fragmentation.  Fragmentation leads to re-transmissions (depends on
>do not
>> fragment bit?) and performance problems.  Performance problems leads
>to
>> frustration and anger.  Anger leads to the dark side of the force.
>>
>> You can increase the MTU to like 9000 or something if you enable
>jumbo
>> frames but you'd need to support it across the board (pfSense,
>routers,
>> switches?, servers, etc.).  It's a hassle probably not worth the
>effort in
>> terms of gains.  Some people do it as a means to increase iSCSI
>traffic
>> performance but others say the throughput gain is dubious at best.  I
>would
>> make sure some doofus didn't enable jumbo frames on your NFS server
>and if
>> so then turn it off and check the MTU setting in the network stack on
>the
>> NFS server as well.
>>
>> I may not know what the hell i'm talking about though so someone else
>can
>> feel free to jump in and tell me what an idiot I am.
>>
>>
>>
>> On Wed, Nov 5, 2014 at 6:47 PM, Adam Thompson <[email protected]>
>> wrote:
>>
>>> Problem: really, really bad performance (<10Mbps) on both NFS (both
>tcp
>>> and udp) and CIFS through pfSense.
>>>
>>> Proximate cause: running a packet capture on the Client shows one
>smoking
>>> gun - the TCP window size on packets sent from the client is always
>~1444
>>> bytes.  Packets arriving from the server show a TCP window size of
>~32k.
>>>
>>>
>>> The Network:
>>>                     +------+
>>>                     |Router|
>>>                     +--+---+
>>>                        |
>>>                 --+----+----+--
>>>                   |         |
>>>                +--+---+  +-------+
>>>                |Client|  |pfSense|
>>>                +------+  +--+----+
>>>                             |
>>>                           --+---+--
>>>                                 |
>>>                              +--+---+
>>>                              |Server|
>>>                              +------+
>>>
>>>     - Client and pfSense both have Router as default gateway.
>>>     - pfSense has custom outbound NAT rules preventing NAT between
>Server
>>> subnet and Client subnet, but NAT'ing all other     - outbound
>connections.
>>>     - Router has static route pointing to Server subnet via pfSense.
>>>
>>> Hardware:
>>>     Router is an OpenBSD system (a CARP cluster, actually) running
>on
>>> silly-overpowered hardware.
>>>     Client is actually multiple systems, ranging from laptops to
>high-end
>>> servers.
>>>     Server is a Xeon E3-1230v3 running Linux, exporting a filesystem
>via
>>> both NFS (v2, v3 & v4) and CIFS (samba).
>>>     pfSense is v2.1.5 (i386) on a dual P-III 1.1GHz, CPU usage
>typically
>>> peaks at around 5%.
>>>
>>>
>>> Performance on local Server subnet (i.e. from a same-subnet client)
>is
>>> very good on all protocols, nearly saturating the gigabit link.
>>> Traffic outbound from the server subnet to the internet (via Router)
>>> moves at a decent pace, this firewall can typically handle ~400Mbps
>without
>>> any trouble, IIRC synthetic benchmarks previously showed it can peak
>at
>>> over 800Mbps.
>>>
>>> Based on the FUBAR TCP window sizes I've observed, I assume pfSense
>is
>>> doing something to my TCP connections... but why are only the
>non-NAT'd
>>> connections affected?  I know there's an option to disable pf scrub,
>but
>>> that's only supposed to affect NFSv3 (AFAIK), and this also affects
>>> NFSv4-over-TCP and CIFS.
>>>
>>> --
>>> -Adam Thompson
>>>  [email protected]
>>>
>>> _______________________________________________
>>> List mailing list
>>> [email protected]
>>> https://lists.pfsense.org/mailman/listinfo/list
>>>
>>
>>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>List mailing list
>[email protected]
>https://lists.pfsense.org/mailman/listinfo/list

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

_______________________________________________
List mailing list
[email protected]
https://lists.pfsense.org/mailman/listinfo/list

Re: [pfSense] terrible performance on NFS & CIFS

Reply via email to