Also, while searching the net for the "emX taskq" solution, I read that few people are successfully running the modified em driver from Yandex.
This is their README:
RX queue is being processed w/more than one thread. Use "sysctl dev.em.X.rx_kthreads" to alter number of threads. TX interrupts has been removed because it's not neccessary actually. That's why interrupt rate has been reduced twice at least. TX queue cleaning moved to seperate kthread. em_start uses mtx_trylock instean of mtx_lock. That's why em_start locks less. + RX queues' priority may be altered thru sysctl. System seems to be more stable if RX scheduled w/less priority. + RX interrupt stay masked if there is no thread ready to catch interrupt. The hint reduces context switching under load.
NOTES:
1) do not forget to do "sysctl net.isr.direct=1" if you want to see more SMP.
2) turn off polling. We didn't touch this part of code yet.

So the question is, should I go for it? Will it help me in any way? I mean, if I have 2 Xeon CPUs and Hyper Threading enabled, I can actually divide it into 4 threads, right? And the biggest question is: will I be able to do it on pfSense and how would I go about it?

Thanks,

Lenny.


On Mar 16, 2009 5:37pm, Scott Ullrich <[email protected]> wrote:
On Mon, Mar 16, 2009 at 7:14 AM, Lenny [email protected]> wrote:

> Hi again,

>

> So I did replace the server, I have an IBM x336 now instead of the x335. The

> NIC is the identical, but not the same.

> First of all, Chris, you were absolutely right - it was some sort of a

> glitch with the hardware compatibility, as with this server I'm seeing a

> completely different behavior. I started seeing interrupt taking some of the

> CPU(not too much though - about 8-10% when loaded), and I don't see an emX

> taskq at all now.

> But the thing is - the problem is still there - I had a relatively high load

> this weekend (15kpps is my high load, remember?) and once again I got some

> packet loss and a slow response time from the website.

>

> Couple of things I noticed though:

> When it happened, the quality RRD graph showed about 35-40ms spike (from the

> usual 1-2). It was that time that I checked the "Disable Hardware Checksum

> Offloading" option and it was back to normal within seconds. But I saw it

> climb few other times afterwords... So maybe it was just a coincidence.

> Also, if I check the interface status when there is normal traffic - there

> are no errors(well, no more additional errors), but the minute the load hits

> - I start seeing the counters climbing up. On both interfaces, but only on

> the "In", the out is "0".

>

> And one last thing, I was thinking about maybe enforcing the negotiation

> through the config.xml. So I went through it and I saw this:

>

> em0

>

>

>

> 100

> Mb

>

>

> XXXX

> 28

> YYYY

>

>

> em1

> OPTICAL

>

>

> ZZZZ

> 29

>

>

>

>

>

> Is this normal, I mean regarding the 100Mb bandwidth? I have everything set

> to autonegotiation and the interface status shows:

> Media 1000baseTX on both, so I assume I shouldn't touch it.

> But the 100Mb confuses me.

> Anyhow, this x336 server is a loaner and I have to return it or buy it

> within a day or two, so if you have any thoughts at all, please.



Now you may be hitting a sysctl limit. Quoting BillM from prior in

this thread:



"Check sysctl net.inet.ip.intr_queue_drops and raise

net.inet.ip.intr_queue_maxlen if it's non-zero.



Also check net.isr.drop.



The intel driver has some debugging also under the dev.em sysctl I believe."



Scott



---------------------------------------------------------------------

To unsubscribe, e-mail: [email protected]

For additional commands, e-mail: [email protected]



Commercial support available - https://portal.pfsense.org



Reply via email to