Hello Yan,
it's very difficult to debug with just what you say ,what is your setup
? (running PacketFence on a raspberry pi is not the same thing than
running on a 32 cpus server)
rlm_perl is?0?2 just use to select the correct chroot based on the realm,
so it's something really fast.
First what you can check is the radius audit log where you can see how
many seconds the request is processing (something more that 1 is not
normal , so it mean that there is an issue).
After that you can check the radius authentication latency (something
near 15/20ms is ok) and also the NTLM call timing to see if the AD is
fast enough.
Last thing, check in graphite the statsd graph, it will show you how
long it take to finish some ldap search per example.
As i said, if you don't want to turn around the pot we have support.
Regards
Fabrice
Le 2018-01-31 ?? 20:39, Yan a ??crit?0?2:
Hi Fabrice,
I mean rtml_perl module takes too much time processing requests and
drags radius very slow.
And I see, no need to login but only need to open mgmt_ip:9000. But
which graphics can tell the issue cause ?
Today we did a pressure test with 50 qps (pf+AD authentication) and
found the freeradius in pf crashed every time and the phenomenon was
very similar with the issue we met recently. We tried to adjust below
parameters and the result was always the same: Freeradius crashed in
about 2 minutes. First it became slow and then crashed and restarted
and then we met ??No EAP session match xxxxxx?? and nearly all requests
got rejected. Hardly to believe 50 qps can ddos freeradius...So any
configurations suggestions?
We changed below parameters but the result was the same:
Before we change the parameters in radiusd.conf:
1?? max_request_time = 10
2?? cleanup_delay = 5
3?? max_requests = 20000
4?? reject_delay = 1
5?? max_servers = 512 ?0?2??
--------
We changed the parameters in radiusd.conf as below??
1?? max_request_time = 20
2?? cleanup_delay = 10
3?? max_requests = 1280000 <tel:1280000>
4?? reject_delay = 1
5?? max_servers = 512
------------------ Original ------------------
*From:* Fabrice Durand <[email protected]>
*Date:* ????,2?? 1,2018 07:43
*To:* Yan <[email protected]>
*Subject:* Re: [PacketFence-users] All authentication failed with
error"NoEAPsession matching state xxxx"
Hello Yan,
there is no username and password.
Also what is doperl module ?
Fabrice
Le 2018-01-31 ?? 09:20, Yan a ??crit :
Hi Fabrice,
I never logged in graph GUI, what??s the username and password it used
? I tried admin GUI account but wrong.
BTW it seems there is a global lock in doperl module and this is the
hard bottleneck as per our stress test...
------------------ Original ------------------
*From:* Fabrice Durand <[email protected]>
*Date:* ????,1?? 31,2018 22:04
*To:* Yan <[email protected]>, packetfence-users
<[email protected]>
*Subject:* Re: [PacketFence-users] All authentication failed with
error "NoEAPsession matching state xxxx"
Hello Yan,
Le 2018-01-31 ?? 00:28, Yan a ??crit :
Hi dear users,
After a whole night??s analysis, we found it??s pf that takes too much
time processing authentication request if the QPS is too high and
hangs all radius requests later and then Aruba AC meets the radius
timeout setting and re-sends the same radius access request to pf
while pf just sent out the first radius accept packet and then
received the same request, it will response accept for a second time
and then delete the state id, but Aruba AC might has waited for
another 5 seconds and send a radius request for a third time, and
this time pf find no state id match this session and just response
reject...And then more and more reject responses will cause user
re-connect wireless and the QPS is much more...It's bad circle...
We find pf has below bottlenecks at least to lead to the hang issue:
1.Mysql query is too slow.
Most of the times it's because you receive too many accounting packet
(try to disable it) or because there too many IO.
2."curl" keeps calling httpd service and it's very slow.
Where do you see curl ?, Freeradius use the rest module to talk to
the webservice
3."doperl" is too slow.
Not really, it depend how you configured PacketFence, let's say you
have a ldap source but it take 600ms to do a search then the radius
answer will be slow.
4."ntlm_auth" process is too slow.
Because probably the AD is too slow to answer, btw you can use the
NTLM cache for that.
5.A device will try to connect again if radiusd crashes or restarted
or meets its max requests
But we don't find which configuration will solve this issue yet. Is
there any suggestion on how to change configuration to handle this
performance issue ? Or any basic directions on how to adjust the
parameters to handle 200 QPS,500 QPS and 2000 QPS ?
We have setup that handle millions of request per day and without any
issues, check the graph like radius latency and also have a look at
http://mgmt_ip:9000 and try to find where it take time.
Btw if you want to us to check your setup, you can ask for a support
with inverse and it will be a pleasure to help you.
Regards
Fabrice
Any response is appreciated. Thank you very very much.
-- Fabrice [email protected] :: +1.514.447.4918 (x135)
::www.inverse.caInverse inc. :: Leaders behind SOGo (http://www.sogo.nu) and
PacketFence (http://packetfence.org)
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
PacketFence-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/packetfence-users