Le 28/11/2011 13:53, Alan DeKok a écrit :
Alexandre Chapellon wrote:
This work as epected for most of my NASes. Unfortunately, i have some
NASes that are behind a satelite link, which is a very unreliable link
with regular packets loss. UDP retramission of packet make the systems
work even with that kind of link, but I have one scenario that create
errors:
This is common in RADIUS. Accounting is... awkward, to put it politely.
When a stop ticket is transmitted once and reaches correctly the
freradius servers (nas -> front -> back), Session record is deleted from
the "live acct" table, packet is then proxied to the 2nd freeradius and
session in Acct table is marked as stoped (acctstoptime=something). If
the front freeradius acks the Stop packet and that Ack is lost on the
link, the NAS retransmit the STOP.
As it should. It's the responsibility of the RADIUS server to deal
with retransmissions from the NAS.
Same thing occur,:
- front radius tries to delete the sessions using its acct_stop_query,
wich result in no line being modified and so tries to run its
acct_stop_query_alt (which basicly does the same thind: delete).
It really should delete the record ONLY if it exists. Or, UPDATE the
record to say "session stopped". After a suitable delay (10-20 min),
the "stopped" sessions can safely be deleted.
alt
query also modify no lines but no error is logged. retransmitted packet
is then proxied to the back server, wich in turns tries ti run its
acct_stop_query (tries to update a session with no acctstoptime). That
query fails as the previous Stop ticket for that session already updated
the recod. It then tries to run the acct_stop_query_alt, which is
designed to try to insert a new session record based on the content of
the stop ticket (this is done to deal with the case where start ticket
is lost and only stop ticket is received, i guess). In my case this last
query fails because of some unicity constraint in the oracle database
(to prevent one session from being recorded multiple times), and an
error is logged in freeradius.
The solution is to fix the queries so that they deal with non-existant
sessions. This is no different than a NAS sending a STOP for sessions
that *never* existed.
Does anybody have an idea on how to deal with that (minor) problem so I
have no more regular error messages.
I was maybe thinking of not proxying to the back server, packets
retransmitted du to ACK loss, but I can't really find out how to do that...
Thanks for reading that long post (I hope it's understandable enough).
It is.
There is no real solution other than building a smarter system to
handle accounting packets.
I suggest writing a detailed state machine describing what happens for
each session, and how each kind of packet is handled. Until that's
done, no good solution is possible.
I don't understand what you mean by "writing a detailed state
machine"... state machine?
We can take such a state machine and use it to update the handling of
accounting packets for 3.0.
Alan DeKok.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
--
<http://www.horoa.net>
Alexandre Chapellon
Ingénierie des systèmes open sources et réseaux.
Follow me on twitter: @alxgomz <http://www.twitter.com/alxgomz>
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html