In article <[EMAIL PROTECTED]>,
Simon Byrnand <[EMAIL PROTECTED]> wrote:
>We've had some problems with stop records going missing from time to time
>from a remote NAS, (over which we have no direct control) and when this
>happens, the user of course shows as online when they're not.
Is there a proxy in between? There are proxies that do not implement
that radius protocol correctly, they just send the accounting packets
through once and don't retry if they don't get an ack back.
>There are two ways this incorrect state (eventually) gets resolved, one is
>that the user disconnects (or gets disconnected) and reconnects, and
>checkrad determines they're not really there anymore - and radiusd seems to
>reduce their previous session time to zero and write a logout entry in the
>radwtmp to that effect. (I see a 0 length session time with radlast, anyhow)
>
>However it does NOT write a stop record into any detail file as far as I
>can see, and to do so would be improper anyway, since it never received
>one.
Exactly
>Now in this situation the times for a user reported by analyzing
>detail files and the radwtmp are clearly going to be different, and the
>radwtmp is the more conservative of the two. A program reading the detail
>files has no way of knowning that checkrad discovered a stuck session.
True.
>The other thing that happens is that somebody else logs into the same port
>on the NAS, and radiusd immediately notices the previous session must have
>been stuck, however in this case it doesnt zero the time in radwtmp, it
>just assumes they logged out at the same time the other user logged into
>the same port.
Yes. Most "last" programs treat this correctly - they notice that
a port was re-used and 'stop' the session that was active on that port.
>A program like sac analyzing the detail files _should_ be
>able to notice this apparent reuse of the same NAS port and deduce that a
>lost stop record occured and give the same result as reading the radwtmp,
>however I have _not_ confirmed that sac 1.8 actually does this, so at this
>stage it is supposition.
It should indeed do that, perhaps you can send a polite suggestion
to the author ?
>I've done extensive comparisions of the times calculated from radwtmp, and
>those calculated from detail files (using sac) for users that havn't
>suffered lost stop records, and calculated times are _identical_ within
>about 1 second, and in every case where there were lost stop records, the
>radwtmp gives the more conservative time of the two. I'd rather err on the
>safe side when I know a NAS box is giving incomplete data...
On your safe side, not on the customers .. it is possible that a
customer logs in at 00:00 AM, logs out at 01:00 AM but the
stop packet gets lost. Then at 06:00 AM the same port gets reused
by another dialin customer and the first one gets billed for 6
hours of usage instead of one.
If the NAS supports 'alive' packets (or are they called 'update' packets
now?) perhaps you can get it to send alive packets every say 5 minutes,
and when an alive packet hasn't been received in the last 15+1 minutes,
mark the session as 'dead' (and use the last received alive packet as
STOP packet - it should have the most recent acct-session-time etc).
This could get a bit busy if you have 2000 dialin lines though
Mike.
--
"dselect has a user interface which scares small children"
-- Theodore Tso, on debian-devel
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html