Re: Freeradius proxy code questions and proposed patch

2007-05-04 Thread Kostas Kalevras
O/H Alan DeKok έγραψε:
 Kostas Zorbadelos wrote:
   
 I have read in the list about the major clean up version 2.0 of the
 server will be. While reading the code of versions 1.x I could see
 that there is great room for improvement. I will take a look in the
 2.0 sources and I look forward to testing it when it becomes
 available. 
 

   Please test it now.  If everyone waits for 2.0 to be release before
 testing it, then everyone will discover little problems that they don't
 like.  Spend some time now to give feedback, and 2.0 will be that much
 more robust for everyone.
   
I think it's a good idea to start releasing 2.0preX versions. That 
should make a few more people interested in testing the code and get 
more comments.

   Alan DeKok.
 --
   http://deployingradius.com   - The web site of the book
   http://deployingradius.com/blog/ - The blog
 - 
 List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
   


-- 
Kostas Kalevras - Network Operations Center
National Technical University of Athens
http://kkalev.wordpress.com

- 
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Freeradius proxy code questions and proposed patch

2007-05-04 Thread Peter Nixon
On Fri 04 May 2007, Kostas Kalevras wrote:
 O/H Alan DeKok έγραψε:
  Kostas Zorbadelos wrote:
  I have read in the list about the major clean up version 2.0 of the
  server will be. While reading the code of versions 1.x I could see
  that there is great room for improvement. I will take a look in the
  2.0 sources and I look forward to testing it when it becomes
  available.
 
Please test it now.  If everyone waits for 2.0 to be release before
  testing it, then everyone will discover little problems that they don't
  like.  Spend some time now to give feedback, and 2.0 will be that much
  more robust for everyone.

 I think it's a good idea to start releasing 2.0preX versions. That
 should make a few more people interested in testing the code and get
 more comments.

I agree. While I have been rolling freeradius-server-snapshot rpms on a 
weekly basis, releasing freeradius-server-2.0preX rpms is likely to get a 
lot more people to upgrade. (Anyone using my repo will get the new version 
automatically) 

Cheers
-- 

Peter Nixon
http://www.peternixon.net/
PGP Key: http://www.peternixon.net/public.asc

- 
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Re: Freeradius proxy code questions and proposed patch

2007-05-04 Thread Alan DeKok
Kostas Kalevras wrote:
 I think it's a good idea to start releasing 2.0preX versions. That 
 should make a few more people interested in testing the code and get 
 more comments.

  I'm working on fixing the handling of the detail files right now, so
that one server will be able to read the detail files it writes.  Once
that's done, I think we're ready for 2.0-pre0.

  e.g. With the new code, the server will be able to:

 - proxy accounting to a home server
 - if that fails, write a detail file
 - read the detail file
 - try to proxy the packets again
 - if that fails, leave the data in the detail file

  This means that a server doing proxying can just be a pass through
server when everything is OK.  Then, if something goes wrong, it can log
the accounting data to a file for later replay.  Once the home servers
come back up, the accounting data will be automatically sent there.

  Alan DeKok.
--
  http://deployingradius.com   - The web site of the book
  http://deployingradius.com/blog/ - The blog
- 
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Freeradius proxy code questions and proposed patch

2007-05-03 Thread Alan DeKok
Kostas Zorbadelos wrote:
 Precicely. But when we work in 'synchronous' mode we want the NAS to
 be in charge of the retransmision policy not our proxy server. If the
 home server does not reply for any reason, we want the client (NAS) to
 notice it and retransmit. Eventually, the client will mark our proxy
 server dead not because it is its fault, but because the home server
 is not responding.  

  Have you tried using failover for home servers?  The whole point of
marking a home server dead is to remove it from the pool of home
servers.  Then, if another one in the same pool is alive, the proxy will
use it.

  If you don't mark the home server dead, then you can't do failover,
and your system becomes less robust.

   Which server?  All your patch does is make sure that the NAS marks the
 proxying server as dead.
 
 Eventually, yes this is what the NAS will do. All that is due to the
 synchronous mode in proxy operation.

  The solution is not to patch the code to make the proxying server
dead.  The solution is to use more than one home server.

 I have read in the list about the major clean up version 2.0 of the
 server will be. While reading the code of versions 1.x I could see
 that there is great room for improvement. I will take a look in the
 2.0 sources and I look forward to testing it when it becomes
 available. 

  Please test it now.  If everyone waits for 2.0 to be release before
testing it, then everyone will discover little problems that they don't
like.  Spend some time now to give feedback, and 2.0 will be that much
more robust for everyone.

  Alan DeKok.
--
  http://deployingradius.com   - The web site of the book
  http://deployingradius.com/blog/ - The blog
- 
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Freeradius proxy code questions and proposed patch

2007-04-30 Thread Kostas Zorbadelos
Hello to everyone.

In a previous thread 
http://www.mail-archive.com/freeradius-users@lists.freeradius.org/msg33354.html 
I had described a strange behavior in our large proxy setup. After
running the server in debug mode (radiusd -xxx) in our production
systems we found out what was causing our problems. The problem was
that the home server in our proxy setup was marked dead quite often
during the day and with a dead_time of 30 secs every request that came
within these 30 secs was rejected.

Our proxy conf initially looked like the following:

  proxy server {

synchronous = yes

retry_delay = 0

retry_count = 0

dead_time = 30
default_fallback = yes

post_proxy_authorize = no

}

###
#
#  Configuration for the proxy realms.
#
...

We first changed the dead_time to 0 so as to avoid marking the home
server dead in synchronous mode.
Additionally, we implemented the following patch (against version 1.1.6):

--- ./src/main/files.c.orig 2007-04-23 15:14:14.569932000 +0300
+++ ./src/main/files.c  2007-04-23 15:22:30.995686000 +0300
@@ -489,6 +489,15 @@
if (cl-last_reply  (( now - 
mainconfig.proxy_retry_delay * mainconfig.proxy_retry_count ))) {
continue;
}
+   /*
+* If we are running in synchronous proxy mode, there's 
no point marking the target
+* server(s) dead, since this should be done by the 
radius client
+*/
+   if (mainconfig.proxy_synchronous) {
+   radlog(L_PROXY, authentication server %s:%d 
for realm %s seems unresponsive.,
+   cl-server, port, cl-realm);
+   continue;
+   }

cl-active = FALSE;
cl-wakeup = now + mainconfig.proxy_dead_time;
@@ -498,6 +507,15 @@
if (cl-last_reply  (( now - 
mainconfig.proxy_retry_delay * mainconfig.proxy_retry_count ))) {
continue;
}
+   /*
+* If we are running in synchronous proxy mode, there's 
no point marking the target
+* server(s) dead, since this should be done by the 
radius client
+*/
+   if (mainconfig.proxy_synchronous) {
+   radlog(L_PROXY, accounting server %s:%d for 
realm %s seems unresponsive.,
+   cl-acct_server, port, cl-realm);
+   continue;
+   }

cl-acct_active = FALSE;
cl-acct_wakeup = now + mainconfig.proxy_dead_time;


The purpose of this patch is to not have the freeradius server mark
the home server dead when working in synchronous mode. We believe that
in synchronous operation it is a good idea to leave the job of marking
the server dead to the NAS client.

All the above actions solved our initial problems. However, after a
while we noticed again clients being rejected when they shouldn't. 

The following code in request_list.c caught my attention:

/*
 *  Refresh a request, by using proxy_retry_delay, cleanup_delay,
 *  max_request_time, etc.
 *
 *  When walking over the request list, all of the per-request
 *  magic is done here.
 */
static int refresh_request(REQUEST *request, void *data)
{
...
(around line 1264 version 1.1.6)

} else if (request-proxy  !request-proxy_reply) {
/*
 *  The request is NOT finished, but there is an
 *  outstanding proxy request, with no matching
 *  proxy reply.
 *
 *  Wake up when it's time to re-send
 *  the proxy request.
 *
 *  But in synchronous proxy, we don't retry but we update
 *  the next retry time as NAS has not resent the request
 *  in the given retry window.
 */
if (mainconfig.proxy_synchronous) {
/*
 *  If the retry_delay * count has passed,
 *  then mark the realm dead.
 */
if (info-now  (request-timestamp + 
(mainconfig.proxy_retry_delay * mainconfig.proxy_retry_count))) {
rad_assert(request-child_pid == 
NO_SUCH_CHILD_PID);
request_reject(request);

realm_disable(request-proxy-dst_ipaddr,
  request-proxy-dst_port);
request-finished = TRUE;
  

Re: Freeradius proxy code questions and proposed patch

2007-04-30 Thread Alan DeKok
Kostas Zorbadelos wrote:
 I had described a strange behavior in our large proxy setup. After
 running the server in debug mode (radiusd -xxx) in our production
 systems we found out what was causing our problems. The problem was
 that the home server in our proxy setup was marked dead quite often
 during the day and with a dead_time of 30 secs every request that came
 within these 30 secs was rejected.

  Yes.  In 1.x, the proxy code does this.  It's fixed in 2.0, which
should be released real soon now.

 +   /*
 +* If we are running in synchronous proxy mode, 
 there's no point marking the target
 +* server(s) dead, since this should be done by the 
 radius client

  Uh, no.  The RADIUS client doesn't know about the home servers.  It
only knows about the server it's sending packets to.

 The purpose of this patch is to not have the freeradius server mark
 the home server dead when working in synchronous mode. We believe that
 in synchronous operation it is a good idea to leave the job of marking
 the server dead to the NAS client.

  Which server?  All your patch does is make sure that the NAS marks the
proxying server as dead.

...
 It seems that in some strange occations the code enters the above
 path. A decision is made in case the current time is older than
 mainconfig.proxy_retry_delay * mainconfig.proxy_retry_count. If this
 is the case, the request is rejected and the code tries to disable the
 realm. However in the proxy.conf configuration file it is mentioned:

  All of that code is *gone* in 2.0.  The new code is so much better
that it's really quite hard to describe how much better it is.

 Please let me know your thoughts on these matters (also on the patch
 we provide)

  Take a look at the current CVS snapshot.  It should be pretty robust
with some recent bug fixes, and it will solve *all* of your proxying
problems.

  And I do mean ALL of the problems.

  Alan DeKok.
--
  http://deployingradius.com   - The web site of the book
  http://deployingradius.com/blog/ - The blog
- 
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html


Re: Freeradius proxy code questions and proposed patch

2007-04-30 Thread Kostas Zorbadelos
On Mon, Apr 30, 2007 at 05:41:06PM +0200, Alan DeKok wrote:
 Kostas Zorbadelos wrote:

  I had described a strange behavior in our large proxy setup. After
  running the server in debug mode (radiusd -xxx) in our production
  systems we found out what was causing our problems. The problem was
  that the home server in our proxy setup was marked dead quite often
  during the day and with a dead_time of 30 secs every request that came
  within these 30 secs was rejected.
 
   Yes.  In 1.x, the proxy code does this.  It's fixed in 2.0, which
 should be released real soon now.
 
  +   /*
  +* If we are running in synchronous proxy mode, 
  there's no point marking the target
  +* server(s) dead, since this should be done by the 
  radius client
 
   Uh, no.  The RADIUS client doesn't know about the home servers.  It
 only knows about the server it's sending packets to.
 

Precicely. But when we work in 'synchronous' mode we want the NAS to
be in charge of the retransmision policy not our proxy server. If the
home server does not reply for any reason, we want the client (NAS) to
notice it and retransmit. Eventually, the client will mark our proxy
server dead not because it is its fault, but because the home server
is not responding.  

  The purpose of this patch is to not have the freeradius server mark
  the home server dead when working in synchronous mode. We believe that
  in synchronous operation it is a good idea to leave the job of marking
  the server dead to the NAS client.
 
   Which server?  All your patch does is make sure that the NAS marks the
 proxying server as dead.
 

Eventually, yes this is what the NAS will do. All that is due to the
synchronous mode in proxy operation.

 ...
  It seems that in some strange occations the code enters the above
  path. A decision is made in case the current time is older than
  mainconfig.proxy_retry_delay * mainconfig.proxy_retry_count. If this
  is the case, the request is rejected and the code tries to disable the
  realm. However in the proxy.conf configuration file it is mentioned:
 
   All of that code is *gone* in 2.0.  The new code is so much better
 that it's really quite hard to describe how much better it is.
 
  Please let me know your thoughts on these matters (also on the patch
  we provide)
 
   Take a look at the current CVS snapshot.  It should be pretty robust
 with some recent bug fixes, and it will solve *all* of your proxying
 problems.
 
   And I do mean ALL of the problems.
 

I have read in the list about the major clean up version 2.0 of the
server will be. While reading the code of versions 1.x I could see
that there is great room for improvement. I will take a look in the
2.0 sources and I look forward to testing it when it becomes
available. 

Thanks a lot Alan.

Kostas

   Alan DeKok.
 --
   http://deployingradius.com   - The web site of the book
   http://deployingradius.com/blog/ - The blog
- 
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html