Hi,
The purpose of this mail is to give you an insight on some stuff I've been
trying at work (playing some could argue) that I'd like to share in case it
could be useful to any of you out there.
I won't describe all the issues I've had during the compilation,
configuration and functional/performance testing nor ask you for help but
rather just describe what I've done and document one of the last problems I had
which kept me awake a few nights (segmentation fault).
I have for the past 4 weeks been trying to evaluate if FreeRadius can
be used as a AAA in an UMTS network with a large amount of subscribers for the
GPRS Data services. With "if it can be used" I mean essentially if it can
handle:
(1) Functionality: basic Authentication/Authorization/Accounting, IP Address
allocation and some GPRS attribute to IP Address mapping storage.
(2) High Availability (no single point of failure HW/SW)
(3) Distributed Architecture (performance target of 250 requests/second peak
hour at a reasonable HW/SW cost)
For the purpose of this test I have decided to use (32 bit due to
problems getting it to compile with 64 bit on SPARC with the distributed
binaries from MySQL):
(a) Solaris 8 on SPARC (selected due to the fact that these machines were
pretty much idle at my company similar tests were run on x86 PCs based on
Fedora Linux Core 4).
(b) MySQL 5.0.21 (MAX version) 32 bit SPARC binary distribution.
(c) Freeradius 1.1.1 (originally with 1.1.0 but due to bugs on the Dictionary
and thanks to recommendation (mail archives) from Alan DeKok I upgraded.
(d) For IP allocation I'm using the rlm_sqlippool module (hard to tell its
version because it's not version controlled as far as I could see, I got it
from a Russian website) as per Alan DeKok's recommendation (mail archives). It
will require some customization as I'm looking into being able to define IP
pools as being comprised of several (not just one) start/end IP ranges.
The test bed is basically two physical nodes each running the same
software i.e. radiusd, mysqld and ndbd (MySQL clustered storage engine
process). The NAS (in UMTS these are called GGSN) will load-balance the
requests (directly or through an IP Load Balancer or even a freeradius proxy
haven't decided yet which).
This configuration allows vertical (bigger machines) and horizontal
(more machines) scalability by adding more CPU:s or extra nodes to the cluster
respectively for improved performance. I have tested the vertical scalability
and it's linear with the CPU utilization. The horizontal will be tested in the
coming days (hard to get hold of the required HW for the tests). I will publish
some results (more quantitative than this email) then.
Last but not least (and in connection to the subject of this email) one
bug I found on the rlm_sqlippool that I have (as I mentioned hard to tell its
version) is that during load testing and given the right circumstances
(multiple NAS, Solaris architecture, MySQL Cluster storage engine only and high
CPU utilization) I was getting a core dump of the 'radiusd' process.
The problem was during the post-authorization phase of the sqlippool
module on the 'allocate-find' SQL statement result retrieval due to the fact
that the expected result row (just one expected with just one field containing
the IP address to allocate) had invalid memory references (a row is modelled as
an array of references to result columns and the only reference was invalid and
therefore causing a segmentation fault to happen).
Looking at the code and debugging it for a while I noticed that the
memory holding the result set was being released before it was being used
(though previously a reference to the first and only row had been kept) hence
causing unpredictable results.
Anyhow the code changes to fix this was to simply move the
'sql_finish_select_query' function call (which indirectly calls the MySQL
function 'mysql_free_result' to release memory allocated to the result set) a
few lines down the 'sqlippool_query1' function which is the one retrieving the
IP Address to be allocated in 'rlm_sqlippool.c' file. See below for details:
1 /*
2 * Query the database expecting a single result row
3 */
4 static int sqlippool_query1(char * out, int outlen, const char * fmt,
SQLSOCK * sqlsocket, void * instance, REQU
5 EST * request, char * param, int param_len)
6 {
7 rlm_sqlippool_t * data = (rlm_sqlippool_t *) instance;
8 char expansion[MAX_STRING_LEN * 4];
9 char query[MAX_STRING_LEN * 4];
10 SQL_ROW row;
11 int r;
12
13 sqlippool_expand(expansion, sizeof(expansion), fmt, instance,
param, param_len);
14
15 /*
16 * Do an xlat on the provided string
17 */
18 if (request) {
19 if (!radius_xlat(query, sizeof(query), expansion,
request, NULL)) {
20 radlog(L_ERR, "sqlippool_command: xlat
failed.");
21 out[0] = '\0';
22 return 0;
23 }
24 }
25 else {
26 strcpy(query, expansion);
27 }
28
29 #if 0
30 DEBUG2("sqlippool_query1: '%s'", query);
31 #endif
32
33 if (rlm_sql_select_query(sqlsocket, data->sql_inst, query)){
34 radlog(L_ERR, "sqlippool_query1: database query error");
35 out[0] = '\0';
36 return 0;
37 }
38
39 r = rlm_sql_fetch_row(sqlsocket, data->sql_inst);
40
41 if (r) {
42 DEBUG("sqlippool_query1: SQL query did not succeed");
43 out[0] = '\0';
44 return 0;
45 }
46
47 row = sqlsocket->row;
48 if (row == NULL) {
49 DEBUG("sqlippool_query1: SQL query did not return any
results");
50 out[0] = '\0';
51 return 0;
52 }
53
54 if (row[0] == NULL){
55 DEBUG("sqlippool_query1: row[0] returned NULL");
56 out[0] = '\0';
57 return 0;
58 }
59
60 r = strlen(row[0]);
61 if (r >= outlen){
62 DEBUG("sqlippool_query1: insufficient string space");
63 out[0] = '\0';
64 return 0;
65 }
66
67 strncpy(out, row[0], r);
68 out[r] = '\0';
69
70 (data->sql_inst->module->sql_finish_select_query)(sqlsocket,
data->sql_inst->config);
71
72 return r;
73 }
Line number 70 was originally right after 39 (after keeping a reference
to the first (and only) result row. The problem is that the row is a reference
to references to memory allocated by the MySQL C API, which gets released
whenever the 'mysql_free_result' function gets called, but the problem it only
popped up under certain conditions hard to re-create.
I'm done for now more details will come later meanwhile I have a
question: is the rlm_sqlippool module going to be part of a freeradius release
in the near future and if not, what would it be the procedure to follow for it
to happen?
Thanks and hope I didn't take so much of your time if you have read the whole
thing!
Cheers,
Alex.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html