Today I finished rewriting the hashtable locking logic. I've added the diff
below.
Still a few consideration to make:
For now I use an array of a fixed size which is used to store the elements to
be deleted.
This array is filled (with a read lock) through the do_all lhash function which
calls the t_old function.
If the array is full the do_all method can't be stopped and thus t_old just
returns. After do_all is finished, the items are deleted (with a write lock).
There are a few things that come to mind with this solution. The size of the
array is fixed. What if every minute more items are added than deleted? I
suggest that we make this size variable. Either in the config or calculate it.
Also I found a bug in the Makefile. I edited the pound.h file but the poundctl
binairy didn't rebuild. I think there's something wrong with the dependencies.
Here are two benchmarks. The table is cleaned every 5 seconds. The size of the
array is 512 (this may need some tweaking less can be better).
Configfile:
ListenHTTP
Address Address xxx.xxx.xxx.186
Port 88
Service
BackEnd
Address xxx.xxx.xxx.186
Port 40081
End
BackEnd
Address xxx.xxx.xxx.186
Port 40081
End
Session
Type Cookie
ID "pound"
TTL 100
End
End
End
Version 2.6b:
# ab -n 20000 -c 20 http://xxx.xxx.xxx.xxx/pound.mhtml
Document Path: /pound.mhtml
Document Length: 87 bytes
Concurrency Level: 20
Time taken for tests: 35.784378 seconds
Complete requests: 20000
Failed requests: 0
Write errors: 0
Total transferred: 5100255 bytes
HTML transferred: 1740087 bytes
Requests per second: 558.90 [#/sec] (mean)
Time per request: 35.784 [ms] (mean)
Time per request: 1.789 [ms] (mean, across all concurrent requests)
Transfer rate: 139.17 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 4 87.4 2 3001
Processing: 1 30 16.7 32 309
Waiting: 1 16 10.8 16 308
Total: 1 35 89.1 34 3050
Percentage of the requests served within a certain time (ms)
50% 34
66% 39
75% 42
80% 45
90% 51
95% 56
98% 62
99% 67
100% 3050 (longest request)
Modified version:
Document Path: /pound.mhtml
Document Length: 87 bytes
Concurrency Level: 20
Time taken for tests: 36.715796 seconds
Complete requests: 20000
Failed requests: 0
Write errors: 0
Total transferred: 5100255 bytes
HTML transferred: 1740087 bytes
Requests per second: 544.72 [#/sec] (mean)
Time per request: 36.716 [ms] (mean)
Time per request: 1.836 [ms] (mean, across all concurrent requests)
Transfer rate: 135.64 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 3 87.4 1 3002
Processing: 1 32 25.1 34 775
Waiting: 1 15 19.4 13 773
Total: 1 36 90.8 35 3059
Percentage of the requests served within a certain time (ms)
50% 35
66% 41
75% 45
80% 47
90% 55
95% 63
98% 74
99% 84
100% 3059 (longest request)
Almost no difference. (a little disapointed, I hoped mine version was faster
;-) )
Please take a look at the diff. Depending on your commends we'll try it on the
original (slower) hardware.
The diff:
diff Pound-2.6b/config.c Pound-2.6b-new/config.c
197c197,198
< pthread_mutex_init(&res->mut, NULL);
---
> // TODO: sets some attributes to this lock to regulate readers/writers
> priority
> pthread_rwlock_init(&res->lock, NULL);
462c463,464
< pthread_mutex_init(&res->mut, NULL);
---
> // TODO: sets some attributes to this lock to regulate readers/writers
> priority
> pthread_rwlock_init(&res->lock, NULL);
535c537,538
< pthread_mutex_init(& be->mut, NULL);
---
> // TODO: initialize with params which regulates rw-priorities
> pthread_rwlock_init(&be->lock, NULL);
561c564,565
< pthread_mutex_init(& be->mut, NULL);
---
> // TODO: initialize with params which regulates rw-priorities
> pthread_rwlock_init(&be->lock, NULL);
diff Pound-2.6b/pound.h Pound-2.6b-new/pound.h
282a283,285
> /* the size of the array which stores the items to be deleted*/
> #define REMOVE_MAX 512
>
311c314
< pthread_mutex_t mut; /* mutex for this back-end */
---
> pthread_rwlock_t lock; /* rw-lock for this back-end */
342c345
< pthread_mutex_t mut; /* mutex for this service */
---
> pthread_rwlock_t lock; /* rw-lock for this service */
diff Pound-2.6b/svc.c Pound-2.6b-new/svc.c
99a100
>
104a106,108
> TABNODE **remove;
> int remove_size;
> pthread_rwlock_t *lock;
106a111,115
>
> /*
> * Remove all expired sessions at one since this locks the table much shorter
> * Requires a readlock on the hashtable.
> */
108c117
< t_old(TABNODE *t, void *arg)
---
> t_remove_at_once(ALL_ARG *a)
110c119,121
< ALL_ARG *a;
---
> int down_load = a->tab->down_load;
> a->tab->down_load = 0;
>
113,115c124,126
< a = (ALL_ARG *)arg;
< if(t->last_acc < a->lim)
< if((res = lh_delete(a->tab, t)) != NULL) {
---
> int i = 0;
> for (;i < a->remove_size;++i) {
> if((res = lh_delete(a->tab, a->remove[i])) != NULL) {
119a131,157
> }
>
> a->tab->down_load = down_load;
> a->remove_size = 0;
>
> return;
> }
>
> /*
> * Check if a node is expired. If so add to the delete list.
> * If this list if full, delete the items in the list.
> */
> static void
> t_old(TABNODE *t, void *arg)
> {
> ALL_ARG *a;
>
> a = (ALL_ARG *)arg;
>
> if (a->remove_size == REMOVE_MAX) {
> /* don't compare if array is full */
> return;
> }
>
> if(t->last_acc < a->lim) {
> a->remove[a->remove_size++] = t;
> }
128c166
< t_expire(LHASH *const tab, const time_t lim)
---
> t_expire(SERVICE *svc)
131d168
< int down_load;
133c170,180
< a.tab = tab;
---
> /* create the array which stores pointers to nodes to be removed */
> if ((a.remove=malloc(sizeof(TABNODE*)*REMOVE_MAX)) == NULL) { // FIXME:
> stack or heap?
> logmsg(LOG_WARNING, "malloc()");
> return;
> }
> /* the number of nodes in the array */
> a.remove_size = 0;
>
> time_t lim = time(NULL) - svc->sess_ttl;
>
> a.tab = svc->sessions;
135,138c182,210
< down_load = tab->down_load;
< tab->down_load = 0;
< lh_doall_arg(tab, LHASH_DOALL_ARG_FN(t_old), &a);
< tab->down_load = down_load;
---
>
> int ret_val;
> if(ret_val = pthread_rwlock_rdlock(&svc->lock)) {
> logmsg(LOG_WARNING, "t_expire() lock: %s", strerror(ret_val));
> return;
> }
>
> lh_doall_arg(svc->sessions, LHASH_DOALL_ARG_FN(t_old), &a);
>
> if(ret_val = pthread_rwlock_unlock(&svc->lock)) {
> logmsg(LOG_WARNING, "t_expire() lock: %s", strerror(ret_val));
> return;
> }
>
>
> if (a.remove_size > 0) {
> /* writelock */
> if(ret_val = pthread_rwlock_wrlock(&svc->lock)) {
> logmsg(LOG_WARNING, "t_expire() lock: %s", strerror(ret_val));
> return;
> }
> /* remove expired elements */
> t_remove_at_once(&a);
> /* unlock */
> if(ret_val = pthread_rwlock_unlock(&svc->lock))
> logmsg(LOG_WARNING, "t_expire() unlock: %s", strerror(ret_val));
> }
> /* free temporary array */
> free(a.remove);
549a622
> int to_add = 0;
551c624
< if(ret_val = pthread_mutex_lock(&svc->mut))
---
> if(ret_val = pthread_rwlock_rdlock(&svc->lock))
571c644
< t_add(svc->sessions, key, &res, sizeof(res));
---
> to_add = 1;
587c660
< t_add(svc->sessions, key, &res, sizeof(res));
---
> to_add = 1;
606c679
< t_add(svc->sessions, key, &res, sizeof(res));
---
> to_add = 1;
615c688
< if(ret_val = pthread_mutex_unlock(&svc->mut))
---
> if(ret_val = pthread_rwlock_unlock(&svc->lock))
617a691,701
> /* adding item to the hashtable? */
> if (1 == to_add) {
> /* writelock */
> if(ret_val = pthread_rwlock_wrlock(&svc->lock))
> logmsg(LOG_WARNING, "get_backend() unlock: %s",
> strerror(ret_val));
> /* add to the table */
> t_add(svc->sessions, key, &res, sizeof(res));
> /* unlock */
> if(ret_val = pthread_rwlock_unlock(&svc->lock))
> logmsg(LOG_WARNING, "get_backend() unlock: %s",
> strerror(ret_val));
> }
632c716,717
< if(ret_val = pthread_mutex_lock(&svc->mut))
---
> /* optain readlock */
> if(ret_val = pthread_rwlock_rdlock(&svc->lock)) {
634,635c719,731
< if(get_HEADERS(key, svc, headers))
< if(t_find(svc->sessions, key) == NULL)
---
> return;
> }
> if(get_HEADERS(key, svc, headers)) {
> if(t_find(svc->sessions, key) == NULL) {
> /* release readlock */
> if(ret_val = pthread_rwlock_unlock(&svc->lock))
> logmsg(LOG_WARNING, "upd_session() unlock: %s",
> strerror(ret_val));
> /* optain writelock */
> if(ret_val = pthread_rwlock_rdlock(&svc->lock)) {
> logmsg(LOG_WARNING, "upd_session() lock: %s",
> strerror(ret_val));
> return;
> }
>
637c733,737
< if(ret_val = pthread_mutex_unlock(&svc->mut))
---
>
> }
> }
> /* release read- or writelock */
> if(ret_val = pthread_rwlock_unlock(&svc->lock))
654,655c754,755
<
< if(ret_val = pthread_mutex_lock(&svc->mut))
---
>
> if(ret_val = pthread_rwlock_wrlock(&svc->lock))
684c784
< if(ret_val = pthread_mutex_unlock(&svc->mut))
---
> if(ret_val = pthread_rwlock_unlock(&svc->lock))
698c798
< if(ret_val = pthread_mutex_lock(&be->mut))
---
> if(ret_val = pthread_rwlock_rdlock(&be->lock))
707c807
< if(ret_val = pthread_mutex_unlock(&be->mut))
---
> if(ret_val = pthread_rwlock_unlock(&be->lock))
1091c1191
< if(ret_val = pthread_mutex_lock(&svc->mut))
---
> if(ret_val = pthread_rwlock_wrlock(&svc->lock))
1103c1203
< if(ret_val = pthread_mutex_unlock(&svc->mut))
---
> if(ret_val = pthread_rwlock_unlock(&svc->lock))
1160c1260
< if(ret_val = pthread_mutex_lock(&svc->mut))
---
> if(ret_val = pthread_rwlock_wrlock(&svc->lock))
1172c1272
< if(ret_val = pthread_mutex_unlock(&svc->mut))
---
> if(ret_val = pthread_rwlock_unlock(&svc->lock))
1189,1193d1288
< time_t cur_time;
< int ret_val;
<
< /* remove stale sessions */
< cur_time = time(NULL);
1198,1204c1293
< if(ret_val = pthread_mutex_lock(&svc->mut)) {
< logmsg(LOG_WARNING, "do_expire() lock: %s",
strerror(ret_val));
< continue;
< }
< t_expire(svc->sessions, cur_time - svc->sess_ttl);
< if(ret_val = pthread_mutex_unlock(&svc->mut))
< logmsg(LOG_WARNING, "do_expire() unlock: %s",
strerror(ret_val));
---
> t_expire(svc);
1209,1215c1298
< if(ret_val = pthread_mutex_lock(&svc->mut)) {
< logmsg(LOG_WARNING, "do_expire() lock: %s",
strerror(ret_val));
< continue;
< }
< t_expire(svc->sessions, cur_time - svc->sess_ttl);
< if(ret_val = pthread_mutex_unlock(&svc->mut))
< logmsg(LOG_WARNING, "do_expire() unlock: %s",
strerror(ret_val));
---
> t_expire(svc);
1244c1327
< if(ret_val = pthread_mutex_lock(&be->mut))
---
> if(ret_val = pthread_rwlock_rdlock(&be->lock))
1248c1331
< if(ret_val = pthread_mutex_unlock(&be->mut))
---
> if(ret_val = pthread_rwlock_unlock(&be->lock))
1258c1341
< if(ret_val = pthread_mutex_lock(&svc->mut)) {
---
> if(ret_val = pthread_rwlock_rdlock(&svc->lock)) {
1267c1350
< if(ret_val = pthread_mutex_lock(&be->mut))
---
> if(ret_val = pthread_rwlock_rdlock(&be->lock))
1273c1356
< if(ret_val = pthread_mutex_unlock(&be->mut))
---
> if(ret_val = pthread_rwlock_unlock(&be->lock))
1279c1362
< if(ret_val = pthread_mutex_lock(&be->mut))
---
> if(ret_val = pthread_rwlock_rdlock(&be->lock))
1285c1368
< if(ret_val = pthread_mutex_unlock(&be->mut))
---
> if(ret_val = pthread_rwlock_unlock(&be->lock))
1290c1373
< if(ret_val = pthread_mutex_unlock(&svc->mut))
---
> if(ret_val = pthread_rwlock_unlock(&svc->lock))
1302c1385
< if(ret_val = pthread_mutex_lock(&be->mut))
---
> if(ret_val = pthread_rwlock_rdlock(&be->lock))
1306c1389
< if(ret_val = pthread_mutex_unlock(&be->mut))
---
> if(ret_val = pthread_rwlock_unlock(&be->lock))
1316c1399
< if(ret_val = pthread_mutex_lock(&svc->mut)) {
---
> if(ret_val = pthread_rwlock_wrlock(&svc->lock)) {
1325c1408
< if(ret_val = pthread_mutex_lock(&be->mut))
---
> if(ret_val = pthread_rwlock_rdlock(&be->lock))
1331c1414
< if(ret_val = pthread_mutex_unlock(&be->mut))
---
> if(ret_val = pthread_rwlock_unlock(&be->lock))
1337c1420
< if(ret_val = pthread_mutex_lock(&be->mut))
---
> if(ret_val = pthread_rwlock_rdlock(&be->lock))
1343c1426
< if(ret_val = pthread_mutex_unlock(&be->mut))
---
> if(ret_val = pthread_rwlock_unlock(&be->lock))
1348c1431
< if(ret_val = pthread_mutex_unlock(&svc->mut))
---
> if(ret_val = pthread_rwlock_unlock(&svc->lock))
1650c1733
< if(dummy = pthread_mutex_lock(&svc->mut))
---
> if(dummy = pthread_rwlock_rdlock(&svc->lock))
1654c1737
< if(dummy = pthread_mutex_unlock(&svc->mut))
---
> if(dummy = pthread_rwlock_unlock(&svc->lock))
1671c1754
< if(dummy = pthread_mutex_lock(&svc->mut))
---
> if(dummy = pthread_rwlock_rdlock(&svc->lock))
1675c1758
< if(dummy = pthread_mutex_unlock(&svc->mut))
---
> if(dummy = pthread_rwlock_unlock(&svc->lock))
1735c1818
< if(ret_val = pthread_mutex_lock(&svc->mut))
---
> if(ret_val = pthread_rwlock_wrlock(&svc->lock))
1738c1821
< if(ret_val = pthread_mutex_unlock(&svc->mut))
---
> if(ret_val = pthread_rwlock_unlock(&svc->lock))
1746c1829
< if(ret_val = pthread_mutex_lock(&svc->mut))
---
> if(ret_val = pthread_rwlock_wrlock(&svc->lock))
1749c1832
< if(ret_val = pthread_mutex_unlock(&svc->mut))
---
> if(ret_val = pthread_rwlock_unlock(&svc->lock))
-----Oorspronkelijk bericht-----
Van: Joe Gooch [mailto:[email protected]]
Verzonden: donderdag 23 december 2010 17:41
Aan: [email protected]
Onderwerp: [Pound Mailing List] RE: Website stalls every 60 seconds
I've been known to set the TTL values low (say, 1 minute) and use JMeter or
similar to set up a fleet of zombie web browsers. It'll give you all the stats
of average response time, responses per second, etc.
As far as pound debugging I usually do the low-tech "add printfs to the code"
approach. :)
Joe
> -----Original Message-----
> From: Steven van der Vegt [mailto:[email protected]]
> Sent: Thursday, December 23, 2010 11:24 AM
> To: [email protected]
> Subject: [Pound Mailing List] RE: Website stalls every 60 seconds
>
> Since this problem is of somewhat a priority to us, I started to
> implement the delete-at-once functionality. It works, but I still need
> to implement read/write locks because during the execution of the
> doall() method we can't afford an insert.
>
> Do you guys have a good testing method?
>
> I'm looking forward to your opinions about this solution. If I'm not
> responding to email the next few days, that's because I'm busy cooking
> and celebrating Christmas :-)
>
> Steven
--
To unsubscribe send an email with subject unsubscribe to [email protected].
Please contact [email protected] for questions.
--
To unsubscribe send an email with subject unsubscribe to [email protected].
Please contact [email protected] for questions.