hi all,

i have two problems that i'm hoping someone will be able to help me investigate:

1) the other day during a period of heavy activity, kannel died and was unable to be restarted by daemontools (which i use to monitor kannel and restart if it happens to die - usually works fine). however if i de-activated daemontools then i was able to start kannel myself from the command line. there was nothing appearing in the log files to indicate what the error preventing restart may have been. i also can't figure out why it would have died; i have my logrotate set to rotate when the logfile reaches a certain size. is it possible that a logrotate event being triggered and sending a hangup signal to the bearerbox pid file during a period of heavy activity could have caused it to die for some reason?

2) i've recently noticed a tremendous increase in dropped DLR messages. i've received a report from my aggregator which clearly shows that my kannel install is acknowledging almost all DLR messages that are being submitted back by my aggregator (see below). I ran a PHP script which looped through 10,000 socket connections to write data to my server that holds the DLR URL, and it ran in about 6 seconds without dropping any (the DLR script on my server simply writes the information to a file for later processing by a daemon which writes the information to the database, so as to avoid the potential for high load to affect other areas of operation). so it appears that, somehow, DLR messages are being dropped between my kannel server and the web server received the DLR messages; how is that possible? is there some way i can find out if that is what's happening?

here are the reports from the aggregator indicating that we're acknowledging the majority of DLR messages correctly:

FIRST SMPP SERVER:
less 2009-06-16.txt | grep -ai 'submit_sm,' | wc -l
4163

less 2009-06-16.txt | grep -ai 'deliver_sm,' | grep -ai 'stat:ACKED' | wc -l
4225

less 2009-06-16.txt | grep -ai 'deliver_sm,' | grep -ai 'stat:FAILED' | wc -l
683

less 2009-06-16.txt | grep -ai 'deliver_sm,' | grep -ai 'stat:DELIVRD' | wc -l
3497

less 2009-06-16.txt | grep -ai 'deliver_sm_resp' | wc -l
8444



SECOND SMPP SERVER:
less 2009-06-16.txt | grep -ai 'submit_sm,' | wc -l
4251

less 2009-06-16.txt | grep -ai 'deliver_sm,' | grep -ai 'stat:ACKED' | wc -l
4176

less 2009-06-16.txt | grep -ai 'deliver_sm,' | grep -ai 'stat:FAILED' | wc -l
629

less 2009-06-16.txt | grep -ai 'deliver_sm,' | grep -ai 'stat:DELIVRD' | wc -l
3519

less 2009-06-16.txt | grep -ai 'deliver_sm_resp' | wc -l
8401

cheers
iain

Reply via email to