hi all,
i have two problems that i'm hoping someone will be able to help me
investigate:
1) the other day during a period of heavy activity, kannel died and was
unable to be restarted by daemontools (which i use to monitor kannel and
restart if it happens to die - usually works fine). however if i
de-activated daemontools then i was able to start kannel myself from the
command line. there was nothing appearing in the log files to indicate
what the error preventing restart may have been. i also can't figure out
why it would have died; i have my logrotate set to rotate when the logfile
reaches a certain size. is it possible that a logrotate event being
triggered and sending a hangup signal to the bearerbox pid file during a
period of heavy activity could have caused it to die for some reason?
2) i've recently noticed a tremendous increase in dropped DLR messages.
i've received a report from my aggregator which clearly shows that my
kannel install is acknowledging almost all DLR messages that are being
submitted back by my aggregator (see below). I ran a PHP script which
looped through 10,000 socket connections to write data to my server that
holds the DLR URL, and it ran in about 6 seconds without dropping any (the
DLR script on my server simply writes the information to a file for later
processing by a daemon which writes the information to the database, so as
to avoid the potential for high load to affect other areas of operation).
so it appears that, somehow, DLR messages are being dropped between my
kannel server and the web server received the DLR messages; how is that
possible? is there some way i can find out if that is what's happening?
here are the reports from the aggregator indicating that we're
acknowledging the majority of DLR messages correctly:
FIRST SMPP SERVER:
less 2009-06-16.txt | grep -ai 'submit_sm,' | wc -l
4163
less 2009-06-16.txt | grep -ai 'deliver_sm,' | grep -ai 'stat:ACKED' | wc
-l
4225
less 2009-06-16.txt | grep -ai 'deliver_sm,' | grep -ai 'stat:FAILED' | wc
-l
683
less 2009-06-16.txt | grep -ai 'deliver_sm,' | grep -ai 'stat:DELIVRD' |
wc -l
3497
less 2009-06-16.txt | grep -ai 'deliver_sm_resp' | wc -l
8444
SECOND SMPP SERVER:
less 2009-06-16.txt | grep -ai 'submit_sm,' | wc -l
4251
less 2009-06-16.txt | grep -ai 'deliver_sm,' | grep -ai 'stat:ACKED' | wc
-l
4176
less 2009-06-16.txt | grep -ai 'deliver_sm,' | grep -ai 'stat:FAILED' | wc
-l
629
less 2009-06-16.txt | grep -ai 'deliver_sm,' | grep -ai 'stat:DELIVRD' |
wc -l
3519
less 2009-06-16.txt | grep -ai 'deliver_sm_resp' | wc -l
8401
cheers
iain