Hi Alexander. Thank you for making the patch. I will try to find the time to test it tomorrow.
/Jacob On Wed, 2011-09-28 at 12:32 +0200, Alexander Malysh wrote: > Hi Jakob, > > sorry for delay. Please try attached patch that should fix pid file issue by > restart and issue with smsbox/wapbox > shutdown when bearerbox has crashed. > > Let me know how it works for you and I will commit this fix. > > Thanks, > Alexander Malysh > > Am 31.08.2011 um 10:38 schrieb Jacob Eiler: > > > Hi. > > > > Thank you for the feedback. > > > > On Tue, 2011-08-30 at 23:00 +0300, Nikos Balkanas wrote: > > > >> 3. If bearerbox is in parachute mode and restarted for any reason, > >> smsbox goes down and stays down. This is caused by failed heartbeats. > >> It can be fixed by introducing a small delay witth retry in smsbox, if > >> it realizes that bearerbox connectiuon is dropped. > > > > Actually the cause is the closing of the bearerbox-connection, causing > > read_from_bearerbox_real to return -1 and in turn the smsbox to > > shutdown. > > > > > >> (1) Could you please post panic logs?I don't think it is due to the > >> pid file. In my experience web restart fails always, because it > >> doesn't preserve the original location and execvp() can't find the > >> executable. Solution is to save original launch directory (i.e. > >> "/usr/local/bin") and then do a chdir() just before you call execvp. > >> You see, after initialization, bearerbox runs from "/" and all > >> relative paths are broken. This could include the pid file as well. > > > > This is only the case when combined with --daemonize/-d. At any rate I > > start Kannel absolute paths for all parameters (executable, pid-file, > > configuration-file. > > > > Here is the panic logs: > > > > ... > > 2011-08-31 10:31:47 [3755] [0] DEBUG: MO concatenated message handling > > cleaned up > > 2011-08-31 10:31:47 [3755] [0] INFO: Total WDP messages: received 0, > > sent 0 > > 2011-08-31 10:31:47 [3755] [0] INFO: Total SMS messages: received 0, dlr > > 0, sent 0, dlr 0 > > 2011-08-31 10:31:47 [3755] [0] DEBUG: Immutable octet strings: 257. > > 2011-08-31 10:31:47 [3755] [0] PANIC: Could not open pid-file > > `/tmp/core.pid' > > 2011-08-31 10:31:47 [3755] [0] PANIC: System error 17: File exists > > 2011-08-31 10:31:47 [3755] [0] PANIC: ./gw/bearerbox(gw_panic+0xbd) > > [0x80f1dcd] > > 2011-08-31 10:31:47 [3755] [0] PANIC: ./gw/bearerbox(get_and_set_debugs > > +0xaeb) [0x80fe9cb] > > 2011-08-31 10:31:47 [3755] [0] PANIC: ./gw/bearerbox(main+0x87) > > [0x80530a7] > > 2011-08-31 10:31:47 [3755] [0] > > PANIC: /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xe7) > > [0xb737be37] > > 2011-08-31 10:31:47 [3755] [0] PANIC: ./gw/bearerbox() [0x8052a41] > > > > > > > >> (2). Same solution as in (3). Introduce a small delay to smsbox. > >> However, for this you don't need to modify the sources, you can just > >> delay it in the init script. > > > > Yes, provided I just the parachute option and an init-script. > > > > In my opinion it make little sense to provide a 'restart' command, when > > in fact only the bearerbox is restarted and all other boxes simply shut > > down. > > > > /Jacob > > > > > > -- > > Jacob Eiler > > Apide ApS > > e: [email protected] > > t: +45 2374 0486 > > w: apide.com > > > > > > > -- Jacob Eiler Apide ApS e: [email protected] t: +45 2374 0486 w: apide.com
