Hi Alexander.

Thank you for making the patch. I will try to find the time to test it
tomorrow.

/Jacob

On Wed, 2011-09-28 at 12:32 +0200, Alexander Malysh wrote:
> Hi Jakob,
> 
> sorry for delay. Please try attached patch that should fix pid file issue by 
> restart and issue with smsbox/wapbox
> shutdown when bearerbox has crashed.
> 
> Let me know how it works for you and I will commit this fix.
> 
> Thanks,
> Alexander Malysh
> 
> Am 31.08.2011 um 10:38 schrieb Jacob Eiler:
> 
> > Hi.
> > 
> > Thank you for the feedback.
> > 
> > On Tue, 2011-08-30 at 23:00 +0300, Nikos Balkanas wrote:
> > 
> >> 3. If bearerbox is in parachute mode and restarted for any reason,
> >> smsbox goes down and stays down. This is caused by failed heartbeats.
> >> It can be fixed by introducing a small delay witth retry in smsbox, if
> >> it realizes that bearerbox connectiuon is dropped.
> > 
> > Actually the cause is the closing of the bearerbox-connection, causing
> > read_from_bearerbox_real to return -1 and in turn the smsbox to
> > shutdown.
> > 
> > 
> >> (1) Could you please post panic logs?I don't think it is due to the
> >> pid file. In my experience web restart fails always, because it
> >> doesn't preserve the original location and execvp() can't find the
> >> executable. Solution is to save original launch directory (i.e.
> >> "/usr/local/bin") and then do a chdir() just before you call execvp.
> >> You see, after initialization, bearerbox runs from "/" and all
> >> relative paths are broken. This could include the pid file as well.
> > 
> > This is only the case when combined with --daemonize/-d. At any rate I
> > start Kannel absolute paths for all parameters (executable, pid-file,
> > configuration-file.
> > 
> > Here is the panic logs:
> > 
> > ...
> > 2011-08-31 10:31:47 [3755] [0] DEBUG: MO concatenated message handling
> > cleaned up
> > 2011-08-31 10:31:47 [3755] [0] INFO: Total WDP messages: received 0,
> > sent 0
> > 2011-08-31 10:31:47 [3755] [0] INFO: Total SMS messages: received 0, dlr
> > 0, sent 0, dlr 0
> > 2011-08-31 10:31:47 [3755] [0] DEBUG: Immutable octet strings: 257.
> > 2011-08-31 10:31:47 [3755] [0] PANIC: Could not open pid-file
> > `/tmp/core.pid'
> > 2011-08-31 10:31:47 [3755] [0] PANIC: System error 17: File exists
> > 2011-08-31 10:31:47 [3755] [0] PANIC: ./gw/bearerbox(gw_panic+0xbd)
> > [0x80f1dcd]
> > 2011-08-31 10:31:47 [3755] [0] PANIC: ./gw/bearerbox(get_and_set_debugs
> > +0xaeb) [0x80fe9cb]
> > 2011-08-31 10:31:47 [3755] [0] PANIC: ./gw/bearerbox(main+0x87)
> > [0x80530a7]
> > 2011-08-31 10:31:47 [3755] [0]
> > PANIC: /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xe7)
> > [0xb737be37]
> > 2011-08-31 10:31:47 [3755] [0] PANIC: ./gw/bearerbox() [0x8052a41]
> > 
> > 
> > 
> >> (2). Same solution as in (3). Introduce a small delay to smsbox.
> >> However, for this you don't need to modify the sources, you can just
> >> delay it in the init script.
> > 
> > Yes, provided I just the parachute option and an init-script.
> > 
> > In my opinion it make little sense to provide a 'restart' command, when
> > in fact only the bearerbox is restarted and all other boxes simply shut
> > down. 
> > 
> > /Jacob
> > 
> > 
> > -- 
> > Jacob Eiler
> > Apide ApS
> > e: [email protected]
> > t: +45 2374 0486
> > w: apide.com
> > 
> > 
> > 
> 

-- 
Jacob Eiler
Apide ApS
e: [email protected]
t: +45 2374 0486
w: apide.com



Reply via email to