I remember that Alejandro Guerreri was working on a DB-store solution a
while ago. Would spool still be a better approach?
Regards,
Dante

2009/9/11 Nikos Balkanas <[email protected]>

>  Yes, but still haven't tried Reiser! By all means stay away from zfs
> (solaris). Despite all the hype, it is much slower than plain ufs.
>
> Still spool is much more efficient than file. Imagine that file store has
> to lock the sms-list while it is writing the store-file from scratch. And
> with long Qs this could take some time.
>
> BR,
> Nikos
>
> ----- Original Message -----
> *From:* Alejandro Guerrieri <[email protected]>
> *To:* Dante Moreno <[email protected]>
> *Cc:* Nikos Balkanas <[email protected]> ; [email protected]
> *Sent:* Friday, September 11, 2009 4:43 PM
> *Subject:* Re: PANIC bearerbox cvs-20090902
>
> Yes, I've benchmarked ext3, ext2 and xfs and ext2 is by far the best
> performing filesystem for spool store.
> At least on my experience, ext3 gets very sluggish on stores over 50K.
> Regarding xfs, despite being quite faster than ext3 in loading the store,
> increased the load on my system under heavy traffic.
>
> IMHO, ext2 is the way to go. However, if you're sustaining heavy traffic
> the spool store stresses the filesystem a lot so I'd recommend you to use a
> dedicated ext2 partition for the store: you'd still use the more reliable
> ext3 for the OS while getting the speed of ext2 where's needed. Furthermore,
> if the ext2 partition crashes you'd be able to unmount it and repair it
> without rebooting the box.
>
> Regards,
>   --
> Alejandro Guerrieri
> [email protected]
>
>
>
>  On 11/09/2009, at 15:34, Dante Moreno wrote:
>
> Hi Nikos,
> I can't use spool store-type right now since kannel runs on ext3
> filesystem. I remember reading that there were performance issues if you
> have a large queue+spool+ext3. If I have no other choice, I can partition
> the system and create an ext2 or xfs partition just for the queue. However I
> want to do that as a last resource solution(and hope that the problem really
> doesn't happen again). On the other hand, there are a couple of good free
> smpp smsc simulators. SMPPSim(free and open source) or Logica's simulator
> for example.
> Regards,
> Dante
>
> 2009/9/11 Nikos Balkanas <[email protected]>
>
>>  Hi,
>>
>> I cannot find anything wrong with the code at that point. However, it
>> looks like memory corruption. Could you please use spool instead of file? It
>> is safer, more efficient and faster than file. In addition it uses different
>> memory structures than file and you should get away with it.
>>
>> I will update and run valgrind on it over the weekend. Unfortunately, I
>> don't have smsc connections, but I hope I can catch the problem with fake
>> smsc. If not, someone else from the list will have to look at it.
>>
>> BR,
>> Nikos
>>
>>  ----- Original Message -----
>> *From:* Nikos Balkanas <[email protected]>
>> *To:* Dante Moreno <[email protected]>
>> *Cc:* [email protected]
>>   *Sent:* Thursday, September 10, 2009 10:44 PM
>> *Subject:* Re: PANIC bearerbox cvs-20090902
>>
>> Hi,
>>
>> How can you say they are the same? Even the the problem is different this
>> time.
>>
>> Anyway I 'll have to look at it.
>>
>> BR,
>> Nikos
>>
>> ----- Original Message -----
>>
>> *From:* Dante Moreno <[email protected]>
>> *To:* Nikos Balkanas <[email protected]>
>> *Cc:* [email protected]
>> *Sent:* Thursday, September 10, 2009 8:12 PM
>> *Subject:* Re: PANIC bearerbox cvs-20090902
>>
>> Hi Nikos,
>> Thanks for answering. First of all, i'm using the file store-type option
>> and there is plenty of free disk space.  I'm using the latest
>> CVS(cvs-20090902). Here are the logs+addr2line output:
>>
>>  2009-09-10 09:10:40 [27966] [18] PANIC: gwlib/octstr.c:2484:
>>> seems_valid_real: Assertion `ostr != NULL' failed. (Called from
>>> gwlib/octstr.c:874:octstr_compare.)
>>
>> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox(gw_panic+0x15b)
>>> [0x4833db]
>>
>> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox [0x483c59]
>>
>> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox(octstr_compare+0x20)
>>> [0x488800]
>>
>> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox [0x477292]
>>
>> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox(gwlist_search+0x54)
>>> [0x4811d4]
>>
>> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox(dict_get+0x35)
>>> [0x4772d5]
>>
>> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox [0x4175f0]
>>
>> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox [0x4177ef]
>>
>> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox [0x417df5]
>>
>> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox [0x47a2f5]
>>
>> 2009-09-10 09:10:40 [27966] [18] PANIC: /lib64/libpthread.so.0
>>> [0x3781e06307]
>>
>> 2009-09-10 09:10:40 [27966] [18] PANIC: /lib64/libc.so.6(clone+0x6d)
>>> [0x37812d1ded]
>>
>>
>> and here the addr2line output:
>>
>>  addr2line -e /gateway-1.4.3_cvs_20090902/gw/bearerbox 0x4833db 0x483c59
>>> 0x488800 0x477292 0x4811d4 0x4772d5 0x4175f0 0x4177ef 0x417df5 0x47a2f5
>>> 0x3781e06307 0x37812d1ded
>>
>> /gateway-1.4.3_cvs_20090902/gwlib/log.c:541
>>
>> /gateway-1.4.3_cvs_20090902/gwlib/octstr.c:2483
>>
>> /gateway-1.4.3_cvs_20090902/gwlib/octstr.c:875
>>
>> /gateway-1.4.3_cvs_20090902/gwlib/dict.c:103
>>
>> /gateway-1.4.3_cvs_20090902/gwlib/list.c:472
>>
>> /gateway-1.4.3_cvs_20090902/gwlib/dict.c:298
>>
>> /gateway-1.4.3_cvs_20090902/gw/bb_store_file.c:196
>>
>> /gateway-1.4.3_cvs_20090902/gw/bb_store_file.c:571
>>
>> /gateway-1.4.3_cvs_20090902/gw/bb_store_file.c:236
>>
>> /gateway-1.4.3_cvs_20090902/gwlib/gwthread-pthread.c:135
>>
>> ??:0
>>
>> ??:0
>>
>>
>> The line numbers seem to be the same as before.
>> Regards,
>> Dante
>>
>>
>> 2009/9/10 Nikos Balkanas <[email protected]>
>>
>>>  Hi,
>>>
>>
>>
>>>
>>> No, this is the right place for debugger info.
>>>
>>> First make sure that your partition is not getting full and kannel has
>>> space to write the Q.
>>> Seems you are using spool type for Q storage and it runs out of unique
>>> hash strings. But I cannot be sure, since your addr2line output is from an
>>> older CVS and reports wrong line numbers.
>>>
>>> Please update to latest CVS and repost.
>>>
>>> BR,
>>> Nikos
>>>
>>> ----- Original Message -----
>>> *From:* Dante Moreno <[email protected]>
>>> *To:* [email protected]
>>> *Sent:* Thursday, September 10, 2009 5:20 PM
>>> *Subject:* Re: PANIC bearerbox cvs-20090902
>>>
>>> Maybe I should post this to the users list? We are now facing this
>>> problem on a daily basis. Any help would be greatly appreciated.
>>> Regards,
>>> Dante
>>>
>>> 2009/9/8 Dante Moreno <[email protected]>
>>>
>>>> Hi, We are using the latest CVS and have found this PANIC bugs. This
>>>> has happened to us 3 times in around two weeks. We are not able to 
>>>> reproduce
>>>> them....the only thing we know is that it seems to happen when the store
>>>> size is very large(100,000+ messages). We are using the "file" store type.
>>>> Below are the bug reports:
>>>>
>>>>  The first one is:
>>>>
>>>>>
>>>>> 2009-08-14 12:29:12 [4472] [15] DEBUG: boxc_receiver: sms received
>>>>
>>>> 2009-08-14 12:29:13 [4472] [14] PANIC: gwlib/octstr.c:2505:
>>>>> seems_valid_real: Assertion `ostr->data[ostr->len] == '\0'' failed. 
>>>>> (Called
>>>>> from gwlib/octstr.c:343:octstr_len.)
>>>>
>>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox(gw_panic+0x15b)
>>>>> [0x4830db]
>>>>
>>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox [0x4837a5]
>>>>
>>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox(octstr_len+0x1f)
>>>>> [0x483aef]
>>>>
>>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox(octstr_hash_key+0x2f)
>>>>> [0x483b8f]
>>>>
>>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox [0x476e8c]
>>>>
>>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox(dict_get+0x1c)
>>>>> [0x476fbc]
>>>>
>>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox [0x4175b0]
>>>>
>>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox [0x4177af]
>>>>
>>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox [0x417db5]
>>>>
>>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox [0x479ff5]
>>>>
>>>> 2009-08-14 12:29:13 [4472] [14] PANIC: /lib64/libpthread.so.0
>>>>> [0x3781e06307]
>>>>
>>>> 2009-08-14 12:29:13 [4472] [14] PANIC: /lib64/libc.so.6(clone+0x6d)
>>>>> [0x37812d1ded]
>>>>
>>>>
>>>>>
>>>>> addr2line -e /gateway-1.4.3/gw/bearerbox 0x4830db 0x4837a5 0x483aef
>>>>> 0x483b8f 0x476e8c 0x476fbc 0x4175b0 0x4177af 0x417db5 0x479ff5 
>>>>> 0x3781e06307
>>>>> 0x37812d1ded
>>>>
>>>> /gateway-1.4.3/gwlib/log.c:541
>>>>
>>>> /gateway-1.4.3/gwlib/octstr.c:2507
>>>>
>>>> /gateway-1.4.3/gwlib/octstr.c:344
>>>>
>>>> /gateway-1.4.3/gwlib/octstr.c:2468
>>>>
>>>> /gateway-1.4.3/gwlib/dict.c:139
>>>>
>>>> /gateway-1.4.3/gwlib/dict.c:294
>>>>
>>>> /gateway-1.4.3/gw/bb_store_file.c:196
>>>>
>>>> /gateway-1.4.3/gw/bb_store_file.c:571
>>>>
>>>> /gateway-1.4.3/gw/bb_store_file.c:236
>>>>
>>>> /gateway-1.4.3/gwlib/gwthread-pthread.c:135
>>>>
>>>> ??:0
>>>>
>>>> ??:0
>>>>
>>>>
>>>> And the second one which happened today:
>>>>
>>>>  2009-09-08 09:56:01 [25766] [18] PANIC: gwlib/octstr.c:2484:
>>>>> seems_valid_real: Assertion `ostr != NULL' failed. (Called from
>>>>> gwlib/octstr.c:874:octstr_compare.)
>>>>
>>>> 2009-09-08 09:56:02 [25766] [21] DEBUG: boxc_receiver: sms received
>>>>
>>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox(gw_panic+0x15b)
>>>>> [0x4833db]
>>>>
>>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox [0x483c59]
>>>>
>>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox(octstr_compare+0x20)
>>>>> [0x488800]
>>>>
>>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox [0x477292]
>>>>
>>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox(gwlist_search+0x54)
>>>>> [0x4811d4]
>>>>
>>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox(dict_get+0x35)
>>>>> [0x4772d5]
>>>>
>>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox [0x4175f0]
>>>>
>>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox [0x4177ef]
>>>>
>>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox [0x417df5]
>>>>
>>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox [0x47a2f5]
>>>>
>>>> 2009-09-08 09:56:02 [25766] [18] PANIC: /lib64/libpthread.so.0
>>>>> [0x3781e06307]
>>>>
>>>> 2009-09-08 09:56:02 [25766] [18] PANIC: /lib64/libc.so.6(clone+0x6d)
>>>>> [0x37812d1ded]
>>>>
>>>>
>>>>
>>>>> addr2line -e gateway-1.4.3_cvs_20090902/gw/bearerbox 0x4833db 0x483c59
>>>>> 0x488800 0x477292 0x4811d4 0x4772d5 0x4175f0 0x4177ef 0x417df5 0x47a2f5
>>>>> 0x3781e06307 0x37812d1ded
>>>>
>>>> gateway-1.4.3_cvs_20090902/gwlib/log.c:541
>>>>
>>>> gateway-1.4.3_cvs_20090902/gwlib/octstr.c:2483
>>>>
>>>> gateway-1.4.3_cvs_20090902/gwlib/octstr.c:875
>>>>
>>>> gateway-1.4.3_cvs_20090902/gwlib/dict.c:103
>>>>
>>>> gateway-1.4.3_cvs_20090902/gwlib/list.c:472
>>>>
>>>> gateway-1.4.3_cvs_20090902/gwlib/dict.c:298
>>>>
>>>> gateway-1.4.3_cvs_20090902/gw/bb_store_file.c:196
>>>>
>>>> gateway-1.4.3_cvs_20090902/gw/bb_store_file.c:571
>>>>
>>>> gateway-1.4.3_cvs_20090902/gw/bb_store_file.c:236
>>>>
>>>> gateway-1.4.3_cvs_20090902/gwlib/gwthread-pthread.c:135
>>>>
>>>> ??:0
>>>>
>>>> ??:0
>>>>
>>>>
>>>> Also, for some strange reason, after the PANIC bearerbox restarts
>>>> itself(parachute) but smsbox doesn't.
>>>> Could anybody please hint me in how to solve this issues?
>>>>
>>>> Regards,
>>>> Dante
>>>>
>>>
>>>
>>
>
>

Reply via email to