I remember that Alejandro Guerreri was working on a DB-store solution a while ago. Would spool still be a better approach? Regards, Dante
2009/9/11 Nikos Balkanas <[email protected]> > Yes, but still haven't tried Reiser! By all means stay away from zfs > (solaris). Despite all the hype, it is much slower than plain ufs. > > Still spool is much more efficient than file. Imagine that file store has > to lock the sms-list while it is writing the store-file from scratch. And > with long Qs this could take some time. > > BR, > Nikos > > ----- Original Message ----- > *From:* Alejandro Guerrieri <[email protected]> > *To:* Dante Moreno <[email protected]> > *Cc:* Nikos Balkanas <[email protected]> ; [email protected] > *Sent:* Friday, September 11, 2009 4:43 PM > *Subject:* Re: PANIC bearerbox cvs-20090902 > > Yes, I've benchmarked ext3, ext2 and xfs and ext2 is by far the best > performing filesystem for spool store. > At least on my experience, ext3 gets very sluggish on stores over 50K. > Regarding xfs, despite being quite faster than ext3 in loading the store, > increased the load on my system under heavy traffic. > > IMHO, ext2 is the way to go. However, if you're sustaining heavy traffic > the spool store stresses the filesystem a lot so I'd recommend you to use a > dedicated ext2 partition for the store: you'd still use the more reliable > ext3 for the OS while getting the speed of ext2 where's needed. Furthermore, > if the ext2 partition crashes you'd be able to unmount it and repair it > without rebooting the box. > > Regards, > -- > Alejandro Guerrieri > [email protected] > > > > On 11/09/2009, at 15:34, Dante Moreno wrote: > > Hi Nikos, > I can't use spool store-type right now since kannel runs on ext3 > filesystem. I remember reading that there were performance issues if you > have a large queue+spool+ext3. If I have no other choice, I can partition > the system and create an ext2 or xfs partition just for the queue. However I > want to do that as a last resource solution(and hope that the problem really > doesn't happen again). On the other hand, there are a couple of good free > smpp smsc simulators. SMPPSim(free and open source) or Logica's simulator > for example. > Regards, > Dante > > 2009/9/11 Nikos Balkanas <[email protected]> > >> Hi, >> >> I cannot find anything wrong with the code at that point. However, it >> looks like memory corruption. Could you please use spool instead of file? It >> is safer, more efficient and faster than file. In addition it uses different >> memory structures than file and you should get away with it. >> >> I will update and run valgrind on it over the weekend. Unfortunately, I >> don't have smsc connections, but I hope I can catch the problem with fake >> smsc. If not, someone else from the list will have to look at it. >> >> BR, >> Nikos >> >> ----- Original Message ----- >> *From:* Nikos Balkanas <[email protected]> >> *To:* Dante Moreno <[email protected]> >> *Cc:* [email protected] >> *Sent:* Thursday, September 10, 2009 10:44 PM >> *Subject:* Re: PANIC bearerbox cvs-20090902 >> >> Hi, >> >> How can you say they are the same? Even the the problem is different this >> time. >> >> Anyway I 'll have to look at it. >> >> BR, >> Nikos >> >> ----- Original Message ----- >> >> *From:* Dante Moreno <[email protected]> >> *To:* Nikos Balkanas <[email protected]> >> *Cc:* [email protected] >> *Sent:* Thursday, September 10, 2009 8:12 PM >> *Subject:* Re: PANIC bearerbox cvs-20090902 >> >> Hi Nikos, >> Thanks for answering. First of all, i'm using the file store-type option >> and there is plenty of free disk space. I'm using the latest >> CVS(cvs-20090902). Here are the logs+addr2line output: >> >> 2009-09-10 09:10:40 [27966] [18] PANIC: gwlib/octstr.c:2484: >>> seems_valid_real: Assertion `ostr != NULL' failed. (Called from >>> gwlib/octstr.c:874:octstr_compare.) >> >> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox(gw_panic+0x15b) >>> [0x4833db] >> >> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox [0x483c59] >> >> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox(octstr_compare+0x20) >>> [0x488800] >> >> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox [0x477292] >> >> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox(gwlist_search+0x54) >>> [0x4811d4] >> >> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox(dict_get+0x35) >>> [0x4772d5] >> >> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox [0x4175f0] >> >> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox [0x4177ef] >> >> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox [0x417df5] >> >> 2009-09-10 09:10:40 [27966] [18] PANIC: bearerbox [0x47a2f5] >> >> 2009-09-10 09:10:40 [27966] [18] PANIC: /lib64/libpthread.so.0 >>> [0x3781e06307] >> >> 2009-09-10 09:10:40 [27966] [18] PANIC: /lib64/libc.so.6(clone+0x6d) >>> [0x37812d1ded] >> >> >> and here the addr2line output: >> >> addr2line -e /gateway-1.4.3_cvs_20090902/gw/bearerbox 0x4833db 0x483c59 >>> 0x488800 0x477292 0x4811d4 0x4772d5 0x4175f0 0x4177ef 0x417df5 0x47a2f5 >>> 0x3781e06307 0x37812d1ded >> >> /gateway-1.4.3_cvs_20090902/gwlib/log.c:541 >> >> /gateway-1.4.3_cvs_20090902/gwlib/octstr.c:2483 >> >> /gateway-1.4.3_cvs_20090902/gwlib/octstr.c:875 >> >> /gateway-1.4.3_cvs_20090902/gwlib/dict.c:103 >> >> /gateway-1.4.3_cvs_20090902/gwlib/list.c:472 >> >> /gateway-1.4.3_cvs_20090902/gwlib/dict.c:298 >> >> /gateway-1.4.3_cvs_20090902/gw/bb_store_file.c:196 >> >> /gateway-1.4.3_cvs_20090902/gw/bb_store_file.c:571 >> >> /gateway-1.4.3_cvs_20090902/gw/bb_store_file.c:236 >> >> /gateway-1.4.3_cvs_20090902/gwlib/gwthread-pthread.c:135 >> >> ??:0 >> >> ??:0 >> >> >> The line numbers seem to be the same as before. >> Regards, >> Dante >> >> >> 2009/9/10 Nikos Balkanas <[email protected]> >> >>> Hi, >>> >> >> >>> >>> No, this is the right place for debugger info. >>> >>> First make sure that your partition is not getting full and kannel has >>> space to write the Q. >>> Seems you are using spool type for Q storage and it runs out of unique >>> hash strings. But I cannot be sure, since your addr2line output is from an >>> older CVS and reports wrong line numbers. >>> >>> Please update to latest CVS and repost. >>> >>> BR, >>> Nikos >>> >>> ----- Original Message ----- >>> *From:* Dante Moreno <[email protected]> >>> *To:* [email protected] >>> *Sent:* Thursday, September 10, 2009 5:20 PM >>> *Subject:* Re: PANIC bearerbox cvs-20090902 >>> >>> Maybe I should post this to the users list? We are now facing this >>> problem on a daily basis. Any help would be greatly appreciated. >>> Regards, >>> Dante >>> >>> 2009/9/8 Dante Moreno <[email protected]> >>> >>>> Hi, We are using the latest CVS and have found this PANIC bugs. This >>>> has happened to us 3 times in around two weeks. We are not able to >>>> reproduce >>>> them....the only thing we know is that it seems to happen when the store >>>> size is very large(100,000+ messages). We are using the "file" store type. >>>> Below are the bug reports: >>>> >>>> The first one is: >>>> >>>>> >>>>> 2009-08-14 12:29:12 [4472] [15] DEBUG: boxc_receiver: sms received >>>> >>>> 2009-08-14 12:29:13 [4472] [14] PANIC: gwlib/octstr.c:2505: >>>>> seems_valid_real: Assertion `ostr->data[ostr->len] == '\0'' failed. >>>>> (Called >>>>> from gwlib/octstr.c:343:octstr_len.) >>>> >>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox(gw_panic+0x15b) >>>>> [0x4830db] >>>> >>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox [0x4837a5] >>>> >>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox(octstr_len+0x1f) >>>>> [0x483aef] >>>> >>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox(octstr_hash_key+0x2f) >>>>> [0x483b8f] >>>> >>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox [0x476e8c] >>>> >>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox(dict_get+0x1c) >>>>> [0x476fbc] >>>> >>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox [0x4175b0] >>>> >>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox [0x4177af] >>>> >>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox [0x417db5] >>>> >>>> 2009-08-14 12:29:13 [4472] [14] PANIC: bearerbox [0x479ff5] >>>> >>>> 2009-08-14 12:29:13 [4472] [14] PANIC: /lib64/libpthread.so.0 >>>>> [0x3781e06307] >>>> >>>> 2009-08-14 12:29:13 [4472] [14] PANIC: /lib64/libc.so.6(clone+0x6d) >>>>> [0x37812d1ded] >>>> >>>> >>>>> >>>>> addr2line -e /gateway-1.4.3/gw/bearerbox 0x4830db 0x4837a5 0x483aef >>>>> 0x483b8f 0x476e8c 0x476fbc 0x4175b0 0x4177af 0x417db5 0x479ff5 >>>>> 0x3781e06307 >>>>> 0x37812d1ded >>>> >>>> /gateway-1.4.3/gwlib/log.c:541 >>>> >>>> /gateway-1.4.3/gwlib/octstr.c:2507 >>>> >>>> /gateway-1.4.3/gwlib/octstr.c:344 >>>> >>>> /gateway-1.4.3/gwlib/octstr.c:2468 >>>> >>>> /gateway-1.4.3/gwlib/dict.c:139 >>>> >>>> /gateway-1.4.3/gwlib/dict.c:294 >>>> >>>> /gateway-1.4.3/gw/bb_store_file.c:196 >>>> >>>> /gateway-1.4.3/gw/bb_store_file.c:571 >>>> >>>> /gateway-1.4.3/gw/bb_store_file.c:236 >>>> >>>> /gateway-1.4.3/gwlib/gwthread-pthread.c:135 >>>> >>>> ??:0 >>>> >>>> ??:0 >>>> >>>> >>>> And the second one which happened today: >>>> >>>> 2009-09-08 09:56:01 [25766] [18] PANIC: gwlib/octstr.c:2484: >>>>> seems_valid_real: Assertion `ostr != NULL' failed. (Called from >>>>> gwlib/octstr.c:874:octstr_compare.) >>>> >>>> 2009-09-08 09:56:02 [25766] [21] DEBUG: boxc_receiver: sms received >>>> >>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox(gw_panic+0x15b) >>>>> [0x4833db] >>>> >>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox [0x483c59] >>>> >>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox(octstr_compare+0x20) >>>>> [0x488800] >>>> >>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox [0x477292] >>>> >>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox(gwlist_search+0x54) >>>>> [0x4811d4] >>>> >>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox(dict_get+0x35) >>>>> [0x4772d5] >>>> >>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox [0x4175f0] >>>> >>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox [0x4177ef] >>>> >>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox [0x417df5] >>>> >>>> 2009-09-08 09:56:02 [25766] [18] PANIC: bearerbox [0x47a2f5] >>>> >>>> 2009-09-08 09:56:02 [25766] [18] PANIC: /lib64/libpthread.so.0 >>>>> [0x3781e06307] >>>> >>>> 2009-09-08 09:56:02 [25766] [18] PANIC: /lib64/libc.so.6(clone+0x6d) >>>>> [0x37812d1ded] >>>> >>>> >>>> >>>>> addr2line -e gateway-1.4.3_cvs_20090902/gw/bearerbox 0x4833db 0x483c59 >>>>> 0x488800 0x477292 0x4811d4 0x4772d5 0x4175f0 0x4177ef 0x417df5 0x47a2f5 >>>>> 0x3781e06307 0x37812d1ded >>>> >>>> gateway-1.4.3_cvs_20090902/gwlib/log.c:541 >>>> >>>> gateway-1.4.3_cvs_20090902/gwlib/octstr.c:2483 >>>> >>>> gateway-1.4.3_cvs_20090902/gwlib/octstr.c:875 >>>> >>>> gateway-1.4.3_cvs_20090902/gwlib/dict.c:103 >>>> >>>> gateway-1.4.3_cvs_20090902/gwlib/list.c:472 >>>> >>>> gateway-1.4.3_cvs_20090902/gwlib/dict.c:298 >>>> >>>> gateway-1.4.3_cvs_20090902/gw/bb_store_file.c:196 >>>> >>>> gateway-1.4.3_cvs_20090902/gw/bb_store_file.c:571 >>>> >>>> gateway-1.4.3_cvs_20090902/gw/bb_store_file.c:236 >>>> >>>> gateway-1.4.3_cvs_20090902/gwlib/gwthread-pthread.c:135 >>>> >>>> ??:0 >>>> >>>> ??:0 >>>> >>>> >>>> Also, for some strange reason, after the PANIC bearerbox restarts >>>> itself(parachute) but smsbox doesn't. >>>> Could anybody please hint me in how to solve this issues? >>>> >>>> Regards, >>>> Dante >>>> >>> >>> >> > >
