The log message you posted earlier with the assert error contains a identity string that - amoung other things - reads i686 (as opposed to amd64/x86_64 or similar). That's how I can tell. And this is exactly why it was added to begin with :)
As for centos, that explains the logs. I must've mixed you up with someone else because I could've sworn you said you were on Ubuntu. Oh well. We've used centos as a test platform though, so nothing fundamentally wrong with it as far as Varnish is concerned. Let me know when/if you get a core dump. - Kristian 2010/6/25, Ben Nowacky <[email protected]>: > NOpe, this is a dedicated server.. We're running CentOs... How do you know > we're running 32-bit version? I had to compile from source on CentOS, so > just grabbed the binaries from the site and did a build from them. How are > you guessing it's 32-bit? > > Definitely not familiar with analyzing core-dumps or even getting them to > run... I'm not a sys-admin, just the guy stuck trying to get our servers > ready for an onslaught of traffic coming next week that I know we can not > handle right now.... > On Jun 24, 2010, at 7:35 PM, Kristian Lyngstøl wrote: > >> If it's not the vm you will have to turn on core dumps to figure it >> out. That involves setting ulimit -c unlimited in the startup script >> (or running it manually on the shell you start varnish from). You also >> likely want to set /proc/sys/vm/core_pattern to a path where you can >> both fit the core dump and actually find it. If you're unfamiliar with >> analyzing core dumps, you can gzip it and send it to me along with >> your varnish binaries, if you want to. >> >> As for logging, I suppose it might have changed in Ubuntu. I'll have >> to check that. You got the assert error though, so it's all there. >> >> Just out of curiosity though: why 32-bit? Is it by any chance a >> virtual machine, or similar? >> >> -Kristian >> PS: I'm not on a computer right now, so you will want to verify the >> ulimit argument-name and core_pattern path. >> >> 2010/6/25, Ben Nowacky <[email protected]>: >>> Thanks Kristian! Been reading your blog, and got some of these from your >>> site... Guess I went overboard with some of them... >>> >>> - Ther is no /var/log/syslog so nothing else is being logged. This is the >>> only location i've been able to get any debug info out of varnish. We're >>> not >>> tapping out VM or anything else it appears though.. Everything looks okay >>> on >>> that front, but I'm going to lower the max threads and see how that takes >>> us.. maybe it'll be a simple solution. >>> >>> Appreciate the help! >>> On Jun 24, 2010, at 7:00 PM, Kristian Lyngstøl wrote: >>> >>>> As Per says, it's likely you run out of vm space. You are also >>>> specifying a great deal of parameters which I suspect are not actually >>>> adjusted to your site. I would not recommend half of them unless you >>>> actually know why. >>>> >>>> It looks like your log entries are from /var/log/messages. You will >>>> likely find more in /var/log/syslog on Ubuntu. >>>> >>>> Also: 5000 threads is going to be far too many on a 32-bit system. >>>> Using 64-bit is by far the simplest way to avoid hassel. If you insist >>>> on 32-bit, you will need to reduce the maximum amount of threads, and >>>> possibly adjust the stack size, though newer varnish packages might >>>> try to do the latter. At any rate, closely monitor vm-usage. >>>> >>>> Also, signal 11 is a segfault. This means invalid or illegal memory >>>> access, which could match the symptoms of a 32-bit >>>> varnish-installation running out of virtual memory address space. >>>> >>>> - Kristian >>>> >>>> 2010/6/25, Ben Nowacky <[email protected]>: >>>>> Here's the error I get consistently: >>>>> Jun 24 23:35:31 srv860 varnishd[20605]: Child (21427) died signal=11 >>>>> Jun 24 23:35:31 srv860 varnishd[20605]: child (21660) Started >>>>> Jun 24 23:35:31 srv860 varnishd[20605]: Child (21660) said >>>>> Jun 24 23:35:31 srv860 varnishd[20605]: Child (21660) said Child starts >>>>> >>>>> Here's my config: >>>>> "-f /usr/local/varnish-2.1.2/etc/default.vcl \ >>>>> -s malloc,1G \ >>>>> -p thread_pool_max=5000 \ >>>>> -p thread_pools=4 \ >>>>> -p thread_pool_min=200 \ >>>>> -p thread_pool_add_delay=1ms \ >>>>> -p cli_timeout=1000s \ >>>>> -p ping_interval=1 \ >>>>> -p cli_buffer=16384 \ >>>>> -p session_linger=20ms \ >>>>> -p lru_interval=360s \ >>>>> -p listen_depth=8192 \ >>>>> -h classic,500009 \ >>>>> -T localhost:2000 " >>>>> >>>>> Am I doing anything in here atrocious that would be causing the random >>>>> resets? I've tried file and malloc storage to no avail.. Neither one >>>>> fixed >>>>> the issue. I've tried adjusting sess_timeout, sess_workspace, etc... >>>>> also >>>>> nothing.. Changed the hash from classic to critbit also, with no >>>>> success. >>>>> Bashing head against the wall, if anyone has any advice could really >>>>> use >>>>> it >>>>> ! ! >>>>> >>>>> >>>>> On Jun 24, 2010, at 10:58 AM, Caunter, Stefan wrote: >>>>> >>>>>> Check dmesg too, child is probably dying. Problem with persistent I >>>>>> found, I had to go back to file. >>>>>> >>>>>> Stefan Caunter :: Senior Systems Administrator :: TOPS >>>>>> e: [email protected] :: m: (416) 561-4871 >>>>>> www.thestar.com www.topscms.com >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: [email protected] >>>>>> [mailto:[email protected]] On Behalf Of Ben >>>>>> Nowacky >>>>>> Sent: June-24-10 1:51 PM >>>>>> To: Flavio Torres >>>>>> Cc: [email protected] >>>>>> Subject: Re: Varnish restarting sporadically... losing entire cache... >>>>>> >>>>>> Thanks Flavio! Here's the errors that I see in the >>>>>> /var/log/messages... >>>>>> Is this what you were seeing? >>>>>> >>>>>> Jun 24 17:38:23 srv860 varnishd[15625]: Child (22165) Panic message: >>>>>> Assert error in SMP_FreeObj(), storage_persistent.c line 802: >>>>>> Condition(sg->nfixed > 0) not true. thread = (cache-timeout) ident = >>>>>> Linux,2.6.18-128.4.1.el5PAE,i686,-spersistent,-hclassic,epoll >>>>>> Backtrace: >>>>>> 0x806ca7c: pan_ic+cc 0x808851e: SMP_FreeObj+13e 0x8064b5f: >>>>>> HSH_Deref+21f 0x80618d1: exp_timer+321 0x806f1fd: wrk_bgthread+cd >>>>>> 0x44249b: /lib/libpthread.so.0 [0x44249b] 0x39942e: >>>>>> /lib/libc.so.6(clone+0x5e) [0x39942e] >>>>>> Jun 24 17:38:23 srv860 varnishd[15625]: child (22984) Started >>>>>> Jun 24 17:38:23 srv860 varnishd[15625]: Child (22984) said >>>>>> Jun 24 17:38:23 srv860 varnishd[15625]: Child (22984) said Child >>>>>> starts >>>>>> Jun 24 17:38:23 srv860 varnishd[15625]: Child (22984) said Dropped 0 >>>>>> segments to make free_reserve >>>>>> Jun 24 17:38:23 srv860 varnishd[15625]: Child (22984) said Silo >>>>>> completely loaded >>>>>> On Jun 24, 2010, at 10:51 AM, Flavio Torres wrote: >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> varnish-misc mailing list >>>>> [email protected] >>>>> http://lists.varnish-cache.org/mailman/listinfo/varnish-misc >>>>> >>> >>> > > _______________________________________________ varnish-misc mailing list [email protected] http://lists.varnish-cache.org/mailman/listinfo/varnish-misc
