Re: [squid-users] squid-3.5.0.2-20141031-r13657 crashes

2014-12-04 Thread James Harper
  It's possible that at one point I might have started 2 instances of
  squid running at once... could that cause corruption?
 
 Yes, very likely. More so the longer they were both running.
 
 I see you mention segfaults below, that can also cause it for any
 objects in use at the time of the crash.

The latter is more likely. This had happened several times before I patched it 
so I doubt I managed to run multiple squid instances on more than one occasion.

 
  addr2line -e /usr/local/squid/sbin/squid 0061a6f9
  /usr/local/src/squid-3.5.0.2-20141031-r13657/src/store.cc:962
 
 
 This looks like http://bugs.squid-cache.org/show_bug.cgi?id=4131. The
 two patched posted there seem to be getting good results.
 

I applied the patch a few days ago and haven't seen the problem since. I'm 
still unsure which problem was causing the other though.

Thanks

James
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] squid-3.5.0.2-20141031-r13657 crashes

2014-12-01 Thread Amos Jeffries
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 1/12/2014 9:44 a.m., James Harper wrote:
 I've been getting squid crashes with squid-3.5.0.2-20141031-r13657.
 Basically I think my cache got corrupt - started seeing
 TCP_SWAPFAIL_MISS and md5 mismatches. Config is cache_dir ufs
 /usr/local/squid/var/cache/squid 102400 16 256
 
 It's possible that at one point I might have started 2 instances of
 squid running at once... could that cause corruption?

Yes, very likely. More so the longer they were both running.

I see you mention segfaults below, that can also cause it for any
objects in use at the time of the crash.

 
 And if it happens again, what sort of things should I collect to
 better diagnose the problem?

Firstly, Squid is expected to cope with cache corruption by cleaning
up entries as it identifies them. The SWAPFAIL and MD5 details you
mention above are signs that the detection at least is occuring.

After any type of corruption these messages can be expected for a
while with an exponential drop-off in frequency as the cache gets
fixed. Only if it starts occuring with unknown cause or does not
decrease in frequency is there a serious problem to attend to (usually
a disk dying, Squid crashing a lot, or second Squid process started
wrongly).


The basic things required for bug reports
(http://wiki.squid-cache.org/SquidFaq/BugReporting). That also
includes investigation of the segfauls mentioned below.

Plus if you can the URL, HTTP request headers in full, any access.log
entry you can match up with the issue.

 As I see it there are two problems: 1. that the cache got corrupt
 in the first place 2. that a corrupt cache can crash squid

These may in fact be the reverse with cause (2) and effect (1). When a
segfault happens the details are not logged at all because the process
doing the logging is the Squid which has died.

Only in an assertion failure or exception error is Squid running well
enough and able to log why before exiting.


 
 Unfortunately I did the stupid thing and deleted the cache without
 taking a copy for post-mortem... the best I can do is:
 
 [31072.428922] squid[6317]: segfault at 58 ip 0061a6f9 sp
 7fff8b9e2d40 error 4 in squid[40+4e9000] [31654.707792]
 squid[6329]: segfault at 58 ip 0061a6f9 sp 7fff54358fe0
 error 4 in squid[40+4e9000] [31783.399832] squid[6465]:
 segfault at 58 ip 0061a6f9 sp 7fff82af0aa0 error 4 in
 squid[40+4e9000] [31984.470507] squid[6509]: segfault at 58 ip
 0061a6f9 sp 7fff028a6640 error 4 in
 squid[40+4e9000] [32178.270298] squid[6576]: segfault at 58 ip
 0061a6f9 sp 7fffe64a07e0 error 4 in
 squid[40+4e9000] [32789.635935] squid[6626]: segfault at 58 ip
 0061a6f9 sp 76932960 error 4 in
 squid[40+4e9000]
 
 addr2line -e /usr/local/squid/sbin/squid 0061a6f9 
 /usr/local/src/squid-3.5.0.2-20141031-r13657/src/store.cc:962
 

This looks like http://bugs.squid-cache.org/show_bug.cgi?id=4131. The
two patched posted there seem to be getting good results.

Amos
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (MingW32)

iQEcBAEBAgAGBQJUfEVFAAoJELJo5wb/XPRjgVkH/1gj4utqZanu9jtLFEaBuV9Q
c4K+pw7grL8s1CH8vKvyNZnhSiDi7FsPPaG1RQTkJplhsqswZ2rUcLkVwAEHb2Ug
cS4uH9y5nN/M+O7yqmx/29JS1ITaXnR2ooy8PctKZoYqizEIz6UhDDd2vFuKiPFJ
rlhP+gvd8fDACtZgWLnojl6OmrFXmD0RyxZE0r8Y5wQyzIkbqveJfHzRcl7hkZJh
xJLPfiakK0RBHQSEDRJg/Jui8hv2UeaqGd/YJcF+XJZW6USY6tB8sVnwd8zir6Aw
q2VVbofu2YRn7RJmUrwwppvbmQ+j9ykRS5VMFkJrDVGrf0RohIqn1d7OO3pNJu0=
=DOp+
-END PGP SIGNATURE-
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] squid-3.5.0.2-20141031-r13657 crashes

2014-11-30 Thread James Harper
This has happened again a day or so after wiping the cache directory. Core dump 
this time:

#0  StoreEntry::checkCachable (this=this@entry=0x284c440) at store.cc:962
962 getReply()-content_length  store_maxobjsize) ||
(gdb) bt
#0  StoreEntry::checkCachable (this=this@entry=0x284c440) at store.cc:962
#1  0x00619edf in StoreEntry::memoryCachable 
(this=this@entry=0x284c440) at store.cc:1418
#2  0x006255d2 in StoreController::keepForLocalMemoryCache 
(this=optimized out, e=...) at store_dir.cc:798
#3  0x00625a75 in StoreController::handleIdleEntry (this=0x23a48d0, 
e=...) at store_dir.cc:891
#4  0x0061b091 in StoreEntry::unlock (this=0x284c440, 
context=context@entry=0x806224 clientReplyContext::forgetHit) at store.cc:543
#5  0x00535356 in clientReplyContext::forgetHit 
(this=this@entry=0x28eb5e8) at client_side_reply.cc:1586
#6  0x005390be in clientReplyContext::identifyFoundObject 
(this=0x28eb5e8, newEntry=optimized out) at client_side_reply.cc:1675
#7  0x0053ed0d in ClientHttpRequest::httpStart 
(this=this@entry=0x28e9ed8) at client_side_request.cc:1517
#8  0x005402b7 in ClientHttpRequest::processRequest 
(this=this@entry=0x28e9ed8) at client_side_request.cc:1503
#9  0x005420d5 in ClientHttpRequest::doCallouts (this=0x28e9ed8) at 
client_side_request.cc:1818
#10 0x00545bd7 in ClientRequestContext::clientAccessCheckDone 
(this=this@entry=0x28603d8, answer=...) at client_side_request.cc:821
#11 0x00546801 in ClientRequestContext::clientAccessCheck2 
(this=0x28603d8) at client_side_request.cc:718
#12 0x005427bc in ClientHttpRequest::doCallouts (this=0x28e9ed8) at 
client_side_request.cc:1711
#13 0x00545bd7 in ClientRequestContext::clientAccessCheckDone 
(this=this@entry=0x28603d8, answer=...) at client_side_request.cc:821
#14 0x005466c5 in clientAccessCheckDoneWrapper (answer=..., 
data=0x28603d8) at client_side_request.cc:730
#15 0x006c369b in ACLChecklist::checkCallback (this=0x28eda18, 
answer=...) at Checklist.cc:167
#16 0x006c3ea4 in ACLChecklist::completeNonBlocking (this=optimized 
out) at Checklist.cc:52
#17 0x006c43a3 in ACLChecklist::nonBlockingCheck (this=optimized out,
callback_=callback_@entry=0x5466a0 clientAccessCheckDoneWrapper(allow_t, 
void*), callback_data_=callback_data_@entry=0x28603d8) at Checklist.cc:255
#18 0x00546171 in ClientRequestContext::clientAccessCheck 
(this=0x28603d8) at client_side_request.cc:698
#19 0x005426ca in ClientHttpRequest::doCallouts (this=0x28e9ed8) at 
client_side_request.cc:1682
#20 0x00544b90 in ClientRequestContext::hostHeaderIpVerify 
(this=0x28603d8, ia=0x2860cc0, dns=...) at client_side_request.cc:526
#21 0x005ca5d4 in ipcacheCallback (i=i@entry=0x2860ca0, 
wait=wait@entry=330) at ipcache.cc:325
#22 0x005cae74 in ipcacheHandleReply (data=optimized out, 
answers=optimized out, na=optimized out, error_message=optimized out) at 
ipcache.cc:475
#23 0x0055b4a1 in idnsCallback (q=0x28561e8, q@entry=0x2862ff8, 
error=error@entry=0x0) at dns_internal.cc:1097
#24 0x0055f78f in idnsGrokReply (buf=buf@entry=0xb0d100 idnsRead(int, 
void*)::rbuf \262\201\200, sz=sz@entry=156, from_ns=optimized out)
at dns_internal.cc:1266
#25 0x005601b5 in idnsRead (fd=7, data=optimized out) at 
dns_internal.cc:1353
#26 0x00752223 in Comm::DoSelect (msec=optimized out) at 
ModEpoll.cc:277
#27 0x006d3f7f in CommSelectEngine::checkEvents (this=optimized out, 
timeout=optimized out) at comm.cc:1835
#28 0x0056893a in EventLoop::checkEngine 
(this=this@entry=0x7fffee8124b0, engine=engine@entry=0x7fffee812440, 
primary=primary@entry=true)
at EventLoop.cc:35
#29 0x00568b27 in EventLoop::runOnce (this=this@entry=0x7fffee8124b0) 
at EventLoop.cc:114
#30 0x00568ce8 in EventLoop::run (this=this@entry=0x7fffee8124b0) at 
EventLoop.cc:82
#31 0x005d0753 in SquidMain (argc=optimized out, argv=optimized 
out) at main.cc:1508
#32 0x004e378c in SquidMainSafe (argv=optimized out, argc=optimized 
out) at main.cc:1240
#33 main (argc=optimized out, argv=optimized out) at main.cc:1233

James
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users