On 03/07/2017 01:08 PM, Heiler Bemerguy wrote: > Some log from right now...
Here is my analysis: > 15:53:05.255| ipc/Queue.h findOldest: peeking from 7 to 6 at 1 Squid worker (kid6) is looking at the queue of disker (kid7) responses. There is just one response in the queue. > IpcIoFile.cc canWait: cannot wait: 1136930911 oldest: ipcIo6.381049w7 Squid worker is trying to estimate whether it has enough time to queue (and eventually perform) more disk I/O. The expected wait is 1'136'930'911 milliseconds or ~13 days. That is longer than the configured cache_dir swap-timeout (a few hundred milliseconds) so Squid refuses to queue this disk I/O request: > store_io.cc storeCreate: no swapdirs for e:=sw1p2RDV/0x206e17d0*4 The same story happens in your other log snippets AFAICT: > 15:53:05.302| ipc/Queue.h findOldest: peeking from 8 to 6 at 1 > IpcIoFile.cc canWait: cannot wait: 1136930958 oldest: ipcIo6.153009r8 > store_io.cc storeCreate: no swapdirs for e:=msw1DV/0x3a4d5870*2 and > 15:53:05.318| ipc/Queue.h findOldest: peeking from 7 to 6 at 1 > IpcIoFile.cc canWait: cannot wait: 1136930974 oldest: ipcIo6.381049w7 > store_io.cc storeCreate: no swapdirs for e:m381432=w1p2DV/0x3ace6740*4 and > 15:53:05.793| ipc/Queue.h findOldest: peeking from 7 to 6 at 1 > IpcIoFile.cc canWait: cannot wait: 1136931448 oldest: ipcIo6.381049w7 > store_io.cc storeCreate: no swapdirs for e:=sw1V/0x206e17d0*1 Focusing on one disker (kid7), we can see that the oldest response does not change: It is always ipcIo6.381049w7. This stuck response results in gradual increment of the expected wait time with every check, matching wall clock time increment: > 15:53:05.255| cannot wait: 1136930911 > 15:53:05.318| cannot wait: 1136930974 > 15:53:05.793| cannot wait: 1136931448 These stuck disker responses probably explain why your disks do not receive any traffic. It is potentially important that both disker responses shown in your logs got stuck at approximately the same absolute time ~13 days ago (around 2017-02-22, give or take a day; subtract 1136930911 milliseconds from 15:53:05.255 in your Squid time zone to know the "exact" time when those stuck requests were queued). How can a disker response get stuck? Most likely, something unusual happened ~13 days ago. This could be a Squid bug and/or a kid restart. * Do all currently running Squid kid processes have about the same start time? [1] * Do you see ipcIo6.381049w7 or ipcIo6.153009r8 mentioned in any old non-debugging messages/warnings? [1] http://stackoverflow.com/questions/5731234/how-to-get-the-start-time-of-a-long-running-linux-process Thank you, Alex. _______________________________________________ squid-dev mailing list [email protected] http://lists.squid-cache.org/listinfo/squid-dev
