Re: [squid-users] squid hangs and dies and can not be killed - needs system reboot

2023-12-19 Thread Amish

Hi Amos,

On 19/12/23 20:25, Amos Jeffries wrote:


On 19/12/23 16:29, Amish wrote:

Hi Alex,

Thank you for replying.

On 19/12/23 01:14, Alex Rousskov wrote:

On 2023-12-18 09:35, Amish wrote:

I use Arch Linux and today I updated squid from squid 5.7 to squid 
6.6.


> Dec 18 13:01:24 mumbai squid[604]: kick abandoning conn199

I do not know whether the above problem is the primary problem in 
your setup, but it is a red flag. Transactions on the same 
connection may get stuck after that message; it is essentially a 
Squid bug.


I am not sure at all, but this bug might be related to Bug 5187 
workaround that went into Squid v6.2 (commit c44cfe7): 
https://bugs.squid-cache.org/show_bug.cgi?id=5187


Does Squid accept new TCP connections after it enters what you 
describe as a dead state? For example, does "telnet 127.0.0.1 8080" 
establishes a connection if executed on the same machine as Squid?


Yes it establishes connection. But I do not know what to do next. 
Browser showed "Connection timed out" message. But I believe 
browser's also connected but nothing happened afterwards.




Ah ... that port is an interception port. It should *not* connect.

Please ensure your firewall contains the "-t mangle" rules for each 
interception port you use. As shown at 



No, port 8080 is not an interception port. And firewall is fine. 
Everything worked before upgrade to 6.6.



> kill -9 does nothing

Is it possible that you are trying to kill the wrong process? You 
should be killing this process AFAICT:


> root 601  0.0  0.2  73816 22528 ?    Ss 12:59 0:02
> /usr/bin/squid -f /etc/squid/btnet/squid.btnet.conf --foreground -sYC


I did not clarify but all processes needed SIGKILL and vanished 
except the Dead squid process which still remained.


# systemctl stop squid

Dec 19 08:46:38 mumbai systemd[1]: squid.service: State 
'stop-sigterm' timed out. Killing.


FWIW, Squid default shutdown grace period for clients to disconnect is 
longer that systemd typically is willing to wait for a service shutdown.


Please set "shutdown_lifetime 10 seconds" in your squid.conf.


Default shutdown_lifetime is 30 seconds.

I think systemd waits 90 seconds.



Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 601 
(squid) with signal SIGKILL.
Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 604 
(squid) with signal SIGKILL.


This is systemd running the command " kill -9 604 ".

Per the Squid code: "XXX: In SMP mode, uncatchable SIGKILL only kills 
the master process".


You can try SIGTERM instead, and repeat up to 3 times if the first 
does not close the process.


kill -9 is supposed to kill the process immediately. But it didnt. 601 
(master process) got killed but 604 did not.


Even when powering off, systemd "finished" shutdown target, unmounted 
partitions and then killed all squid processes but could not kill 
process 604.


After waiting for 2-3 minutes, I forced power off the system and restarted.

Regards,

Amish



HTH
Amos

___
squid-users mailing list
squid-users@lists.squid-cache.org
https://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] squid hangs and dies and can not be killed - needs system reboot

2023-12-19 Thread Alex Rousskov

On 2023-12-18 22:29, Amish wrote:

On 19/12/23 01:14, Alex Rousskov wrote:

On 2023-12-18 09:35, Amish wrote:


I use Arch Linux and today I updated squid from squid 5.7 to squid 6.6.


> Dec 18 13:01:24 mumbai squid[604]: kick abandoning conn199

I do not know whether the above problem is the primary problem in your 
setup, but it is a red flag. Transactions on the same connection may 
get stuck after that message; it is essentially a Squid bug.


I am not sure at all, but this bug might be related to Bug 5187 
workaround that went into Squid v6.2 (commit c44cfe7): 
https://bugs.squid-cache.org/show_bug.cgi?id=5187


Does Squid accept new TCP connections after it enters what you 
describe as a dead state? For example, does "telnet 127.0.0.1 8080" 
establishes a connection if executed on the same machine as Squid?


Yes it establishes connection. But I do not know what to do next. 


This tells us that your Squid is still listening for incoming 
connections. Most likely, it is not "dead" but running and just unable 
to make progress with those connections (for yet unknown reasons). That 
information is helpful but not sufficient (for me) to solve the problem 
you are describing.


The next step that I would recommend is to collect debugging information 
from the running process and share a pointer to the corresponding 
compressed cache.log file:


* Ideally, start collection when Squid starts and reproduce the problem 
while collecting full debugging information:

http://wiki.squid-cache.org/SquidFaq/BugReporting#full-debug-output

* If you have to, start collection after Squid is already in bad state 
and just before you use telnet or browser to tickle Squid:

http://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction

Do not use any secret information (e.g., production certificate keys) 
for these tests (unless you are going to share the logs privately with 
those you trust).


Do not downgrade to v5 for these tests.


HTH,

Alex.


Browser showed "Connection timed out" message. But I believe browser's 
also connected but nothing happened afterwards.




> kill -9 does nothing

Is it possible that you are trying to kill the wrong process? You 
should be killing this process AFAICT:


> root 601  0.0  0.2  73816 22528 ?    Ss   12:59 0:02
> /usr/bin/squid -f /etc/squid/btnet/squid.btnet.conf --foreground -sYC


I did not clarify but all processes needed SIGKILL and vanished except 
the Dead squid process which still remained.


# systemctl stop squid

Dec 19 08:46:38 mumbai systemd[1]: squid.service: State 'stop-sigterm' 
timed out. Killing.
Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 601 
(squid) with signal SIGKILL.
Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 604 
(squid) with signal SIGKILL.
Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 607 
(security_file_c) with signal SIGKILL.
Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 608 
(security_file_c) with signal SIGKILL.
Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 609 
(security_file_c) with signal SIGKILL.
Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 610 
(security_file_c) with signal SIGKILL.
Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 611 
(security_file_c) with signal SIGKILL.
Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 622 
(log_file_daemon) with signal SIGKILL.
Dec 19 08:46:38 mumbai systemd[1]: squid.service: Main process exited, 
code=killed, status=9/KILL
Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 604 
(squid) with signal SIGKILL.


Waited for 2 minutes for squid to stop then pressed ctrl-c to systemctl 
stop squid command.


As you can see in last line shows that attempt was made to kill DEAD 
process with PID 604.


# ps aux |grep squid
proxy    604  0.0  0.0  0 0 ?    D    Dec18   0:03 [squid]

Now only DEAD squid process remains.

What next? Should I downgrade to 5.9 and check?

Regards

Amish


Alex.


After the update from 5.7 to 6.6, squid starts but then reaches Dead 
state in a minute or two.


# ps aux | grep squid
root 601  0.0  0.2  73816 22528 ?    Ss   12:59 0:02 
/usr/bin/squid -f /etc/squid/btnet/squid.btnet.conf --foreground -sYC

proxy    604  0.0  0.0  0 0 ?    D    12:59 0:03 [squid]
proxy    607  0.0  0.0  11976  7424 ?    S    12:59 0:00 
(security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB
proxy    608  0.0  0.0  11976  7168 ?    S    12:59 0:00 
(security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB
proxy    609  0.0  0.0  11712  5632 ?    S    12:59 0:00 
(security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB
proxy    610  0.0  0.0  11712  5376 ?    S    12:59 0:00 
(security_file_certgen) -s /var/cache/squid/ssl_db -M 4MB
proxy    611  0.0  0.0  11712  5504 ?    S    12:59 0:00 
(security_file_certgen) 

Re: [squid-users] squid hangs and dies and can not be killed - needs system reboot

2023-12-19 Thread Amos Jeffries


On 19/12/23 16:29, Amish wrote:

Hi Alex,

Thank you for replying.

On 19/12/23 01:14, Alex Rousskov wrote:

On 2023-12-18 09:35, Amish wrote:


I use Arch Linux and today I updated squid from squid 5.7 to squid 6.6.


> Dec 18 13:01:24 mumbai squid[604]: kick abandoning conn199

I do not know whether the above problem is the primary problem in your 
setup, but it is a red flag. Transactions on the same connection may 
get stuck after that message; it is essentially a Squid bug.


I am not sure at all, but this bug might be related to Bug 5187 
workaround that went into Squid v6.2 (commit c44cfe7): 
https://bugs.squid-cache.org/show_bug.cgi?id=5187


Does Squid accept new TCP connections after it enters what you 
describe as a dead state? For example, does "telnet 127.0.0.1 8080" 
establishes a connection if executed on the same machine as Squid?


Yes it establishes connection. But I do not know what to do next. 
Browser showed "Connection timed out" message. But I believe browser's 
also connected but nothing happened afterwards.




Ah ... that port is an interception port. It should *not* connect.

Please ensure your firewall contains the "-t mangle" rules for each 
interception port you use. As shown at 






> kill -9 does nothing

Is it possible that you are trying to kill the wrong process? You 
should be killing this process AFAICT:


> root 601  0.0  0.2  73816 22528 ?    Ss   12:59 0:02
> /usr/bin/squid -f /etc/squid/btnet/squid.btnet.conf --foreground -sYC


I did not clarify but all processes needed SIGKILL and vanished except 
the Dead squid process which still remained.


# systemctl stop squid

Dec 19 08:46:38 mumbai systemd[1]: squid.service: State 'stop-sigterm' 
timed out. Killing.



FWIW, Squid default shutdown grace period for clients to disconnect is 
longer that systemd typically is willing to wait for a service shutdown.


Please set "shutdown_lifetime 10 seconds" in your squid.conf.


Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 601 
(squid) with signal SIGKILL.
Dec 19 08:46:38 mumbai systemd[1]: squid.service: Killing process 604 
(squid) with signal SIGKILL.


This is systemd running the command " kill -9 604 ".

Per the Squid code: "XXX: In SMP mode, uncatchable SIGKILL only kills 
the master process".


You can try SIGTERM instead, and repeat up to 3 times if the first does 
not close the process.




HTH
Amos
___
squid-users mailing list
squid-users@lists.squid-cache.org
https://lists.squid-cache.org/listinfo/squid-users