Re: [squid-dev] File descriptor leak at ICAP reqmod rewrites of CONNECT requests

2020-12-11 Thread Alexey Sergin
Hi Eliezer!


> What version of Squid are you using?

I have reproduced the leak on squid versions: 4.8, 5.0.4, git master
38da9c24e.
In production deployment we use squid 4.x


> Can you provide a setup example for re-production?

Blow are the steps to reproduce the problem on the CentOS 7 system.

Test icap server is based on Example3 from
https://github.com/Peoplecantfly/icapserver/blob/master/TUTORIAL.md
The only difference is that implementations of example_REQMOD and
example_RESPMOD methods are swapped.

1. Put the following code to file server.py:
# 
# vvv cut here vvv
# 
import time
import threading

from icapserver import *

class ExampleICAPHandler(BaseICAPRequestHandler):

def example_OPTIONS(self):
self.set_icap_response(200)
self.set_icap_header('Methods', 'RESPMOD, REQMOD')
self.set_icap_header('Service',
'ICAP Server' + ' ' + self._server_version)
self.set_icap_header('Options-TTL', '3600')
self.set_icap_header('Preview', '0')
self.send_headers(False)

def example_RESPMOD(self):
self.no_adaptation_required()

def example_REQMOD(self):
if self.has_body:
while True:
if self.read_chunk() == '':
break
self.set_icap_response(200)
self.set_enc_status('HTTP/1.1 451 Unavailable For Legal Reasons')
self.send_headers(False)

class ExampleICAPServer():

def __init__(self, addr='', port=13440):
self.addr = addr
self.port = port

def start(self):
self.server = ICAPServer((self.addr, self.port), ExampleICAPHandler)
self.thread = threading.Thread(target=self.server.serve_forever)
self.thread.start()
return True

def stop(self):
self.server.shutdown()
self.server.server_close()
self.thread.join(2)
return True


try:
server = ExampleICAPServer()
server.start()
print 'Use Control-C to exit'
while True:
time.sleep(1)
except KeyboardInterrupt:
server.stop()
print "Finished"
# 
# ^^^ cut here ^^^
# 

2. Prepare environment for test icap server:
# yum install -y python2-virtualenv
# virtualenv-2 icapserver
# source icapserver/bin/activate
# pip install icapserver

3. Run the test icap server:
# source icapserver/bin/activate
# python server.py &

4. Install squid version 4 or above (built with --enable-icap-client).

5. Rewrite /etc/squid/squid.conf:
cache deny all
cache_mem 0
shutdown_lifetime 5 seconds
http_port 3128
acl localnet src 127.0.0.1
http_access deny !localnet
http_access allow all
icap_enable on
icap_service example_req reqmod_precache icap://127.0.0.1:13440/example
adaptation_access example_req allow all
icap_service_failure_limit -1

6. Apply squid configuration changes:
# systemctl restart squid

7. Initiate http request:
# curl -q -x http://127.0.0.1:3128 https://example.com

8. Output of curl should look like:
curl: (56) Received HTTP code 451 from proxy after CONNECT

9. Leaked descriptor is in a CLOSE-WAIT state:
# ss -natpo | grep CLOSE-WAIT


> If you can supply couple(2-3 or more) ICAP connections pcap I can try to
see what happens in the connection level.

Pcap file with 3 connections attached.


> >From my experience there is much differences between holding the ICAP
session open or closed after once request.
> The reason for this is that like HTTP/1.0 ICAP is a “blocking”(don’t
remember the exact word, Alex might remember).
> There for if the proxy has 800 requests per seconds it’s better for the
setup to open new connection per request to match the load.
> It will const memory and CPU in the short term but in the long term the
clients requests will bock less and..
> It will probably consume less then the ICAP connections memory leak.

Leak happens both with and without the load.
Http client connections do leak, but ICAP connections do not.


> There for if the proxy has 800 requests per seconds it’s better for the
setup to open new connection per request to match the load.

Production ICAP server reports "Max-Connections: 5000" in response to ICAP
OPTIONS request.
Is in sufficient? If not, what should ICAP server do, and what should be
done squid (and in squid.conf), to achieve the "open new connection per
request" behaviour?


Thanks!

On Fri, Dec 11, 2020 at 12:11 AM Eliezer Croitor 
wrote:

> Hey Alexey,
>
>
>
> What version of Squid are you using?
>
> Can you provide a setup example for re-production?
>
> I can write the relevant ICAP service however I am missing pcap file to
> understand the ICAP sessions.
>
> If you can supply couple(2-3 or more) ICAP connections pcap I can try to
> see what happens in the connection level.
>
>
>
> From my experience there is much differences between holding the ICAP
> session open or closed after once request.
>
> The reason for this is that like HTTP/1.0 ICAP is a “blocking”(don’t
> remember the exact 

Re: [squid-dev] File descriptor leak at ICAP reqmod rewrites of CONNECT requests

2020-12-11 Thread Alex Rousskov
On 12/10/20 3:33 PM, Alexey Sergin wrote:

> - Squid writes to cache.log a message like "kick abandoning <>";

These messages indicate a Squid bug, most likely in REQMOD request
satisfaction implementation specific to CONNECT use cases. The messages
are not prefixed with a "BUG" label, but they should be.


> - Squid does not close the file descriptor used for http client connection.

Yes, that is a likely side effect of the above-mentioned bug.


> client_side.cc. Closing clientConnection right after
> "debugs(<>abandoning<>)" fixes the leak.

> Is it ok to always close() clientConnection when "abandoning" thing
> happens? 

> Are there any known scenarios where this close() would be
> inappropriate?

Unknown. Such a closure (alone) is not a proper fix. If well-tested, it
may be considered to be a good-enough workaround, but no more than that.

What is currently missing (at least) is understanding of what is going
on. The correct fix, whatever it is, would be determined by that
understanding.

In general, responding with a 403 does not invalidate the client
connection, so it does not have to be closed. Said that, I am sure there
are use cases where such closure would be a good idea. I would not be
surprised if Squid closes the connection after denying regular requests.


> Could you please give me some advice on a better/proper fix, if close()
> at "abandoning" time is wrong?

Unfortunately, I cannot. Somebody needs to investigate the problem and
identify the bug in ConnStateData::kick() logic (or elsewhere).

Alex.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev