You are the best,
We will launch a benchmark to see the diff

Le 07/02/2022 à 16:14, Eliezer Croitoru a écrit :

Hey David,

Since the handle_stdout runs in it’s own thread it’s sole purpose is to send results to stdout.

If I will run the next code in a simple software without the 0.5 sleep time:

     while RUNNING:

         if quit > 0:

           return

         while len(queue) > 0:

             item = queue.pop(0)

sys.stdout.write(item)

sys.stdout.flush()

         time.sleep(0.5)

what will happen is that the software will run with 100% CPU looping over and over on the size of the queue
while sometimes it will spit some data to stdout.

Adding a small delay with 0.5 secs will allow some “idle” time for the cpu in the loop preventing it from consuming
all the CPU time.

It’s a very old technique and there are others which are more efficient but it’s enough to demonstrate that a simple threaded helper is much better then any PHP code that was not meant to be running as a STDIN/OUT daemon/helper software.

All The Bests,

Eliezer

----

Eliezer Croitoru

NgTech, Tech Support

Mobile: +972-5-28704261

Email: ngtech1...@gmail.com

*From:*David Touzeau <da...@articatech.com>
*Sent:* Monday, February 7, 2022 02:42
*To:* Eliezer Croitoru <ngtech1...@gmail.com>; squid-users@lists.squid-cache.org
*Subject:* Re: [squid-users] external helper development

Sorry Elizer

It was a mistake... No, your code is clean..
Impressive for the first shot
Many thanks for your example, we will run our stress tool to see the difference...

Just a question

Why did you send 500 milliseconds of sleep in the handle_stdoud ? Is it for let squid closing the pipe ?


Le 06/02/2022 à 11:46, Eliezer Croitoru a écrit :

    Hey David,

    Not a fully completed helper but it seems to works pretty nice and
    might be better then what exist already:

    
https://gist.githubusercontent.com/elico/03938e3a796c53f7c925872bade78195/raw/21ff1bbc0cf3d91719db27d9d027652e8bd3de4e/threaded-helper-example.py

    #!/usr/bin/env python

    import sys

    import time

    import urllib.request

    import signal

    import threading

    #set debug mode for True or False

    debug = False

    #debug = True

    queue = []

    threads = []

    RUNNING = True

    quit = 0

    rand_api_url = "https://cloud1.ngtech.co.il/api/test.php";
    <https://cloud1.ngtech.co.il/api/test.php>

    def sig_handler(signum, frame):

    sys.stderr.write("Signal is received:" + str(signum) + "\n")

        global quit

        quit = 1

        global RUNNING

        RUNNING=False

    def handle_line(line):

         if not RUNNING:

             return

         if not line:

             return

         if quit > 0:

             return

         arr = line.split()

         response = urllib.request.urlopen( rand_api_url )

         response_text = response.read()

         queue.append(arr[0] + " " + response_text.decode("utf-8"))

    def handle_stdout(n):

         while RUNNING:

             if quit > 0:

               return

             while len(queue) > 0:

                 item = queue.pop(0)

    sys.stdout.write(item)

    sys.stdout.flush()

             time.sleep(0.5)

    def handle_stdin(n):

        while RUNNING:

             line = sys.stdin.readline()

             if not line:

                 break

             if quit > 0:

                 break

             line = line.strip()

             thread = threading.Thread(target=handle_line, args=(line,))

             thread.start()

    threads.append(thread)

    signal.signal(signal.SIGUSR1, sig_handler)

    signal.signal(signal.SIGUSR2, sig_handler)

    signal.signal(signal.SIGALRM, sig_handler)

    signal.signal(signal.SIGINT, sig_handler)

    signal.signal(signal.SIGQUIT, sig_handler)

    signal.signal(signal.SIGTERM, sig_handler)

    stdout_thread = threading.Thread(target=handle_stdout, args=(1,))

    stdout_thread.start()

    threads.append(stdout_thread)

    stdin_thread = threading.Thread(target=handle_stdin, args=(2,))

    stdin_thread.start()

    threads.append(stdin_thread)

    while(RUNNING):

        time.sleep(3)

    print("Not RUNNING")

    for thread in threads:

        thread.join()

    print("All threads stopped.")

    ## END

    Eliezer

    ----

    Eliezer Croitoru

    NgTech, Tech Support

    Mobile: +972-5-28704261

    Email: ngtech1...@gmail.com

    *From:*squid-users <squid-users-boun...@lists.squid-cache.org>
    <mailto:squid-users-boun...@lists.squid-cache.org> *On Behalf Of
    *David Touzeau
    *Sent:* Friday, February 4, 2022 16:29
    *To:* squid-users@lists.squid-cache.org
    *Subject:* Re: [squid-users] external helper development

    Elizer,

    Thanks for all this advice and indeed your arguments are valid
    between opening a socket, sending data, receiving data and closing
    the socket unlike direct access to a regex or a memory entry even
    if the calculation has already been done.

    But what surprises me the most is that we have produced a python
    plugin in thread which I provide you a code below.
    The php code is like your mentioned example ( No thread, just a
    loop and output OK )

    Results are after 6k requests, squid freeze and no surf can be
    made as with PHP code we can up to 10K requests and squid is happy
    really, we did not understand why python is so low.

    Here a python code using threads

    #!/usr/bin/env python
    import os
    import sys
    import time
    import signal
    import locale
    import traceback
    import threading
    import select
    import traceback as tb

    class ClienThread():

        def __init__(self):
            self._exiting = False
            self._cache = {}

        def exit(self):
            self._exiting = True

        def stdout(self, lineToSend):
            try:
                sys.stdout.write(lineToSend)
                sys.stdout.flush()

            except IOError as e:
                if e.errno==32:
                    # Error Broken PIPE!"
                    pass
            except:
                # other execpt
                pass

        def run(self):
            while not self._exiting:
                if sys.stdin in select.select([sys.stdin], [], [],
    0.5)[0]:
                    line = sys.stdin.readline()
                    LenOfline=len(line)

                    if LenOfline==0:
                        self._exiting=True
                        break

                    if line[-1] == '\n':line = line[:-1]
                    channel = None
                    options = line.split()

                    try:
                        if options[0].isdigit(): channel = options.pop(0)
                    except IndexError:
                        self.stdout("0 OK first=ERROR\n")
                        continue

                    # Processing here

                    try:
                        self.stdout("%s OK\n" % channel)
                    except:
                        self.stdout("%s ERROR first=ERROR\n" % channel)




    class Main(object):
        def __init__(self):
            self._threads = []
            self._exiting = False
            self._reload = False
            self._config = ""

            for sig, action in (
                (signal.SIGINT, self.shutdown),
                (signal.SIGQUIT, self.shutdown),
                (signal.SIGTERM, self.shutdown),
                (signal.SIGHUP, lambda s, f: setattr(self, '_reload',
    True)),
                (signal.SIGPIPE, signal.SIG_IGN),
            ):
                try:
                    signal.signal(sig, action)
                except AttributeError:
                    pass



        def shutdown(self, sig = None, frame = None):
            self._exiting = True
            self.stop_threads()

        def start_threads(self):

            sThread = ClienThread()
            t = threading.Thread(target = sThread.run)
            t.start()
            self._threads.append((sThread, t))



        def stop_threads(self):
            for p, t in self._threads:
                p.exit()
            for p, t in self._threads:
                t.join(timeout =  1.0)
            self._threads = []

        def run(self):
            """ main loop """
            ret = 0
            self.start_threads()
            return ret


    if __name__ == '__main__':
        # set C locale
        locale.setlocale(locale.LC_ALL, 'C')
        os.environ['LANG'] = 'C'
        ret = 0
        try:
            main = Main()
            ret = main.run()
        except SystemExit:
            pass
        except KeyboardInterrupt:
            ret = 4
        except:
        sys.exit(ret)

    Le 04/02/2022 à 07:06, Eliezer Croitoru a écrit :

        And about the cache of each helpers, the cost of a cache on a
        single helper is not much in terms of memory comparing to some
        network access.

        Again it’s possible to test and verify this on a loaded system
        to get results. The delay itself can be seen from squid side
        in the cache manager statistics.

        You can also try to compare the next ruby helper:

        https://wiki.squid-cache.org/EliezerCroitoru/SessionHelper

        About a shared “base” which allows helpers to avoid
        computation of the query…. It’s a good argument, however it
        depends what is the cost of
        pulling from the cache compared to calculating the answer.

        A very simple string comparison or regex matching would
        probably be faster than reaching a shared storage in many cases.

        Also take into account the “concurrency” support from the
        helper side.

        A helper that supports parallel processing of requests/lines
        can do better then many single helpers in more than once use case.

        In any case I would suggest to enable requests concurrency
        from squid side since the STDIN buffer will emulate some level
        of concurrency
        by itself and will allow squid to keep going forward faster.

        Just to mention that SquidGuard have used a single helper
        cache for a very long time, ie every single SquidGuard helper
        has it’s own copy of the whole

        configuration and database files in memory.

        And again, if you do have any option to implement a server
        service model and that the helpers will contact this main
        service you will be able to implement
        much faster internal in-memory cache compared to a
        redis/memcahe/other external daemon(need to be tested).

        A good example for this is ufdbguard which has helpers that
        are clients of the main service which does the whole heavy
        lifting and also holds
        one copy of the DB.

        I have implemented SquidBlocker this way and have seen that it
        out-performs any other service I have tried until now.


_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users

Reply via email to