Hi Joe,

when you send a request via LWP::UserAgent to the Server which does the long lasting SFTP calls, then I'm pretty sure that you get a timout in the LWP::UserAgent code.

I'm pretty sure the client (LWP::UserAgent) is not waiting long enough for the answer: https://metacpan.org/pod/LWP::UserAgent#timeout

After having here a long timeout you have to be sure that the very first client which sent the very first request also waits long enough to let the application server make severals tries, therefore n * timeout.

Best wishes
Andreas


Am 13.05.2025 um 16:46 schrieb Joseph He:
Many thanks to you all.

I am still trying to figure out the issue. Let me re-explain the problem I experienced with some details.

The environment is Ubuntu 22.04, Apache2, ModPerl.
I run a Http::request with LWP::UserAgent, the server receives the request and starts to process it. But it takes much longer due to a stalled SFTP call to the remote server, the Apache server timeout and sends back failure, meanwhile,*the server actually is still trying to process this request*. On the calling side, after receiving the failure status, it initiates another http::request and the load balancer redirects this call to another server for processing.
It turns out this same http::request is processed twice.

On my production server the timeout happens at 300 seconds mark. On my QA and Dev server, the timeout happens at 600 seconds. I have not changed anything on my production server yet. But on my QA and DEV servers, I have tried to change Timeout in apache2.conf, have tried to add Timeout to the virtualhost config, also have tried to add SetPerlEnv MOD_PERL_TIMEOUT to the virtualhost config, none of them change the timeout behavior of my QA and DEV servers.

So what exactly controls the Timeout? I am totally lost.

Cheers,
Joe


On Wed, Apr 23, 2025 at 5:17 PM Mithun Bhattacharya <mit...@gmail.com> wrote:

    Okay agreed that is a valid time out basically it is saying that a
    client has established tcp/ip connection but has not put its
    request either a get put or a post

    On Wed, Apr 23, 2025, 3:38 PM Joseph He <joseph.he.2...@gmail.com>
    wrote:

        On Apache2 doc, I found this. How does this timeout work? It
        looks like it can only wait for 300 seconds before failing a
        request.

        https://httpd.apache.org/docs/2.0/mod/core.html#timeout
        Description:
        <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Description>
                Amount of time the server will wait for certain events before
        failing a request
        Syntax:
        <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Syntax>
                |TimeOut seconds|
        Default:
        <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Default>
                |TimeOut 300|
        Context:
        <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Context>
                server config, virtual host
        Status:
        <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Status>
                Core
        Module:
        <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Module>
                core

        The |TimeOut| directive currently defines the amount of time
        Apache will wait for three things:

         1. The total amount of time it takes to receive a GET request.
         2. The amount of time between receipt of TCP packets on a
            POST or PUT request.
         3. The amount of time between ACKs on transmissions of TCP
            packets in responses.

        We plan on making these separately configurable at some point
        down the road. The timer used to default to 1200 before 1.2,
        but has been lowered to 300 which is still far more than
        necessary in most situations. It is not set any lower by
        default because there may still be odd places in the code
        where the timer is not reset when a packet is sent.


        On Wed, Apr 23, 2025 at 3:07 PM Mithun Bhattacharya
        <mit...@gmail.com> wrote:

            You configure timeout at the client side. Apache is at the
            server side. Server doesn't have a concept of time it
            could take days to run and not care.

            mod_perl code is where you are sending the http return
            status to make sure the client doesn't timeout waiting for
            the server to respond.


            On Wed, Apr 23, 2025, 2:19 PM Joseph He
            <joseph.he.2...@gmail.com> wrote:

                Thanks, all.
                Is that Apache timeout controlled by its
                configuration "Timeout"?
                I don't think it has anything to do with modPerl. Am
                I missing something?
                Thanks.

                On Wed, Apr 23, 2025 at 1:41 PM Mithun Bhattacharya
                <mit...@gmail.com> wrote:

                    Timeout happens because of how we handle the
                    request. Timeout is basically no response came
                    back. Why that happens is because we think we want
                    to have a correct response. Unfortunately for long
                    running requests the correct response shouldn't be
                    via http response code or we face situations like
                    this. Instead reply with a 200 OK immediately and
                    then provide correct status in the message body.
                    Once a response code/header has been sent timeout
                    won't trigger and you could potentially hold the
                    connection for hours without a problem.

                    On Wed, Apr 23, 2025, 9:32 AM Andreas Mock
                    <andreas.m...@web.de> wrote:

                        Hi Joseph,

                        your description is very vague, so can only
                        answer on some assumptions:

                        It sounds like a timeout is fired somewhere.

                        Best advice in these situations: Log as many
                        steps as you can. Keep your
                        eyes open on TCP/IP and higher level timeouts.

                        Declare only ONE instance responsible for a
                        retry: Either the app server
                        calling the dispatcher with several tries or
                        the dispatcher trying for
                        himself. Not both.

                        Best regards
                        Andreas


                        Am 23.04.2025 um 16:21 schrieb Joseph He:
                        > All, good day.
                        >
                        > Here is the issue I have.
                        > My entire application is running on
                        ModPerl/Apache environment.
                        > I send Http::Request with data load from my
                        App server to a dispatch
                        > server thru LWP::UserAgent, I set the
                        timeout 600 seconds.
                        >
                        > The dispatch server is supposed to
                        manipulate the data and send the
                        > data to an external SFTP server. Because the
                        SFTP can fail, it will
                        > keep trying up to 4 times with 30 seconds
                        sleep in case that SFTP
                        > connection fails.
                        >
                        > Recently, I found that I uploaded the file
                        twice sometimes. I figured
                        > out the root cause is that my Dispatch
                        server returns 'failure' at 6
                        > minutes while it keeps trying to do the
                        SFTP. The App server
                        > received HTTP::Response with error status so
                        it issued another call to
                        > send data. It turns out I uploaded the
                        identified file twice.
                        >
                        > Anybody has this sort of experience? Why
                        does the dispatch server
                        > return 'error' while it still processes the
                        data?
                        >
                        > Thanks a lot,
                        > Joseph
                        >

Reply via email to