Andreas, thank you. On the client side, I set the timeout at LWP::UserAgent request to 600, and I can verify that it indeed works on my QA and DEV environment. If I change it to 120, then it can timeout at 120. So on my production server, the client side receives a timeout from the server after 5 minutes, so I still think the server Timeout plays a role here. I just don't know what config I can change to test it out.
Joe On Tue, May 13, 2025 at 10:07 AM Andreas Mock <andreas.m...@web.de> wrote: > Hi Joe, > > when you send a request via LWP::UserAgent to the Server which does the > long lasting SFTP calls, then I'm pretty sure that you get a timout in the > LWP::UserAgent code. > > I'm pretty sure the client (LWP::UserAgent) is not waiting long enough for > the answer: https://metacpan.org/pod/LWP::UserAgent#timeout > > After having here a long timeout you have to be sure that the very first > client which sent the very first request also waits long enough to let the > application server make severals tries, therefore n * timeout. > > Best wishes > Andreas > > > Am 13.05.2025 um 16:46 schrieb Joseph He: > > Many thanks to you all. > > I am still trying to figure out the issue. Let me re-explain the problem I > experienced with some details. > > The environment is Ubuntu 22.04, Apache2, ModPerl. > I run a Http::request with LWP::UserAgent, the server receives the request > and starts to process it. > But it takes much longer due to a stalled SFTP call to the remote server, > the Apache server timeout and sends back failure, meanwhile,* the server > actually is still trying to process this request*. > On the calling side, after receiving the failure status, it initiates > another http::request and the load balancer redirects this call to another > server for processing. > It turns out this same http::request is processed twice. > > On my production server the timeout happens at 300 seconds mark. On my QA > and Dev server, the timeout happens at 600 seconds. I have not changed > anything on my production server yet. > But on my QA and DEV servers, I have tried to change Timeout in > apache2.conf, have tried to add Timeout to the virtualhost config, also > have tried to add SetPerlEnv MOD_PERL_TIMEOUT to the virtualhost config, > none of them change the timeout behavior of my QA and DEV servers. > > So what exactly controls the Timeout? I am totally lost. > > Cheers, > Joe > > > On Wed, Apr 23, 2025 at 5:17 PM Mithun Bhattacharya <mit...@gmail.com> > wrote: > >> Okay agreed that is a valid time out basically it is saying that a client >> has established tcp/ip connection but has not put its request either a get >> put or a post >> >> On Wed, Apr 23, 2025, 3:38 PM Joseph He <joseph.he.2...@gmail.com> wrote: >> >>> On Apache2 doc, I found this. How does this timeout work? It looks like >>> it can only wait for 300 seconds before failing a request. >>> >>> https://httpd.apache.org/docs/2.0/mod/core.html#timeout >>> Description: >>> <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Description> >>> Amount >>> of time the server will wait for certain events before failing a request >>> Syntax: >>> <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Syntax> >>> TimeOut seconds >>> Default: >>> <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Default> TimeOut >>> 300 >>> Context: >>> <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Context> server >>> config, virtual host >>> Status: >>> <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Status> Core >>> Module: >>> <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Module> core >>> >>> The TimeOut directive currently defines the amount of time Apache will >>> wait for three things: >>> >>> 1. The total amount of time it takes to receive a GET request. >>> 2. The amount of time between receipt of TCP packets on a POST or >>> PUT request. >>> 3. The amount of time between ACKs on transmissions of TCP packets >>> in responses. >>> >>> We plan on making these separately configurable at some point down the >>> road. The timer used to default to 1200 before 1.2, but has been lowered to >>> 300 which is still far more than necessary in most situations. It is not >>> set any lower by default because there may still be odd places in the code >>> where the timer is not reset when a packet is sent. >>> >>> On Wed, Apr 23, 2025 at 3:07 PM Mithun Bhattacharya <mit...@gmail.com> >>> wrote: >>> >>>> You configure timeout at the client side. Apache is at the server side. >>>> Server doesn't have a concept of time it could take days to run and not >>>> care. >>>> >>>> mod_perl code is where you are sending the http return status to make >>>> sure the client doesn't timeout waiting for the server to respond. >>>> >>>> On Wed, Apr 23, 2025, 2:19 PM Joseph He <joseph.he.2...@gmail.com> >>>> wrote: >>>> >>>>> Thanks, all. >>>>> Is that Apache timeout controlled by its configuration "Timeout"? >>>>> I don't think it has anything to do with modPerl. Am I missing >>>>> something? >>>>> Thanks. >>>>> >>>>> On Wed, Apr 23, 2025 at 1:41 PM Mithun Bhattacharya <mit...@gmail.com> >>>>> wrote: >>>>> >>>>>> Timeout happens because of how we handle the request. Timeout is >>>>>> basically no response came back. Why that happens is because we think we >>>>>> want to have a correct response. Unfortunately for long running requests >>>>>> the correct response shouldn't be via http response code or we face >>>>>> situations like this. Instead reply with a 200 OK immediately and then >>>>>> provide correct status in the message body. Once a response code/header >>>>>> has >>>>>> been sent timeout won't trigger and you could potentially hold the >>>>>> connection for hours without a problem. >>>>>> >>>>>> On Wed, Apr 23, 2025, 9:32 AM Andreas Mock <andreas.m...@web.de> >>>>>> wrote: >>>>>> >>>>>>> Hi Joseph, >>>>>>> >>>>>>> your description is very vague, so can only answer on some >>>>>>> assumptions: >>>>>>> >>>>>>> It sounds like a timeout is fired somewhere. >>>>>>> >>>>>>> Best advice in these situations: Log as many steps as you can. Keep >>>>>>> your >>>>>>> eyes open on TCP/IP and higher level timeouts. >>>>>>> >>>>>>> Declare only ONE instance responsible for a retry: Either the app >>>>>>> server >>>>>>> calling the dispatcher with several tries or the dispatcher trying >>>>>>> for >>>>>>> himself. Not both. >>>>>>> >>>>>>> Best regards >>>>>>> Andreas >>>>>>> >>>>>>> >>>>>>> Am 23.04.2025 um 16:21 schrieb Joseph He: >>>>>>> > All, good day. >>>>>>> > >>>>>>> > Here is the issue I have. >>>>>>> > My entire application is running on ModPerl/Apache environment. >>>>>>> > I send Http::Request with data load from my App server to a >>>>>>> dispatch >>>>>>> > server thru LWP::UserAgent, I set the timeout 600 seconds. >>>>>>> > >>>>>>> > The dispatch server is supposed to manipulate the data and send >>>>>>> the >>>>>>> > data to an external SFTP server. Because the SFTP can fail, it >>>>>>> will >>>>>>> > keep trying up to 4 times with 30 seconds sleep in case that SFTP >>>>>>> > connection fails. >>>>>>> > >>>>>>> > Recently, I found that I uploaded the file twice sometimes. I >>>>>>> figured >>>>>>> > out the root cause is that my Dispatch server returns 'failure' at >>>>>>> 6 >>>>>>> > minutes while it keeps trying to do the SFTP. The App server >>>>>>> > received HTTP::Response with error status so it issued another >>>>>>> call to >>>>>>> > send data. It turns out I uploaded the identified file twice. >>>>>>> > >>>>>>> > Anybody has this sort of experience? Why does the dispatch server >>>>>>> > return 'error' while it still processes the data? >>>>>>> > >>>>>>> > Thanks a lot, >>>>>>> > Joseph >>>>>>> > >>>>>>> >>>>>>