Andreas, I add sleep 300 to the server code to simulate the stall SFTP and I set the server Timeout to be 150. Then I run curl command with different timeout 25, 50, 100, 150, 200, 250, the curl command always timeout accordingly. The behavior of curl is exactly the same as that of changing LWP::UserAgent timeout, which verifies the timeout option on the client side indeed works as desired. But no matter how I treak the server side Apache config, it just does not do anything.
Cheers, Joe On Tue, May 13, 2025 at 10:49 AM Andreas Mock <andreas.m...@web.de> wrote: > Hallo Joe, > > try to reduce the problem. > > Make the call to your SFTP-Service via curl or some other http(s) client > to see whether you get the same timeout. If yes, than the server side is > closing the connection. If not then you have to investigate the > LWP::UserAgent part. > > Another hint in combination with SSL: > https://stackoverflow.com/questions/9400068/make-timeout-work-for-lwpuseragent-https > > Best regards > Andreas > > > Am 13.05.2025 um 17:22 schrieb Joseph He: > > Andreas, thank you. > > On the client side, I set the timeout at LWP::UserAgent request to 600, > and I can verify that it indeed works on my QA and DEV environment. If I > change it to 120, then it can timeout at 120. > So on my production server, the client side receives a timeout from the > server after 5 minutes, so I still think the server Timeout plays a role > here. I just don't know what config I can change to test it out. > > Joe > > On Tue, May 13, 2025 at 10:07 AM Andreas Mock <andreas.m...@web.de> wrote: > >> Hi Joe, >> >> when you send a request via LWP::UserAgent to the Server which does the >> long lasting SFTP calls, then I'm pretty sure that you get a timout in the >> LWP::UserAgent code. >> >> I'm pretty sure the client (LWP::UserAgent) is not waiting long enough >> for the answer: https://metacpan.org/pod/LWP::UserAgent#timeout >> >> After having here a long timeout you have to be sure that the very first >> client which sent the very first request also waits long enough to let the >> application server make severals tries, therefore n * timeout. >> >> Best wishes >> Andreas >> >> >> Am 13.05.2025 um 16:46 schrieb Joseph He: >> >> Many thanks to you all. >> >> I am still trying to figure out the issue. Let me re-explain the problem >> I experienced with some details. >> >> The environment is Ubuntu 22.04, Apache2, ModPerl. >> I run a Http::request with LWP::UserAgent, the server receives the >> request and starts to process it. >> But it takes much longer due to a stalled SFTP call to the remote server, >> the Apache server timeout and sends back failure, meanwhile,* the server >> actually is still trying to process this request*. >> On the calling side, after receiving the failure status, it initiates >> another http::request and the load balancer redirects this call to another >> server for processing. >> It turns out this same http::request is processed twice. >> >> On my production server the timeout happens at 300 seconds mark. On my QA >> and Dev server, the timeout happens at 600 seconds. I have not changed >> anything on my production server yet. >> But on my QA and DEV servers, I have tried to change Timeout in >> apache2.conf, have tried to add Timeout to the virtualhost config, also >> have tried to add SetPerlEnv MOD_PERL_TIMEOUT to the virtualhost config, >> none of them change the timeout behavior of my QA and DEV servers. >> >> So what exactly controls the Timeout? I am totally lost. >> >> Cheers, >> Joe >> >> >> On Wed, Apr 23, 2025 at 5:17 PM Mithun Bhattacharya <mit...@gmail.com> >> wrote: >> >>> Okay agreed that is a valid time out basically it is saying that a >>> client has established tcp/ip connection but has not put its request either >>> a get put or a post >>> >>> On Wed, Apr 23, 2025, 3:38 PM Joseph He <joseph.he.2...@gmail.com> >>> wrote: >>> >>>> On Apache2 doc, I found this. How does this timeout work? It looks like >>>> it can only wait for 300 seconds before failing a request. >>>> >>>> https://httpd.apache.org/docs/2.0/mod/core.html#timeout >>>> Description: >>>> <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Description> >>>> Amount >>>> of time the server will wait for certain events before failing a request >>>> Syntax: >>>> <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Syntax> >>>> TimeOut seconds >>>> Default: >>>> <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Default> TimeOut >>>> 300 >>>> Context: >>>> <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Context> server >>>> config, virtual host >>>> Status: >>>> <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Status> Core >>>> Module: >>>> <https://httpd.apache.org/docs/2.0/mod/directive-dict.html#Module> core >>>> >>>> The TimeOut directive currently defines the amount of time Apache will >>>> wait for three things: >>>> >>>> 1. The total amount of time it takes to receive a GET request. >>>> 2. The amount of time between receipt of TCP packets on a POST or >>>> PUT request. >>>> 3. The amount of time between ACKs on transmissions of TCP packets >>>> in responses. >>>> >>>> We plan on making these separately configurable at some point down the >>>> road. The timer used to default to 1200 before 1.2, but has been lowered to >>>> 300 which is still far more than necessary in most situations. It is not >>>> set any lower by default because there may still be odd places in the code >>>> where the timer is not reset when a packet is sent. >>>> >>>> On Wed, Apr 23, 2025 at 3:07 PM Mithun Bhattacharya <mit...@gmail.com> >>>> wrote: >>>> >>>>> You configure timeout at the client side. Apache is at the server >>>>> side. Server doesn't have a concept of time it could take days to run and >>>>> not care. >>>>> >>>>> mod_perl code is where you are sending the http return status to make >>>>> sure the client doesn't timeout waiting for the server to respond. >>>>> >>>>> On Wed, Apr 23, 2025, 2:19 PM Joseph He <joseph.he.2...@gmail.com> >>>>> wrote: >>>>> >>>>>> Thanks, all. >>>>>> Is that Apache timeout controlled by its configuration "Timeout"? >>>>>> I don't think it has anything to do with modPerl. Am I missing >>>>>> something? >>>>>> Thanks. >>>>>> >>>>>> On Wed, Apr 23, 2025 at 1:41 PM Mithun Bhattacharya <mit...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Timeout happens because of how we handle the request. Timeout is >>>>>>> basically no response came back. Why that happens is because we think we >>>>>>> want to have a correct response. Unfortunately for long running requests >>>>>>> the correct response shouldn't be via http response code or we face >>>>>>> situations like this. Instead reply with a 200 OK immediately and then >>>>>>> provide correct status in the message body. Once a response code/header >>>>>>> has >>>>>>> been sent timeout won't trigger and you could potentially hold the >>>>>>> connection for hours without a problem. >>>>>>> >>>>>>> On Wed, Apr 23, 2025, 9:32 AM Andreas Mock <andreas.m...@web.de> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Joseph, >>>>>>>> >>>>>>>> your description is very vague, so can only answer on some >>>>>>>> assumptions: >>>>>>>> >>>>>>>> It sounds like a timeout is fired somewhere. >>>>>>>> >>>>>>>> Best advice in these situations: Log as many steps as you can. Keep >>>>>>>> your >>>>>>>> eyes open on TCP/IP and higher level timeouts. >>>>>>>> >>>>>>>> Declare only ONE instance responsible for a retry: Either the app >>>>>>>> server >>>>>>>> calling the dispatcher with several tries or the dispatcher trying >>>>>>>> for >>>>>>>> himself. Not both. >>>>>>>> >>>>>>>> Best regards >>>>>>>> Andreas >>>>>>>> >>>>>>>> >>>>>>>> Am 23.04.2025 um 16:21 schrieb Joseph He: >>>>>>>> > All, good day. >>>>>>>> > >>>>>>>> > Here is the issue I have. >>>>>>>> > My entire application is running on ModPerl/Apache environment. >>>>>>>> > I send Http::Request with data load from my App server to a >>>>>>>> dispatch >>>>>>>> > server thru LWP::UserAgent, I set the timeout 600 seconds. >>>>>>>> > >>>>>>>> > The dispatch server is supposed to manipulate the data and send >>>>>>>> the >>>>>>>> > data to an external SFTP server. Because the SFTP can fail, it >>>>>>>> will >>>>>>>> > keep trying up to 4 times with 30 seconds sleep in case that SFTP >>>>>>>> > connection fails. >>>>>>>> > >>>>>>>> > Recently, I found that I uploaded the file twice sometimes. I >>>>>>>> figured >>>>>>>> > out the root cause is that my Dispatch server returns 'failure' >>>>>>>> at 6 >>>>>>>> > minutes while it keeps trying to do the SFTP. The App server >>>>>>>> > received HTTP::Response with error status so it issued another >>>>>>>> call to >>>>>>>> > send data. It turns out I uploaded the identified file twice. >>>>>>>> > >>>>>>>> > Anybody has this sort of experience? Why does the dispatch server >>>>>>>> > return 'error' while it still processes the data? >>>>>>>> > >>>>>>>> > Thanks a lot, >>>>>>>> > Joseph >>>>>>>> > >>>>>>>> >>>>>>>