rusackas commented on issue #40047:
URL: https://github.com/apache/superset/issues/40047#issuecomment-4753542709

   Thanks for the detailed logs @adayush, they turned out to be the key here. 
Dug into this and I think it's an actual bug, not just an environment quirk: a 
few of the network calls in the report path had no socket timeout, so if the 
SMTP server (or the chart-data URL the worker hits for CSV) becomes 
unreachable, the socket blocks *forever* and leaves the schedule wedged in 
`WORKING`. That lines up with every format breaking, resetting the state not 
helping, and getting no logs after the run starts... and with it appearing 
after an environment change without an image rebuild.
   
   Put up #41250 to bound those calls (SMTP, the CSV/dataframe fetch, and 
Selenium's page load) so a hang turns into a logged `ERROR` and a freed worker 
instead of a stuck schedule.
   
   In the meantime it's worth confirming this particular worker can actually 
reach your SMTP host and `WEBDRIVER_BASEURL`, since one of those going 
unroutable ~10 days ago is the likely trigger.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to