rusackas commented on issue #40047: URL: https://github.com/apache/superset/issues/40047#issuecomment-4753542709
Thanks for the detailed logs @adayush, they turned out to be the key here. Dug into this and I think it's an actual bug, not just an environment quirk: a few of the network calls in the report path had no socket timeout, so if the SMTP server (or the chart-data URL the worker hits for CSV) becomes unreachable, the socket blocks *forever* and leaves the schedule wedged in `WORKING`. That lines up with every format breaking, resetting the state not helping, and getting no logs after the run starts... and with it appearing after an environment change without an image rebuild. Put up #41250 to bound those calls (SMTP, the CSV/dataframe fetch, and Selenium's page load) so a hang turns into a logged `ERROR` and a freed worker instead of a stuck schedule. In the meantime it's worth confirming this particular worker can actually reach your SMTP host and `WEBDRIVER_BASEURL`, since one of those going unroutable ~10 days ago is the likely trigger. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
