argibbs commented on issue #34339:
URL: https://github.com/apache/airflow/issues/34339#issuecomment-1718286614

   > ++ some guess on my side - we recently had such issue also related to 
infrastructure instability. Logs showed that tasks executed successfully but 
updates in DB failed due to network connection problems. We could find 
exceptions in connectivity in the worker stdout. So question:
   > 
   > * Is it possible for you to reproduce this and capture the worker logs for 
the time?
   > * Can you share these logs and potentially the scheduler logs in the same 
timeframe?
   
   I mean, I have been able to reliably reproduce this since upgrading to 2.7.0 
- however, I have scoured the worker logs, and there's never any obvious 
errors. The same goes for the db logs as well. I haven't seen anything in the 
scheduler logs, but they're noisy. The only obvious error I've found so far is 
the dag processor timeout errors.
   
   Also, worth noting that I really only changed one thing: upgraded from 
2.3.3->2.7.0->2.7.1; if we'd been experiencing network issues, I'd have 
expected that to be version agnostic, rather than manifesting only once I'd 
upgraded to 2.7. I'm not discounting it (I'm working on removing some config 
errors that are causing log errors in the scheduler, so I can better grep for 
fails), but it's not my primary suspect right now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to