ofgrenudo commented on issue #57158:
URL: https://github.com/apache/airflow/issues/57158#issuecomment-3444791356

   hmmm so occassionally, we will have an issue where the logs will stop 
rendering. the resolution has always been to restart the scheduler. Im trying 
to find the code that causes this. 
   
   We are currently using a docker deployment. I have at times been able to 
reproduce it when im working locally, but i always figured it was my machine 
that was causing some sort of  state error instead of airflow having a bug. 
   
   I would appreciate some direction in where to look to see if airflow can 
provide any logs as to whats happenning.
   
   ## Some facts that make this more confusing
   
   When this happens, if you are at the web ui, and trigger a task, it is 
successfully ran on the worker. But the logs are not retrieved from the job. If 
i log into the web server, through docker exec, I can view the logs and see my 
jobs are running, but they are reported as failures. in the web ui. for jobs 
that are supposed to re run on failure, this can be an issue because now the 
job is successfully executing, but is reporting as failed causing it to re run. 
Our solution would probably be to write some more itempotent code, but either 
way, airflow running into this state error is a major issue for us. 
   
   again to recap, where should i begin to look to try and diagnose this issue. 
   
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to