nclaeys opened a new issue #20099:
URL: https://github.com/apache/airflow/issues/20099


   ### Apache Airflow version
   
   2.2.2 (latest released)
   
   ### What happened
   
   After deleting the dag, the scheduler starts crashlooping and cannot 
recover. This means that an issue with the dag causes the whole environment to 
be down. 
   
   The stacktrace is as follows:
   
   `
   airflow-scheduler [2021-12-07 09:30:07,483] {kubernetes_executor.py:791} 
INFO - Shutting down Kubernetes executor                                        
                                  │
   │ airflow-scheduler [2021-12-07 09:30:08,509] {process_utils.py:100} INFO - 
Sending Signals.SIGTERM to GPID 1472                                            
                                 │
   │ airflow-scheduler [2021-12-07 09:30:08,681] {process_utils.py:66} INFO - 
Process psutil.Process(pid=1472, status='terminated', exitcode=0, 
started='09:28:37') (1472) terminated with exit │
   │ airflow-scheduler [2021-12-07 09:30:08,681] {scheduler_job.py:655} INFO - 
Exited execute loop                                                             
                                 
   │ airflow-scheduler Traceback (most recent call last):                       
                                                                                
                                
   │ airflow-scheduler   File "/home/airflow/.local/bin/airflow", line 8, in 
<module>                                                                        
                                   
   │ airflow-scheduler     sys.exit(main())                                     
                                                                                
                               
   │ airflow-scheduler   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/__main__.py", line 
48, in main                                                                     
     
   │ airflow-scheduler     args.func(args)                                      
                                                                                
                                
   │ airflow-scheduler   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/cli_parser.py", 
line 48, in command                                                             
    
   │ airflow-scheduler     return func(*args, **kwargs)                         
                                                                                
                                
   │ airflow-scheduler   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/cli.py", line 
92, in wrapper                                                                  
    
   │ airflow-scheduler     return f(*args, **kwargs)                            
                                                                                
                                
   │ airflow-scheduler   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/commands/scheduler_command.py",
 line 75, in scheduler                                               │
   │ airflow-scheduler     _run_scheduler_job(args=args)                        
                                                                                
                                
   │ airflow-scheduler   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/commands/scheduler_command.py",
 line 46, in _run_scheduler_job                                      │
   │ airflow-scheduler     job.run()                                            
                                                                                
                                
   │ airflow-scheduler   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/base_job.py", 
line 245, in run                                                                
     
   │ airflow-scheduler     self._execute()                                      
                                                                                
                                
   │ airflow-scheduler   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py",
 line 628, in _execute                                                          
 
   │ airflow-scheduler     self._run_scheduler_loop()                           
                                                                                
                                
   │ airflow-scheduler   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py",
 line 709, in _run_scheduler_loop                                               
 │
   │ airflow-scheduler     num_queued_tis = self._do_scheduling(session)        
                                                                                
                                
   │ airflow-scheduler   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py",
 line 820, in _do_scheduling                                                    
 │
   │ airflow-scheduler     num_queued_tis = 
self._critical_section_execute_task_instances(session=session)                  
                                                                    
   │ airflow-scheduler   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py",
 line 483, in _critical_section_execute_task_instances                          
 │
   │ airflow-scheduler     queued_tis = 
self._executable_task_instances_to_queued(max_tis, session=session)             
                                                                        
   │ airflow-scheduler   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/session.py", 
line 67, in wrapper                                                             
     
   │ airflow-scheduler     return func(*args, **kwargs)                         
                                                                                
                                
   │ airflow-scheduler   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py",
 line 366, in _executable_task_instances_to_queued                              
 │
   │ airflow-scheduler     if serialized_dag.has_task(task_instance.task_id):   
                                                                                
                                
   │ airflow-scheduler AttributeError: 'NoneType' object has no attribute 
'has_task' `
   
   
   ### What you expected to happen
   
   I expect that the scheduler does not crash because the dag gets deleted. The 
biggest issue however is that the whole environment goes down, it would be 
acceptable that the scheduler has issues with that dag (it is deleted after 
all) but it should not affect all other dags on the environment.
   
   ### How to reproduce
   
   1. I created the following dag:
   
   `
   from airflow import DAG
   from datafy.operators import DatafyContainerOperatorV2
   from datetime import datetime, timedelta
   default_args = {
       "owner": "Datafy",
       "depends_on_past": False,
       "start_date": datetime(year=2021, month=12, day=1),
       "task_concurrency": 4,
       "retries": 2,
       "retry_delay": timedelta(minutes=5),
   }
   
   dag = DAG(
       "testnielsdev", default_args=default_args, 
max_active_runs=default_args["task_concurrency"] + 1, schedule_interval="0 1 * 
* *",
   )
   
   DatafyContainerOperatorV2(
       dag=dag,
       task_id="sample",
       cmds=["python"],
       arguments=["-m", "testnielsdev.sample", "--date", "{{ ds }}", "--env", 
"{{ macros.datafy.env() }}"],
       instance_type="mx_small",
       instance_life_cycle="spot",
   )
   `
   When looking at the airflow code, the most important setting apart from the 
defaults is to specify task_concurrency.
   2. I enable the dag
   3. I delete it. When the file gets removed, the scheduler starts 
crashlooping.
   
   ### Operating System
   
   We use the default airflow docker image
   
   ### Versions of Apache Airflow Providers
   
   Not relevant 
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   Not relevant 
   
   ### Anything else
   
   It occurred at one of our customers and I was quickly able to preproduce the 
issue.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to