gbonazzoli opened a new issue #19343:
URL: https://github.com/apache/airflow/issues/19343
### Apache Airflow version
2.2.1 (latest released)
### Operating System
Ubuntu 20.04.3 LTS
### Versions of Apache Airflow Providers
```
apache-airflow-providers-celery==2.1.0
apache-airflow-providers-ftp==2.0.1
apache-airflow-providers-http==2.0.1
apache-airflow-providers-imap==2.0.1
apache-airflow-providers-microsoft-mssql==2.0.1
apache-airflow-providers-microsoft-winrm==2.0.1
apache-airflow-providers-openfaas==2.0.0
apache-airflow-providers-oracle==2.0.1
apache-airflow-providers-samba==3.0.0
apache-airflow-providers-sftp==2.1.1
apache-airflow-providers-sqlite==2.0.1
apache-airflow-providers-ssh==2.2.0
```
### Deployment
Virtualenv installation
### Deployment details
Airflow 2.2.1 on a LXD Container "all in one" (web, scheduler, database ==
postgres)
### What happened
I don't know if it is related to the change of the time we had in Italy from
03:00 to 02:00 happened on October 30th.
The result is that during the same day the scheduler has been compromised.
The output of the commad `airflow scheduler` is:
```
root@new-airflow:~/airflow# airflow scheduler
____________ _____________
____ |__( )_________ __/__ /________ __
____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
_/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
[2021-11-01 02:00:41,181] {scheduler_job.py:596} INFO - Starting the
scheduler
[2021-11-01 02:00:41,181] {scheduler_job.py:601} INFO - Processing each file
at most -1 times
[2021-11-01 02:00:41,267] {manager.py:163} INFO - Launched
DagFileProcessorManager with pid: 12284
[2021-11-01 02:00:41,268] {scheduler_job.py:1115} INFO - Resetting orphaned
tasks for active dag runs
[2021-11-01 02:00:41,269] {settings.py:52} INFO - Configured default
timezone Timezone('UTC')
[2021-11-01 02:00:41,332] {celery_executor.py:493} INFO - Adopted the
following 7 tasks from a dead executor
<TaskInstance:
EXEC_SAVE_ORACLE_SOURCE.PSOFA_PSO_PKG_UTILITY_SP_SAVE_ORACLE_SOURCE
scheduled__2021-10-30T17:30:00+00:00 [running]> in state STARTED
<TaskInstance: EXEC_MAIL_VENDUTO_LIMASTE-BALLETTA.PSO_SP_DATI_BB_V4
scheduled__2021-10-30T19:00:00+00:00 [running]> in state STARTED
<TaskInstance: EXEC_MAIL_TASSO_CONV.PSO_SP_DATI_BB_V6
scheduled__2021-10-30T20:35:00+00:00 [running]> in state STARTED
<TaskInstance: EXEC_MAIL_VENDUTO_UNICA_AM.PSO_SP_DATI_BB_V6
scheduled__2021-10-30T19:20:00+00:00 [running]> in state STARTED
<TaskInstance: EXEC_BI_ASYNC.bi_pkg_batch_carica_async_2
scheduled__2021-10-30T23:00:00+00:00 [running]> in state STARTED
<TaskInstance: EXEC_MAIL_INGRESSI_UNICA.PSO_SP_INGRESSI_BB_V4
scheduled__2021-10-30T20:15:00+00:00 [running]> in state STARTED
<TaskInstance: API_REFRESH_PSO_ANALISI_CONS_ORDINE_EXCEL.Refresh_Table
scheduled__2021-10-31T07:29:00+00:00 [running]> in state STARTED
[2021-11-01 02:00:41,440] {dagrun.py:511} INFO - Marking run <DagRun
EXEC_CALCOLO_FILTRO_RR_INCREMENTALE @ 2021-10-30 18:00:00+00:00:
scheduled__2021-10-30T18:00:00+00:00, externally triggered: False> successful
[2021-11-01 02:00:41,441] {dagrun.py:556} INFO - DagRun Finished:
dag_id=EXEC_CALCOLO_FILTRO_RR_INCREMENTALE, execution_date=2021-10-30
18:00:00+00:00, run_id=scheduled__2021-10-30T18:00:00+00:00,
run_start_date=2021-10-31 09:00:00.440704+00:00, run_end_date=2021-11-01
01:00:41.441139+00:00, run_duration=57641.000435, state=success,
external_trigger=False, run_type=scheduled, data_interval_start=2021-10-30
18:00:00+00:00, data_interval_end=2021-10-31 09:00:00+00:00,
dag_hash=91db1a3fa29d7dba470ee53feddb124b
[2021-11-01 02:00:41,444] {scheduler_job.py:644} ERROR - Exception when
executing SchedulerJob._run_scheduler_loop
Traceback (most recent call last):
File
"/usr/local/lib/python3.8/dist-packages/airflow/jobs/scheduler_job.py", line
628, in _execute
self._run_scheduler_loop()
File
"/usr/local/lib/python3.8/dist-packages/airflow/jobs/scheduler_job.py", line
709, in _run_scheduler_loop
num_queued_tis = self._do_scheduling(session)
File
"/usr/local/lib/python3.8/dist-packages/airflow/jobs/scheduler_job.py", line
792, in _do_scheduling
callback_to_run = self._schedule_dag_run(dag_run, session)
File
"/usr/local/lib/python3.8/dist-packages/airflow/jobs/scheduler_job.py", line
1044, in _schedule_dag_run
self._update_dag_next_dagruns(dag, dag_model, active_runs)
File
"/usr/local/lib/python3.8/dist-packages/airflow/jobs/scheduler_job.py", line
935, in _update_dag_next_dagruns
data_interval = dag.get_next_data_interval(dag_model)
File "/usr/local/lib/python3.8/dist-packages/airflow/models/dag.py", line
629, in get_next_data_interval
return self.infer_automated_data_interval(dag_model.next_dagrun)
File "/usr/local/lib/python3.8/dist-packages/airflow/models/dag.py", line
667, in infer_automated_data_interval
end = cast(CronDataIntervalTimetable, self.timetable)._get_next(start)
File
"/usr/local/lib/python3.8/dist-packages/airflow/timetables/interval.py", line
171, in _get_next
naive = make_naive(current, self._timezone)
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/timezone.py",
line 143, in make_naive
if is_naive(value):
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/timezone.py",
line 50, in is_naive
return value.utcoffset() is None
AttributeError: 'NoneType' object has no attribute 'utcoffset'
[2021-11-01 02:00:42,459] {process_utils.py:100} INFO - Sending
Signals.SIGTERM to GPID 12284
[2021-11-01 02:00:42,753] {process_utils.py:212} INFO - Waiting up to 5
seconds for processes to exit...
[2021-11-01 02:00:42,792] {process_utils.py:66} INFO - Process
psutil.Process(pid=12342, status='terminated', started='02:00:41') (12342)
terminated with exit code None
[2021-11-01 02:00:42,792] {process_utils.py:66} INFO - Process
psutil.Process(pid=12284, status='terminated', exitcode=0, started='02:00:40')
(12284) terminated with exit code 0
[2021-11-01 02:00:42,792] {process_utils.py:66} INFO - Process
psutil.Process(pid=12317, status='terminated', started='02:00:41') (12317)
terminated with exit code None
[2021-11-01 02:00:42,792] {scheduler_job.py:655} INFO - Exited execute loop
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/airflow/__main__.py", line
48, in main
args.func(args)
File "/usr/local/lib/python3.8/dist-packages/airflow/cli/cli_parser.py",
line 48, in command
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/cli.py", line
92, in wrapper
return f(*args, **kwargs)
File
"/usr/local/lib/python3.8/dist-packages/airflow/cli/commands/scheduler_command.py",
line 75, in scheduler
_run_scheduler_job(args=args)
File
"/usr/local/lib/python3.8/dist-packages/airflow/cli/commands/scheduler_command.py",
line 46, in _run_scheduler_job
job.run()
File "/usr/local/lib/python3.8/dist-packages/airflow/jobs/base_job.py",
line 245, in run
self._execute()
File
"/usr/local/lib/python3.8/dist-packages/airflow/jobs/scheduler_job.py", line
628, in _execute
self._run_scheduler_loop()
File
"/usr/local/lib/python3.8/dist-packages/airflow/jobs/scheduler_job.py", line
709, in _run_scheduler_loop
num_queued_tis = self._do_scheduling(session)
File
"/usr/local/lib/python3.8/dist-packages/airflow/jobs/scheduler_job.py", line
792, in _do_scheduling
callback_to_run = self._schedule_dag_run(dag_run, session)
File
"/usr/local/lib/python3.8/dist-packages/airflow/jobs/scheduler_job.py", line
1044, in _schedule_dag_run
self._update_dag_next_dagruns(dag, dag_model, active_runs)
File
"/usr/local/lib/python3.8/dist-packages/airflow/jobs/scheduler_job.py", line
935, in _update_dag_next_dagruns
data_interval = dag.get_next_data_interval(dag_model)
File "/usr/local/lib/python3.8/dist-packages/airflow/models/dag.py", line
629, in get_next_data_interval
return self.infer_automated_data_interval(dag_model.next_dagrun)
File "/usr/local/lib/python3.8/dist-packages/airflow/models/dag.py", line
667, in infer_automated_data_interval
end = cast(CronDataIntervalTimetable, self.timetable)._get_next(start)
File
"/usr/local/lib/python3.8/dist-packages/airflow/timetables/interval.py", line
171, in _get_next
naive = make_naive(current, self._timezone)
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/timezone.py",
line 143, in make_naive
if is_naive(value):
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/timezone.py",
line 50, in is_naive
return value.utcoffset() is None
AttributeError: 'NoneType' object has no attribute 'utcoffset'
```
There was non way to have Airflow started !!
I restored the backup of the day before in order to have Airflow up and
running again.
Now it works, but at the startup Airflow launched all the jobs he thinks
were non executed, causing some problems on the database due to these unusual
load.
Is there a way to avoid this behaviour at the startup ?
### What you expected to happen
_No response_
### How to reproduce
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]