potiuk commented on issue #18041:
URL: https://github.com/apache/airflow/issues/18041#issuecomment-1159825606
> I have the same problem I'm using airflow 2.2.5, SparkKubernetesOperator
and SparkKubernetesSensor
>
> Driver is running But the sensor displays the following logs until the
number of retries exceeds the threshold
>
> ```
> 2022-06-17, 18:05:52 CST] {spark_kubernetes.py:104} INFO - Poking:
load-customer-data-init-1655486757.7793136
> [2022-06-17, 18:05:52 CST] {spark_kubernetes.py:124} INFO - Spark
application is still in state: RUNNING
> [2022-06-17, 18:06:49 CST] {local_task_job.py:211} WARNING - State of this
instance has been externally set to up_for_retry. Terminating instance.
> [2022-06-17, 18:06:49 CST] {process_utils.py:120} INFO - Sending
Signals.SIGTERM to group 84. PIDs of all processes in the group: [84]
> [2022-06-17, 18:06:49 CST] {process_utils.py:75} INFO - Sending the signal
Signals.SIGTERM to group 84
> [2022-06-17, 18:06:49 CST] {taskinstance.py:1430} ERROR - Received
SIGTERM. Terminating subprocesses.
> [2022-06-17, 18:06:49 CST] {taskinstance.py:1774} ERROR - Task failed with
exception
> Traceback (most recent call last):
> File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/sensors/base.py",
line 249, in execute
> time.sleep(self._get_next_poke_interval(started_at, run_duration,
try_number))
> File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py",
line 1432, in signal_handler
> raise AirflowException("Task received SIGTERM signal")
> airflow.exceptions.AirflowException: Task received SIGTERM signal
> [2022-06-17, 18:06:49 CST] {taskinstance.py:1278} INFO - Marking task as
FAILED. dag_id=salesforecast-load-init, task_id=load-customer-data-init-sensor,
execution_date=20220617T172033, start_date=20220617T175649,
end_date=20220617T180649
> [2022-06-17, 18:06:49 CST] {standard_task_runner.py:93} ERROR - Failed to
execute job 24 for task load-customer-data-init-sensor (Task received SIGTERM
signal; 84)
> [2022-06-17, 18:06:49 CST] {process_utils.py:70} INFO - Process
psutil.Process(pid=84, status='terminated', exitcode=1, started='17:56:48')
(84) terminated with exit code 1
> ```
Did you try the earlier suggestions with dagrun_timeout? Do you know what is
sending SIGTERM to this task?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]