potiuk commented on issue #18041:
URL: https://github.com/apache/airflow/issues/18041#issuecomment-1159825606

   > I have the same problem I'm using airflow 2.2.5, SparkKubernetesOperator 
and SparkKubernetesSensor
   > 
   > Driver is running But the sensor displays the following logs until the 
number of retries exceeds the threshold
   > 
   > ```
   > 2022-06-17, 18:05:52 CST] {spark_kubernetes.py:104} INFO - Poking: 
load-customer-data-init-1655486757.7793136
   > [2022-06-17, 18:05:52 CST] {spark_kubernetes.py:124} INFO - Spark 
application is still in state: RUNNING
   > [2022-06-17, 18:06:49 CST] {local_task_job.py:211} WARNING - State of this 
instance has been externally set to up_for_retry. Terminating instance.
   > [2022-06-17, 18:06:49 CST] {process_utils.py:120} INFO - Sending 
Signals.SIGTERM to group 84. PIDs of all processes in the group: [84]
   > [2022-06-17, 18:06:49 CST] {process_utils.py:75} INFO - Sending the signal 
Signals.SIGTERM to group 84
   > [2022-06-17, 18:06:49 CST] {taskinstance.py:1430} ERROR - Received 
SIGTERM. Terminating subprocesses.
   > [2022-06-17, 18:06:49 CST] {taskinstance.py:1774} ERROR - Task failed with 
exception
   > Traceback (most recent call last):
   >   File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/sensors/base.py", 
line 249, in execute
   >     time.sleep(self._get_next_poke_interval(started_at, run_duration, 
try_number))
   >   File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py",
 line 1432, in signal_handler
   >     raise AirflowException("Task received SIGTERM signal")
   > airflow.exceptions.AirflowException: Task received SIGTERM signal
   > [2022-06-17, 18:06:49 CST] {taskinstance.py:1278} INFO - Marking task as 
FAILED. dag_id=salesforecast-load-init, task_id=load-customer-data-init-sensor, 
execution_date=20220617T172033, start_date=20220617T175649, 
end_date=20220617T180649
   > [2022-06-17, 18:06:49 CST] {standard_task_runner.py:93} ERROR - Failed to 
execute job 24 for task load-customer-data-init-sensor (Task received SIGTERM 
signal; 84)
   > [2022-06-17, 18:06:49 CST] {process_utils.py:70} INFO - Process 
psutil.Process(pid=84, status='terminated', exitcode=1, started='17:56:48') 
(84) terminated with exit code 1
   > ```
   
   Did you try the earlier suggestions with dagrun_timeout? Do you know what is 
sending SIGTERM to this task?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to