trlopes1974 commented on issue #39717:
URL: https://github.com/apache/airflow/issues/39717#issuecomment-2263227474

   @potiuk .
   I have managed to get  some debug into this. Maybe this can help???
   
   So we are talking about the problem where the CELERY task is marked as 
failed but the AIRFLOW task is successful
   
   Celery Task:
   
![image](https://github.com/user-attachments/assets/b6d5b2f1-4c1d-415e-bb4e-f0a7752c0ce8)
   
   Airflow Task.
   
![image](https://github.com/user-attachments/assets/fbc20f9c-93dd-4bca-b036-eba0e4895e02)
   
   Airflow Task LOG:
   ```
   1
     
   *** Found local files:
   ***   * 
/opt/tkapp/airflow/logs/dag_id=CS00007002_Correcao_Dados_Oracle/run_id=CS00007002_Correcao_Dados_Oracle_WO0000000813598_2024-08-01T12:34:08.391542+01:00/task_id=remote_actions.ssh_command_remove_operator/attempt=1.log
   [2024-08-01, 12:35:46 WEST] {hashicorpvaultapi.py:57} DEBUG - 
Vault:[https://VAULT.SERVER.NAME], namespace:[admin/SOMENAMESPACE], 
proxy:[{"http": "http://PROXYSERVER:443"; ,"https": "http://PROXYSERVER:443"}]
   [2024-08-01, 12:35:46 WEST] {hashicorpvaultapi.py:85} WARNING - a 
inicializar com memcache: cachehost:memcache.SERVERHOSTNAME.knb , 
cacheport:11211 , cachettl:2592000 ,  
cachecrt:/opt/tkapp/ssl/SERVERHOSTNAME_knb_combined.pem
   [2024-08-01, 12:35:46 WEST] {hashicorpvaultapi.py:144} DEBUG - performing 
authentication to: https://VAULT.SERVER.NAME/v1/auth/approle/login  -> role 
autentication: roleid:36 secret:36 result:640
   [2024-08-01, 12:35:46 WEST] {hashicorpvaultapi.py:164} DEBUG - Got token:110
   [2024-08-01, 12:35:46 WEST] {hashicorpvaultapi.py:192} DEBUG - Cached secret 
config/sql_alchemy_conn: 2
   [2024-08-01, 12:35:46 WEST] {hashicorpvaultapi.py:57} DEBUG - 
Vault:[https://VAULT.SERVER.NAME], namespace:[admin/SOMENAMESPACE], 
proxy:[{"http": "http://PROXYSERVER:443"; ,"https": "http://PROXYSERVER:443"}]
   [2024-08-01, 12:35:46 WEST] {hashicorpvaultapi.py:85} WARNING - a 
inicializar com memcache: cachehost:memcache.SERVERHOSTNAME.knb , 
cacheport:11211 , cachettl:2592000 ,  
cachecrt:/opt/tkapp/ssl/SERVERHOSTNAME_knb_combined.pem
   [2024-08-01, 12:35:47 WEST] {hashicorpvaultapi.py:144} DEBUG - performing 
authentication to: https://VAULT.SERVER.NAME/v1/auth/approle/login  -> role 
autentication: roleid:36 secret:36 result:640
   [2024-08-01, 12:35:47 WEST] {hashicorpvaultapi.py:164} DEBUG - Got token:110
   [2024-08-01, 12:35:47 WEST] {hashicorpvaultapi.py:192} DEBUG - Cached secret 
config/result_backend: 2
   [2024-08-01, 12:35:47 WEST] {hashicorpvaultapi.py:57} DEBUG - 
Vault:[https://VAULT.SERVER.NAME], namespace:[admin/SOMENAMESPACE], 
proxy:[{"http": "http://PROXYSERVER:443"; ,"https": "http://PROXYSERVER:443"}]
   [2024-08-01, 12:35:47 WEST] {hashicorpvaultapi.py:85} WARNING - a 
inicializar com memcache: cachehost:memcache.SERVERHOSTNAME.knb , 
cacheport:11211 , cachettl:2592000 ,  
cachecrt:/opt/tkapp/ssl/SERVERHOSTNAME_knb_combined.pem
   [2024-08-01, 12:35:47 WEST] {hashicorpvaultapi.py:144} DEBUG - performing 
authentication to: https://VAULT.SERVER.NAME/v1/auth/approle/login  -> role 
autentication: roleid:36 secret:36 result:640
   [2024-08-01, 12:35:47 WEST] {hashicorpvaultapi.py:164} DEBUG - Got token:110
   [2024-08-01, 12:35:47 WEST] {hashicorpvaultapi.py:192} DEBUG - Cached secret 
config/broker_url: 2
   [2024-08-01, 12:35:47 WEST] {local_task_job_runner.py:120} ▼ Pre task 
execution logs
   [2024-08-01, 12:35:47 WEST] {taskinstance.py:2076} INFO - Dependencies all 
met for dep_context=non-requeueable deps ti=<TaskInstance: 
CS00007002_Correcao_Dados_Oracle.remote_actions.ssh_command_remove_operator 
CS00007002_Correcao_Dados_Oracle_WO0000000813598_2024-08-01T12:34:08.391542+01:00
 [queued]>
   [2024-08-01, 12:35:47 WEST] {taskinstance.py:2076} INFO - Dependencies all 
met for dep_context=requeueable deps ti=<TaskInstance: 
CS00007002_Correcao_Dados_Oracle.remote_actions.ssh_command_remove_operator 
CS00007002_Correcao_Dados_Oracle_WO0000000813598_2024-08-01T12:34:08.391542+01:00
 [queued]>
   [2024-08-01, 12:35:47 WEST] {taskinstance.py:2306} INFO - Starting attempt 1 
of 1
   [2024-08-01, 12:35:47 WEST] {taskinstance.py:2330} INFO - Executing 
<Task(SSHOperator): remote_actions.ssh_command_remove_operator> on 2024-08-01 
11:34:08.393962+00:00
   [2024-08-01, 12:35:47 WEST] {standard_task_runner.py:63} INFO - Started 
process 2987081 to run task
   [2024-08-01, 12:35:47 WEST] {standard_task_runner.py:96} INFO - Running: 
['airflow', 'tasks', 'run', 'CS00007002_Correcao_Dados_Oracle', 
'remote_actions.ssh_command_remove_operator', 
'CS00007002_Correcao_Dados_Oracle_WO0000000813598_2024-08-01T12:34:08.391542+01:00',
 '--job-id', '515545', '--raw', '--subdir', 
'DAGS_FOLDER/CS00007002_Correcao_Dados_Oracle.py', '--cfg-path', 
'/tmp/tmpdx2hhjj6']
   [2024-08-01, 12:35:47 WEST] {standard_task_runner.py:97} INFO - Job 515545: 
Subtask remote_actions.ssh_command_remove_operator
   [2024-08-01, 12:35:47 WEST] {task_command.py:426} INFO - Running 
<TaskInstance: 
CS00007002_Correcao_Dados_Oracle.remote_actions.ssh_command_remove_operator 
CS00007002_Correcao_Dados_Oracle_WO0000000813598_2024-08-01T12:34:08.391542+01:00
 [running]> on host SERVERHOSTNAME
   [2024-08-01, 12:35:48 WEST] {taskinstance.py:2648} INFO - Exporting env 
vars: AIRFLOW_CTX_DAG_EMAIL='cs-notificacoes-criticas-prod - PT Tribo2 
Novobanco <[email protected]>' AIRFLOW_CTX_DAG_OWNER='ttauto' 
AIRFLOW_CTX_DAG_ID='CS00007002_Correcao_Dados_Oracle' 
AIRFLOW_CTX_TASK_ID='remote_actions.ssh_command_remove_operator' 
AIRFLOW_CTX_EXECUTION_DATE='2024-08-01T11:34:08.393962+00:00' 
AIRFLOW_CTX_TRY_NUMBER='1' 
AIRFLOW_CTX_DAG_RUN_ID='CS00007002_Correcao_Dados_Oracle_WO0000000813598_2024-08-01T12:34:08.391542+01:00'
   [2024-08-01, 12:35:48 WEST] {taskinstance.py:430} ▲▲▲ Log group end
   [2024-08-01, 12:35:48 WEST] {ssh.py:151} INFO - Creating ssh_client
   [2024-08-01, 12:35:48 WEST] {ssh.py:302} WARNING - No Host Key Verification. 
This won't protect against Man-In-The-Middle attacks
   [2024-08-01, 12:35:48 WEST] {transport.py:1909} INFO - Connected (version 
2.0, client OpenSSH_8.0)
   [2024-08-01, 12:35:48 WEST] {transport.py:1909} INFO - Auth banner: 
b"################################### AVISO 
#########################################\n\nOs sistemas internos SOME NAME so' 
devem ser usados para realizar atividades de \nnegocio do SOME NAME ou outros 
fins autorizados pela Direcao do SOME NAME \xe2\x80\x93 \nDepartamento de 
Sistemas de Informacao 
(DSI)\n\n===============================================================================\n=
             O DIREITO DE ACESSO A ESTE SISTEMA E' RESERVADO !!              
=\n===============================================================================\n=
                                                                             
=\n=      Este sistema deve ser utilizado apenas em actividades de negocio      
 =\n=                       autorizadas pela Gestao do SOME NAME (DSI).         
 =\n=                                                                           
  =\n=====================================================
 ==========================\n=    Este sistema esta' sujeito a auditorias 
efectuadas a qualquer momento.   
=\n===============================================================================\n"
   [2024-08-01, 12:35:48 WEST] {transport.py:1909} INFO - Authentication 
(publickey) successful!
   [2024-08-01, 12:35:48 WEST] {ssh.py:483} INFO - Running command: 
/usr/bin/sudo -s --  eval '/usr/bin/su - oradb -c 
"/home/oradb/correcao_dados/correcao_dados.sh WO0000000813598 REMOVE"'
   [2024-08-01, 12:35:48 WEST] {ssh.py:529} INFO - a remover 
/tmp/correcao_dados/WO0000000813598
   [2024-08-01, 12:35:48 WEST] {ssh.py:529} INFO - ACCAO 'REMOVE' EXECUTADA 
PARA A WORKORDERID:'WO0000000813598'
   [2024-08-01, 12:35:48 WEST] {taskinstance.py:441} ▼ Post task execution logs
   [2024-08-01, 12:35:48 WEST] {taskinstance.py:1206} INFO - Marking task as 
SUCCESS. dag_id=CS00007002_Correcao_Dados_Oracle, 
task_id=remote_actions.ssh_command_remove_operator, 
run_id=CS00007002_Correcao_Dados_Oracle_WO0000000813598_2024-08-01T12:34:08.391542+01:00,
 execution_date=20240801T113408, start_date=20240801T113547, 
end_date=20240801T113548
   [2024-08-01, 12:35:48 WEST] {logging_mixin.py:188} INFO - log_config - 
locating: LOGLEVEL / modules.dagutililies
   [2024-08-01, 12:35:48 WEST] {dagutililies.py:5} WARNING - current log level 
for modules.dagutililies is: DEBUG
   [2024-08-01, 12:35:48 WEST] {hashicorpvaultapi.py:192} DEBUG - Cached secret 
variables/RABBIT: 9
   [2024-08-01, 12:35:48 WEST] {dagutililies.py:159} INFO - *********** SUCCESS 
CALLBACK START ***********
   [2024-08-01, 12:35:48 WEST] {hashicorpvaultapi.py:192} DEBUG - Cached secret 
variables/TKAPP: 10
   [2024-08-01, 12:35:48 WEST] {dagutililies.py:170} INFO - *********** SUCCESS 
CALLBACK END ***********
   [2024-08-01, 12:35:48 WEST] {local_task_job_runner.py:240} INFO - Task 
exited with return code 0
   [2024-08-01, 12:35:48 WEST] {local_task_job_runner.py:222} ▲▲▲ Log group end
   
   ```
   
   **Extra debug info on **celery_executor_utils.py****
   
   ```
   [2ce8b927-3f7b-42b9-831f-8c44b13b5eaa] Executing command in Celery: 
['airflow', 'tasks', 'run', 'CS00007002_Correcao_Dados_Oracle', 
'remote_actions.ssh_command_remove_operator', 
'CS00007002_Correcao_Dados_Oracle_WO0000000813598_2024-08-01T12:34:08.391542+01:00',
 '--local', '--subdir', 'DAGS_FOLDER/CS00007002_Correcao_Dados_Oracle.py']
   Execute command: ['airflow', 'tasks', 'run', 
'CS00007002_Correcao_Dados_Oracle', 
'remote_actions.ssh_command_remove_operator', 
'CS00007002_Correcao_Dados_Oracle_WO0000000813598_2024-08-01T12:34:08.391542+01:00',
 '--local', '--subdir', 'DAGS_FOLDER/CS00007002_Correcao_Dados_Oracle.py'] for 
celery_task_id:2ce8b927-3f7b-42b9-831f-8c44b13b5eaa
   Execute _execute_in_fork: ['airflow', 'tasks', 'run', 
'CS00007002_Correcao_Dados_Oracle', 
'remote_actions.ssh_command_remove_operator', 
'CS00007002_Correcao_Dados_Oracle_WO0000000813598_2024-08-01T12:34:08.391542+01:00',
 '--local', '--subdir', 'DAGS_FOLDER/CS00007002_Correcao_Dados_Oracle.py'] for 
celery_task_id:2ce8b927-3f7b-42b9-831f-8c44b13b5eaa got PID:2987051
   fork pid: 2987051  - celery_task_id:2ce8b927-3f7b-42b9-831f-8c44b13b5eaa
   Execute _execute_in_fork: ['airflow', 'tasks', 'run', 
'CS00007002_Correcao_Dados_Oracle', 
'remote_actions.ssh_command_remove_operator', 
'CS00007002_Correcao_Dados_Oracle_WO0000000813598_2024-08-01T12:34:08.391542+01:00',
 '--local', '--subdir', 'DAGS_FOLDER/CS00007002_Correcao_Dados_Oracle.py'] for 
celery_task_id:2ce8b927-3f7b-42b9-831f-8c44b13b5eaa got PID:0
   calling func 'task_run' with args Namespace(subcommand='run', 
dag_id='CS00007002_Correcao_Dados_Oracle', 
task_id='remote_actions.ssh_command_remove_operator', 
execution_date_or_run_id='CS00007002_Correcao_Dados_Oracle_WO0000000813598_2024-08-01T12:34:08.391542+01:00',
 cfg_path=None, depends_on_past='check', force=False, 
ignore_all_dependencies=False, ignore_dependencies=False, 
ignore_depends_on_past=False, interactive=False, job_id=None, local=True, 
map_index=-1, mark_success=False, shut_down_logging=False, pickle=None, 
pool=None, raw=False, read_from_db=False, ship_dag=False, 
subdir='DAGS_FOLDER/CS00007002_Correcao_Dados_Oracle.py', verbose=False, 
func=<function lazy_load_command.<locals>.command at 0x7fb74870b550>, 
external_executor_id='2ce8b927-3f7b-42b9-831f-8c44b13b5eaa')
   Execute _execute_in_fork: calling func 'task_run' with args 
Namespace(subcommand='run', dag_id='CS00007002_Correcao_Dados_Oracle', 
task_id='remote_actions.ssh_command_remove_operator', 
execution_date_or_run_id='CS00007002_Correcao_Dados_Oracle_WO0000000813598_2024-08-01T12:34:08.391542+01:00',
 cfg_path=None, depends_on_past='check', force=False, 
ignore_all_dependencies=False, ignore_dependencies=False, 
ignore_depends_on_past=False, interactive=False, job_id=None, local=True, 
map_index=-1, mark_success=False, shut_down_logging=False, pickle=None, 
pool=None, raw=False, read_from_db=False, ship_dag=False, 
subdir='DAGS_FOLDER/CS00007002_Correcao_Dados_Oracle.py', verbose=False, 
func=<function lazy_load_command.<locals>.command at 0x7fb74870b550>, 
external_executor_id='2ce8b927-3f7b-42b9-831f-8c44b13b5eaa')
   [2ce8b927-3f7b-42b9-831f-8c44b13b5eaa] Failed to execute task.
   [2ce8b927-3f7b-42b9-831f-8c44b13b5eaa] Failed to execute task.
   Traceback (most recent call last):
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/providers/celery/executors/celery_executor_utils.py",
 line 194, in _execute_in_fork
       args.func(args)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/cli/cli_config.py",
 line 49, in command
       return func(*args, **kwargs)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/utils/cli.py", 
line 114, in wrapper
       return f(*args, **kwargs)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/cli/commands/task_command.py",
 line 441, in task_run
       task_return_code = _run_task_by_selected_method(args, _dag, ti)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/cli/commands/task_command.py",
 line 219, in _run_task_by_selected_method
       return _run_task_by_local_task_job(args, ti)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/cli/commands/task_command.py",
 line 281, in _run_task_by_local_task_job
       ret = run_job(job=job_runner.job, execute_callable=job_runner._execute)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/utils/session.py",
 line 79, in wrapper
       return func(*args, session=session, **kwargs)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/jobs/job.py", 
line 402, in run_job
       return execute_job(job, execute_callable=execute_callable)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/jobs/job.py", 
line 431, in execute_job
       ret = execute_callable()
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/jobs/local_task_job_runner.py",
 line 201, in _execute
       self.handle_task_exit(return_code)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/jobs/local_task_job_runner.py",
 line 244, in handle_task_exit
       
self.task_instance.schedule_downstream_tasks(max_tis_per_query=self.job.max_tis_per_query)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/utils/session.py",
 line 79, in wrapper
       return func(*args, session=session, **kwargs)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/models/taskinstance.py",
 line 3523, in schedule_downstream_tasks
       return TaskInstance._schedule_downstream_tasks(
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/api_internal/internal_api_call.py",
 line 115, in wrapper
       return func(*args, **kwargs)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/utils/session.py",
 line 76, in wrapper
       return func(*args, **kwargs)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/models/taskinstance.py",
 line 3473, in _schedule_downstream_tasks
       partial_dag = task.dag.partial_subset(
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/models/dag.py", 
line 2550, in partial_subset
       dag.task_dict = {
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/models/dag.py", 
line 2551, in <dictcomp>
       t.task_id: _deepcopy_task(t)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/models/dag.py", 
line 2548, in _deepcopy_task
       return copy.deepcopy(t, memo)
     File "/usr/lib64/python3.9/copy.py", line 153, in deepcopy
       y = copier(memo)
     File 
"/opt/tkapp/env_airflow/lib64/python3.9/site-packages/airflow/models/baseoperator.py",
 line 1307, in __deepcopy__
       setattr(result, k, copy.deepcopy(v, memo))
     File "/usr/lib64/python3.9/copy.py", line 146, in deepcopy
       y = copier(x, memo)
     File "/usr/lib64/python3.9/copy.py", line 230, in _deepcopy_dict
       y[deepcopy(key, memo)] = deepcopy(value, memo)
     File "/usr/lib64/python3.9/copy.py", line 172, in deepcopy
       y = _reconstruct(x, memo, *rv)
     File "/usr/lib64/python3.9/copy.py", line 270, in _reconstruct
       state = deepcopy(state, memo)
     File "/usr/lib64/python3.9/copy.py", line 146, in deepcopy
       y = copier(x, memo)
     File "/usr/lib64/python3.9/copy.py", line 230, in _deepcopy_dict
       y[deepcopy(key, memo)] = deepcopy(value, memo)
     File "/usr/lib64/python3.9/copy.py", line 172, in deepcopy
       y = _reconstruct(x, memo, *rv)
     File "/usr/lib64/python3.9/copy.py", line 270, in _reconstruct
       state = deepcopy(state, memo)
     File "/usr/lib64/python3.9/copy.py", line 146, in deepcopy
       y = copier(x, memo)
     File "/usr/lib64/python3.9/copy.py", line 230, in _deepcopy_dict
       y[deepcopy(key, memo)] = deepcopy(value, memo)
     File "/usr/lib64/python3.9/copy.py", line 172, in deepcopy
       y = _reconstruct(x, memo, *rv)
     File "/usr/lib64/python3.9/copy.py", line 270, in _reconstruct
       state = deepcopy(state, memo)
     File "/usr/lib64/python3.9/copy.py", line 146, in deepcopy
       y = copier(x, memo)
     File "/usr/lib64/python3.9/copy.py", line 230, in _deepcopy_dict
       y[deepcopy(key, memo)] = deepcopy(value, memo)
       y[deepcopy(key, memo)] = deepcopy(value, memo)
     File "/usr/lib64/python3.9/copy.py", line 172, in deepcopy
       y = _reconstruct(x, memo, *rv)
     File "/usr/lib64/python3.9/copy.py", line 270, in _reconstruct
       state = deepcopy(state, memo)
     File "/usr/lib64/python3.9/copy.py", line 146, in deepcopy
       y = copier(x, memo)
     File "/usr/lib64/python3.9/copy.py", line 230, in _deepcopy_dict
       y[deepcopy(key, memo)] = deepcopy(value, memo)
     File "/usr/lib64/python3.9/copy.py", line 172, in deepcopy
       y = _reconstruct(x, memo, *rv)
     File "/usr/lib64/python3.9/copy.py", line 270, in _reconstruct
       state = deepcopy(state, memo)
     File "/usr/lib64/python3.9/copy.py", line 146, in deepcopy
       y = copier(x, memo)
     File "/usr/lib64/python3.9/copy.py", line 230, in _deepcopy_dict
       y[deepcopy(key, memo)] = deepcopy(value, memo)
     File "/usr/lib64/python3.9/copy.py", line 161, in deepcopy
       rv = reductor(4)
   TypeError: cannot pickle '_cffi_backend.FFI' object
   waitpid PID/RET: 2987051/256
   Execute _execute_in_fork: waitpid PID/RET: 2987051/256
   Execute _execute_in_fork: Celery command failed on host: slautop02 with 
celery_task_id 2ce8b927-3f7b-42b9-831f-8c44b13b5eaa (PID: 2987051, Return Code: 
256)
   Execute command ERROR:Celery command failed on host: slautop02 with 
celery_task_id 2ce8b927-3f7b-42b9-831f-8c44b13b5eaa (PID: 2987051, Return Code: 
256)
   ERROR executing: Celery command failed on host: slautop02 with 
celery_task_id 2ce8b927-3f7b-42b9-831f-8c44b13b5eaa (PID: 2987051, Return Code: 
256)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to