paramjeet01 opened a new issue, #39267:
URL: https://github.com/apache/airflow/issues/39267

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### If "Other Airflow 2 version" selected, which one?
   
   2.8.3
   
   ### What happened?
   
   We are facing intermittent json error but on next retry it works. 
   ```
   [2024-04-26, 00:21:32 IST] {pod_manager.py:718} INFO - Checking if xcom 
sidecar container is started.
   [2024-04-26, 00:21:32 IST] {pod_manager.py:721} INFO - The xcom sidecar 
container is started.
   [2024-04-26, 00:21:32 IST] {pod_manager.py:798} INFO - Running command... if 
[ -s /airflow/xcom/return.json ]; then cat /airflow/xcom/return.json; else echo 
__airflow_xcom_result_empty__; fi
   [2024-04-26, 00:21:36 IST] {pod_manager.py:798} INFO - Running command... if 
[ -s /airflow/xcom/return.json ]; then cat /airflow/xcom/return.json; else echo 
__airflow_xcom_result_empty__; fi
   [2024-04-26, 00:21:40 IST] {pod_manager.py:798} INFO - Running command... if 
[ -s /airflow/xcom/return.json ]; then cat /airflow/xcom/return.json; else echo 
__airflow_xcom_result_empty__; fi
   [2024-04-26, 00:21:44 IST] {pod_manager.py:798} INFO - Running command... if 
[ -s /airflow/xcom/return.json ]; then cat /airflow/xcom/return.json; else echo 
__airflow_xcom_result_empty__; fi
   [2024-04-26, 00:21:52 IST] {pod_manager.py:798} INFO - Running command... if 
[ -s /airflow/xcom/return.json ]; then cat /airflow/xcom/return.json; else echo 
__airflow_xcom_result_empty__; fi
   [2024-04-26, 00:21:52 IST] {pod_manager.py:798} INFO - Running command... 
kill -s SIGINT 1
   [2024-04-26, 00:21:52 IST] {pod.py:909} INFO - Deleting pod: 
hpaev5-hevc-smp-manifest-generation-9lqygp1s
   [2024-04-26, 00:21:52 IST] {taskinstance.py:2731} ERROR - Task failed with 
exception
   Traceback (most recent call last):
     File "/opt/airflow/plugins/operators/kubernetes_pod_operator.py", line 
200, in execute
       result = self.extract_xcom(pod=self.pod)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py",
 line 557, in extract_xcom
       result = self.pod_manager.extract_xcom(pod)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py",
 line 730, in extract_xcom
       result = self.extract_xcom_json(pod)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 
289, in wrapped_f
       return self(f, *args, **kw)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 
379, in __call__
       do = self.iter(retry_state=retry_state)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 
325, in iter
       raise retry_exc.reraise()
     File 
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 
158, in reraise
       raise self.last_attempt.result()
     File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in 
result
       return self.__get_result()
     File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in 
__get_result
       raise self._exception
     File 
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 
382, in __call__
       result = fn(*args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py",
 line 765, in extract_xcom_json
       json.loads(result)
     File "/usr/local/lib/python3.10/json/__init__.py", line 346, in loads
       return _default_decoder.decode(s)
     File "/usr/local/lib/python3.10/json/decoder.py", line 337, in decode
       obj, end = self.raw_decode(s, idx=_w(s, 0).end())
     File "/usr/local/lib/python3.10/json/decoder.py", line 355, in raw_decode
       raise JSONDecodeError("Expecting value", s, err.value) from None
   json.decoder.JSONDecodeError: Expecting value: line 1 column 16385 (char 
16384)
   During handling of the above exception, another exception occurred:
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py",
 line 439, in _execute_task
       result = _execute_callable(context=context, **execute_callable_kwargs)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py",
 line 414, in _execute_callable
       return execute_callable(context=context, **execute_callable_kwargs)
     File "/opt/airflow/plugins/operators/kubernetes_pod_operator.py", line 
215, in execute
       self.cleanup(
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py",
 line 839, in cleanup
       raise AirflowException
   ```
   
   ### What you think should happen instead?
   
   The task should not fail when xcom data.
   
   ### How to reproduce
   
   This can be reproduced by having a 20k char in json file for xcom and it'll 
fail intermittently while taking the data.
   I'll investigate on the reason for the xcom json issue.
   
   ### Operating System
   
   Amazon Linux 2
   
   ### Versions of Apache Airflow Providers
   
   pytest>=6.2.5
   docker>=5.0.0
   crypto>=1.4.1
   cryptography>=3.4.7
   pyOpenSSL>=20.0.1
   ndg-httpsclient>=0.5.1
   boto3>=1.34.0
   sqlalchemy
   redis>=3.5.3
   requests>=2.26.0
   pysftp>=0.2.9
   werkzeug>=1.0.1
   apache-airflow-providers-cncf-kubernetes==8.0.0
   apache-airflow-providers-amazon>=8.13.0
   psycopg2>=2.8.5
   grpcio>=1.37.1
   grpcio-tools>=1.37.1
   protobuf>=3.15.8,<=3.21
   python-dateutil>=2.8.2
   jira>=3.1.1
   confluent_kafka>=1.7.0
   pyarrow>=10.0.1,<10.1.0
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   Official helm chart deployment.
   
   ### Anything else?
   
   I think , we are facing similar issue : 
https://github.com/apache/airflow/issues/32111
   And it's fixed here : https://github.com/apache/airflow/pull/32113/files , 
we might need to increase the retry count.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to