amoghrajesh commented on code in PR #58992:
URL: https://github.com/apache/airflow/pull/58992#discussion_r2598055295


##########
airflow-core/src/airflow/api_fastapi/core_api/routes/public/xcom.py:
##########
@@ -101,16 +101,25 @@ def get_xcom_entry(
     item = copy.copy(result)
 
     if deserialize:
-        # We use `airflow.serialization.serde` for deserialization here 
because custom XCom backends (with their own
-        # serializers/deserializers) are only used on the worker side during 
task execution.
-
-        # However, the XCom value is *always* stored in the metadata database 
as a valid JSON object.
-        # Therefore, for purposes such as UI display or returning API 
responses, deserializing with
-        # `airflow.serialization.serde` is safe and recommended.
-        from airflow.serialization.serde import deserialize as 
serde_deserialize
-
-        # full=False ensures that the `item` is deserialized without loading 
the classes, and it returns a stringified version
-        item.value = serde_deserialize(XComModel.deserialize_value(item), 
full=False)
+        # Custom XCom backends may store references (eg: object storage paths) 
in the database.
+        # The custom XCom backend's deserialize_value() resolves these to 
actual values, but that is only
+        # used on workers during task execution. The API reads directly from 
the database and uses
+        # stringify() to convert DB values (references or serialized data) to 
human readable
+        # format for UI display or for API users.
+        import json
+
+        from airflow.serialization.stringify import stringify as stringify_xcom
+
+        try:
+            parsed_value = json.loads(result.value)
+        except (ValueError, TypeError):
+            # Already deserialized (e.g., set via Task Execution API)
+            parsed_value = result.value

Review Comment:
   I thought about it a bit more and task execution path cannot store a bad 
value in the database because it will go through the serde filter before hand 
and anything wrong will be caught early. So we can be sure that the sdk path 
will always be reliable jsonable and would enter the "except" path:
   
   ```python
   (airflow) ➜  airflow git:(move-serde-to-task-sdk) ✗ python             
   Python 3.13.3 (main, Apr  8 2025, 13:54:08) [Clang 17.0.0 
(clang-1700.0.13.3)] on darwin
   Type "help", "copyright", "credits" or "license" for more information.
   >>> 
   >>> 
   >>> import json
   >>> json.loads({1: 2})
   Traceback (most recent call last):
     File "<python-input-3>", line 1, in <module>
       json.loads({1: 2})
       ~~~~~~~~~~^^^^^^^^
     File 
"/opt/homebrew/Cellar/[email protected]/3.13.3_1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/json/__init__.py",
 line 339, in loads
       raise TypeError(f'the JSON object must be str, bytes or bytearray, '
                       f'not {s.__class__.__name__}')
   TypeError: the JSON object must be str, bytes or bytearray, not dict
   >>> json.loads([1,2])
   Traceback (most recent call last):
     File "<python-input-4>", line 1, in <module>
       json.loads([1,2])
       ~~~~~~~~~~^^^^^^^
     File 
"/opt/homebrew/Cellar/[email protected]/3.13.3_1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/json/__init__.py",
 line 339, in loads
       raise TypeError(f'the JSON object must be str, bytes or bytearray, '
                       f'not {s.__class__.__name__}')
   TypeError: the JSON object must be str, bytes or bytearray, not list
   >>> json.loads("abcd")
   Traceback (most recent call last):
     File "<python-input-5>", line 1, in <module>
       json.loads("abcd")
       ~~~~~~~~~~^^^^^^^^
     File 
"/opt/homebrew/Cellar/[email protected]/3.13.3_1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/json/__init__.py",
 line 346, in loads
       return _default_decoder.decode(s)
              ~~~~~~~~~~~~~~~~~~~~~~~^^^
     File 
"/opt/homebrew/Cellar/[email protected]/3.13.3_1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/json/decoder.py",
 line 345, in decode
       obj, end = self.raw_decode(s, idx=_w(s, 0).end())
                  ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/opt/homebrew/Cellar/[email protected]/3.13.3_1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/json/decoder.py",
 line 363, in raw_decode
       raise JSONDecodeError("Expecting value", s, err.value) from None
   json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
   
   >>> json.loads("{invalid json")
   Traceback (most recent call last):
     File "<python-input-6>", line 1, in <module>
       json.loads("{invalid json")
       ~~~~~~~~~~^^^^^^^^^^^^^^^^^
     File 
"/opt/homebrew/Cellar/[email protected]/3.13.3_1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/json/__init__.py",
 line 346, in loads
       return _default_decoder.decode(s)
              ~~~~~~~~~~~~~~~~~~~~~~~^^^
     File 
"/opt/homebrew/Cellar/[email protected]/3.13.3_1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/json/decoder.py",
 line 345, in decode
       obj, end = self.raw_decode(s, idx=_w(s, 0).end())
                  ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/opt/homebrew/Cellar/[email protected]/3.13.3_1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/json/decoder.py",
 line 361, in raw_decode
       obj, end = self.scan_once(s, idx)
                  ~~~~~~~~~~~~~~^^^^^^^^
   json.decoder.JSONDecodeError: Expecting property name enclosed in double 
quotes: line 1 column 2 (char 1)
   
   ```
   And the Core API never stores deserialized Python objects, so it will always 
enter the "try" part and almost never fail to deserialize unless someone has 
stored directly in the DB



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to