Re: [PR] [SPARK-5192] Fix type handling of namedTuple for transfromWithState [spark]

via GitHub Wed, 03 Dec 2025 16:23:31 -0800


zeruibao commented on code in PR #53314:
URL: https://github.com/apache/spark/pull/53314#discussion_r2587006200



##########
python/pyspark/sql/streaming/stateful_processor_api_client.py:
##########
@@ -501,6 +501,11 @@ def normalize_value(v: Any) -> Any:
                 # Convert NumPy types to Python primitive types.
                 if isinstance(v, np.generic):
                     return v.tolist()
+                # Named tuples (collections.namedtuple or typing.NamedTuple) 
have a
+                # _fields attribute. Spark Row has __fields__. Both require 
positional
+                # arguments and cannot be instantiated with a generator 
expression.
+                if hasattr(v, '_fields') or hasattr(v, '__fields__'):

Review Comment:
   Hey @gaogaotiantian, I use
   ```
                   if (
                       isinstance(v, Row) or
                       (isinstance(v, tuple) and hasattr(v, "_fields"))
                   ):
   ```
   instead. `isinstance(v, NamedTuple)` won’t work because `typing.NamedTuple` 
is a class factory, not a runtime parent of instances. Checking `isinstance(v, 
tuple)` and _fields is the correct way. Please take another look. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-5192] Fix type handling of namedTuple for transfromWithState [spark]

Reply via email to