HeartSaVioR commented on code in PR #53122:
URL: https://github.com/apache/spark/pull/53122#discussion_r2592232421
##########
python/pyspark/sql/pandas/serializers.py:
##########
@@ -2236,15 +2237,15 @@ def row_iterator():
for batch in batches:
# Detect which column has data - each batch contains only one
type
input_result = extract_rows(batch, "inputData",
self.key_offsets)
+ init_result = extract_rows(batch, "initState",
self.init_key_offsets)
if input_result is not None:
Review Comment:
nit: same here, XOR?
##########
python/pyspark/sql/pandas/serializers.py:
##########
@@ -2009,20 +2009,21 @@ def row_stream():
for i, c in
enumerate(flatten_state_table.itercolumns())
]
+ flatten_init_table = flatten_columns(batch, "initState")
+ init_data_pandas = [
+ self.arrow_to_pandas(c, i)
+ for i, c in enumerate(flatten_init_table.itercolumns())
+ ]
+
if bool(data_pandas):
Review Comment:
nit: Probably just assert with XOR here to confirm either bool(data_pandas)
or bool(init_data_pandas) is True and another is False?
```
>>> True^True
False
>>> True^False
True
>>> False^True
True
>>> False^False
False
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]