gene-db commented on code in PR #49487:
URL: https://github.com/apache/spark/pull/49487#discussion_r1915787864


##########
python/pyspark/sql/connect/conversion.py:
##########
@@ -104,6 +104,7 @@ def _need_converter(
     def _create_converter(
         dataType: DataType,
         nullable: bool = True,
+        variants_as_dicts = False  # some code paths may require python 
interal types

Review Comment:
   ```suggestion
           variants_as_dicts = False  # some code paths may require python 
internal types
   ```



##########
python/pyspark/sql/connect/conversion.py:
##########
@@ -333,6 +340,7 @@ def convert(data: Sequence[Any], schema: StructType) -> 
"pa.Table":
             LocalDataToArrowConversion._create_converter(
                 field.dataType,
                 field.nullable,
+                variants_as_dicts = True

Review Comment:
   How do we know when to set this to true or false? It is not clear to me.



##########
python/pyspark/sql/connect/conversion.py:
##########
@@ -303,8 +307,11 @@ def convert_variant(value: Any) -> Any:
                     isinstance(value, dict)
                     and all(key in value for key in ["value", "metadata"])
                     and all(isinstance(value[key], bytes) for key in ["value", 
"metadata"])
+                    and not variants_as_dicts
                 ):
                     return VariantVal(value["value"], value["metadata"])
+                elif isinstance(value, VariantVal) and variants_as_dicts:

Review Comment:
   Isn't there a matrix of inputs we could get?
   - `value` is `VariantVal` & `variants_as_dicts` is `False`: not handled?
   - `value` is `VariantVal` & `variants_as_dicts` is `True`: handled here, 
returns `dict`
   - `value` is `dict` & `variants_as_dicts` is `False`: handled above, returns 
`VariantVal`
   - `value` is `dict` & `variants_as_dicts` is `True`: not handled?
   
   What do we do for the cases we are not handling?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to