zero323 commented on a change in pull request #34354:
URL: https://github.com/apache/spark/pull/34354#discussion_r779096230



##########
File path: python/pyspark/sql/functions.py
##########
@@ -1670,7 +1671,19 @@ def expr(str: str) -> Column:
     return Column(sc._jvm.functions.expr(str))
 
 
+@overload
 def struct(*cols: "ColumnOrName") -> Column:
+    ...
+
+
+@overload
+def struct(__cols: Union[List["ColumnOrName_"], Tuple["ColumnOrName_", ...]]) 
-> Column:
+    ...

Review comment:
       We still want to support (it was common user request in the past) calls 
like
   
   ```python
   struct(("foo", "bar"))
   ```
   
   which shouldn't be accepted without `Tuple` (or some supertype).
   
   If you try
   
   ```python
   diff --git a/python/pyspark/sql/functions.py 
b/python/pyspark/sql/functions.py
   index 006d10c9fc..caf17a84b3 100644
   --- a/python/pyspark/sql/functions.py
   +++ b/python/pyspark/sql/functions.py
   @@ -1677,13 +1677,11 @@ def struct(*cols: "ColumnOrName") -> Column:
    
    
    @overload
   -def struct(__cols: Union[List["ColumnOrName_"], Tuple["ColumnOrName_", 
...]]) -> Column:
   +def struct(__cols: Union[List["ColumnOrName_"]]) -> Column:
        ...
    
    
   -def struct(
   -    *cols: Union["ColumnOrName", Union[List["ColumnOrName_"], 
Tuple["ColumnOrName_", ...]]]
   -) -> Column:
   +def struct(*cols: Union["ColumnOrName", Union[List["ColumnOrName_"]]]) -> 
Column:
        """Creates a new struct column.
    
        .. versionadded:: 1.4.0
   diff --git a/python/pyspark/sql/tests/typing/test_functions.yml 
b/python/pyspark/sql/tests/typing/test_functions.yml
   index efb3293472..f5f2f13c4f 100644
   --- a/python/pyspark/sql/tests/typing/test_functions.yml
   +++ b/python/pyspark/sql/tests/typing/test_functions.yml
   @@ -66,6 +66,8 @@
        create_map(col_objs)
        map_concat(col_objs)
    
   +    struct(("foo", "bar"))
   +
      out: |
        main:29: error: No overload variant of "array" matches argument types 
"List[Column]", "List[Column]"  [call-overload]
        main:29: note: Possible overload variant:
   
   ```
   
   you should see error in data tests
   
   ```
   ___________________________ varargFunctionsOverloads 
___________________________
   /path/to/spark/python/pyspark/sql/tests/typing/test_functions.yml:19: 
   E   pytest_mypy_plugins.utils.TypecheckAssertionError: Invalid output: 
   E   Actual:
   E     main:50: error: No overload variant of "struct" matches argument type 
"Tuple[str, str]"  [call-overload] (diff)
   E     main:50: note: Possible overload variants:    (diff)
   E     main:50: note:     def struct(*cols: Union[Column, str]) -> Column 
(diff)
   E     main:50: note:     def [ColumnOrName_] struct(List[ColumnOrName_]) -> 
Column (diff)
   E   Expected:
   E     (empty)
   =========================== short test summary info 
============================
   ````
   
   
   > And could I ask if what is the `...` mean in `["ColumnOrName_", ...]` ??
   
   `Tuples` are typed like product types, so `Tuple[ColumnOrName_]` matches 
tuple with exactly one column or str element. In contrast 
`Tuple["ColumnOrName_", ...]` matches tuples of arbitrary size, as long as all 
elements are columns or strings (there is [mypy doc 
section](https://mypy.readthedocs.io/en/stable/kinds_of_types.html?highlight=namedtuple#tuple-types)
 that discusses this syntax further).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to