[GitHub] [spark] zero323 commented on a change in pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-11-24 Thread GitBox


zero323 commented on a change in pull request #34354:
URL: https://github.com/apache/spark/pull/34354#discussion_r756336353



##
File path: python/pyspark/sql/functions.py
##
@@ -3514,7 +3538,19 @@ def map_from_arrays(col1: "ColumnOrName", col2: 
"ColumnOrName") -> Column:
 return Column(sc._jvm.functions.map_from_arrays(_to_java_column(col1), 
_to_java_column(col2)))
 
 
+@overload
 def array(*cols: "ColumnOrName") -> Column:
+...
+
+
+@overload
+def array(__cols: Union[List["ColumnOrName"], Tuple["ColumnOrName", ...]]) -> 
Column:
+...
+
+
+def array(
+*cols: Union["ColumnOrName", Union[List["ColumnOrName"], 
Tuple["ColumnOrName", ...]]]

Review comment:
   In general, nested `Unions` are flattened automatically:
   
   ```python
   >>>  Union["ColumnOrName", Union[List["ColumnOrName"], Tuple["ColumnOrName", 
...]]]
   typing.Union[ForwardRef('ColumnOrName'), 
typing.List[ForwardRef('ColumnOrName')], 
typing.Tuple[ForwardRef('ColumnOrName'), ...
   ```
   
   Personally, I find structure that clearly groups similar categories useful 
(for similar problem see 
https://github.com/apache/spark/pull/34671#discussion_r754532969), but I am not 
very attached to this idea.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on a change in pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-22 Thread GitBox


zero323 commented on a change in pull request #34354:
URL: https://github.com/apache/spark/pull/34354#discussion_r734849802



##
File path: python/pyspark/sql/functions.py
##
@@ -1652,7 +1652,19 @@ def expr(str: str) -> Column:
 return Column(sc._jvm.functions.expr(str))
 
 
+@overload
 def struct(*cols: "ColumnOrName") -> Column:
+...
+
+
+@overload
+def struct(__cols: Union[List["ColumnOrName"], Tuple["ColumnOrName", ...]]) -> 
Column:

Review comment:
   > I know it backfires in some contexts, but maybe not here.
   
   But we'd need explicit checks for strings, like
   
   ```python
   if len(cols) == 1 and  isinstance(cols[0], Sequence) and not 
isinstance(cols[0], str):
   cols = cols[0]
   ... 
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on a change in pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-22 Thread GitBox


zero323 commented on a change in pull request #34354:
URL: https://github.com/apache/spark/pull/34354#discussion_r734825579



##
File path: python/pyspark/sql/functions.py
##
@@ -1652,7 +1652,19 @@ def expr(str: str) -> Column:
 return Column(sc._jvm.functions.expr(str))
 
 
+@overload
 def struct(*cols: "ColumnOrName") -> Column:
+...
+
+
+@overload
+def struct(__cols: Union[List["ColumnOrName"], Tuple["ColumnOrName", ...]]) -> 
Column:

Review comment:
   In general, I think we have a bigger problem with current aliases 
outliving their usefulness, but that's a topic for a longer discussion and 
maybe formal design document. Sigh




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on a change in pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-22 Thread GitBox


zero323 commented on a change in pull request #34354:
URL: https://github.com/apache/spark/pull/34354#discussion_r734824096



##
File path: python/pyspark/sql/functions.py
##
@@ -1652,7 +1652,19 @@ def expr(str: str) -> Column:
 return Column(sc._jvm.functions.expr(str))
 
 
+@overload
 def struct(*cols: "ColumnOrName") -> Column:
+...
+
+
+@overload
+def struct(__cols: Union[List["ColumnOrName"], Tuple["ColumnOrName", ...]]) -> 
Column:

Review comment:
   In general, I think we have a bigger problem with current aliases 
outliving their usefulness, but that's a topic for a longer discussion and 
maybe formal design document. Sigh




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on a change in pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-22 Thread GitBox


zero323 commented on a change in pull request #34354:
URL: https://github.com/apache/spark/pull/34354#discussion_r734824096



##
File path: python/pyspark/sql/functions.py
##
@@ -1652,7 +1652,19 @@ def expr(str: str) -> Column:
 return Column(sc._jvm.functions.expr(str))
 
 
+@overload
 def struct(*cols: "ColumnOrName") -> Column:
+...
+
+
+@overload
+def struct(__cols: Union[List["ColumnOrName"], Tuple["ColumnOrName", ...]]) -> 
Column:

Review comment:
   In general, I think we have a bigger problem with current aliases 
outliving their usability, but that's a topic for a longer discussion and maybe 
formal design document. Sigh




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on a change in pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-22 Thread GitBox


zero323 commented on a change in pull request #34354:
URL: https://github.com/apache/spark/pull/34354#discussion_r734823494



##
File path: python/pyspark/sql/functions.py
##
@@ -1652,7 +1652,19 @@ def expr(str: str) -> Column:
 return Column(sc._jvm.functions.expr(str))
 
 
+@overload
 def struct(*cols: "ColumnOrName") -> Column:
+...
+
+
+@overload
+def struct(__cols: Union[List["ColumnOrName"], Tuple["ColumnOrName", ...]]) -> 
Column:

Review comment:
   > How about using more general type, like `Sequence` or `Iterable`?
   
   Yeah, this is something that has bothering me for a couple of days now.  
More general type would be great (assuming we'd modify the code, what wouldn't 
be a bad idea anyway), if it wasn't for the fact, that `str` is recursively 
`Sequence[str] ` / `Iterable[str]`.
   
   
   ```python
   from typing import Sequence, Iterable
   
   x: Sequence[str] = "abc"
   y: Iterable[str] = "abc"
   ```
   
   I know it backfires in some contexts, but maybe not here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on a change in pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-22 Thread GitBox


zero323 commented on a change in pull request #34354:
URL: https://github.com/apache/spark/pull/34354#discussion_r734483147



##
File path: python/pyspark/sql/functions.py
##
@@ -3455,7 +3469,19 @@ def translate(srcCol: "ColumnOrName", matching: str, 
replace: str) -> Column:
 
 # -- Collection functions --
 
+@overload
 def create_map(*cols: "ColumnOrName") -> Column:
+...
+
+
+@overload
+def create_map(__cols: Union[List["ColumnOrName"], Tuple["ColumnOrName", 
...]]) -> Column:

Review comment:
   This indicates that argument should be treated as positional-only.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org