[GitHub] [spark] EnricoMi commented on a diff in pull request #37407: [SPARK-39876][SQL] Add UNPIVOT to SQL syntax

GitBox Fri, 16 Sep 2022 03:16:09 -0700


EnricoMi commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r972871859



##########
python/pyspark/sql/dataframe.py:
##########
@@ -3064,7 +3064,7 @@ def cube(self, *cols: "ColumnOrName") -> "GroupedData":  
# type: ignore[misc]
 
     def unpivot(
         self,
-        ids: Optional[Union["ColumnOrName", List["ColumnOrName"], 
Tuple["ColumnOrName", ...]]],
+        ids: Union["ColumnOrName", List["ColumnOrName"], Tuple["ColumnOrName", 
...]],

Review Comment:
   However, `pyspark.pandas.frame.melt` allows for `None` for `ids`, having the 
meaning of `[]`, while `values` being `None` means magically take all non-id 
columns:
   
   ```python
       def melt(
           self,
           id_vars: Optional[Union[Name, List[Name]]] = None,
           value_vars: Optional[Union[Name, List[Name]]] = None,
           var_name: Optional[Union[str, List[str]]] = None,
           value_name: str = "value",
       ) -> "DataFrame":
           """
           ...
           Parameters
           ----------
           frame : DataFrame
           id_vars : tuple, list, or ndarray, optional
               Column(s) to use as identifier variables.
           value_vars : tuple, list, or ndarray, optional
               Column(s) to unpivot. If not specified, uses all columns that
               are not set as `id_vars`.
   ```
   
   Should Python API be consistent with Scala API or PySpark Pandas API?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] EnricoMi commented on a diff in pull request #37407: [SPARK-39876][SQL] Add UNPIVOT to SQL syntax

Reply via email to