EnricoMi commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r972871859
##########
python/pyspark/sql/dataframe.py:
##########
@@ -3064,7 +3064,7 @@ def cube(self, *cols: "ColumnOrName") -> "GroupedData":
# type: ignore[misc]
def unpivot(
self,
- ids: Optional[Union["ColumnOrName", List["ColumnOrName"],
Tuple["ColumnOrName", ...]]],
+ ids: Union["ColumnOrName", List["ColumnOrName"], Tuple["ColumnOrName",
...]],
Review Comment:
However, `pyspark.pandas.frame.melt` allows for `None` for `ids`, having the
meaning of `[]`, while `values` being `None` means magically take all non-id
columns:
```python
def melt(
self,
id_vars: Optional[Union[Name, List[Name]]] = None,
value_vars: Optional[Union[Name, List[Name]]] = None,
var_name: Optional[Union[str, List[str]]] = None,
value_name: str = "value",
) -> "DataFrame":
"""
...
Parameters
----------
frame : DataFrame
id_vars : tuple, list, or ndarray, optional
Column(s) to use as identifier variables.
value_vars : tuple, list, or ndarray, optional
Column(s) to unpivot. If not specified, uses all columns that
are not set as `id_vars`.
```
Should Python API be consistent with Scala API or PySpark Pandas API?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]