amaliujia commented on code in PR #38318:
URL: https://github.com/apache/spark/pull/38318#discussion_r1015762416
##########
python/pyspark/sql/connect/dataframe.py:
##########
@@ -323,6 +323,14 @@ def unionByName(self, other: "DataFrame",
allowMissingColumns: bool = False) ->
def where(self, condition: Expression) -> "DataFrame":
return self.filter(condition)
+ def summary(self, *statistics: str) -> "DataFrame":
+ _statistics: List[str] = list(statistics)
+ assert all(isinstance(s, str) for s in _statistics)
Review Comment:
In contrast, I guess the assert in `plan.py` is ok because that is internal
API and we can implement it right so assert on unexpected things (and developer
can fix when it really happens).
```
def __init__(self, child: Optional["LogicalPlan"], function: str,
**kwargs: Any) -> None:
super().__init__(child)
assert function in ["summary"]
self.function = function
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]