gaogaotiantian commented on code in PR #54009:
URL: https://github.com/apache/spark/pull/54009#discussion_r2735019871
##########
python/pyspark/pandas/frame.py:
##########
@@ -11361,9 +11361,13 @@ def _result_aggregated(
# dtype: bool
return first_series(DataFrame(internal))
- # TODO(SPARK-46167): add axis, pct, na_option parameter
+ # TODO(SPARK-46167): add pct, na_option parameter
def rank(
- self, method: str = "average", ascending: bool = True, numeric_only:
bool = False
+ self,
+ method: str = "average",
+ ascending: bool = True,
+ numeric_only: bool = False,
+ axis: Axis = 0,
Review Comment:
We need to make a decision for where `axis` should be. `pandas` has it at
the very beginning - we are doing a different thing, which means if the user is
sending the argument positionally, we would have a different result. On the
other hand, if they are doing that, moving `axis` to the beginning would break
their existing code too.
On a side note, `pandas` is moving towards keyword-only APIs very eagerly.
We could also consider doing that here to avoid user sending the wrong argument.
We are incompatible with `pandas` now - might be a good chance to fix that
and hurt the users early.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]