Github user citoubest commented on the issue:
https://github.com/apache/spark/pull/15135
@davies, what do you think about this patch? Can you give me some advice?
Thanks
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user citoubest commented on the issue:
https://github.com/apache/spark/pull/15135
with pandas, the param for agg is the function not a str (function names).
In [13]: df
Out[13]:
a b c d
0 0.068300 0.263883 0.237335 1
1 0.226992
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/15135
Pandas doesn't support this, does it?
```
>>> pd.read_csv('test.csv').groupby('a').agg('sum', 'avg')
Traceback (most recent call last):
File "", line 1, in
File "/Library/Py
Github user citoubest commented on the issue:
https://github.com/apache/spark/pull/15135
OK, because pandas dataframe support the added approach to agg, so I
suppose maybe spark dataframe should support, but it not. So I have tried to
add this patch. If you think this patch is not ne
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/15135
I understand the reasons why you want to add this -- but I feel this is too
esoteric and if we add this one, there are also a lot of other cases that can
be added and I don't know where we would stop.
Github user citoubest commented on the issue:
https://github.com/apache/spark/pull/15135
@rxin @davies @srowen
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wish
Github user citoubest commented on the issue:
https://github.com/apache/spark/pull/15135
@petermaxlee
In my opinion, list comprehension can reduce code length to some extent.
It's better if the agg method can support the easy way in api level.
---
If your project is set up for
Github user petermaxlee commented on the issue:
https://github.com/apache/spark/pull/15135
Isn't it as simple as
```
cols = [x for x in df.columns if x != "key]
df.groupby("key").agg([F.min(x) for x in cols] + [F.max(x) for x in cols])
```
---
If your project is set u
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15135
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feat