[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065790#comment-16065790
]
Han Xu commented on SPARK-10915:
I'm currently traveling without access to my email. To get in touch
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065787#comment-16065787
]
Erik Erlandson commented on SPARK-10915:
This would be great for exposing {{TDigest}} aggregation
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592839#comment-15592839
]
Reynold Xin commented on SPARK-10915:
-
The current implementation of collect_list isn't going to work
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592831#comment-15592831
]
Jason White commented on SPARK-10915:
-
At the moment, we use .repartitionAndSortWithinPartitions to
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592534#comment-15592534
]
Jason White commented on SPARK-10915:
-
That's unfortunate. Materializing a list somewhere is exactly
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592544#comment-15592544
]
Reynold Xin commented on SPARK-10915:
-
But if you need strict ordering guarantees, materializing them
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592514#comment-15592514
]
Davies Liu commented on SPARK-10915:
[~jason.white] When a aggregate function is applied, the order
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591706#comment-15591706
]
Jason White commented on SPARK-10915:
-
We would also very much like Python UDAFs. In particular, we
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586392#comment-15586392
]
Tobi Bosede commented on SPARK-10915:
-
Oh ok, thanks.
> Add support for UDAFs in Python
>
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586364#comment-15586364
]
Reynold Xin commented on SPARK-10915:
-
BTW percentile on large data is very expensive and can easily
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586362#comment-15586362
]
Reynold Xin commented on SPARK-10915:
-
They don't have Python API yet (nor Scala API), but you can
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586348#comment-15586348
]
Tobi Bosede commented on SPARK-10915:
-
can you please provide where these functions are documented?
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586330#comment-15586330
]
Reynold Xin commented on SPARK-10915:
-
There is percentile and approximate percentile.
> Add
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586325#comment-15586325
]
Tobi Bosede commented on SPARK-10915:
-
Hmm...I don't think percent_rank does what I thought. It looks
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586282#comment-15586282
]
Tobi Bosede commented on SPARK-10915:
-
Probably, haha. Holden mentioned I should ping "Michael" (no
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586172#comment-15586172
]
Reynold Xin commented on SPARK-10915:
-
Hm I don't think [~mgummelt] knows SQL at all. Not sure how
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586159#comment-15586159
]
Tobi Bosede commented on SPARK-10915:
-
Ah ok. Well maybe we can implement an interquartile (IQR)
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583638#comment-15583638
]
Davies Liu commented on SPARK-10915:
Currently all the aggregate functions are implemented in Scala,
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583180#comment-15583180
]
Tobi Bosede commented on SPARK-10915:
-
Thanks Davies. Someone also mentioned collect on the mailing
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582918#comment-15582918
]
Davies Liu commented on SPARK-10915:
Python UDF is executed in batch mode to have reasonable
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581167#comment-15581167
]
Reynold Xin commented on SPARK-10915:
-
It is indeed very complicated to implement UDAF in Python.
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581127#comment-15581127
]
Tobi Bosede commented on SPARK-10915:
-
It is complicated to implement a UDAF in python? If you read
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581031#comment-15581031
]
Reynold Xin commented on SPARK-10915:
-
What's the use case? Is it not possible to just run
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15580917#comment-15580917
]
Tobi Bosede commented on SPARK-10915:
-
Thoughts [~davies] and [~mgummelt]? Refer to
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15058987#comment-15058987
]
Tristan commented on SPARK-10915:
-
Would the analogy to UDAF support in Python be lambdas, as mentioned
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059055#comment-15059055
]
Justin Uang commented on SPARK-10915:
-
An abstract base class would be fine, or something like
26 matches
Mail list logo