[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2017-06-27 Thread Han Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065790#comment-16065790 ] Han Xu commented on SPARK-10915: I'm currently traveling without access to my email. To get in touch

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2017-06-27 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065787#comment-16065787 ] Erik Erlandson commented on SPARK-10915: This would be great for exposing {{TDigest}} aggregation

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592839#comment-15592839 ] Reynold Xin commented on SPARK-10915: - The current implementation of collect_list isn't going to work

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-20 Thread Jason White (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592831#comment-15592831 ] Jason White commented on SPARK-10915: - At the moment, we use .repartitionAndSortWithinPartitions to

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-20 Thread Jason White (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592534#comment-15592534 ] Jason White commented on SPARK-10915: - That's unfortunate. Materializing a list somewhere is exactly

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592544#comment-15592544 ] Reynold Xin commented on SPARK-10915: - But if you need strict ordering guarantees, materializing them

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-20 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592514#comment-15592514 ] Davies Liu commented on SPARK-10915: [~jason.white] When a aggregate function is applied, the order

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-20 Thread Jason White (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591706#comment-15591706 ] Jason White commented on SPARK-10915: - We would also very much like Python UDAFs. In particular, we

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-18 Thread Tobi Bosede (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586392#comment-15586392 ] Tobi Bosede commented on SPARK-10915: - Oh ok, thanks. > Add support for UDAFs in Python >

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586364#comment-15586364 ] Reynold Xin commented on SPARK-10915: - BTW percentile on large data is very expensive and can easily

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586362#comment-15586362 ] Reynold Xin commented on SPARK-10915: - They don't have Python API yet (nor Scala API), but you can

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-18 Thread Tobi Bosede (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586348#comment-15586348 ] Tobi Bosede commented on SPARK-10915: - can you please provide where these functions are documented?

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586330#comment-15586330 ] Reynold Xin commented on SPARK-10915: - There is percentile and approximate percentile. > Add

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-18 Thread Tobi Bosede (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586325#comment-15586325 ] Tobi Bosede commented on SPARK-10915: - Hmm...I don't think percent_rank does what I thought. It looks

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-18 Thread Tobi Bosede (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586282#comment-15586282 ] Tobi Bosede commented on SPARK-10915: - Probably, haha. Holden mentioned I should ping "Michael" (no

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586172#comment-15586172 ] Reynold Xin commented on SPARK-10915: - Hm I don't think [~mgummelt] knows SQL at all. Not sure how

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-18 Thread Tobi Bosede (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586159#comment-15586159 ] Tobi Bosede commented on SPARK-10915: - Ah ok. Well maybe we can implement an interquartile (IQR)

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583638#comment-15583638 ] Davies Liu commented on SPARK-10915: Currently all the aggregate functions are implemented in Scala,

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-17 Thread Tobi Bosede (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583180#comment-15583180 ] Tobi Bosede commented on SPARK-10915: - Thanks Davies. Someone also mentioned collect on the mailing

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582918#comment-15582918 ] Davies Liu commented on SPARK-10915: Python UDF is executed in batch mode to have reasonable

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581167#comment-15581167 ] Reynold Xin commented on SPARK-10915: - It is indeed very complicated to implement UDAF in Python.

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-16 Thread Tobi Bosede (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581127#comment-15581127 ] Tobi Bosede commented on SPARK-10915: - It is complicated to implement a UDAF in python? If you read

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581031#comment-15581031 ] Reynold Xin commented on SPARK-10915: - What's the use case? Is it not possible to just run

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-16 Thread Tobi Bosede (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15580917#comment-15580917 ] Tobi Bosede commented on SPARK-10915: - Thoughts [~davies] and [~mgummelt]? Refer to

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2015-12-15 Thread Tristan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15058987#comment-15058987 ] Tristan commented on SPARK-10915: - Would the analogy to UDAF support in Python be lambdas, as mentioned

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2015-12-15 Thread Justin Uang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059055#comment-15059055 ] Justin Uang commented on SPARK-10915: - An abstract base class would be fine, or something like