[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-08-29 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16537 Thanks @HyukjinKwon --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-08-22 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16537 Thanks @HyukjinKwon :D --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-08-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16537 Let me take over this one and credit to @zero323. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-08-22 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16537 I see people run into this kind of things quite a bit. sounds like this is important to have. how about reviving some forms of this? --- If your project is set up for it, you can reply to

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78330/ Test PASSed. ---

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #78330 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78330/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #78330 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78330/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/16537 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78315/ Test FAILed. ---

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #78315 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78315/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16537 I cannot reproduce this locally, but do we really use `pypy-2.0.2`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #78315 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78315/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78313/ Test FAILed. ---

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #78313 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78313/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #78313 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78313/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16537 @holdenk I'll try to reproduce this problem but it looks a bit awkward: > AttributeError: 'function' object has no attribute '__closure__' Doesn't look like something related to

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16537 If you have the time to update/fix this @zero323 I'm happy to merge it pending jenkins, otherwise I'll just close the issue at the end of the month. --- If your project is set up for it, you can

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78307/ Test FAILed. ---

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #78307 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78307/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #78307 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78307/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/16537 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16537 So it seems there isn't a solid reason not to merge this provided we aren't going to go down the rabbit whole we've been talking about. Lets make sure everything is still ok with Jenkins still

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-06-20 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/16537 @zero323 Hi, are you still working on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-03-15 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16537 @holdenk If you don't see this merged could your resolve the [JIRA ticket](https://issues.apache.org/jira/browse/SPARK-19165) and I'll just close the PR? No reason to keep this open ad infinitum :)

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-03-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-03-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #73936 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73936/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-03-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73936/ Test PASSed. ---

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-03-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #73936 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73936/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72953/ Test PASSed. ---

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #72953 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72953/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #72953 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72953/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72884/ Test PASSed. ---

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #72884 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72884/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #72884 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72884/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16537 Putting this particular PR and the scalability of the improvement process aside, Spark is heavily underdocumented. This is something that hits Python and R users way more than everyone else. In

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16537 Ah perhaps then we are simply argeeing with each-other. I'm fine with adding these types of fixes - but doing it one function at a time is just going to be too time consuming and distracting from

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/16537 Maybe we're at an agree to disagree situation, but I think we may be talking about different things. If you're saying that we should try to keep these together to make reviews easier, I'd agree. I

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16537 I think the overhead of doing this piecemeal removes review time available for more important changes (like places where users are actively encountering confusing error messages, incorrect

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/16537 Yeah, I thought this was the other PR that validates the function is callable. Still, I don't agree that it's okay for python to be less friendly as long as we don't think people will hit the

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16537 @rdblue i think we're maybe understanding different type checks. My understanding is in this case the error is already thrown right away. It's also not that the user needs to pass a callable here,

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/16537 Sorry, my example was for validating the object passed to `udf` was callable, not for the use of the UDF. I still think it's a good idea not to make assumptions about how a user makes a mistake.

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16537 I explore an alternative approach, with adding type hints (https://github.com/zero323/pyspark-stubs), but I doubt it'll become particularly popular, and I won't even try to push it to the main

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16537 And there is of course a matter of user experience. Even if failure is cheap, something like this: ```python In [4]: from pyspark.sql.functions import udf In [5]: udf(lambda

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/16537 I don't think it is a good idea to think that this has little use because it is a dumb mistake to pass something that isn't callable. In this case, it's easy to accidentally reuse a name for a

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16537 I definitely think moving errors earlier is important (nothing is worse than a 9 hour job that fails in the middle because of the wrong type). That being said in this case the error isn't caught

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16537 For me it is all about the bigger picture. I've been working with Python for quite a while right now (probably to long for my own good) and I am used to two things: - Language is

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72826/ Test PASSed. ---

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16537 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #72826 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72826/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16537 **[Test build #72826 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72826/testReport)** for PR 16537 at commit

[GitHub] spark issue #16537: [SPARK-19165][PYTHON][SQL] UserDefinedFunction.__call__ ...

2017-02-13 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16537 So I'm curious what the motivation is for adding these checks - looking in the mailing list archives this doesn't seem like a common error (and the only stack overflow post about this I saw was