Github user zero323 commented on the issue:
https://github.com/apache/spark/pull/16537
Thanks @HyukjinKwon
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/16537
Thanks @HyukjinKwon :D
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/16537
Let me take over this one and credit to @zero323.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/16537
I see people run into this kind of things quite a bit.
sounds like this is important to have. how about reviving some forms of
this?
---
If your project is set up for it, you can reply to
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78330/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #78330 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78330/testReport)**
for PR 16537 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #78330 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78330/testReport)**
for PR 16537 at commit
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/16537
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78315/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #78315 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78315/testReport)**
for PR 16537 at commit
Github user zero323 commented on the issue:
https://github.com/apache/spark/pull/16537
I cannot reproduce this locally, but do we really use `pypy-2.0.2`?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #78315 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78315/testReport)**
for PR 16537 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78313/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #78313 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78313/testReport)**
for PR 16537 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #78313 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78313/testReport)**
for PR 16537 at commit
Github user zero323 commented on the issue:
https://github.com/apache/spark/pull/16537
@holdenk I'll try to reproduce this problem but it looks a bit awkward:
> AttributeError: 'function' object has no attribute '__closure__'
Doesn't look like something related to
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/16537
If you have the time to update/fix this @zero323 I'm happy to merge it
pending jenkins, otherwise I'll just close the issue at the end of the month.
---
If your project is set up for it, you can
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78307/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #78307 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78307/testReport)**
for PR 16537 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #78307 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78307/testReport)**
for PR 16537 at commit
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/16537
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/16537
So it seems there isn't a solid reason not to merge this provided we aren't
going to go down the rabbit whole we've been talking about. Lets make sure
everything is still ok with Jenkins still
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/16537
@zero323 Hi, are you still working on this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user zero323 commented on the issue:
https://github.com/apache/spark/pull/16537
@holdenk If you don't see this merged could your resolve the [JIRA
ticket](https://issues.apache.org/jira/browse/SPARK-19165) and I'll just close
the PR? No reason to keep this open ad infinitum :)
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #73936 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73936/testReport)**
for PR 16537 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73936/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #73936 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73936/testReport)**
for PR 16537 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72953/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #72953 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72953/testReport)**
for PR 16537 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #72953 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72953/testReport)**
for PR 16537 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72884/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #72884 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72884/testReport)**
for PR 16537 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #72884 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72884/testReport)**
for PR 16537 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user zero323 commented on the issue:
https://github.com/apache/spark/pull/16537
Putting this particular PR and the scalability of the improvement process
aside, Spark is heavily underdocumented. This is something that hits Python
and R users way more than everyone else. In
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/16537
Ah perhaps then we are simply argeeing with each-other. I'm fine with
adding these types of fixes - but doing it one function at a time is just going
to be too time consuming and distracting from
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/16537
Maybe we're at an agree to disagree situation, but I think we may be
talking about different things. If you're saying that we should try to keep
these together to make reviews easier, I'd agree. I
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/16537
I think the overhead of doing this piecemeal removes review time available
for more important changes (like places where users are actively encountering
confusing error messages, incorrect
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/16537
Yeah, I thought this was the other PR that validates the function is
callable. Still, I don't agree that it's okay for python to be less friendly as
long as we don't think people will hit the
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/16537
@rdblue i think we're maybe understanding different type checks. My
understanding is in this case the error is already thrown right away. It's also
not that the user needs to pass a callable here,
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/16537
Sorry, my example was for validating the object passed to `udf` was
callable, not for the use of the UDF. I still think it's a good idea not to
make assumptions about how a user makes a mistake.
Github user zero323 commented on the issue:
https://github.com/apache/spark/pull/16537
I explore an alternative approach, with adding type hints
(https://github.com/zero323/pyspark-stubs), but I doubt it'll become
particularly popular, and I won't even try to push it to the main
Github user zero323 commented on the issue:
https://github.com/apache/spark/pull/16537
And there is of course a matter of user experience. Even if failure is
cheap, something like this:
```python
In [4]: from pyspark.sql.functions import udf
In [5]: udf(lambda
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/16537
I don't think it is a good idea to think that this has little use because
it is a dumb mistake to pass something that isn't callable. In this case, it's
easy to accidentally reuse a name for a
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/16537
I definitely think moving errors earlier is important (nothing is worse
than a 9 hour job that fails in the middle because of the wrong type). That
being said in this case the error isn't caught
Github user zero323 commented on the issue:
https://github.com/apache/spark/pull/16537
For me it is all about the bigger picture. I've been working with Python
for quite a while right now (probably to long for my own good) and I am used to
two things:
- Language is
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72826/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16537
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #72826 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72826/testReport)**
for PR 16537 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16537
**[Test build #72826 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72826/testReport)**
for PR 16537 at commit
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/16537
So I'm curious what the motivation is for adding these checks - looking in
the mailing list archives this doesn't seem like a common error (and the only
stack overflow post about this I saw was
59 matches
Mail list logo