[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2018-01-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16534 Recently we hit some problems while extending python udf, to support `asNondeterministic`, `asNonNullable`, etc. It's really confusing if the return type is just a python function. ---

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2018-01-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16534 Is this still a problem? Now `UserDefinedFunction` defines `returnType` as a property. --- - To unsubscribe, e-mail:

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-24 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16534 I agree, just in case someone does have an isinstance check (or similar) we should document the change in the release notes. --- If your project is set up for it, you can reply to this email and

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-24 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16534 Thanks @holdenk. I think it should be mentioned as a change of behavior in the release notes. We don't change API, and `UserDefinedFunction` is hardly public (it is not even included in the docs),

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-24 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16534 Merged to master, thanks @zero323 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-24 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16534 Great! Thanks for doing this, will merge to master :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-24 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16534 Don't worry, I get it :) The point is to make user experience better not worse, right? In practice: - These changes are pretty far from data, so overall impact is negligible and constant.

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-23 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16534 Yes pydoc.help does depend on looking at the docstring on the type rather than the object :( Too bad the IPython magic isn't used in pydoc too. Sorry for all the back and forth, I'm just

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-23 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16534 `update_wrapper` works the same way as `wraps` - it will be useful for IPython, which uses relatively complex inspection rules, but will be useless anywhere when one depends on `pydoc.help`. ---

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-23 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16534 I'm not sure about `wraps` but with `update_wrapper`, I tested it in a Jupyter kernel and it seems to give all of the docstring and signature information without adding another function dispatch

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-22 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16534 To a very limited extent. It can bring some useful information in IPython / Jupyter (maybe some other tools as well) but won't work with built-in `help` / `pydoc.help`. You can compare:

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-22 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16534 So it feels like we are adding an extra layer of indirection unnecessarily, could you use update_wrapper from functools directly on the udf object? --- If your project is set up for it, you can

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-16 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16534 Sure, I'll take another closer look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-16 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16534 Change looks good to me but I didn't look super carefully. @holdenk can you take a look at this? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16534 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72966/ Test PASSed. ---

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16534 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #72966 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72966/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #72966 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72966/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16534 Build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16534 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72949/ Test PASSed. ---

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #72949 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72949/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16534 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16534 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72951/ Test PASSed. ---

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #72951 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72951/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #72951 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72951/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #72949 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72949/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16534 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #72242 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72242/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #72242 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72242/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-26 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16534 @rxin I am not aware of any straightforward way of separating these two, but I focused on the docstrings anyway. The rationale is simple - I want to be able to: - Create packages

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-26 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16534 Is the goal to change the doc or the repl string? It might be useful to change the repl string but I'm not sure if it is worth changing the doc. --- If your project is set up for it, you can reply

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-26 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16534 Thanks @holdenk! Let's wait for another opinion (maybe @rxin) and if it is not acceptable I'll just close this and ask for closing the ticket. Theoretically we could define a constructor with

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-26 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16534 So I'm not super comfortable changing the return type (what about if user code has `isinstance` checks with `UserDefinedFunction`?) That being said if @davies or one of the other committers thinks

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16534 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71723/ Test PASSed. ---

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16534 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #71723 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71723/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-20 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16534 @holdenk I used function arguments to make sure that public API, though not types, is preserved. Please let me know what you think. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #71723 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71723/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16534 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16534 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71685/ Test PASSed. ---

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #71685 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71685/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #71685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71685/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16534 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #71680 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71680/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16534 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71680/ Test FAILed. ---

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16534 **[Test build #71680 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71680/testReport)** for PR 16534 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-12 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16534 @holdenk Indeed. Not the most fortunate moment for making a bunch of connected PRs :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-12 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16534 @holdenk I don't think it should go to the point release at all (same as https://github.com/apache/spark/pull/16533 which, depending on the resolution, may introduce new functionality or breaking

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-12 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16534 It's a bit hard to follow up wit those during JIRA maintenance window - I'll follow up after JIRA comes back online :) --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-12 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16534 Improving UDF Docstrings for Python seems like a good idea, but at the cost of breaking the public API in a point release I think it might make sense for us to do the more work approach unless