[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/19027 Sure - I think there are a number of different situations reported in the JIRA that could be separated into different fixes. Let me know what I can help with! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19027 Merged to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19027 Will merge this one BTW. Sounds we are fine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19027 That's fine, @ueshin and @felixcheung. Adding few tests with `numpy` type might be an extra bit and (possibly) unrelated vs it's easy to add a test and might be a (possibly) common case users would try first. Of course, supporting `numpy` types properly should be orthogonal. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19027 @felixcheung I'm sorry if I'm missing something but it sounds like it's a different problem from this pr? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/19027 It's not specific to it, but fairly common when people are calling numpy in UDF and returning its scalar type as-is. These scalar "looks" like Python native types (numpy.float_ vs float). That's the case reported in JIRA and what I've run into. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19027 LGTM. Btw, I'm just curious why we need tests with `numpy` here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19027 Will probably take a look through the problem in the near future including hard dependencies and etc. I took a quick look but I think I need more time but yes it looks appearently vaild point. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/19027 I'm ok without the test since this is unlikely to break in the future. We do have tests that depends on (optionally) numpy (and Arrow) - seems like we should be able to take on dependencies more formally so we could test them properly? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19027 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19027 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81056/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19027 **[Test build #81056 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81056/testReport)** for PR 19027 at commit [`4abaef7`](https://github.com/apache/spark/commit/4abaef78087a3b2ee6c86f7ea720ea356fe80353). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19027 **[Test build #81056 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81056/testReport)** for PR 19027 at commit [`4abaef7`](https://github.com/apache/spark/commit/4abaef78087a3b2ee6c86f7ea720ea356fe80353). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19027 Oops, looks I need to check if numpy is available. Let me rather take this one out here as I am trying to whitelist `basestring` if you don't mind. I tested it with numpy in my local for your concern @felixcheung and it looks fine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19027 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81053/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19027 **[Test build #81053 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81053/testReport)** for PR 19027 at commit [`5e21a7e`](https://github.com/apache/spark/commit/5e21a7ed7fc409a94e0c5589962761d95c342a27). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19027 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19027 **[Test build #81053 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81053/testReport)** for PR 19027 at commit [`5e21a7e`](https://github.com/apache/spark/commit/5e21a7ed7fc409a94e0c5589962761d95c342a27). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19027 Thanks @felixcheung and @holdenk. I just added a simple test with numpy.float. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/19027 I like this approach @HyukjinKwon :D! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/19027 Cool looks to me like a very reasonable fix. Could we perhaps add a test for numpy.bool_ or numpy.float_ (that it should fail)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19027 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81030/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19027 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19027 **[Test build #81030 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81030/testReport)** for PR 19027 at commit [`d14c2cc`](https://github.com/apache/spark/commit/d14c2cc9aabfbfa2294f7e4937704fc63717e321). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19027 cc @zero323, @rdblue, @nchammas, @holdenk, @ueshin and @felixcheung. Could you take a look please? I think it is a small fix but the advantage is quite large. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19027: [SPARK-19165][PYTHON][SQL] PySpark APIs using columns as...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19027 **[Test build #81030 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81030/testReport)** for PR 19027 at commit [`d14c2cc`](https://github.com/apache/spark/commit/d14c2cc9aabfbfa2294f7e4937704fc63717e321). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org