[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-07-18 Thread sun-rui
Github user sun-rui commented on the issue: https://github.com/apache/spark/pull/12836 no, go ahead to submit one:) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wi

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-07-18 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 @NarineK Not as far as I know --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-07-18 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 @shivaram, @sun-rui , I was wondering if someone created a jira for the issue described here: https://github.com/apache/spark/pull/12836#issuecomment-225403054 --- If your project is set up for

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-19 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Thanks for the quick response. I'll create one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-19 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/12836 @NarineK I am not quite sure. Maybe you could create a new JIRA for gapply's programming guide. --- If your project is set up for it, you can reply to this email and have your reply appear on Git

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-19 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 @vectorijk, should I do the pull request for the same jira - https://issues.apache.org/jira/browse/SPARK-15672, or should I create a new jira for the programming guide? --- If your project is set

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-17 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/12836 @NarineK Cool~ I think it is better to open a separate PR to track `gapply` programming guide. --- If your project is set up for it, you can reply to this email and have your reply appear on GitH

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-17 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Hi @vectorijk, Thanks for asking, i think in a separate PR. Do you think including in #13660 would be better ? --- If your project is set up for it, you can reply to this email and have your re

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-17 Thread vectorijk
Github user vectorijk commented on the issue: https://github.com/apache/spark/pull/12836 @NarineK Which way do you want to include programming guide for `gapply`, in separate PR or in #13660? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enab

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60621/ Test PASSed. ---

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60621 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60621/consoleFull)** for PR 12836 at commit [`fe36d24`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60621 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60621/consoleFull)** for PR 12836 at commit [`fe36d24`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread sun-rui
Github user sun-rui commented on the issue: https://github.com/apache/spark/pull/12836 @shivaram, LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or i

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Thanks, @shivaram and @sun-rui. Yes, I can work on programming guide for gapply. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 @NarineK Thanks again for the updates to this PR and thanks @sun-rui for reviewing. The code changes LGTM -- the refactoring of worker.R is especially useful for readability. I just had a

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60574/ Test PASSed. ---

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60574 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60574/consoleFull)** for PR 12836 at commit [`4d1cc6b`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60574 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60574/consoleFull)** for PR 12836 at commit [`4d1cc6b`](https://github.com/apache/spark/commit/4

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread sun-rui
Github user sun-rui commented on the issue: https://github.com/apache/spark/pull/12836 @NarineK, there is one comment left un-addressed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-14 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Addressed your comments @sun-rui, please let me know if you have any comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If you

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60392/ Test PASSed. ---

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60392 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60392/consoleFull)** for PR 12836 at commit [`91e1944`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60391/ Test PASSed. ---

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60391 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60391/consoleFull)** for PR 12836 at commit [`1aa368d`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60392 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60392/consoleFull)** for PR 12836 at commit [`91e1944`](https://github.com/apache/spark/commit/9

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60391 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60391/consoleFull)** for PR 12836 at commit [`1aa368d`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-12 Thread sun-rui
Github user sun-rui commented on the issue: https://github.com/apache/spark/pull/12836 @shivaram, I think we are reaching the final version:). It would be better that you can have a detailed review on the examples and test cases. --- If your project is set up for it, you can reply to

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-12 Thread sun-rui
Github user sun-rui commented on the issue: https://github.com/apache/spark/pull/12836 yes, let's do it in a separate PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-11 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/12836 We can do it in a separate pr -- it'd be great to move all Python and R methods over to a single class. Otherwise it has two major problems: 1. Those methods are public for Java. 2. It

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60348/ Test PASSed. ---

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60348 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60348/consoleFull)** for PR 12836 at commit [`d51441f`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-11 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 @rxin I think in this case we need access to grouping expression and DataFrame from within the RelationalGroupedDataset class. One solution could be to move the function `flatMapGroupsInR` to the h

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60348 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60348/consoleFull)** for PR 12836 at commit [`d51441f`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-11 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Thanks @liancheng and @rxin ! With respect to your point, @rxin - "private[sql] signature in public APIs ." dapply added that signature to `Dataset.scala `and gapply adds it to `Rela

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-10 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/12836 The SQL part of changes look generally good except for a few styling issues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-10 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/12836 Can we avoid adding private[sql] methods in public APIs? Those have no effect in Java. Maybe create a helper method for all the R methods. --- If your project is set up for it, you can reply to this e

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-10 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 Thanks @liancheng for clarification and @NarineK for implementing the override. I just had one minor comment. @sun-rui Can you take one final look ? Since we have not still cut RC1, we mi

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-10 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/12836 @NarineK @shivaram Sorry for the late reply. Overriding `stringArgs` is the correct solution for this issue. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60281/ Test PASSed. ---

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60281 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60281/consoleFull)** for PR 12836 at commit [`0ca74fd`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60276/ Test PASSed. ---

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60276 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60276/consoleFull)** for PR 12836 at commit [`20a1c37`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60281 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60281/consoleFull)** for PR 12836 at commit [`0ca74fd`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60276 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60276/consoleFull)** for PR 12836 at commit [`20a1c37`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-09 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Hi @sun-rui, hi @shivaram, I've overwritten the stringArgs - I've pushed my changes in the following branch. I haven't created a jira yet. https://github.com/apache/spark/commit/939dbd

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60223/ Test FAILed. ---

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60223 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60223/consoleFull)** for PR 12836 at commit [`afa385d`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-08 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Sure, let me try to override stringArgs and give it a try. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not ha

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-08 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 I think I found the commit which causes this problem - https://github.com/apache/spark/commit/6dde27404cb3d921d75dd6afca4b383f9df5976a added toString to include arrays and the output we get is from

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-07 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Thank you for the quick responses @sun-rui and @shivaram . Here is how the `dataframe.queyExection.toString` printout starts with: == Parsed Logical Plan == 'SerializeFromObject

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-06 Thread sun-rui
Github user sun-rui commented on the issue: https://github.com/apache/spark/pull/12836 I guess the byte array of the serialized R function is dumped. Let me find which commit caused this. I guess something like overriding toString may solve this --- If your project is set up for it,

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-06 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 I don't know what could cause this - Do we have the beginning of the string ? My guess is `MapPartitions` or one of the nodes in the plan is calling `toString` on a byte Array that contains some R

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-06 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Do you know what exactly caused this ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enab

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-06 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Hi @shivaram , hi @sun-rui , Surprisingly the `dataframe.queyExection.toString` both for dapply and gapply is prepended by a huge array, which I'm not able to understand. It seems that recent co

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-06 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 I can print-out the query plan on scala side and see what does it look like for that example. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-06 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 The out of memory error seems to be coming from trying to form a `.toString` of the query plan. Is there something in the query plan for R UDFs that might create a very large string ? Als

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60028/ Test FAILed. ---

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60028 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60028/consoleFull)** for PR 12836 at commit [`00a091e`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60028 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60028/consoleFull)** for PR 12836 at commit [`00a091e`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-06 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 not sure why it fails. It fails for my new test case on iris dataset. The resulting dataframe has 35x2 dimensions. --- If your project is set up for it, you can reply to this email and have your re

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-05 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Locally, run-tests.sh run successfully, but it fails on jenkins ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project d

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60020/ Test FAILed. ---

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60020 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60020/consoleFull)** for PR 12836 at commit [`e4fa8e6`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60020 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60020/consoleFull)** for PR 12836 at commit [`e4fa8e6`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-05 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 @shivaram, I didn't change the code, but merged with master, because prior to this the build was failing because some pyspark tests didn't pass. After my today's merge, when I run gapply tes

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-05 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 The error was ``` 1. Error: gapply() on a DataFrame -- java.lang.OutOfMemoryJava heap space ``` @NarineK Do you think there was a

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12836 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60013/ Test FAILed. ---

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60013 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60013/consoleFull)** for PR 12836 at commit [`249568e`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-05 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 Yeah I think we can still make this to 2.0 -- Are there any other comments @sun-rui ? Also pinging @davies / @rxin again for a SQL reviewer to take a look at this --- If your project is set u

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60013 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60013/consoleFull)** for PR 12836 at commit [`249568e`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59998/ Test FAILed. --- If you

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #59998 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59998/consoleFull)** for PR 12836 at commit [`0a22042`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #59998 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59998/consoleFull)** for PR 12836 at commit [`0a22042`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59996/ Test FAILed. --- If you

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #59996 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59996/consoleFull)** for PR 12836 at commit [`afa7e4e`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #59996 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59996/consoleFull)** for PR 12836 at commit [`afa7e4e`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59993/ Test FAILed. --- If you

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #59993 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59993/consoleFull)** for PR 12836 at commit [`46df2ee`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #59993 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59993/consoleFull)** for PR 12836 at commit [`46df2ee`](https://github.com/apache/spark/commit/4

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #59992 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59992/consoleFull)** for PR 12836 at commit [`cbde29a`](https://github.com/apache/spark/commit/

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59992/ Test FAILed. --- If you

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #59992 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59992/consoleFull)** for PR 12836 at commit [`cbde29a`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-04 Thread NarineK
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/12836 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59991/ Test FAILed. --- If you

  1   2   >