[GitHub] spark pull request: [SPARK-5569] [STREAMING] fix ObjectInputStream...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8955#issuecomment-144644388 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5569] [STREAMING] fix ObjectInputStream...
GitHub user maxwellzdm opened a pull request: https://github.com/apache/spark/pull/8955 [SPARK-5569] [STREAMING] fix ObjectInputStreamWithLoader for supporting load array classes. When use Kafka DirectStream API to create checkpoint and restore saved checkpoint when restart, ClassNotFound exception would occur. The reason for this error is that ObjectInputStreamWithLoader extends the ObjectInputStream class and override its resolveClass method. But Instead of Using Class.forName(desc,false,loader), Spark uses loader.loadClass(desc) to instance the class, which do not works with array class. For example: Class.forName("[Lorg.apache.spark.streaming.kafka.OffsetRange.",false,loader) works well while loader.loadClass("[Lorg.apache.spark.streaming.kafka.OffsetRange") would throw an class not found exception. details of the difference between Class.forName and loader.loadClass can be found here. http://bugs.java.com/view_bug.do?bug_id=6446627 You can merge this pull request into a Git repository by running: $ git pull https://github.com/maxwellzdm/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8955.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8955 commit 929fec4445d8857e1d7833c9c848ad18226f60e9 Author: DEMING ZHUDate: 2015-10-01T07:24:38Z fix ObjectInputStreamWithLoader for supporting load array classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10841][SQL] Add pushdown support of UDF...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8922#issuecomment-144645064 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10841][SQL] Add pushdown support of UDF...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8922#issuecomment-144645045 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10841][SQL] Add pushdown support of UDF...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8922#issuecomment-144646240 [Test build #43147 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43147/consoleFull) for PR 8922 at commit [`388de88`](https://github.com/apache/spark/commit/388de88069f335a3db55aae604918e52d26a4071). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7012][SQL] Add support for NOT NULL mod...
Github user smola commented on the pull request: https://github.com/apache/spark/pull/8746#issuecomment-144693706 @sabhyankar Great! The implementation looks good. Could you add a test case for it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8956#issuecomment-144701972 [Test build #43149 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43149/consoleFull) for PR 8956 at commit [`f27288e`](https://github.com/apache/spark/commit/f27288e29211d47e24767cf7731914cdf9865bc1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9570][Docs][YARN]Consistent recommendat...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/8385#issuecomment-144707850 @nssalian I'd like to resolve this at last. This has outstanding comments and needs a rebase. Do you want to do that or should I take over? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10724] [SQL] SQL's floor() returns DOUB...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/8893#issuecomment-144707576 It looks like @chenghao-intel is farther along towards a simpler fix in https://github.com/apache/spark/pull/8933/files Do you mind closing this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8956#issuecomment-144726375 [Test build #43149 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43149/console) for PR 8956 at commit [`f27288e`](https://github.com/apache/spark/commit/f27288e29211d47e24767cf7731914cdf9865bc1). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` case class StringFilter(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/8956 [SPARK-10895][SQL] Push down string filters to Parquet JIRA: https://issues.apache.org/jira/browse/SPARK-10895 You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 parquet-stringfilter-pushdown Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8956.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8956 commit f27288e29211d47e24767cf7731914cdf9865bc1 Author: Liang-Chi HsiehDate: 2015-10-01T11:14:14Z Push down string filters to Parquet. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-10883: use relative location of scalasty...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/8949#issuecomment-144708587 Per comments in the JIRA, I think this is unnecessary as you can use the standard Maven syntax to build modules correctly. Do you mind closing this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10886] [Documentation] Random RDD creat...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/8951#issuecomment-144708911 Do you mind closing this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9570][Docs][YARN]Consistent recommendat...
Github user nssalian commented on the pull request: https://github.com/apache/spark/pull/8385#issuecomment-144723093 @srowen please go ahead. Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8956#issuecomment-144726522 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43149/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8956#issuecomment-144726518 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10841][SQL] Add pushdown support of UDF...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/8922#issuecomment-144697959 ping @liancheng @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8956#issuecomment-144700733 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/8758#issuecomment-144706624 Yeah, the current script does blindly 'pass' the first arg as an environment variable. It never parsed any args at all to the arg-parsing code, which seems like an oversight. Instead it sent a dummy (?) argument 1 for some reason -- is this historical? So I generally agree with plumbing through the arguments to the argument parsing code. By the way, don't we need to remove that "1" argument then? From there it seemed straightforward to attempt to retain backwards compatibility for the script, such that passing just a dir as the single arg still works (and generates a warning). This formulation would in fact stop at the first such argument. I would not be against just removing support for this naked argument, as it has been long since deprecated, if anyone felt strongly about it. Aside from the "1" issue, this LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8956#issuecomment-144700717 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10515] When killing executor, the pendi...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/8668#issuecomment-144707186 @KaiXinXiaoLei do you mind closing this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10772][Streaming][Scala]: NullPointerEx...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8881#issuecomment-144707086 [Test build #1833 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1833/consoleFull) for PR 8881 at commit [`cba60ed`](https://github.com/apache/spark/commit/cba60ed77e1c4812617667f5d1d3e73e588e9f96). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10582] using dynamic-executor-allocatio...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/8737#issuecomment-144707252 @KaiXinXiaoLei are you working on this, or else do you mind closing this PR? I'm also not clear if it's the same thing as https://github.com/apache/spark/pull/8945 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10772][Streaming][Scala]: NullPointerEx...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8881#issuecomment-144707377 [Test build #1833 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1833/console) for PR 8881 at commit [`cba60ed`](https://github.com/apache/spark/commit/cba60ed77e1c4812617667f5d1d3e73e588e9f96). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-10891][STREAMING][KINESIS] Add Mes...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8954#issuecomment-144732292 [Test build #43150 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43150/consoleFull) for PR 8954 at commit [`a72d3ec`](https://github.com/apache/spark/commit/a72d3ec526898f41b88ad907c468c843962bb965). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/8758#issuecomment-144738730 Aha you're right about "1". It can stay: ``` usage="Usage: spark-daemon.sh [--config ] (start|stop|submit|status) " ``` Yes it doesn't pass args now but isn't that a problem? clearly `HistoryServer` has code to parse args and I don't see how those are plumbed through. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10058][Core][Tests]Fix the flaky tests ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/8946 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10889] [Streaming] Bump KCL to add Mill...
GitHub user akatz opened a pull request: https://github.com/apache/spark/pull/8957 [SPARK-10889] [Streaming] Bump KCL to add MillisBehindLatest metric I don't believe the API changed at all. You can merge this pull request into a Git repository by running: $ git pull https://github.com/akatz/spark kcl-upgrade Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8957.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8957 commit 6abc571a1a1d0a06565a0ec9fefa7bcb1ce69cfe Author: Avrohom KatzDate: 2015-09-30T22:26:36Z Bump KCL to add MillisBehindLatest metric --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-10891][STREAMING][KINESIS] Add Mes...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8954#issuecomment-144734249 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-10891][STREAMING][KINESIS] Add Mes...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8954#issuecomment-144734253 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43150/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/8958 [SPARK-10900][Streaming]Add output operation events to StreamingListener Add output operation events to StreamingListener so as to implement the following UI features: 1. Progress bar of a batch in the batch list. 2. Be able to display output operation `description` and `duration` when there is no spark job in a Streaming job. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zsxwing/spark output-operation-events Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8958.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8958 commit 1ffae9302bc699bba750693ba3d08327e0b62f57 Author: zsxwingDate: 2015-10-01T14:42:58Z Add output operation events to StreamingListener --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10889] [Streaming] Bump KCL to add Mill...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8957#issuecomment-144731855 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/8758#issuecomment-144732338 You guys are misinterpreting the script. That `1` is not an argument to the HistoryServer, it's an argument to `spark-daemon.sh`. The script never passes any arguments to the HistoryServer itself. This change is *breaking* the command line parsing in `HistoryServerArguments`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10300] [build] [tests] Add support for ...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/8437#issuecomment-144733008 @JoshRosen #8775 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10058][Core][Tests]Fix the flaky tests ...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/8946#issuecomment-144738827 Merged to master and branch-1.5, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/8758#issuecomment-144740121 The change in `start-history-server.sh` adds the command line parameters, and those are propagated to the java process. That part of the change is fine. The broken part is the one I commented on. It's breaking command line parsing, because if you provide an invalid argument, it treats it as the log directory. and stops parsing the rest of the command line. That behavior has never existed and is actually broken. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/8758#issuecomment-144741977 Good, yes passing through the args is right. Right now if you run `start-history-server.sh foo` you will successfully set the log directory to foo because of what the script does. That's what I'm trying to preserve. Or am I also missing something there? Right now there is no other arg parsing to break, right? nothing is passed or parsed otherwise. I tend to agree it's a little janky, but hey it generates a warning. Compatibility is good if it's cheap and it seems easy here. But then again it's been deprecated for forever. So I don't feel strongly about keeping it if there are strong feelings for retiring this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8958#issuecomment-144754459 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8958#issuecomment-144754489 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/8958#issuecomment-14475 /cc @tdas --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8958#issuecomment-144756156 [Test build #43151 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43151/consoleFull) for PR 8958 at commit [`ba8f9b8`](https://github.com/apache/spark/commit/ba8f9b8a9aa53a88a87592cc833180ee4aaf6ee8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-10891][STREAMING][KINESIS] Add Mes...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8954#issuecomment-144730502 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-10891][STREAMING][KINESIS] Add Mes...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8954#issuecomment-144730473 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10058][Core][Tests]Fix the flaky tests ...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/8946#issuecomment-144732736 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-10891][STREAMING][KINESIS] Add Mes...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8954#issuecomment-144734228 [Test build #43150 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43150/console) for PR 8954 at commit [`a72d3ec`](https://github.com/apache/spark/commit/a72d3ec526898f41b88ad907c468c843962bb965). * This patch **fails to build**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class KinesisBackedBlockRDD[T](` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/8758#issuecomment-144742060 Hmm, I think I get what you're trying to do. You're trying to implement the script's command line handling in `HistoryServerArguments` (the old `if [ $# != 0 ]; then` code you're removing). I don't think you should do that. Instead, the script itself should handle this backwards compatibility, much like it did before. Instead of setting an env variable, it can add command line arguments to the history server. But the change in `HistoryServerArguments` is broken the way it is now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/8758#issuecomment-144749562 Hi, me again. Looking at the code once more, I think it's ok if you want to make this change in the scala code and not the shell script (so, e.g., you can unit test it), but it cannot be done in the current spot. Basically, you need something like this: if (args.length == 1) { // Print deprecation warning, set log dir. } else { parse(args.toList) } That way the existing command-line parsing is not broken. With your change, something like "--properties-file foo blah --dir /path" will successfully be parsed, and the log directory will be set to "blah", which is not what should happen, since that's either an invalid command line, or the directory should be set to "/path". (I vote for invalid command line.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8958#issuecomment-144771166 [Test build #43151 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43151/console) for PR 8958 at commit [`ba8f9b8`](https://github.com/apache/spark/commit/ba8f9b8a9aa53a88a87592cc833180ee4aaf6ee8). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class StreamingListenerOutputOperationStarted(` * `case class StreamingListenerOutputOperationCompleted(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/8952#issuecomment-144788137 Yeah that would be great. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10886] [Documentation] Random RDD creat...
Github user jayantshekhar closed the pull request at: https://github.com/apache/spark/pull/8951 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10886] [Documentation] Random RDD creat...
Github user jayantshekhar commented on the pull request: https://github.com/apache/spark/pull/8951#issuecomment-144787366 Thanks Sean! Closing it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8941#issuecomment-144790511 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/8952#issuecomment-144782174 Thanks @NarineK - I left some inline comments regarding the roxygen docs. Regarding the sqlContext reuse, I think we should do that in a separate JIRA. Could you file one for that ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...
Github user NarineK commented on the pull request: https://github.com/apache/spark/pull/8952#issuecomment-144787827 Hi Shivaram, should I change the example for createDataFrame with iris too ? Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8952#issuecomment-144790493 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8941#issuecomment-144790574 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8952#issuecomment-144790538 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10841][SQL] Add pushdown support of UDF...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/8922#issuecomment-144776282 I will post performance comparison later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/8952#discussion_r40933250 --- Diff: R/pkg/R/SQLContext.R --- @@ -149,6 +149,26 @@ createDataFrame <- function(sqlContext, data, schema = NULL, samplingRatio = 1.0 dataFrame(sdf) } +#' Create a DataFrame from an RDD +#' +#' Converts an RDD to a DataFrame by infer the types. +#' +#' @param sqlContext A SQLContext +#' @param data An RDD or list or data.frame +#' @param schema a list of column names or named list (StructType), optional +#' @return an DataFrame +#' @export +#' @examples +#'\dontrun{ +#' sc <- sparkR.init() +#' sqlContext <- sparkRSQL.init(sc) +#' rdd <- lapply(parallelize(sc, 1:10), function(x) list(a=x, b=as.character(x))) +#' df <- as.DataFrame(sqlContext, rdd) +#' } +as.DataFrame <- function(sqlContext, data, schema = NULL, samplingRatio = 1.0){ --- End diff -- Yeah I think we can do that, but lets do this change in a separate JIRA / PR --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10841][SQL] Add pushdown support of UDF...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/8922#issuecomment-144772135 I'm a little skeptical that this is worth the complexity. Do you have real works loads that this speeds up significantly? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/8952#discussion_r40933374 --- Diff: R/pkg/R/SQLContext.R --- @@ -149,6 +149,26 @@ createDataFrame <- function(sqlContext, data, schema = NULL, samplingRatio = 1.0 dataFrame(sdf) } +#' Create a DataFrame from an RDD +#' +#' Converts an RDD to a DataFrame by infer the types. --- End diff -- Could you remove the reference to RDD in the comments here (and in createDataFrame). You could just make it `Converts R data.frame or list into DataFrame`. In the same spirit could you use `iris` or something like that in the example (instead of the lapply) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8958#issuecomment-144771352 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43151/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8958#issuecomment-144771351 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8952#issuecomment-144792949 [Test build #43152 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43152/consoleFull) for PR 8952 at commit [`336b3a9`](https://github.com/apache/spark/commit/336b3a9d94dec34db4e5eea3a21a7d2ac1d0ce1a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...
Github user NarineK commented on the pull request: https://github.com/apache/spark/pull/8952#issuecomment-144794849 Here is the jira for making sqlContext global. https://issues.apache.org/jira/browse/SPARK-10903 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8952#issuecomment-144796830 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10261][Documentation, ML] Fixed @Since ...
Github user tijoparacka commented on the pull request: https://github.com/apache/spark/pull/8554#issuecomment-144803543 yu-iskw could you please reivew this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8941#issuecomment-144803339 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/8952#issuecomment-144816142 looks good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/8959#issuecomment-144818093 Hi @tgravescs , This fixes the problem but I think it's actually just masking a subtle bug elsewhere. In `getUserClasspath`, there's this code: val mainUri = mainJar.orElse(Some(APP_JAR)).map(new URI(_)) That is not actually doing the right thing in certain cases. When invoked from a `Client` instance, the `mainJar` argument comes from `ClientArguments.userJar`, so it's never going to be `None` (and thus always return the name of the original jar instead of `APP_JAR`). The "cleanest" thing would be to have just a single version of `getUserClasspath` that gets things from `SparkConf`, but that runs into the problem that the conf has not yet been updated when `populateClasspath` is called. I think changing that `map` call to something like the following would fix the source of the problem: .map { path => val uri = new URI(path) if (uri.getScheme == LOCAL_SCHEME) new URI(uri.getPath()) else uri } (Not tested.) Does that make sense? Could you try that out? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8941#issuecomment-144792979 [Test build #43153 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43153/consoleFull) for PR 8941 at commit [`2b2c643`](https://github.com/apache/spark/commit/2b2c643436584203357405e8921fe65be1af9286). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10889] [Streaming] Bump KCL to add Mill...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8957#issuecomment-144793932 [Test build #1834 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1834/consoleFull) for PR 8957 at commit [`6abc571`](https://github.com/apache/spark/commit/6abc571a1a1d0a06565a0ec9fefa7bcb1ce69cfe). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10709] [SQL] When loading a json datase...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/8899#issuecomment-144794227 @navis can you follow up on this PR or close it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8952#issuecomment-144796688 [Test build #43152 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43152/console) for PR 8952 at commit [`336b3a9`](https://github.com/apache/spark/commit/336b3a9d94dec34db4e5eea3a21a7d2ac1d0ce1a). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/8758#issuecomment-144796916 I like that better since it focuses narrowly on supporting one arg. @rekhajoshm what do you think? the rest looks good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8952#issuecomment-144796833 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43152/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10810] [WIP] [SQL] Improve session mana...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8909#issuecomment-144798668 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8959#issuecomment-144810614 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8959#issuecomment-144810576 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10810] [SQL] Improve session management...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8909#issuecomment-144812918 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10810] [SQL] Improve session management...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8909#issuecomment-144812883 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/8959#issuecomment-144821230 where are you suggesting putting this .map so its clear? in getUserPath on mainUri? I'm not seeing how your map call fixes anything so I"m guessing I'm missing the context. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10889] [Streaming] Bump KCL to add Mill...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/8957#issuecomment-144793663 Seems OK to me, but the usual question is simply, does it introduce any potential problems too? changes dependencies, incompatible behavior, etc? might skim the release notes and commits if you can to sanity check. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10810] [WIP] [SQL] Improve session mana...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8909#issuecomment-144797066 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10810] [WIP] [SQL] Improve session mana...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8909#issuecomment-144797039 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10810] [SQL] Improve session management...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8909#issuecomment-144799890 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10810] [SQL] Improve session management...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8909#issuecomment-144799919 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8941#issuecomment-144803367 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10261][Documentation, ML] Fixed @Since ...
Github user tijoparacka commented on the pull request: https://github.com/apache/spark/pull/8554#issuecomment-144803700 @yu-iskw could you please reivew this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8941#issuecomment-144805650 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43153/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8941#issuecomment-144805491 [Test build #43153 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43153/console) for PR 8941 at commit [`2b2c643`](https://github.com/apache/spark/commit/2b2c643436584203357405e8921fe65be1af9286). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8941#issuecomment-144805646 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8959#issuecomment-144812185 [Test build #43157 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43157/consoleFull) for PR 8959 at commit [`08f9382`](https://github.com/apache/spark/commit/08f93822de809946e565793adab7c30c8c8c3430). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10264][Documentation, ML] Added Since a...
Github user tijoparacka commented on the pull request: https://github.com/apache/spark/pull/8532#issuecomment-144802854 Any of you can review this and merge. I may loose track if it is delayed more. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8941#issuecomment-144804910 [Test build #43156 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43156/consoleFull) for PR 8941 at commit [`666dd63`](https://github.com/apache/spark/commit/666dd630dd8882b16b864f2dcf3b994b70894ef8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9570][Docs][YARN]Consistent recommendat...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/8385#discussion_r40947175 --- Diff: docs/running-on-yarn.md --- @@ -16,37 +16,51 @@ containers used by the application use the same configuration. If the configurat Java system properties or environment variables not managed by YARN, they should also be set in the Spark application's configuration (driver, executors, and the AM when running in client mode). -There are two deploy modes that can be used to launch Spark applications on YARN. In `yarn-cluster` mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. In `yarn-client` mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN. +There are two deploy modes that can be used to launch Spark applications on YARN. In `cluster` mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. In `client` mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN. -Unlike in Spark standalone and Mesos mode, in which the master's address is specified in the `--master` parameter, in YARN mode the ResourceManager's address is picked up from the Hadoop configuration. Thus, the `--master` parameter is `yarn-client` or `yarn-cluster`. -To launch a Spark application in `yarn-cluster` mode: +Unlike in Spark standalone and Mesos mode, in which the master's address is specified in the `--master` parameter, in YARN mode the ResourceManager's address is picked up from the Hadoop configuration. Thus, the `--master` parameter is `yarn` and `--deploy-mode` can be `client` or `cluster` to select the YARN deployment mode. +To launch a Spark application in YARN in `cluster` mode: - `$ ./bin/spark-submit --class path.to.your.Class --master yarn-cluster [options] [app options]` - + `$ ./bin/spark-submit --class path.to.your.Class --master yarn --deploy-mode cluster [options] [app options]` + For example: $ ./bin/spark-submit --class org.apache.spark.examples.SparkPi \ ---master yarn-cluster \ +--master yarn \ +--deploy-mode cluster --num-executors 3 \ --driver-memory 4g \ --executor-memory 2g \ --executor-cores 1 \ --queue thequeue \ lib/spark-examples*.jar \ -10 -The above starts a YARN client program which starts the default Application Master. Then SparkPi will be run as a child thread of Application Master. The client will periodically poll the Application Master for status updates and display them in the console. The client will exit once your application has finished running. Refer to the "Debugging your Application" section below for how to see driver and executor logs. +The above example starts a YARN client program which starts the default Application Master. Then SparkPi will be run as a child thread of Application Master. The client will periodically poll the Application Master for status updates and display them in the console. The client will exit once your application has finished running. Refer to the "Debugging your Application" section below for how to see driver and executor logs. + +To launch a Spark application in `client` mode, do the same, but replace `cluster` with `client` in the `--deploy-mode` argument. +To run spark-shell: -To launch a Spark application in `yarn-client` mode, do the same, but replace `yarn-cluster` with `yarn-client`. To run spark-shell: +$ ./bin/spark-shell --master yarn --deploy-mode client -$ ./bin/spark-shell --master yarn-client +For example: +$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi \ +--master yarn-cluster \ --- End diff -- still using yarn-cluster instead of deploy-mode --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...
Github user olarayej commented on the pull request: https://github.com/apache/spark/pull/8920#issuecomment-144809819 @shivaram @felixcheung @sun-rui Thanks for your feedback! I totally see your point with the naming (sort vs. arrange), but @NarineK's implementation has two advantages: 1) It supports string column names in both asc and desc order. In the current SparkR's implementation of arrange(), I couldn't do that: arrange(df, desc("Species")) # fails 2) Boolean parameter 'decreasing' is useful. Right now, if you were to sort by 100 columns, all of them in descending order, you'll need to write 100 times, for each column: desc(data$col1), , desc(data$col100), whereas in @NarineK's implementation, it will suffice to specify decreasing=T. I'm aware that plyr also takes functions asc/desc, probably because R was not designed with big data in mind. We've seen customer use cases with hundreds of thousands of columns. Bottom line: I think these are two valid additions to Spark R, and since the code is ready and tested, it won't hurt. Let the user decide which function to use. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10810] [SQL] Improve session management...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8909#issuecomment-144816852 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10810] [WIP] [SQL] Improve session mana...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8909#issuecomment-144798664 [Test build #43154 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43154/console) for PR 8909 at commit [`2e365ad`](https://github.com/apache/spark/commit/2e365ada232e42d88692c89634e5ed2ceb741beb). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class HyperLogLogPlusPlus(child: Expression, relativeSD: Double = 0.05)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10810] [WIP] [SQL] Improve session mana...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8909#issuecomment-144798673 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43154/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org