[GitHub] spark pull request: [SPARK-12331][ML] R^2 for regression through t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10384#issuecomment-166400018 **[Test build #48122 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48122/consoleFull)** for PR 10384 at commit [`526d156`](https://github.com/apache/spark/commit/526d1562ed3bc9a0870ca542f6831868db5d8633). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12331][ML] R^2 for regression through t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10384#issuecomment-166400134 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12396][Core]Once driver connect to a ma...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/10407#issuecomment-166406625 Good catch, this change looks fine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12396][Core]Once driver connect to a ma...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10407#issuecomment-166408094 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12396][Core]Once driver connect to a ma...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10407#issuecomment-166408093 **[Test build #48125 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48125/consoleFull)** for PR 10407 at commit [`7959c1f`](https://github.com/apache/spark/commit/7959c1f75cd34e46ceda011ec11ce56e8e166fd1). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12231][SQL]create a combineFilters' pro...
Github user kevinyu98 commented on the pull request: https://github.com/apache/spark/pull/10388#issuecomment-166407905 @marmbrus : Can you help take a look at this PR? Thanks for your review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12396][Core]Once driver connect to a ma...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10407#issuecomment-166408095 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48125/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8641][SPARK-12455][SQL] Native Spark Wi...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/10402#discussion_r48186892 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala --- @@ -421,6 +460,19 @@ case class CumeDist() extends RowNumberLike with SizeBasedWindowFunction { override val evaluateExpression = Divide(Cast(rowNumber, DoubleType), Cast(n, DoubleType)) } +/** + * The NTile function divides the rows for each window partition into 'n' buckets ranging from 1 to + * at most 'n'. Bucket values will differ by at most 1. If the number of rows in the partition does + * not divide evenly into the number of buckets, then the remainder values are distributed one per + * bucket, starting with the first bucket. + * + * The NTile function is particularly useful for the calculation of tertiles, quartiles, deciles and + * other common summary statistics + * + * @param buckets number of buckets to divide the rows in. + */ +@ExpressionDescription(usage = "_FUNC_(x) - The NTILE(n) function divides the rows for each " + + "window partition into 'n' buckets ranging from 1 to at most 'n'.") case class NTile(buckets: Expression) extends RowNumberLike with SizeBasedWindowFunction { --- End diff -- Add some comments to explain the implementation of NTile? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12457] [SQL] Add ExpressionDescription ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10418#issuecomment-166408025 **[Test build #48124 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48124/consoleFull)** for PR 10418 at commit [`e49b036`](https://github.com/apache/spark/commit/e49b03649981688a60076489bd52a5232f4c2f0d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12463][SPARK-12464][SPARK-12465][SPARK-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10057#issuecomment-166413509 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48121/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12321][SQL] JSON format for TreeNode (u...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/10311#issuecomment-166413616 Thanks, merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12398] Smart truncation of DataFrame / ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10373 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12149] [Web UI] Executor UI improvement...
Github user ajbozarth commented on the pull request: https://github.com/apache/spark/pull/10154#issuecomment-166418393 And here's screenshots of completed only being colored when comparing to active or failed, examples with both color choices. ![screen shot 2015-12-21 at 12 43 58 pm](https://cloud.githubusercontent.com/assets/13952758/11941166/7700c7f4-a7e2-11e5-8df6-19e0b1de4760.png) ![screen shot 2015-12-21 at 12 45 05 pm](https://cloud.githubusercontent.com/assets/13952758/11941169/7729ecba-a7e2-11e5-92d2-d6e894019bac.png) ![screen shot 2015-12-21 at 12 57 11 pm](https://cloud.githubusercontent.com/assets/13952758/11941167/772571b2-a7e2-11e5-88f9-642b18537874.png) ![screen shot 2015-12-21 at 12 57 19 pm](https://cloud.githubusercontent.com/assets/13952758/11941168/7727e7ee-a7e2-11e5-8c0c-2685bac3beb9.png) Personally I don't know how I feel about this, I think it looks better visually, but I'm not sure about usability. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12463][SPARK-12464][SPARK-12465][SPARK-...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/10057#discussion_r48191846 --- Diff: docs/spark-standalone.md --- @@ -341,23 +341,8 @@ Learn more about getting started with ZooKeeper [here](http://zookeeper.apache.o **Configuration** -In order to enable this recovery mode, you can set SPARK_DAEMON_JAVA_OPTS in spark-env using this configuration: - - - System propertyMeaning - -spark.deploy.recoveryMode -Set to ZOOKEEPER to enable standby Master recovery mode (default: NONE). - - -spark.deploy.zookeeper.url -The ZooKeeper cluster url (e.g., 192.168.1.100:2181,192.168.1.101:2181). - - -spark.deploy.zookeeper.dir -The directory in ZooKeeper to store recovery state (default: /spark). - - +In order to enable this recovery mode, you can set SPARK_DAEMON_JAVA_OPTS in spark-env by configuring `spark.deploy.recoveryMode` and related spark.deploy.zookeeper.* configurations. +For more information about these configurations please refer to the configurations (doc)[configurations.html#deploy] --- End diff -- we should also briefly mention this in the `running-on-mesos.md` docs right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12415] Do not use closure serializer to...
Github user tedyu commented on the pull request: https://github.com/apache/spark/pull/10368#issuecomment-166387628 @andrewor14 @zsxwing Please take another look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12149] [Web UI] Executor UI improvement...
Github user ajbozarth commented on the pull request: https://github.com/apache/spark/pull/10154#issuecomment-166393950 I'll switch the green and blue and post some screen shot to see the difference. As for the completed column always being blue: I could make it conditional so it's only blue if there are either active or failed tasks to compare too, but I feel like that would end up looking odd when some executors completed tasks are colored and some aren't. Or I could remove the coloring from completed tasks, I'll post some screenshots of this as well before pushing any changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12429][Streaming][Doc]Add Accumulator a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10385#issuecomment-166393941 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12441] [SQL] Fixing missingInput in Gen...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/10393#discussion_r48190846 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -317,6 +319,7 @@ class DatasetSuite extends QueryTest with SharedSQLContext { val ds = Seq(("a", 10), ("a", 20), ("b", 1), ("b", 2), ("c", 1)).toDS() val grouped = ds.groupBy($"_1").keyAs[String] val agged = grouped.mapGroups { case (g, iter) => (g, iter.map(_._2).sum) } +assert(agged.queryExecution.executedPlan.missingInput.isEmpty) --- End diff -- Maybe we should just have `checkAnswer` assert that there are no physical or logical nodes that have missing input (after analysis)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12457] [SQL] Add ExpressionDescription ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10418#issuecomment-166416601 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48128/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12463][SPARK-12464][SPARK-12465][SPARK-...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/10057#discussion_r48192495 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterPersistenceEngine.scala --- @@ -53,9 +53,12 @@ private[spark] trait MesosClusterPersistenceEngine { * all of them reuses the same connection pool. */ private[spark] class ZookeeperMesosClusterPersistenceEngineFactory(conf: SparkConf) - extends MesosClusterPersistenceEngineFactory(conf) { + extends MesosClusterPersistenceEngineFactory(conf) with Logging { - lazy val zk = SparkCuratorUtil.newClient(conf, "spark.mesos.deploy.zookeeper.url") + // TODO(tnachen): Remove support for spark.mesos.deploy.zookeeper.url in 0.28. --- End diff -- same here, no need to do this later. Just do it now. (also this is Spark so you should use Spark versions in comments, not Mesos versions) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12453] [Streaming] Spark Streaming Kine...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10416#issuecomment-166429971 This isn't vs master, and is a duplicate. Please close this PR --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12247] [ML] [DOC] Documentation for spa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10411#issuecomment-166430031 **[Test build #48130 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48130/consoleFull)** for PR 10411 at commit [`ab0f301`](https://github.com/apache/spark/commit/ab0f301cc6d9cfa5b4a9f1da733859db52ef7f83). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12453] [Streaming] Spark Streaming Kine...
GitHub user Schadix opened a pull request: https://github.com/apache/spark/pull/10416 [SPARK-12453] [Streaming] Spark Streaming Kinesis Example broken due ⦠â¦to wrong AWS Java SDK version Fix successfully tested by me. Maybe the tests have to be improved as they passed the previous version as well. You can merge this pull request into a Git repository by running: $ git pull https://github.com/Schadix/spark fix-aws-java-sdk-version Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10416.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10416 commit d6eaf460ac549544c90d97a1ed87682d83906eb7 Author: Martin SchadeDate: 2015-12-21T18:33:54Z [SPARK-12453] [Streaming] Spark Streaming Kinesis Example broken due to wrong AWS Java SDK version --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12331][ML] R^2 for regression through t...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/10384#issuecomment-166388345 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12429][Streaming][Doc]Add Accumulator a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10385#issuecomment-166393943 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48120/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12429][Streaming][Doc]Add Accumulator a...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10385#issuecomment-166393679 **[Test build #48120 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48120/consoleFull)** for PR 10385 at commit [`9e241e7`](https://github.com/apache/spark/commit/9e241e791e5ce2cb0cc3dd7c609339ca8ea11129). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * `class JavaWordBlacklist `\n * `class JavaDroppedWordsCounter `\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12331][ML] R^2 for regression through t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10384#issuecomment-166400137 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48122/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7727] [SQL] Avoid inner classes in Rule...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/10174#discussion_r48183908 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala --- @@ -59,7 +59,7 @@ abstract class RuleExecutor[TreeType <: TreeNode[_]] extends Logging { protected case class Batch(name: String, strategy: Strategy, rules: Rule[TreeType]*) /** Defines a sequence of rule batches, to be overridden by the implementation. */ - protected val batches: Seq[Batch] + protected def batches: Seq[Batch] --- End diff -- I don't think thats true. You can't call `super` on a val. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12102][SQL] Cast a non-nullable struct ...
Github user dilipbiswal commented on the pull request: https://github.com/apache/spark/pull/10156#issuecomment-166403008 @cloud-fan thanks.. rebasing to newer code helps :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7727] [SQL] Avoid inner classes in Rule...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/10174#discussion_r48183993 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -69,6 +71,13 @@ object DefaultOptimizer extends Optimizer { } /** + * Non-abstract representation of the standard Spark optimizing strategies + */ +object DefaultOptimizer extends Optimizer { + override def batches: Seq[Batch] = super.batches --- End diff -- I agree, this looks like a no-op. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12396][Core]Once driver connect to a ma...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10407#issuecomment-166407814 **[Test build #48125 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48125/consoleFull)** for PR 10407 at commit [`7959c1f`](https://github.com/apache/spark/commit/7959c1f75cd34e46ceda011ec11ce56e8e166fd1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12457] [SQL] Add ExpressionDescription ...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10418#issuecomment-166407785 I am wondering if we only need to show the SQL usage? For DF version, we have API doc (maybe that is good enough for now). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12457] [SQL] Add ExpressionDescription ...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10418#issuecomment-166411445 I see. Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12441] [SQL] Fixing missingInput in Gen...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/10393#issuecomment-166415241 I think its fine to have this stuff in physical plans. I actually added the `!` when I was debugging problems with the query planner (i.e. it was producing spark plans with the wrong attributes). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/9593#issuecomment-166381528 LGTM. Ping @tdas to take a final look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10647][MESOS] Fix zookeeper dir with me...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/10057#discussion_r48175719 --- Diff: core/src/main/scala/org/apache/spark/deploy/mesos/MesosClusterDispatcher.scala --- @@ -50,7 +50,11 @@ private[mesos] class MesosClusterDispatcher( extends Logging { private val publicAddress = Option(conf.getenv("SPARK_PUBLIC_DNS")).getOrElse(args.host) - private val recoveryMode = conf.get("spark.mesos.deploy.recoveryMode", "NONE").toUpperCase() + private val recoveryMode = conf.getOption("spark.mesos.deploy.recoverMode").map { mode => +logWarning("spark.mesos.deploy.recoverMode is deprecated. Please configure " + --- End diff -- I don't think it's even worth logging a warning. It wasn't documented so any user who was using the old config somehow found out about it from the code, knowing that it's not officially supported. I'd rather keep the code simpler than add a warning that I doubt will be useful for many. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12149] [Web UI] Executor UI improvement...
Github user ajbozarth commented on the pull request: https://github.com/apache/spark/pull/10154#issuecomment-166408671 ![screen shot 2015-12-21 at 12 20 30 pm](https://cloud.githubusercontent.com/assets/13952758/11940489/e423c4c6-a7dd-11e5-99ce-6eab80c6c36c.png) ![screen shot 2015-12-21 at 12 10 57 pm](https://cloud.githubusercontent.com/assets/13952758/11940491/e4449d72-a7dd-11e5-99c2-5e890d24631c.png) ![screen shot 2015-12-21 at 12 04 21 pm](https://cloud.githubusercontent.com/assets/13952758/11940490/e43fd85a-a7dd-11e5-8850-57a2c1c9f7d7.png) There's three screenshots of what switching the blue and green would look like, I actually think I like it better. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8641][SPARK-12455][SQL] Native Spark Wi...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10402#issuecomment-166408658 @hvanhovell Regarding the doc, we can acknowledge them (Hive and Presto) in the scala doc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12463][SPARK-12464][SPARK-12465][SPARK-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10057#issuecomment-166413505 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12149] [Web UI] Executor UI improvement...
Github user ajbozarth commented on the pull request: https://github.com/apache/spark/pull/10154#issuecomment-166419895 I didn't do screen shots of no color for completed, overall I am personally a fan of switch the green to completed and making coloring completed conditional on an active or failed task to compare to. If you agree @srowen then I'll push that code change --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12468] [Pyspark]
GitHub user ZacharySBrown opened a pull request: https://github.com/apache/spark/pull/10419 [SPARK-12468] [Pyspark] This addresses an issue where `extractParamMap()` method for a model that has been fit returns an empty dictionary, e.g. (from the [Pyspark ML API Documentation](http://spark.apache.org/docs/latest/ml-guide.html#example-estimator-transformer-and-param)): ```python from pyspark.mllib.linalg import Vectors from pyspark.ml.classification import LogisticRegression from pyspark.ml.param import Param, Params # Prepare training data from a list of (label, features) tuples. training = sqlContext.createDataFrame([ (1.0, Vectors.dense([0.0, 1.1, 0.1])), (0.0, Vectors.dense([2.0, 1.0, -1.0])), (0.0, Vectors.dense([2.0, 1.3, 1.0])), (1.0, Vectors.dense([0.0, 1.2, -0.5]))], ["label", "features"]) # Create a LogisticRegression instance. This instance is an Estimator. lr = LogisticRegression(maxIter=10, regParam=0.01) # Print out the parameters, documentation, and any default values. print "LogisticRegression parameters:\n" + lr.explainParams() + "\n" # Learn a LogisticRegression model. This uses the parameters stored in lr. model1 = lr.fit(training) # Since model1 is a Model (i.e., a transformer produced by an Estimator), # we can view the parameters it used during fit(). # This prints the parameter (name: value) pairs, where names are unique IDs for this # LogisticRegression instance. print "Model 1 was fit using parameters: " print model1.extractParamMap() ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/ZacharySBrown/spark Pyspark_extractParamMap_Fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10419.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10419 commit 6e7c80b6805fc4b6ef4b60c18cb699385ed3bc2e Author: Zak BrownDate: 2015-12-21T20:28:51Z Updated _fit() method of JavaEstimator Class to update paramMap for the returned model commit 1c5f4998b0bb88dfb7a650525e46853ddbd65ea8 Author: Zak Brown Date: 2015-12-21T21:15:04Z Removed extra spaces in modifications to wrapper.py to conform with PEP8 standards --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Doc typo: ltrim = trim from left end, not righ...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10414#issuecomment-166431476 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12062] [CORE] Change Master to asyc reb...
Github user tedyu commented on the pull request: https://github.com/apache/spark/pull/10284#issuecomment-166386063 See https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48119/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12331][ML] R^2 for regression through t...
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/10384#discussion_r48178585 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/evaluation/RegressionMetrics.scala --- @@ -105,6 +112,14 @@ class RegressionMetrics @Since("1.2.0") ( */ @Since("1.2.0") def r2: Double = { -1 - SSerr / SStot +// In case of regression through the origin (biased case), the definition of R^2 is +// to be modified. Here is a review paper which explains why: +// J. G. Eisenhauer, Regression through the Origin. Teaching Statistics 25, 76â80 (2003) +// https://online.stat.psu.edu/~ajw13/stat501/SpecialTopics/Reg_thru_origin.pdf --- End diff -- @iyounus Can you put this comment into the scaladoc instead? [This example](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CentralMomentAgg.scala#L27) might be of use. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-11882: Custom scheduler support
Github user jacek-lewandowski commented on the pull request: https://github.com/apache/spark/pull/10292#issuecomment-166396719 @ScrapCodes - yes, we are building Spark on top of this change and it is working correctly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12062] [CORE] Change Master to asyc reb...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/10284#issuecomment-166401444 I've opened #10417 to fix it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12466] Fix harmless NPE in tests
GitHub user andrewor14 opened a pull request: https://github.com/apache/spark/pull/10417 [SPARK-12466] Fix harmless NPE in tests ``` [info] ReplayListenerSuite: [info] - Simple replay (58 milliseconds) java.lang.NullPointerException at org.apache.spark.deploy.master.Master$$anonfun$asyncRebuildSparkUI$1.applyOrElse(Master.scala:982) at org.apache.spark.deploy.master.Master$$anonfun$asyncRebuildSparkUI$1.applyOrElse(Master.scala:980) ``` https://amplab.cs.berkeley.edu/jenkins/view/Spark-QA-Test/job/Spark-Master-SBT/4316/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=spark-test/consoleFull This was introduced in #10284. It's harmless because the NPE is caused by a race that occurs mainly in `local-cluster` tests (but don't actually fail the tests). You can merge this pull request into a Git repository by running: $ git pull https://github.com/andrewor14/spark fix-harmless-npe Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10417.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10417 commit 42334e082673d3ccff374c47111068f27dd8ddff Author: Andrew OrDate: 2015-12-21T19:44:48Z Do null check on `self` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6624][SQL]Add CNF Normalization as part...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/8200#discussion_r48183738 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -464,6 +465,27 @@ object OptimizeIn extends Rule[LogicalPlan] { } /** + * Convert an expression into its conjunctive normal form (CNF), i.e. AND of ORs. + * For example, a && b || c is normalized to (a || c) && (b || c) by this method. + * + * Refer to https://en.wikipedia.org/wiki/Conjunctive_normal_form for more information + */ +object CNFNormalization extends Rule[LogicalPlan] { + def apply(plan: LogicalPlan): LogicalPlan = plan transform { +case q: LogicalPlan => q transformExpressionsUp { + case or @ Or(left, right) => (left, right) match { --- End diff -- I would not make this a nested match as I think it makes it unnecessarily hard to read (and if you just use transform directly you won't have to manually handle the default case) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12443][SQL] encoderFor should support D...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/10399#issuecomment-166403849 Decimal is a private class. We should not expose it to users until we have audited the API. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12371][SQL] Runtime nullability check f...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/10331#discussion_r48184650 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -83,6 +84,16 @@ class Dataset[T] private[sql]( */ private[sql] val boundTEncoder = resolvedTEncoder.bind(logicalPlan.output) + logTrace( +s""" + |# unresolvedTEncoder.fromRowExpression + |${unresolvedTEncoder.fromRowExpression.treeString} + |# resolvedTEncoder.fromRowExpression + |${resolvedTEncoder.fromRowExpression.treeString} + |# boundTEncoder.fromRowExpression + |${boundTEncoder.fromRowExpression.treeString} + """.stripMargin) --- End diff -- What does this do? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12440][Core] - Avoid setCheckpoint warn...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/10392#issuecomment-166407388 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12457] [SQL] Add ExpressionDescription ...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/10418#discussion_r48186455 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -44,6 +46,9 @@ case class Size(child: Expression) extends UnaryExpression with ExpectsInputType * Sorts the input array in ascending / descending order according to the natural ordering of * the array elements and returns it. */ +@ExpressionDescription( +usage = "_FUNC_(array, order) - Sorts the input array for the given column in " + + "ascending / descending order, according to the natural ordering of the array elements") --- End diff -- Maybe it is better to use `_FUNC_(array, ascendingOrder)` and mention that `ascendingOrder` is a boolean? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12231][SQL]create a combineFilters' pro...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/10388#issuecomment-166408282 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12396][Core]Once driver connect to a ma...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/10407#issuecomment-166423976 @echoTomei it's failing style tests because there's a whitespace at the end of the line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12468] [Pyspark]
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10419#issuecomment-166428318 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-10625] [SQL] Spark SQL JDBC read/write ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/8785#discussion_r48194955 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/UnserializableDriverHelper.scala --- @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.jdbc + +import java.sql.{DriverManager, Connection} +import java.util.Properties +import java.util.logging.Logger + +object UnserializableDriverHelper { + + def replaceDriverDuring[T](f: => T): T = { +import scala.collection.JavaConverters._ --- End diff -- It's not necessary to make this local --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-10625] [SQL] Spark SQL JDBC read/write ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/8785#discussion_r48194924 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/UnserializableDriverHelper.scala --- @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.jdbc + +import java.sql.{DriverManager, Connection} +import java.util.Properties +import java.util.logging.Logger + +object UnserializableDriverHelper { + + def replaceDriverDuring[T](f: => T): T = { +import scala.collection.JavaConverters._ + +object UnserializableH2Driver extends org.h2.Driver { + + override def connect(url: String, info: Properties): Connection = { + +val result = super.connect(url, info) +info.put("unserializableDriver", this) +result + } + + override def getParentLogger: Logger = null +} + +val oldDrivers = DriverManager.getDrivers.asScala.filter(_.acceptsURL("jdbc:h2:")) +oldDrivers.foreach{ DriverManager.deregisterDriver } +DriverManager.registerDriver(UnserializableH2Driver) + +val result = try { f } +finally { --- End diff -- This still isn't normal try-finally formatting ``` try { } finally { } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-10625] [SQL] Spark SQL JDBC read/write ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/8785#discussion_r48194995 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/UnserializableDriverHelper.scala --- @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.jdbc + +import java.sql.{DriverManager, Connection} +import java.util.Properties +import java.util.logging.Logger + +object UnserializableDriverHelper { + + def replaceDriverDuring[T](f: => T): T = { +import scala.collection.JavaConverters._ + +object UnserializableH2Driver extends org.h2.Driver { + + override def connect(url: String, info: Properties): Connection = { + +val result = super.connect(url, info) +info.put("unserializableDriver", this) +result + } + + override def getParentLogger: Logger = null +} + +val oldDrivers = DriverManager.getDrivers.asScala.filter(_.acceptsURL("jdbc:h2:")) +oldDrivers.foreach{ DriverManager.deregisterDriver } --- End diff -- Still not right w.r.t spaces: `foo.foreach(bar)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12371][SQL] Runtime nullability check f...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10331#issuecomment-166380661 **[Test build #48117 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48117/consoleFull)** for PR 10331 at commit [`9f42052`](https://github.com/apache/spark/commit/9f420523ba01e9922fce06c81bd837661ec34107). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * `case class AssertNotNull(`\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8641][SQL] Native Spark Window function...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10402#issuecomment-166384678 @hvanhovell I just created https://issues.apache.org/jira/browse/SPARK-12455 for adding doc. Can you also put SPARK-12455 in the jira title? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12311][CORE] Restore previous value of ...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/10289#issuecomment-166388741 Yeah, this looks fine. I wonder if there's something safer that we can do in the future. What I want is the following but I don't think scalastyle is powerful enough to support it: ``` // Allow override def beforeAll(): Unit = { super.beforeAll() ... } // Disallow override def beforeAll(): Unit = { // no super.beforeAll() here ... } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3369] [CORE] [STREAMING] Java mapPartit...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10413#issuecomment-166388703 @srowen I've been thinking about all the RDD APIs last night, and one thought I have is that if the goal is to push more people towards DataFrame/Dataset, then maybe it wouldn't be crazy to not change any of the existing RDD APIs to ensure maximum compatibility. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12429][Streaming][Doc]Add Accumulator a...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10385#issuecomment-166388556 **[Test build #48120 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48120/consoleFull)** for PR 10385 at commit [`9e241e7`](https://github.com/apache/spark/commit/9e241e791e5ce2cb0cc3dd7c609339ca8ea11129). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8641][SPARK-12455][SQL] Native Spark Wi...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10402#issuecomment-166400507 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7727] [SQL] Avoid inner classes in Rule...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/10174#discussion_r48184082 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -28,11 +28,13 @@ import org.apache.spark.sql.catalyst.plans.logical._ import org.apache.spark.sql.catalyst.rules._ import org.apache.spark.sql.types._ -abstract class Optimizer extends RuleExecutor[LogicalPlan] - -object DefaultOptimizer extends Optimizer { - val batches = -// SubQueries are only needed for analysis and can be removed before execution. +/** + * Abstract class all optimizers should inherit of, contains the standard batches (extending --- End diff -- closing parentheses? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12438][SQL] Add SQLUserDefinedType supp...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/10390#issuecomment-166403465 I think we need more design work before we implement this. I'm not a huge fan of the current UDT API and I think we will replace it in Spark 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12457] [SQL] Add ExpressionDescription ...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/10418 [SPARK-12457] [SQL] Add ExpressionDescription to collection functions. Feeling a little bit guilty to pick the one with the least functions. : ) One question to @yhuai : when users describe these functions, I think it might be nice to provide them multiple useful examples. For example, for the function `sort_array`, I am wondering how to put the following example in `extended`? ``` > val df = Seq((Array(2, 1, 3), Array("b", "c", "a"))).toDF("col1", "col2") > df.select(sort_array($"col1", false), sort_array($"col2", false)).collect() Array[org.apache.spark.sql.Row] = Array([WrappedArray(3, 2, 1),WrappedArray(c, b, a)]) ``` So far, the example in `Upper` is not very clear when we adding the complex use cases. Could you show us more examples? Thank you! You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark ExpDesc4C Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10418.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10418 commit e49b03649981688a60076489bd52a5232f4c2f0d Author: gatorsmileDate: 2015-12-21T19:53:38Z Add ExpressionDescription to collection functions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12396][Core]Once driver connect to a ma...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/10407#issuecomment-166406368 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12463][SPARK-12464][SPARK-12465][SPARK-...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10057#issuecomment-166413261 **[Test build #48121 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48121/consoleFull)** for PR 10057 at commit [`2c29939`](https://github.com/apache/spark/commit/2c29939cd7ea3724d1bbb3defbdc1988afad85c8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12398] Smart truncation of DataFrame / ...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/10373#issuecomment-166413378 Thanks, merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12441] [SQL] Fixing missingInput in Gen...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/10393#discussion_r48191121 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala --- @@ -307,6 +307,7 @@ case class MapPartitions[T, U]( uEncoder: ExpressionEncoder[U], output: Seq[Attribute], child: SparkPlan) extends UnaryNode { + override def missingInput: AttributeSet = AttributeSet.empty --- End diff -- This fixes the problem, but I think it might be better to add something like `def producedAttributes: AttributeSet = AttributeSet` to `QueryPlan`. This can be automatically subtracted from `missingInput` and nodes that produce attributes can override this function. I think that will be clearer. What do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12463][SPARK-12464][SPARK-12465][SPARK-...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/10057#discussion_r48192141 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkCuratorUtil.scala --- @@ -35,8 +35,11 @@ private[spark] object SparkCuratorUtil extends Logging { def newClient( conf: SparkConf, zkUrlConf: String = "spark.deploy.zookeeper.url"): CuratorFramework = { --- End diff -- @tnachen did you miss this one? I don't think we need this param anymore since it's always going to be the same one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12331][ML] R^2 for regression through t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10384#issuecomment-166389342 **[Test build #48122 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48122/consoleFull)** for PR 10384 at commit [`526d156`](https://github.com/apache/spark/commit/526d1562ed3bc9a0870ca542f6831868db5d8633). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3369] [CORE] [STREAMING] Java mapPartit...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10413#issuecomment-166389536 I think it probably cuts the other way - if you want people to use dataframes then keeping the RDDs as-is is not that important. I don't see that RDDs are going away and fixing this API seems natural for 2.x. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12466] Fix harmless NPE in tests
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10417#issuecomment-166404483 **[Test build #48123 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48123/consoleFull)** for PR 10417 at commit [`42334e0`](https://github.com/apache/spark/commit/42334e082673d3ccff374c47111068f27dd8ddff). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12457] [SQL] Add ExpressionDescription ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10418#issuecomment-166423844 **[Test build #48129 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48129/consoleFull)** for PR 10418 at commit [`afdcb1c`](https://github.com/apache/spark/commit/afdcb1c6e0e8282f189935ce388a4687387a5816). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12463][SPARK-12464][SPARK-12465][SPARK-...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/10057#discussion_r48192593 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkCuratorUtil.scala --- @@ -35,8 +35,11 @@ private[spark] object SparkCuratorUtil extends Logging { def newClient( conf: SparkConf, zkUrlConf: String = "spark.deploy.zookeeper.url"): CuratorFramework = { -val ZK_URL = conf.get(zkUrlConf) -val zk = CuratorFrameworkFactory.newClient(ZK_URL, +newClient(conf.get(zkUrlConf)) + } + + def newClient(zkUrl: String): CuratorFramework = { --- End diff -- actually we probably don't need this method now that we support only one config --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12429][Streaming][Doc]Add Accumulator a...
Github user BenFradet commented on the pull request: https://github.com/apache/spark/pull/10385#issuecomment-166427396 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12371][SQL] Runtime nullability check f...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/10331#issuecomment-166430547 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-10625] [SQL] Spark SQL JDBC read/write ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/8785#discussion_r48195057 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala --- @@ -75,6 +78,16 @@ private[sql] object JDBCRelation { } ans.toArray } + + def getEffectiveProperties( + connectionProperties: Properties, + extraOptions: scala.collection.Map[String, String] = Map()): Properties = { --- End diff -- Still need not be qualified; then I think this need not wrap? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10647][MESOS] Fix zookeeper dir with me...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/10057#discussion_r48172947 --- Diff: core/src/main/scala/org/apache/spark/deploy/mesos/MesosClusterDispatcher.scala --- @@ -50,7 +50,11 @@ private[mesos] class MesosClusterDispatcher( extends Logging { private val publicAddress = Option(conf.getenv("SPARK_PUBLIC_DNS")).getOrElse(args.host) - private val recoveryMode = conf.get("spark.mesos.deploy.recoveryMode", "NONE").toUpperCase() + private val recoveryMode = conf.getOption("spark.mesos.deploy.recoverMode").map { mode => +logWarning("spark.mesos.deploy.recoverMode is deprecated. Please configure " + --- End diff -- I see, should we still honor the setting as a deprecation cycle? I'm going to just remove the warning for now but add a TODO. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12371][SQL] Runtime nullability check f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10331#issuecomment-166380809 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48117/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12371][SQL] Runtime nullability check f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10331#issuecomment-166380806 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8641][SPARK-12455][SQL] Native Spark Wi...
Github user hvanhovell commented on the pull request: https://github.com/apache/spark/pull/10402#issuecomment-166385561 Done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12062] [CORE] Change Master to asyc reb...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/10284#issuecomment-166385160 @tedyu would you mind pointing me to the jenkins page? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-10625] [SQL] Spark SQL JDBC read/write ...
Github user tribbloid commented on the pull request: https://github.com/apache/spark/pull/8785#issuecomment-166392614 Sorry I misunderstood your intention on previous comments, now both problem should be fixed: toSeq is removed and single line of code in brackets are pulled up. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8641][SPARK-12455][SQL] Native Spark Wi...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10402#issuecomment-166392842 I guess you also need to put `extended` like the example `Upper` given by @yhuai ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12439][SQL] Fix toCatalystArray and Map...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/10391#issuecomment-166404830 +1 to moving the test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12231][SQL]create a combineFilters' pro...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10388#issuecomment-166410817 **[Test build #48127 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48127/consoleFull)** for PR 10388 at commit [`305739f`](https://github.com/apache/spark/commit/305739f872ba90ba9ef4f3ef6c4f812b4024d8e9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12440][Core] - Avoid setCheckpoint warn...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10392#issuecomment-166410688 **[Test build #48126 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48126/consoleFull)** for PR 10392 at commit [`bace5b9`](https://github.com/apache/spark/commit/bace5b98d20a9f7d6876b65120e6668aa064c948). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12321][SQL] JSON format for TreeNode (u...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10311 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12457] [SQL] Add ExpressionDescription ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10418#issuecomment-166416596 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12441] [SQL] Fixing missingInput in Gen...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/10393#discussion_r48191985 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala --- @@ -307,6 +307,7 @@ case class MapPartitions[T, U]( uEncoder: ExpressionEncoder[U], output: Seq[Attribute], child: SparkPlan) extends UnaryNode { + override def missingInput: AttributeSet = AttributeSet.empty --- End diff -- That is a good idea. Will add it soon. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12463][SPARK-12464][SPARK-12465][SPARK-...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/10057#discussion_r48192082 --- Diff: core/src/main/scala/org/apache/spark/deploy/mesos/MesosClusterDispatcher.scala --- @@ -50,7 +50,10 @@ private[mesos] class MesosClusterDispatcher( extends Logging { private val publicAddress = Option(conf.getenv("SPARK_PUBLIC_DNS")).getOrElse(args.host) - private val recoveryMode = conf.get("spark.mesos.deploy.recoveryMode", "NONE").toUpperCase() + // TODO(tnachen): Remove support for spark.mesos.deploy.recoverMode in 0.28. + private val recoveryMode = conf.getOption("spark.mesos.deploy.recoverMode").map { mode => --- End diff -- I would just remove support for this right now since we're not even documenting it. The next version will be Spark 2.0 so this is the right time to remove support for things that we don't want to maintain. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12429][Streaming][Doc]Add Accumulator a...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/10385#issuecomment-166420976 Added Java and Python examples. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12149] [Web UI] Executor UI improvement...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10154#issuecomment-166430919 I think green makes more sense for completed, and blue for active, yes. My hesitation is just that you'll end up with the completed column always sticking out since it will always end up colored. Failed too of course but those do deserve attention. Any other opinions out there? after all it's a matter mostly of taste at this point. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12453] [Streaming] Spark Streaming Kine...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10416#issuecomment-166387366 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12415] Do not use closure serializer to...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10368#issuecomment-166387458 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48119/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12415] Do not use closure serializer to...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10368#issuecomment-166387455 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12415] Do not use closure serializer to...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10368#issuecomment-166387289 **[Test build #48119 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48119/consoleFull)** for PR 10368 at commit [`bc7699c`](https://github.com/apache/spark/commit/bc7699cc2793aeaa74f6a51bd8518dc91f327999). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org