[GitHub] spark issue #11867: [SPARK-14049] [CORE] Add functionality in spark history ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/11867 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11867: [SPARK-14049] [CORE] Add functionality in spark history ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/11867 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71699/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11867: [SPARK-14049] [CORE] Add functionality in spark history ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/11867 **[Test build #71699 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71699/testReport)** for PR 11867 at commit [`38ebece`](https://github.com/apache/spark/commit/38ebece49f0313c7fa9553309da85b67af4398ec). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16593 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71701/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16593 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16593 **[Test build #71701 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71701/testReport)** for PR 16593 at commit [`acca991`](https://github.com/apache/spark/commit/acca991d3d92116ce3a88918b3798d14d32849f8). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class ReorderHivePartitionedTableSchema(sparkSession: SparkSession)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16656: [SPARK-18116][DStream] Report stream input information a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16656 **[Test build #71708 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71708/testReport)** for PR 16656 at commit [`547ecb3`](https://github.com/apache/spark/commit/547ecb338fa086deb86edf93b091ea6fdf2836f2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16656: [SPARK-18116][DStream] Report stream input inform...
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/16656 [SPARK-18116][DStream] Report stream input information after recover from checkpoint ## What changes were proposed in this pull request? Run a streaming application which souce from kafka.There are many batchs queued in the job list before application stopped, and then stop the application, as follow starting it from checkpointed file, in the spark ui, the size of the queued batchs which stored in the checkpoint file are 0 ## How was this patch tested? update unit test You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark SPARK-18116 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16656.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16656 commit b8d09457448ca0f47238c8d5afb3db3d3e0cb3dc Author: uncleGen Date: 2017-01-20T02:55:54Z Report stream input information after recover from checkpoint commit 547ecb338fa086deb86edf93b091ea6fdf2836f2 Author: uncleGen Date: 2017-01-20T07:29:36Z add unit test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16654: [SPARK-19303][ML][WIP] Add evaluate method in clustering...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16654 **[Test build #71707 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71707/testReport)** for PR 16654 at commit [`29bda3f`](https://github.com/apache/spark/commit/29bda3f136b9766decf87d8452f30dc40871441d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16654: [SPARK-19303][ML][WIP] Add evaluate method in clustering...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16654 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16654: [SPARK-19303][ML][WIP] Add evaluate method in clustering...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16654 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71698/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16654: [SPARK-19303][ML][WIP] Add evaluate method in clustering...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16654 **[Test build #71698 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71698/testReport)** for PR 16654 at commit [`bb01219`](https://github.com/apache/spark/commit/bb01219acb8195c56bd76a25daec8952fba7631a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16586: [SPARK-19117][SPARK-18922][TESTS] Fix the rest of flaky,...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16586 Hi @srowen, I think it is ready for a second look. In short, the current status is, - there are some test failures (https://github.com/apache/spark/pull/16586#issuecomment-273437565) when running each package-level, which possibly look flaky - these failures were individually tested and passed by `test-only` (https://github.com/apache/spark/pull/16586#issuecomment-273952379) - `local metrics` seems still flaky but it seems less flaky in individual tests assuming from the build results in https://github.com/apache/spark/pull/16586#discussion_r97022356 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16645 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71696/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16645 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16645 **[Test build #71696 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71696/testReport)** for PR 16645 at commit [`b1028ad`](https://github.com/apache/spark/commit/b1028ad573301ae4d351678a6e6b3b66392e32d3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16586: [WIP][SPARK-19117][SPARK-18922][TESTS] Fix the re...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16586#discussion_r97022356 --- Diff: core/src/test/scala/org/apache/spark/scheduler/SparkListenerSuite.scala --- @@ -229,7 +229,7 @@ class SparkListenerSuite extends SparkFunSuite with LocalSparkContext with Match } val numSlices = 16 -val d = sc.parallelize(0 to 1e3.toInt, numSlices).map(w) +val d = sc.parallelize(0 to 1, numSlices).map(w) --- End diff -- I am pretty sure the deserialization time test is less flaky now assuming from the individual tests as below: **Before** - 9 failures out of 10. [1 (failed)](https://ci.appveyor.com/project/spark-test/spark/build/546-windows-complete/job/ktdmdxkdi4ni4ier) [2 (failed)](https://ci.appveyor.com/project/spark-test/spark/build/549-windows-complete/job/b4mqgyt72g6he7e7) [3 (failed)](https://ci.appveyor.com/project/spark-test/spark/build/551-windows-complete/job/j0ywrgv8d733yqb4) [4 (failed)](https://ci.appveyor.com/project/spark-test/spark/build/553-windows-complete/job/yqoapee3og5x46wk) [5 (passed)](https://ci.appveyor.com/project/spark-test/spark/build/554-windows-complete/job/g3hhdl5s8odu9ir0) [6 (failed)](https://ci.appveyor.com/project/spark-test/spark/build/555-windows-complete/job/9utyo2glowuf3ulc) [7 (failed)](https://ci.appveyor.com/project/spark-test/spark/build/541-windows-test/job/4gtm26hcm5327aa1) [8 (failed)](https://ci.appveyor.com/project/spark-test/spark/build/542-windows-test/job/166i4xiljy7iof8l) [9 (failed)](https://ci.appveyor.com/project/spark-test/spark/build/540-windows-test/job/39v7nwuq598p3rtm) [10 (failed)](https://ci.appveyor.com/project/spark-test/spark/build/539-windows-test/job/how9cbsj5i5cykeh) **After** - 1 failure out of 7. [1 (passed)](https://ci.appveyor.com/project/spark-test/spark/build/576-windows-complete/job/9sfx150cp38ofttn) [2 (passed)](https://ci.appveyor.com/project/spark-test/spark/build/577-windows-complete/job/nrjgs7emtlnj6y5f) [3 (passed)](https://ci.appveyor.com/project/spark-test/spark/build/578-windows-complete/job/qwgsuc5uas8mk0o7) [4 (passed)](https://ci.appveyor.com/project/spark-test/spark/build/579-windows-complete/job/sf1sspisb4ai4j7r) [5 (failed)](https://ci.appveyor.com/project/spark-test/spark/build/580-windows-complete/job/808c08fvnm26w3uh) [6 (passed)](https://ci.appveyor.com/project/spark-test/spark/build/581-windows-complete/job/y7o97qq18my44dvo) [7 (passed)](https://ci.appveyor.com/project/spark-test/spark/branch/68031366-45EE-45B4-867A-40A4D9B1AD07) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16593 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16593 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71702/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16593 **[Test build #71702 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71702/testReport)** for PR 16593 at commit [`21f113a`](https://github.com/apache/spark/commit/21f113a85ae2df46c93dd57384a01955f394188b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15770: [SPARK-15784][ML]:Add Power Iteration Clustering ...
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/15770#discussion_r97021451 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala --- @@ -0,0 +1,182 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.ml.clustering + +import org.apache.spark.annotation.{Experimental, Since} +import org.apache.spark.ml.Transformer +import org.apache.spark.ml.linalg.{Vector} +import org.apache.spark.ml.param._ +import org.apache.spark.ml.param.shared._ +import org.apache.spark.ml.util._ +import org.apache.spark.mllib.clustering.{PowerIterationClustering => MLlibPowerIterationClustering} +import org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment +import org.apache.spark.rdd.RDD +import org.apache.spark.sql.{DataFrame, Dataset, Row} +import org.apache.spark.sql.functions.{col} +import org.apache.spark.sql.types.{IntegerType, LongType, StructField, StructType} + +/** + * Common params for PowerIterationClustering + */ +private[clustering] trait PowerIterationClusteringParams extends Params with HasMaxIter + with HasFeaturesCol with HasPredictionCol { + + /** + * The number of clusters to create (k). Must be > 1. Default: 2. + * @group param + */ + @Since("2.2.0") + final val k = new IntParam(this, "k", "The number of clusters to create. " + +"Must be > 1.", ParamValidators.gt(1)) + + /** @group getParam */ + @Since("2.2.0") + def getK: Int = $(k) + + /** + * Param for the initialization algorithm. This can be either "random" to use a random vector + * as vertex properties, or "degree" to use normalized sum similarities. Default: random. + */ + @Since("2.2.0") + final val initMode = new Param[String](this, "initMode", "The initialization algorithm. " + +"Supported options: 'random' and 'degree'.", +(value: String) => validateInitMode(value)) --- End diff -- What about use validator `ParamValidators.inArray[String](...)` instead? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16586: [WIP][SPARK-19117][SPARK-18922][TESTS] Fix the rest of f...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16586 They all pass in individual tests with `test-only` (please check the logs above). ``` org.apache.spark.scheduler.SparkListenerSuite: - local metrics (8 seconds, 656 milliseconds) org.apache.spark.sql.hive.execution.HiveQuerySuite: - constant null testing (531 milliseconds) org.apache.spark.sql.hive.execution.AggregationQuerySuite: - udaf with all data types (4 seconds, 285 milliseconds) org.apache.spark.sql.hive.StatisticsSuite: - verify serialized column stats after analyzing columns (2 seconds, 844 milliseconds) org.apache.spark.sql.hive.execution.SQLQuerySuite: - dynamic partition value test (1 second, 407 milliseconds) - SPARK-6785: HiveQuerySuite - Date cast (188 milliseconds) ``` Although I am wondering how/why those tests seem more flaky (assuming from observations in the builds), I think it is possible to say, at least, Spark tests (in a way I run) are able to pass on Windows. Let me remove `[WIP]` and try to make the tests stable on Windows in the future if this sounds reasonable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16619: [WIP][SPARK-19257][SQL]CatalogStorageFormat.locationUri ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16619 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71700/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16619: [WIP][SPARK-19257][SQL]CatalogStorageFormat.locationUri ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16619 **[Test build #71700 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71700/testReport)** for PR 16619 at commit [`66dc4de`](https://github.com/apache/spark/commit/66dc4de3cd466e1fc6897b5034967a8c01bc8867). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16619: [WIP][SPARK-19257][SQL]CatalogStorageFormat.locationUri ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16619 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture s...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16646#discussion_r97019304 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/GaussianMixtureWrapper.scala --- @@ -91,7 +92,10 @@ private[r] object GaussianMixtureWrapper extends MLReadable[GaussianMixtureWrapp .setStages(Array(rFormulaModel, gm)) .fit(data) -new GaussianMixtureWrapper(pipeline, dim) +val gmm: GaussianMixtureModel = pipeline.stages(1).asInstanceOf[GaussianMixtureModel] --- End diff -- they are the same when the pipeline has 1 stage. I prefer `stages.last` because if we later add a stage to transform the input data it will break `stages(1)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture s...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16646#discussion_r97019071 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/GaussianMixtureWrapper.scala --- @@ -124,7 +129,8 @@ private[r] object GaussianMixtureWrapper extends MLReadable[GaussianMixtureWrapp val rMetadataStr = sc.textFile(rMetadataPath, 1).first() val rMetadata = parse(rMetadataStr) val dim = (rMetadata \ "dim").extract[Int] - new GaussianMixtureWrapper(pipeline, dim, isLoaded = true) + val logLikelihood = (rMetadata \ "logLikelihood").extract[Double] + new GaussianMixtureWrapper(pipeline, dim, logLikelihood, isLoaded = true) --- End diff -- it may not be a big deal right now, since spark.gmm is relatively new. but I think we should come up with a plan on model persistent compability not only with R vs JVM but also across versions of Spark. also might be useful to link this JIRA to SPARK-18864 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16582: [SPARK-19220][UI] Make redirection to HTTPS apply...
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/16582#discussion_r97018717 --- Diff: core/src/main/scala/org/apache/spark/ui/JettyUtils.scala --- @@ -306,23 +311,31 @@ private[spark] object JettyUtils extends Logging { httpConnector.setPort(currentPort) connectors += httpConnector - sslOptions.createJettySslContextFactory().foreach { factory => -// If the new port wraps around, do not try a privileged port. -val securePort = - if (currentPort != 0) { -(currentPort + 400 - 1024) % (65536 - 1024) + 1024 - } else { -0 - } -val scheme = "https" -// Create a connector on port securePort to listen for HTTPS requests -val connector = new ServerConnector(server, factory) -connector.setPort(securePort) - -connectors += connector - -// redirect the HTTP requests to HTTPS port -collection.addHandler(createRedirectHttpsHandler(securePort, scheme)) + val httpsConnector = sslOptions.createJettySslContextFactory() match { +case Some(factory) => + // If the new port wraps around, do not try a privileged port. + val securePort = +if (currentPort != 0) { + (currentPort + 400 - 1024) % (65536 - 1024) + 1024 +} else { + 0 +} + val scheme = "https" + // Create a connector on port securePort to listen for HTTPS requests + val connector = new ServerConnector(server, factory) + connector.setPort(securePort) + connector.setName(SPARK_CONNECTOR_NAME) + connectors += connector + + // redirect the HTTP requests to HTTPS port + httpConnector.setName(REDIRECT_CONNECTOR_NAME) + collection.addHandler(createRedirectHttpsHandler(securePort, scheme)) --- End diff -- I noticed one point. If a port is already used, `collection.addHandler` will take place more than twice leading redirection doesn't work properly. Of course, it's not your fault. If you fix it in this PR together, it's good but it's a separate issue so I'll fix in another PR otherwise. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16631: [SPARK-19271] [SQL] Change non-cbo estimation of ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16631 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16631: [SPARK-19271] [SQL] Change non-cbo estimation of aggrega...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16631 Thanks! Merging to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16566: [SPARK-18821][SparkR]: Bisecting k-means wrapper in Spar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16566 **[Test build #71706 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71706/testReport)** for PR 16566 at commit [`b25fc83`](https://github.com/apache/spark/commit/b25fc832c79714db20e2c79e95253919a36714f1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture s...
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/16646#discussion_r97018334 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/GaussianMixtureWrapper.scala --- @@ -91,7 +92,10 @@ private[r] object GaussianMixtureWrapper extends MLReadable[GaussianMixtureWrapp .setStages(Array(rFormulaModel, gm)) .fit(data) -new GaussianMixtureWrapper(pipeline, dim) +val gmm: GaussianMixtureModel = pipeline.stages(1).asInstanceOf[GaussianMixtureModel] --- End diff -- I have a question: I saw in some wrappers,, it uses `pipeline.stages.last` and some uses `pipeline.stages(1)`. What is the difference of the two use case? I tried using them interchangably and the tests are still passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16631: [SPARK-19271] [SQL] Change non-cbo estimation of ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16631#discussion_r97017980 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala --- @@ -344,7 +344,8 @@ abstract class UnaryNode extends LogicalPlan { sizeInBytes = 1 } -child.stats(conf).copy(sizeInBytes = sizeInBytes) +// Don't propagate rowCount and attributeStats, since they are not estimated here. --- End diff -- Sure. Please submit the PR to fix the other cases. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16631: [SPARK-19271] [SQL] Change non-cbo estimation of ...
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/16631#discussion_r97017902 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala --- @@ -344,7 +344,8 @@ abstract class UnaryNode extends LogicalPlan { sizeInBytes = 1 } -child.stats(conf).copy(sizeInBytes = sizeInBytes) +// Don't propagate rowCount and attributeStats, since they are not estimated here. --- End diff -- If we remove this, estimation result of aggregate still has wrong rowCount and attributeStats. Shall we merge this and I'll do tests for other unaryNodes and fix them if something still goes wrong. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16566: [SPARK-18821][SparkR]: Bisecting k-means wrapper ...
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/16566#discussion_r97017872 --- Diff: R/pkg/R/mllib_clustering.R --- @@ -38,6 +45,149 @@ setClass("KMeansModel", representation(jobj = "jobj")) #' @note LDAModel since 2.1.0 setClass("LDAModel", representation(jobj = "jobj")) +#' Bisecting K-Means Clustering Model +#' +#' Fits a bisecting k-means clustering model against a Spark DataFrame. +#' Users can call \code{summary} to print a summary of the fitted model, \code{predict} to make +#' predictions on new data, and \code{write.ml}/\code{read.ml} to save/load fitted models. +#' +#' @param data a SparkDataFrame for training. +#' @param formula a symbolic description of the model to be fitted. Currently only a few formula +#'operators are supported, including '~', '.', ':', '+', and '-'. +#'Note that the response variable of formula is empty in spark.bisectingKmeans. +#' @param k the desired number of leaf clusters. Must be > 1. +#' The actual number could be smaller if there are no divisible leaf clusters. +#' @param maxIter maximum iteration number. +#' @param seed the random seed. +#' @param minDivisibleClusterSize The minimum number of points (if greater than or equal to 1.0) +#'or the minimum proportion of points (if less than 1.0) of a divisible cluster. +#'Note that it is an advanced. The default value should be enough --- End diff -- In scala, it uses `@group expertParam` in the document and the API document shows `(expert-only) Parameters`. I will change it to `it is an expert parameter`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16655: [SPARK-19305][SQL] partitioned table should always put p...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16655 **[Test build #71705 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71705/testReport)** for PR 16655 at commit [`9ec7d36`](https://github.com/apache/spark/commit/9ec7d36a198560441e3c3e96fa59789bdd36751b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16593#discussion_r97017743 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala --- @@ -45,6 +46,18 @@ case class CreateHiveTableAsSelectCommand( override def innerChildren: Seq[LogicalPlan] = Seq(query) override def run(sparkSession: SparkSession): Seq[Row] = { +// when create a partitioned table, we should reorder the columns +// to put the partition columns at the end +val partitionAttrs = tableDesc.partitionColumnNames.map { p => + query.output.find(_.name == p).getOrElse( +new AnalysisException(s"Partition column[$p] does not exist " + + s"in query output partition").asInstanceOf[NamedExpression] + ) +} +val partitionSet = AttributeSet(partitionAttrs) +val dataAttrs = query.output.filterNot(partitionSet.contains) +val reorderedOutputQuery = Project(dataAttrs ++ partitionAttrs, query) --- End diff -- we can revert this after https://github.com/apache/spark/pull/16655 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16655: [SPARK-19305][SQL] partitioned table should always put p...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16655 cc @yhuai @gatorsmile @windpiger --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16655: [SPARK-19305][SQL] partitioned table should alway...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/16655 [SPARK-19305][SQL] partitioned table should always put partition columns at the end of table schema ## What changes were proposed in this pull request? For data source tables, we will always reorder the specified table schema, or the query in CTAS, to put partition columns at the end. e.g. `CREATE TABLE t(a int, b int, c int, d int) USING parquet PARTITIONED BY (d, b)` will create a table with schema `` Hive serde tables don't have this problem before, because its CREATE TABLE syntax specifies data schema and partition schema individually. However, after we unifed the CREATE TABLE syntax, Hive serde table also need to do the reorder. This PR puts the reorder logic in a analyzer rule, which works with both data source tables and Hive serde tables. ## How was this patch tested? new regression test You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark schema Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16655.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16655 commit 9ec7d36a198560441e3c3e96fa59789bdd36751b Author: Wenchen Fan Date: 2017-01-20T06:10:36Z partitioned table should always put partition columns at the end of table schema --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16347: [SPARK-18934][SQL] Writing to dynamic partitions does no...
Github user junegunn commented on the issue: https://github.com/apache/spark/pull/16347 Rebased to current master. The patch is simpler thanks to the refactoring made in [SPARK-18243](https://issues.apache.org/jira/browse/SPARK-18243). Anyway, I can understand your rationale for wanting to have explicit API on the writer side, but then make sure that the sort specification from `sortWithinPartitions` is automatically propagated to the writer, or the method is no longer compatible to `SORT BY` in Hive and [the documentation](https://github.com/apache/spark/blob/v2.1.0/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L990) should be corrected accordingly. Care should be taken for `INSERT OVERWRITE TABLE ... DISTIRBUTE BY ... SORT BY ...` statement in Spark SQL so that it's compatible to the same Hive SQL. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16642: [SPARK-19284][SQL]append to partitioned datasource table...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16642 **[Test build #71704 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71704/testReport)** for PR 16642 at commit [`3a5ebd7`](https://github.com/apache/spark/commit/3a5ebd7ee5ead531bc9a778703faebc4807b8611). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16642: [SPARK-19284][SQL]append to partitioned datasource table...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16642 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71703/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16642: [SPARK-19284][SQL]append to partitioned datasource table...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16642 **[Test build #71703 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71703/testReport)** for PR 16642 at commit [`095d421`](https://github.com/apache/spark/commit/095d421a05f985785964c2fae0e7c4f84fc1752a). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16642: [SPARK-19284][SQL]append to partitioned datasource table...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16642 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture s...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/16646#discussion_r97016294 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/GaussianMixtureWrapper.scala --- @@ -124,7 +129,8 @@ private[r] object GaussianMixtureWrapper extends MLReadable[GaussianMixtureWrapp val rMetadataStr = sc.textFile(rMetadataPath, 1).first() val rMetadata = parse(rMetadataStr) val dim = (rMetadata \ "dim").extract[Int] - new GaussianMixtureWrapper(pipeline, dim, isLoaded = true) + val logLikelihood = (rMetadata \ "logLikelihood").extract[Double] + new GaussianMixtureWrapper(pipeline, dim, logLikelihood, isLoaded = true) --- End diff -- Yeah, it will break existing persisted model, but I think we don't guarantee mode persistent compatibility between different versions for SparkR. We are planing to make model persistence consistent between SparkR and MLlib, then there is no specific handling for SparkR and will let MLlib to handle all model persistent issue. However if we want to make model persistent compatibility for SparkR currently, I can add code to handle different versions here but will lead maintenance more complicated. What's your opinions? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16642: [SPARK-19284][SQL]append to partitioned datasource table...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16642 **[Test build #71703 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71703/testReport)** for PR 16642 at commit [`095d421`](https://github.com/apache/spark/commit/095d421a05f985785964c2fae0e7c4f84fc1752a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16643: [SPARK-17724][Streaming][WebUI] Unevaluated new lines in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16643 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16643: [SPARK-17724][Streaming][WebUI] Unevaluated new lines in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16643 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71695/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16643: [SPARK-17724][Streaming][WebUI] Unevaluated new lines in...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16643 **[Test build #71695 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71695/testReport)** for PR 16643 at commit [`d1c16e2`](https://github.com/apache/spark/commit/d1c16e2f17190e6d227a9d062a54ffb75687ce68). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16653: [SPARK-19302][DOC][MINOR] Fix the wrong item format in s...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16653 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/16593 thanks all, let's make a summary: 1. no CTAS ` create table t(a int, b int, c string, d string) using $provider partitioned by(d, c) ` the schema order of table in catalog should be `a, b, d, c` a) for datasource table this situation `has ensured by DataSource.getOrInferFileFormatSchema`: https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L182 b) for hive table as @lins05 's comment, currently we does not process this situation, as the suggest we should add a new rule for it. 2. CTAS ` create table t using $provider partitioned by(d, c) select 1 as b, 2 as a, 'x' as c, 'y' as d ` the schema order of table in catalog should be `b, a, d, c` a) for datasource table this situation `has ensured by create table with updated schema`: https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala#L159 b) for hive table this pr put this logic in `CreateHIveTableAsSelectCommand`, if we add a new rule, we can merge the logic with no-CTAS for hive situation. Above all, to ensure the order of schema in catalog as we expected, we need add a new rule for hive table. this is the test branch implement the new rule,https://github.com/windpiger/spark/commit/acca991d3d92116ce3a88918b3798d14d32849f8#diff-73bd90660f41c12a87ee9fe8d35d856aR463 But before this implement new rule, we should first merge the pr(#16642), then we can get a `tableDesc with non-empty schema`, and then we can use it here https://github.com/windpiger/spark/commit/acca991d3d92116ce3a88918b3798d14d32849f8#diff-73bd90660f41c12a87ee9fe8d35d856aR470 @cloud-fan @lins05 is this ok? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture s...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/16646#discussion_r97015657 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/GaussianMixtureWrapper.scala --- @@ -91,7 +92,10 @@ private[r] object GaussianMixtureWrapper extends MLReadable[GaussianMixtureWrapp .setStages(Array(rFormulaModel, gm)) .fit(data) -new GaussianMixtureWrapper(pipeline, dim) +val gmm: GaussianMixtureModel = pipeline.stages(1).asInstanceOf[GaussianMixtureModel] +val logLikelihood: Double = gmm.summary.logLikelihood --- End diff -- Both are ok, to explicitly give a type will make developers clear to understand what it means. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15353: [SPARK-17724][WebUI][Streaming] Unevaluated new l...
Github user keypointt closed the pull request at: https://github.com/apache/spark/pull/15353 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16593 **[Test build #71702 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71702/testReport)** for PR 16593 at commit [`21f113a`](https://github.com/apache/spark/commit/21f113a85ae2df46c93dd57384a01955f394188b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11867: [SPARK-14049] [CORE] Add functionality in spark history ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/11867 **[Test build #71699 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71699/testReport)** for PR 11867 at commit [`38ebece`](https://github.com/apache/spark/commit/38ebece49f0313c7fa9553309da85b67af4398ec). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16566: [SPARK-18821][SparkR]: Bisecting k-means wrapper ...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16566#discussion_r97014380 --- Diff: R/pkg/R/mllib_clustering.R --- @@ -38,6 +45,149 @@ setClass("KMeansModel", representation(jobj = "jobj")) #' @note LDAModel since 2.1.0 setClass("LDAModel", representation(jobj = "jobj")) +#' Bisecting K-Means Clustering Model +#' +#' Fits a bisecting k-means clustering model against a Spark DataFrame. +#' Users can call \code{summary} to print a summary of the fitted model, \code{predict} to make +#' predictions on new data, and \code{write.ml}/\code{read.ml} to save/load fitted models. +#' +#' @param data a SparkDataFrame for training. +#' @param formula a symbolic description of the model to be fitted. Currently only a few formula +#'operators are supported, including '~', '.', ':', '+', and '-'. +#'Note that the response variable of formula is empty in spark.bisectingKmeans. +#' @param k the desired number of leaf clusters. Must be > 1. +#' The actual number could be smaller if there are no divisible leaf clusters. +#' @param maxIter maximum iteration number. +#' @param seed the random seed. +#' @param minDivisibleClusterSize The minimum number of points (if greater than or equal to 1.0) +#'or the minimum proportion of points (if less than 1.0) of a divisible cluster. +#'Note that it is an advanced. The default value should be enough --- End diff -- as far as I recall the term used in spark.ml doc is "expert parameter" - you might want to check how it is explained there. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16566: [SPARK-18821][SparkR]: Bisecting k-means wrapper ...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16566#discussion_r97014257 --- Diff: R/pkg/R/mllib_clustering.R --- @@ -38,6 +45,149 @@ setClass("KMeansModel", representation(jobj = "jobj")) #' @note LDAModel since 2.1.0 setClass("LDAModel", representation(jobj = "jobj")) +#' Bisecting K-Means Clustering Model +#' +#' Fits a bisecting k-means clustering model against a Spark DataFrame. +#' Users can call \code{summary} to print a summary of the fitted model, \code{predict} to make +#' predictions on new data, and \code{write.ml}/\code{read.ml} to save/load fitted models. +#' +#' @param data a SparkDataFrame for training. +#' @param formula a symbolic description of the model to be fitted. Currently only a few formula +#'operators are supported, including '~', '.', ':', '+', and '-'. +#'Note that the response variable of formula is empty in spark.bisectingKmeans. +#' @param k the desired number of leaf clusters. Must be > 1. +#' The actual number could be smaller if there are no divisible leaf clusters. +#' @param maxIter maximum iteration number. +#' @param seed the random seed. +#' @param minDivisibleClusterSize The minimum number of points (if greater than or equal to 1.0) +#'or the minimum proportion of points (if less than 1.0) of a divisible cluster. +#'Note that it is an advanced. The default value should be enough --- End diff -- `Note that it is an advanced. ` do you mean to say `Note that it is an advanced option.`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16593 **[Test build #71701 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71701/testReport)** for PR 16593 at commit [`acca991`](https://github.com/apache/spark/commit/acca991d3d92116ce3a88918b3798d14d32849f8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16619: [WIP][SPARK-19257][SQL]CatalogStorageFormat.locationUri ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16619 **[Test build #71700 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71700/testReport)** for PR 16619 at commit [`66dc4de`](https://github.com/apache/spark/commit/66dc4de3cd466e1fc6897b5034967a8c01bc8867). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16653: [SPARK-19302][DOC][MINOR] Fix the wrong item format in s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16653 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71697/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16457: [SPARK-19057][ML] Instances' weight must be non-n...
Github user zhengruifeng closed the pull request at: https://github.com/apache/spark/pull/16457 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16457: [SPARK-19057][ML] Instances' weight must be non-negative
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/16457 I think it better to discuss in the JIRA. When we come to an agreement, I will reopen this pr. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16653: [SPARK-19302][DOC][MINOR] Fix the wrong item format in s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16653 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16653: [SPARK-19302][DOC][MINOR] Fix the wrong item format in s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16653 **[Test build #71697 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71697/testReport)** for PR 16653 at commit [`f337dc3`](https://github.com/apache/spark/commit/f337dc33be85374296b43b1f25435521be63b782). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16654: [SPARK-19303][ML][WIP] Add evaluate method in clustering...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16654 **[Test build #71698 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71698/testReport)** for PR 16654 at commit [`bb01219`](https://github.com/apache/spark/commit/bb01219acb8195c56bd76a25daec8952fba7631a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16631: [SPARK-19271] [SQL] Change non-cbo estimation of ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16631#discussion_r97013889 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala --- @@ -344,7 +344,8 @@ abstract class UnaryNode extends LogicalPlan { sizeInBytes = 1 } -child.stats(conf).copy(sizeInBytes = sizeInBytes) +// Don't propagate rowCount and attributeStats, since they are not estimated here. --- End diff -- How about removing this and fix all the similar issues in a separate PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16631: [SPARK-19271] [SQL] Change non-cbo estimation of aggrega...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16631 LGTM except one comment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16654: [SPARK-19303][ML][WIP] Add evaluate method in clu...
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/16654 [SPARK-19303][ML][WIP] Add evaluate method in clustering models ## What changes were proposed in this pull request? 1, add evaluation metric in summary 2, add an evaluate() method which returns a summary ## How was this patch tested? added tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhengruifeng/spark clustering_model_evaluate Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16654.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16654 commit bb01219acb8195c56bd76a25daec8952fba7631a Author: Zheng RuiFeng Date: 2017-01-20T05:12:29Z create pr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture supports...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16646 looks good. just have some question not specific to this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture s...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16646#discussion_r97013742 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/GaussianMixtureWrapper.scala --- @@ -124,7 +129,8 @@ private[r] object GaussianMixtureWrapper extends MLReadable[GaussianMixtureWrapp val rMetadataStr = sc.textFile(rMetadataPath, 1).first() val rMetadata = parse(rMetadataStr) val dim = (rMetadata \ "dim").extract[Int] - new GaussianMixtureWrapper(pipeline, dim, isLoaded = true) + val logLikelihood = (rMetadata \ "logLikelihood").extract[Double] + new GaussianMixtureWrapper(pipeline, dim, logLikelihood, isLoaded = true) --- End diff -- would this break with any existing persisted model (that is missing a double here for logLikelihood)? is there a way to mitigate that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture s...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16646#discussion_r97013542 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/GaussianMixtureWrapper.scala --- @@ -91,7 +92,10 @@ private[r] object GaussianMixtureWrapper extends MLReadable[GaussianMixtureWrapp .setStages(Array(rFormulaModel, gm)) .fit(data) -new GaussianMixtureWrapper(pipeline, dim) +val gmm: GaussianMixtureModel = pipeline.stages(1).asInstanceOf[GaussianMixtureModel] +val logLikelihood: Double = gmm.summary.logLikelihood --- End diff -- for this line and above it, why do we need to explicitly give it a type (ie, `Double` or `GaussianMixtureModel`)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture s...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16646#discussion_r97013459 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/GaussianMixtureWrapper.scala --- @@ -91,7 +92,10 @@ private[r] object GaussianMixtureWrapper extends MLReadable[GaussianMixtureWrapp .setStages(Array(rFormulaModel, gm)) .fit(data) -new GaussianMixtureWrapper(pipeline, dim) +val gmm: GaussianMixtureModel = pipeline.stages(1).asInstanceOf[GaussianMixtureModel] --- End diff -- hmm, I see what you are saying --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16653: [SPARK-19302][DOC][MINOR] Fix the wrong item format in s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16653 **[Test build #71697 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71697/testReport)** for PR 16653 at commit [`f337dc3`](https://github.com/apache/spark/commit/f337dc33be85374296b43b1f25435521be63b782). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16653: [SPARK-19302][DOC][MINOR] Fix the wrong item form...
GitHub user sarutak opened a pull request: https://github.com/apache/spark/pull/16653 [SPARK-19302][DOC][MINOR] Fix the wrong item format in security.md ## What changes were proposed in this pull request? In docs/security.md, there is a description as follows. ``` steps to configure the key-stores and the trust-store for the standalone deployment mode is as follows: * Generate a keys pair for each node * Export the public key of the key pair to a file on each node * Import all exported public keys into a single trust-store ``` According to markdown format, the first item should follow a blank line. ## How was this patch tested? Manually tested. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sarutak/spark SPARK-19302 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16653.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16653 commit f337dc33be85374296b43b1f25435521be63b782 Author: sarutak Date: 2017-01-20T04:52:38Z Fixed item format in security.md to abide by the Markdown format --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16652: [SPARK-19234][MLLib] AFTSurvivalRegression should fail f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16652 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16652: [SPARK-19234][MLLib] AFTSurvivalRegression should...
GitHub user admackin opened a pull request: https://github.com/apache/spark/pull/16652 [SPARK-19234][MLLib] AFTSurvivalRegression should fail fast when any labels are zero ## What changes were proposed in this pull request? If any labels of 0.0 (which are invalid) are supplied, AFTSurvivalRegression gives an error straight away rather than hard-to-interpret warnings and zero-valued coefficients in the output. ## How was this patch tested? Verified against current test suite. (One test needed to be updated as it was providing values of zero for labels so was failing after this patch) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/admackin/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16652.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16652 commit ab6d4148c4aa721898733b14eec5068652ca1085 Author: Andy MacKinlay Date: 2017-01-20T01:56:45Z Addresses SPARK-19234 - make sure label is positive commit b07c281c378d68d86b81498ca247c7346719973e Author: Andy MacKinlay Date: 2017-01-20T04:02:54Z Addresses SPARK-19234 - fix test suite to ensure no zero-labels get passed in test cases as they now throw errors --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16645 **[Test build #71696 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71696/testReport)** for PR 16645 at commit [`b1028ad`](https://github.com/apache/spark/commit/b1028ad573301ae4d351678a6e6b3b66392e32d3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16645 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16645 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71693/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16645 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16645 **[Test build #71693 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71693/testReport)** for PR 16645 at commit [`b1028ad`](https://github.com/apache/spark/commit/b1028ad573301ae4d351678a6e6b3b66392e32d3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/16633 Hi @viirya , the main concern of @scwf is that, we can't afford performance regression in any customer scenarios. I think you can understand that :) I went through the discussion above, it seems we've had some solution for both cases you mentioned [here](https://github.com/apache/spark/pull/16633#issuecomment-273963150), then talking points becomes the following two: 1. how to decide the threshold of the two cases; 2. rdd chain is broken. Let's wait @rxin 's comment on the second point. Here I'm just interested in the first one. One possible way to get the number is to modify the mapoutput statistics suggested by @scwf . For cbo, if the computing logic before limit is complex, it's hard to get an accurate estimation. E.g. joins from filtered tables, where join keys and filter keys are probably different (that'll need column correlation info). As you mentioned we can get an estimated number and confidence, can you describe how? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16647: [SPARK-19292][SQL] filter with partition columns ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16647 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16647: [SPARK-19292][SQL] filter with partition columns should ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16647 Thanks! Merging to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16631: [SPARK-19271] [SQL] Change non-cbo estimation of ...
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/16631#discussion_r97009203 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala --- @@ -344,7 +344,8 @@ abstract class UnaryNode extends LogicalPlan { sizeInBytes = 1 } -child.stats(conf).copy(sizeInBytes = sizeInBytes) +// Don't propagate rowCount and attributeStats, since they are not estimated here. --- End diff -- Yes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16631: [SPARK-19271] [SQL] Change non-cbo estimation of ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16631#discussion_r97008849 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala --- @@ -344,7 +344,8 @@ abstract class UnaryNode extends LogicalPlan { sizeInBytes = 1 } -child.stats(conf).copy(sizeInBytes = sizeInBytes) +// Don't propagate rowCount and attributeStats, since they are not estimated here. --- End diff -- This sounds a general bug. We are having multiple `UnaryNode` are doing the same thing. Is my understanding right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16028: [SPARK-18518][ML] HasSolver supports override
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16028 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71694/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16630: [SPARK-19270][ML] Add summary table to GLM summary
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/16630 Jenkins add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16630: [SPARK-19270][ML] Add summary table to GLM summary
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/16630 Jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16028: [SPARK-18518][ML] HasSolver supports override
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16028 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16028: [SPARK-18518][ML] HasSolver supports override
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16028 **[Test build #71694 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71694/testReport)** for PR 16028 at commit [`a95`](https://github.com/apache/spark/commit/a959a0b9a98dda2f45ce4843ed8595024e58). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15219: [SPARK-14098][SQL] Generate Java code to build CachedCol...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15219 **[Test build #71692 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71692/testReport)** for PR 15219 at commit [`b15d9d5`](https://github.com/apache/spark/commit/b15d9d5724936f5946d99acc40b75754e8583aa6). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15219: [SPARK-14098][SQL] Generate Java code to build CachedCol...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15219 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15219: [SPARK-14098][SQL] Generate Java code to build CachedCol...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15219 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71692/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16630: [SPARK-19270][ML] Add summary table to GLM summary
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/16630 jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture s...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/16646#discussion_r97007956 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/GaussianMixtureWrapper.scala --- @@ -91,7 +92,10 @@ private[r] object GaussianMixtureWrapper extends MLReadable[GaussianMixtureWrapp .setStages(Array(rFormulaModel, gm)) .fit(data) -new GaussianMixtureWrapper(pipeline, dim) +val gmm: GaussianMixtureModel = pipeline.stages(1).asInstanceOf[GaussianMixtureModel] --- End diff -- We need here to explicitly get ```logLikelihood``` and make it a member of the wrapper, since ```summary``` was not saved in the pipeline model so we can't get it (after L40) from a persistent R gaussian mixture model. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture s...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/16646#discussion_r97007694 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/GaussianMixtureWrapper.scala --- @@ -91,7 +92,10 @@ private[r] object GaussianMixtureWrapper extends MLReadable[GaussianMixtureWrapp .setStages(Array(rFormulaModel, gm)) .fit(data) -new GaussianMixtureWrapper(pipeline, dim) +val gmm: GaussianMixtureModel = pipeline.stages(1).asInstanceOf[GaussianMixtureModel] +val logLikelihood: Double = gmm.summary.logLikelihood + +new GaussianMixtureWrapper(pipeline, dim, logLikelihood) --- End diff -- We can't, since ```summary``` was not saved in the pipeline model, so we need to save it into the wrapper explicitly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...
Github user scwf commented on the issue: https://github.com/apache/spark/pull/16633 @viirya i suggest fix the 2 in this pr, let's wait some comment on 1. /cc @rxin and @wzhfy who may comment on the first case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11867: [SPARK-14049] [CORE] Add functionality in spark history ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/11867 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71690/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11867: [SPARK-14049] [CORE] Add functionality in spark history ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/11867 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org