[spark] branch master updated (27ed89b7be5 -> be5c85cffee)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 27ed89b7be5 [SPARK-38775][ML] cleanup validation functions add be5c85cffee [SPARK-36979][SQL][TESTS][FOLLOWUP] Move the test from `SQLQuerySuite` to `SQLQueryTestSuite` No new revisions were added by this update. Summary of changes: .../resources/sql-tests/inputs/non-excludable-rule.sql | 4 .../sql-tests/results/non-excludable-rule.sql.out| 16 .../test/scala/org/apache/spark/sql/SQLQuerySuite.scala | 7 --- 3 files changed, 20 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-38775][ML] cleanup validation functions
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 27ed89b7be5 [SPARK-38775][ML] cleanup validation functions 27ed89b7be5 is described below commit 27ed89b7be5ebb91e4a0b106b1669a7867a6012d Author: Ruifeng Zheng AuthorDate: Sat Jun 18 21:51:50 2022 -0700 [SPARK-38775][ML] cleanup validation functions ### What changes were proposed in this pull request? 1, remove unused `extractInstances` and `extractLabeledPoints` in `Predictor`; 2, remove unused `checkNonNegativeWeight` in `function`; 3, move `getNumClasses` from `Clasifier` to `DatasetUtils`; 4, move `getNumFeatures` from `MetadataUtils` to `DatasetUtils`; ### Why are the changes needed? to unify to methods ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? existing testsuites Closes #36049 from zhengruifeng/validate_cleanup. Authored-by: Ruifeng Zheng Signed-off-by: Dongjoon Hyun --- .../spark/examples/ml/DeveloperApiExample.scala| 7 +- .../main/scala/org/apache/spark/ml/Predictor.scala | 51 +- .../spark/ml/classification/Classifier.scala | 106 + .../ml/classification/DecisionTreeClassifier.scala | 3 +- .../spark/ml/classification/FMClassifier.scala | 2 +- .../spark/ml/classification/GBTClassifier.scala| 20 +--- .../ml/classification/RandomForestClassifier.scala | 2 +- .../spark/ml/clustering/GaussianMixture.scala | 2 +- .../evaluation/BinaryClassificationEvaluator.scala | 7 +- .../spark/ml/evaluation/ClusteringEvaluator.scala | 21 ++-- .../spark/ml/evaluation/ClusteringMetrics.scala| 6 +- .../MulticlassClassificationEvaluator.scala| 8 +- .../spark/ml/evaluation/RegressionEvaluator.scala | 16 ++-- .../scala/org/apache/spark/ml/feature/LSH.scala| 2 +- .../org/apache/spark/ml/feature/RobustScaler.scala | 2 +- .../org/apache/spark/ml/feature/Selector.scala | 2 +- .../ml/feature/UnivariateFeatureSelector.scala | 2 +- .../apache/spark/ml/feature/VectorIndexer.scala| 2 +- .../main/scala/org/apache/spark/ml/functions.scala | 6 -- .../apache/spark/ml/regression/FMRegressor.scala | 2 +- .../apache/spark/ml/regression/GBTRegressor.scala | 20 +--- .../regression/GeneralizedLinearRegression.scala | 2 +- .../spark/ml/regression/LinearRegression.scala | 2 +- .../org/apache/spark/ml/util/DatasetUtils.scala| 82 +++- .../org/apache/spark/ml/util/MetadataUtils.scala | 14 +-- .../spark/ml/classification/ClassifierSuite.scala | 44 + project/MimaExcludes.scala | 16 +++- 27 files changed, 152 insertions(+), 297 deletions(-) diff --git a/examples/src/main/scala/org/apache/spark/examples/ml/DeveloperApiExample.scala b/examples/src/main/scala/org/apache/spark/examples/ml/DeveloperApiExample.scala index 487cb27b93f..bfee3301f8e 100644 --- a/examples/src/main/scala/org/apache/spark/examples/ml/DeveloperApiExample.scala +++ b/examples/src/main/scala/org/apache/spark/examples/ml/DeveloperApiExample.scala @@ -24,6 +24,7 @@ import org.apache.spark.ml.linalg.{BLAS, Vector, Vectors} import org.apache.spark.ml.param.{IntParam, ParamMap} import org.apache.spark.ml.util.Identifiable import org.apache.spark.sql.{Dataset, Row, SparkSession} +import org.apache.spark.sql.functions.col /** * A simple example demonstrating how to write your own learning algorithm using Estimator, @@ -120,8 +121,10 @@ private class MyLogisticRegression(override val uid: String) // This method is used by fit() override protected def train(dataset: Dataset[_]): MyLogisticRegressionModel = { -// Extract columns from data using helper method. -val oldDataset = extractLabeledPoints(dataset) +// Extract columns from data. +val oldDataset = dataset.select(col($(labelCol)).cast("double"), col($(featuresCol))) + .rdd + .map { case Row(l: Double, f: Vector) => LabeledPoint(l, f) } // Do learning to estimate the coefficients vector. val numFeatures = oldDataset.take(1)(0).features.size diff --git a/mllib/src/main/scala/org/apache/spark/ml/Predictor.scala b/mllib/src/main/scala/org/apache/spark/ml/Predictor.scala index e0b128e3698..9c6eb880c80 100644 --- a/mllib/src/main/scala/org/apache/spark/ml/Predictor.scala +++ b/mllib/src/main/scala/org/apache/spark/ml/Predictor.scala @@ -18,14 +18,11 @@ package org.apache.spark.ml import org.apache.spark.annotation.Since -import org.apache.spark.ml.feature.{Instance, LabeledPoint} -import org.apache.spark.ml.functions.checkNonNegativeWeight -import org.apache.spark.ml.linalg.{Vector, VectorUDT} +import org.apache.spark.ml.linalg.VectorUDT import org.apache.spark.ml.param._ import
[spark] branch master updated (a859dd25019 -> 362f27f38e9)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from a859dd25019 [SPARK-39509][INFRA] Support `DEFAULT_ARTIFACT_REPOSITORY` in `check-license` add 362f27f38e9 [SPARK-39507][CORE] `SocketAuthServer` should respect Java IPv6 options No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/security/SocketAuthServer.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] srowen commented on a diff in pull request #400: [SPARK-39512] Document docker image release steps
srowen commented on code in PR #400: URL: https://github.com/apache/spark-website/pull/400#discussion_r901012063 ## site/sitemap.xml: ## @@ -941,27 +941,27 @@ weekly - https://spark.apache.org/graphx/ + https://spark.apache.org/news/ Review Comment: I don't know which ordering is correct, but maybe revert this change? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] holdenk opened a new pull request, #400: [SPARK-39512] Document docker image release steps
holdenk opened a new pull request, #400: URL: https://github.com/apache/spark-website/pull/400 Document the docker image release steps for the release manager to follow when finalizing the release. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-39509][INFRA] Support `DEFAULT_ARTIFACT_REPOSITORY` in `check-license`
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new a859dd25019 [SPARK-39509][INFRA] Support `DEFAULT_ARTIFACT_REPOSITORY` in `check-license` a859dd25019 is described below commit a859dd25019715165ddb0defe3ddfd8e3cba866e Author: Dongjoon Hyun AuthorDate: Sat Jun 18 10:05:24 2022 -0700 [SPARK-39509][INFRA] Support `DEFAULT_ARTIFACT_REPOSITORY` in `check-license` ### What changes were proposed in this pull request? This PR aims to make `check-license` script to support IPv6 environment via `DEFAULT_ARTIFACT_REPOSITORY` ### Why are the changes needed? Apache Maven Central repository has two separate URLs. - https://repo.maven.apache.org/maven2/ (IPv4) - https://ipv6.repo1.maven.org/maven2/ (IPv6) `DEFAULT_ARTIFACT_REPOSITORY` allows IPv6 users to use `ipv6.repo1.maven.org` or Google Maven Central Mirror according to their needs. ### Does this PR introduce _any_ user-facing change? No. This is a dev-only change. ### How was this patch tested? Pass the CIs. Closes #36907 from dongjoon-hyun/SPARK-39509. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun --- dev/check-license | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dev/check-license b/dev/check-license index bd255954d6d..f1cd5a5f1d4 100755 --- a/dev/check-license +++ b/dev/check-license @@ -20,7 +20,7 @@ acquire_rat_jar () { - URL="https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat/${RAT_VERSION}/apache-rat-${RAT_VERSION}.jar"; + URL="${DEFAULT_ARTIFACT_REPOSITORY:-https://repo1.maven.org/maven2/}org/apache/rat/apache-rat/${RAT_VERSION}/apache-rat-${RAT_VERSION}.jar"; JAR="$rat_jar" - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org