(spark) branch master updated: [SPARK-44953][CORE] Log a warning when shuffle tracking is enabled along side another DA supported mechanism
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new a101c48dd965 [SPARK-44953][CORE] Log a warning when shuffle tracking is enabled along side another DA supported mechanism a101c48dd965 is described below commit a101c48dd9650d2bca2047b91f9e2a3ba90f142d Author: zwangsheng AuthorDate: Mon May 13 13:33:34 2024 -0700 [SPARK-44953][CORE] Log a warning when shuffle tracking is enabled along side another DA supported mechanism ### What changes were proposed in this pull request? Log a warning when shuffle tracking is enabled along side another DA supported mechanism ### Why are the changes needed? Some users enable both shuffle tracking and another mechanism (like migration) and then are confused when their jobs don't scale down. https://issues.apache.org/jira/browse/SPARK-44953 ### Does this PR introduce _any_ user-facing change? Yes, user can find the warning log when enabled both shuffle tracking and another DA supported mechanism(shuffle decommission). ### How was this patch tested? No ### Was this patch authored or co-authored using generative AI tooling? NO Closes #45454 from zwangsheng/SPARK-44953. Authored-by: zwangsheng Signed-off-by: Holden Karau --- .../scala/org/apache/spark/ExecutorAllocationManager.scala | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala index 3bfa1ae0d4dc..1fe02eec3a07 100644 --- a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala +++ b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala @@ -206,11 +206,13 @@ private[spark] class ExecutorAllocationManager( throw new SparkException( s"s${DYN_ALLOCATION_SUSTAINED_SCHEDULER_BACKLOG_TIMEOUT.key} must be > 0!") } +val shuffleTrackingEnabled = conf.get(config.DYN_ALLOCATION_SHUFFLE_TRACKING_ENABLED) +val shuffleDecommissionEnabled = decommissionEnabled && + conf.get(config.STORAGE_DECOMMISSION_SHUFFLE_BLOCKS_ENABLED) if (!conf.get(config.SHUFFLE_SERVICE_ENABLED) && !reliableShuffleStorage) { - if (conf.get(config.DYN_ALLOCATION_SHUFFLE_TRACKING_ENABLED)) { + if (shuffleTrackingEnabled) { logInfo("Dynamic allocation is enabled without a shuffle service.") - } else if (decommissionEnabled && - conf.get(config.STORAGE_DECOMMISSION_SHUFFLE_BLOCKS_ENABLED)) { + } else if (shuffleDecommissionEnabled) { logInfo("Shuffle data decommission is enabled without a shuffle service.") } else if (!testing) { throw new SparkException("Dynamic allocation of executors requires one of the " + @@ -224,6 +226,12 @@ private[spark] class ExecutorAllocationManager( } } +if (shuffleTrackingEnabled && (shuffleDecommissionEnabled || reliableShuffleStorage)) { + logWarning("You are enabling both shuffle tracking and other DA supported mechanism, " + +"which will cause idle executors not to be released in a timely, " + +"please check the configurations.") +} + if (executorAllocationRatio > 1.0 || executorAllocationRatio <= 0.0) { throw new SparkException( s"${DYN_ALLOCATION_EXECUTOR_ALLOCATION_RATIO.key} must be > 0 and <= 1.0") - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-45807][SQL] Add createOrReplaceView(..) / replaceView(..) to ViewCatalog
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 89673da0beb2 [SPARK-45807][SQL] Add createOrReplaceView(..) / replaceView(..) to ViewCatalog 89673da0beb2 is described below commit 89673da0beb2b64434d29a94e07fa9c6fb4a93e8 Author: Eduard Tudenhoefner AuthorDate: Tue Dec 5 10:29:32 2023 -0800 [SPARK-45807][SQL] Add createOrReplaceView(..) / replaceView(..) to ViewCatalog ### What changes were proposed in this pull request? ViewCatalog API improvements described in [SPIP](https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing) that didn't make it into the codebase as part of #37556 ### Why are the changes needed? Required for DataSourceV2 view support. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? N/A ### Was this patch authored or co-authored using generative AI tooling? N/A Closes #43677 from nastra/SPARK-45807. Authored-by: Eduard Tudenhoefner Signed-off-by: Holden Karau --- .../spark/sql/connector/catalog/ViewCatalog.java | 86 ++ 1 file changed, 86 insertions(+) diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java index eb67b9904869..eef348928b2c 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java @@ -140,6 +140,92 @@ public interface ViewCatalog extends CatalogPlugin { String[] columnComments, Map properties) throws ViewAlreadyExistsException, NoSuchNamespaceException; + /** + * Replace a view in the catalog. + * + * The default implementation has a race condition. + * Catalogs are encouraged to implement this operation atomically. + * + * @param ident a view identifier + * @param sql the SQL text that defines the view + * @param currentCatalog the current catalog + * @param currentNamespace the current namespace + * @param schema the view query output schema + * @param queryColumnNames the query column names + * @param columnAliases the column aliases + * @param columnComments the column comments + * @param properties the view properties + * @throws NoSuchViewException If the view doesn't exist or is a table + * @throws NoSuchNamespaceException If the identifier namespace does not exist (optional) + */ + default void replaceView( + Identifier ident, + String sql, + String currentCatalog, + String[] currentNamespace, + StructType schema, + String[] queryColumnNames, + String[] columnAliases, + String[] columnComments, + Map properties) throws NoSuchViewException, NoSuchNamespaceException { +if (viewExists(ident)) { + dropView(ident); + try { +createView(ident, sql, currentCatalog, currentNamespace, schema, +queryColumnNames, columnAliases, columnComments, properties); + } + catch (ViewAlreadyExistsException e) { +throw new RuntimeException("Race condition when dropping and creating view", e); + } +} else { + throw new NoSuchViewException(ident); +} + } + + /** + * Create or replace a view in the catalog. + * + * The default implementation has race conditions. + * Catalogs are encouraged to implement this operation atomically. + * + * @param ident a view identifier + * @param sql the SQL text that defines the view + * @param currentCatalog the current catalog + * @param currentNamespace the current namespace + * @param schema the view query output schema + * @param queryColumnNames the query column names + * @param columnAliases the column aliases + * @param columnComments the column comments + * @param properties the view properties + * @throws NoSuchNamespaceException If the identifier namespace does not exist (optional) + */ + default void createOrReplaceView( + Identifier ident, + String sql, + String currentCatalog, + String[] currentNamespace, + StructType schema, + String[] queryColumnNames, + String[] columnAliases, + String[] columnComments, + Map properties) throws NoSuchNamespaceException { +if (viewExists(ident)) { + try { +replaceView(ident, sql, currentCatalog, currentNamespace, schema, +queryColumnNames, columnAliases, columnComments, properties); + } catch (NoSuchViewException e) { +throw new RuntimeException("Rac
[spark] branch master updated: [SPARK-44735][SQL] Add warning msg when inserting columns with the same name by row that don't match up
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new ae100267c28b [SPARK-44735][SQL] Add warning msg when inserting columns with the same name by row that don't match up ae100267c28b is described below commit ae100267c28bc6fd2c2f9c880ed3df1999423992 Author: Jia Fan AuthorDate: Mon Oct 23 10:51:31 2023 -0700 [SPARK-44735][SQL] Add warning msg when inserting columns with the same name by row that don't match up ### What changes were proposed in this pull request? This PR add a warning msg when inserting columns name with the same name by row but order not matched. Tell user can use `INSERT INTO BY NAME` to reorder columns to match with table schema. It will be like: ![image](https://github.com/apache/spark/assets/32387433/18e57125-8a2e-407c-a3fd-93a9cbf122a1) ### Why are the changes needed? Optimize user usage scenarios. ### Does this PR introduce _any_ user-facing change? Yes, sometimes will show some warning. ### How was this patch tested? Test in local ### Was this patch authored or co-authored using generative AI tooling? No Closes #42763 from Hisoka-X/SPARK-44735_add_warning_for_by_name. Authored-by: Jia Fan Signed-off-by: Holden Karau --- .../apache/spark/sql/catalyst/analysis/Analyzer.scala | 2 ++ .../sql/catalyst/analysis/TableOutputResolver.scala | 17 - .../apache/spark/sql/execution/datasources/rules.scala | 5 - 3 files changed, 22 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala index 0469fb29a6fc..06d949ece262 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala @@ -3421,6 +3421,8 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor case v2Write: V2WriteCommand if v2Write.table.resolved && v2Write.query.resolved && !v2Write.outputResolved => validateStoreAssignmentPolicy() +TableOutputResolver.suitableForByNameCheck(v2Write.isByName, + expected = v2Write.table.output, queryOutput = v2Write.query.output) val projection = TableOutputResolver.resolveOutputColumns( v2Write.table.name, v2Write.table.output, v2Write.query, v2Write.isByName, conf) if (projection != v2Write.query) { diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala index d41757725771..1398552399cd 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala @@ -19,7 +19,9 @@ package org.apache.spark.sql.catalyst.analysis import scala.collection.mutable +import org.apache.spark.internal.Logging import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.SQLConfHelper import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.expressions.objects.AssertNotNull import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project} @@ -34,7 +36,7 @@ import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.internal.SQLConf.StoreAssignmentPolicy import org.apache.spark.sql.types.{ArrayType, DataType, DecimalType, IntegralType, MapType, StructType} -object TableOutputResolver { +object TableOutputResolver extends SQLConfHelper with Logging { def resolveVariableOutputColumns( expected: Seq[VariableReference], @@ -470,6 +472,19 @@ object TableOutputResolver { } } + def suitableForByNameCheck( + byName: Boolean, + expected: Seq[Attribute], + queryOutput: Seq[Attribute]): Unit = { +if (!byName && expected.size == queryOutput.size && + expected.forall(e => queryOutput.exists(p => conf.resolver(p.name, e.name))) && + expected.zip(queryOutput).exists(e => !conf.resolver(e._1.name, e._2.name))) { + logWarning("The query columns and the table columns have same names but different " + +"orders. You can use INSERT [INTO | OVERWRITE] BY NAME to reorder the query columns to " + +"align with the table columns.") +} + } + private def containsIntegralOrDecimalType(dt: DataType): Boolean = dt match { case _: IntegralType | _: DecimalType => true case a: ArrayType =
[spark] branch master updated: [SPARK-33782][K8S][CORE] Place spark.files, spark.jars and spark.files under the current working directory on the driver in K8S cluster mode
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new af8dd411aa9 [SPARK-33782][K8S][CORE] Place spark.files, spark.jars and spark.files under the current working directory on the driver in K8S cluster mode af8dd411aa9 is described below commit af8dd411aa9d15bc59d09cf9959d4a57debc9635 Author: pralabhkumar AuthorDate: Mon Dec 12 21:16:54 2022 -0800 [SPARK-33782][K8S][CORE] Place spark.files, spark.jars and spark.files under the current working directory on the driver in K8S cluster mode ### What changes were proposed in this pull request? This PR will place spark.files , spark.jars and spark.pyfiles to the current working directory on the driver in K8s cluster mode ### Why are the changes needed? This mimics the behaviour of Yarn and also helps user to access files from PWD . Also as mentioned in the jira By doing this, users can, for example, leverage PEX to manage Python dependences in Apache Spark: ``` pex pyspark==3.0.1 pyarrow==0.15.1 pandas==0.25.3 -o myarchive.pex PYSPARK_PYTHON=./myarchive.pex spark-submit --files myarchive.pex ``` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested via unit test cases and also ran on local K8s cluster. Closes #37417 from pralabhkumar/rk_k8s_local_resource. Authored-by: pralabhkumar Signed-off-by: Holden Karau --- .../org/apache/spark/deploy/SparkSubmit.scala | 55 +- .../org/apache/spark/deploy/SparkSubmitSuite.scala | 35 ++ 2 files changed, 69 insertions(+), 21 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala index 783cf47df16..73acfedd8bc 100644 --- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala +++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala @@ -20,6 +20,7 @@ package org.apache.spark.deploy import java.io._ import java.lang.reflect.{InvocationTargetException, UndeclaredThrowableException} import java.net.{URI, URL} +import java.nio.file.Files import java.security.PrivilegedExceptionAction import java.text.ParseException import java.util.{ServiceLoader, UUID} @@ -383,43 +384,55 @@ private[spark] class SparkSubmit extends Logging { }.orNull if (isKubernetesClusterModeDriver) { -// Replace with the downloaded local jar path to avoid propagating hadoop compatible uris. -// Executors will get the jars from the Spark file server. -// Explicitly download the related files here -args.jars = localJars -val filesLocalFiles = Option(args.files).map { - downloadFileList(_, targetDir, sparkConf, hadoopConf) -}.orNull -val archiveLocalFiles = Option(args.archives).map { uris => +// SPARK-33748: this mimics the behaviour of Yarn cluster mode. If the driver is running +// in cluster mode, the archives should be available in the driver's current working +// directory too. +// SPARK-33782 : This downloads all the files , jars , archiveFiles and pyfiles to current +// working directory +def downloadResourcesToCurrentDirectory(uris: String, isArchive: Boolean = false): +String = { val resolvedUris = Utils.stringToSeq(uris).map(Utils.resolveURI) - val localArchives = downloadFileList( + val localResources = downloadFileList( resolvedUris.map( UriBuilder.fromUri(_).fragment(null).build().toString).mkString(","), targetDir, sparkConf, hadoopConf) - - // SPARK-33748: this mimics the behaviour of Yarn cluster mode. If the driver is running - // in cluster mode, the archives should be available in the driver's current working - // directory too. - Utils.stringToSeq(localArchives).map(Utils.resolveURI).zip(resolvedUris).map { -case (localArchive, resolvedUri) => - val source = new File(localArchive.getPath) + Utils.stringToSeq(localResources).map(Utils.resolveURI).zip(resolvedUris).map { +case (localResources, resolvedUri) => + val source = new File(localResources.getPath) val dest = new File( ".", if (resolvedUri.getFragment != null) resolvedUri.getFragment else source.getName) logInfo( -s"Unpacking an archive $resolvedUri " + +s"Files $resolvedUri " + s"from ${source.getAbsolutePath} to ${dest.getAbsolutePath}") Utils.deleteRecursively(dest) -
[spark] branch master updated: [SPARK-39799][SQL] DataSourceV2: View catalog interface
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new d12abff7e83 [SPARK-39799][SQL] DataSourceV2: View catalog interface d12abff7e83 is described below commit d12abff7e83705fb0e187a79f86b45f99e4b7abb Author: John Zhuge AuthorDate: Wed Nov 16 09:05:14 2022 -0800 [SPARK-39799][SQL] DataSourceV2: View catalog interface ### What changes were proposed in this pull request? ViewCatalog API described in [SPIP](https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing). ### Why are the changes needed? First step towards DataSourceV2 view support. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? N/A Closes #37556 from jzhuge/SPARK-39799. Authored-by: John Zhuge Signed-off-by: Holden Karau --- core/src/main/resources/error/error-classes.json | 15 ++ .../apache/spark/sql/connector/catalog/View.java | 74 .../spark/sql/connector/catalog/ViewCatalog.java | 188 + .../spark/sql/connector/catalog/ViewChange.java| 79 + .../catalyst/analysis/AlreadyExistException.scala | 10 ++ .../spark/sql/catalyst/analysis/Analyzer.scala | 2 +- .../catalyst/analysis/NoSuchItemException.scala| 8 + 7 files changed, 375 insertions(+), 1 deletion(-) diff --git a/core/src/main/resources/error/error-classes.json b/core/src/main/resources/error/error-classes.json index d5d6e938ad1..c96cb11874c 100644 --- a/core/src/main/resources/error/error-classes.json +++ b/core/src/main/resources/error/error-classes.json @@ -1428,6 +1428,21 @@ "3. set \"spark.sql.legacy.allowUntypedScalaUDF\" to \"true\" and use this API with caution" ] }, + "VIEW_ALREADY_EXISTS" : { +"message" : [ + "Cannot create view because it already exists.", + "Choose a different name, drop or replace the existing object, or add the IF NOT EXISTS clause to tolerate pre-existing objects." +], +"sqlState" : "42000" + }, + "VIEW_NOT_FOUND" : { +"message" : [ + "The view cannot be found. Verify the spelling and correctness of the schema and catalog.", + "If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.", + "To tolerate the error on drop use DROP VIEW IF EXISTS." +], +"sqlState" : "42000" + }, "_LEGACY_ERROR_TEMP_0001" : { "message" : [ "Invalid InsertIntoContext" diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/View.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/View.java new file mode 100644 index 000..a4dc5f2f2d2 --- /dev/null +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/View.java @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.connector.catalog; + +import java.util.Map; + +import org.apache.spark.annotation.DeveloperApi; +import org.apache.spark.sql.types.StructType; + +/** + * An interface representing a persisted view. + */ +@DeveloperApi +public interface View { + /** + * A name to identify this view. + */ + String name(); + + /** + * The view query SQL text. + */ + String query(); + + /** + * The current catalog when the view is created. + */ + String currentCatalog(); + + /** + * The current namespace when the view is created. + */ + String[] currentNamespace(); + + /** + * The schema for the view when the view is created after applying column aliases. + */ + StructType schema(); + + /** + * The output column names of the query that creates this view. + */ + String[] queryColumnNames(); + + /** + * The view colu
[spark] branch master updated: [SPARK-40428][CORE] Fix shutdown hook in the CoarseGrainedSchedulerBackend
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 505a8a04ad6 [SPARK-40428][CORE] Fix shutdown hook in the CoarseGrainedSchedulerBackend 505a8a04ad6 is described below commit 505a8a04ad64a6732bf9fec03c28bfbd514d109d Author: Holden Karau AuthorDate: Tue Oct 4 17:26:34 2022 -0700 [SPARK-40428][CORE] Fix shutdown hook in the CoarseGrainedSchedulerBackend ### What changes were proposed in this pull request? Fix the shutdown hook call through to CoarseGrainedSchedulerBackend ### Why are the changes needed? Sometimes if the driver shuts down abnormally resources may be left dangling. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests. Closes #37885 from holdenk/shutdownhook-for-k8s. Lead-authored-by: Holden Karau Co-authored-by: Holden Karau Signed-off-by: Holden Karau --- core/src/main/scala/org/apache/spark/SparkContext.scala | 4 +++- .../scala/org/apache/spark/scheduler/DAGScheduler.scala | 16 .../org/apache/spark/scheduler/TaskSchedulerImpl.scala | 16 3 files changed, 27 insertions(+), 9 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/SparkContext.scala b/core/src/main/scala/org/apache/spark/SparkContext.scala index f101dc8e083..e2c6a912bc2 100644 --- a/core/src/main/scala/org/apache/spark/SparkContext.scala +++ b/core/src/main/scala/org/apache/spark/SparkContext.scala @@ -2119,7 +2119,9 @@ class SparkContext(config: SparkConf) extends Logging { Utils.tryLogNonFatalError { _plugins.foreach(_.shutdown()) } -FallbackStorage.cleanUp(_conf, _hadoopConfiguration) +Utils.tryLogNonFatalError { + FallbackStorage.cleanUp(_conf, _hadoopConfiguration) +} Utils.tryLogNonFatalError { _eventLogger.foreach(_.stop()) } diff --git a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala index ab05409aebb..c5529851382 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala @@ -2890,10 +2890,18 @@ private[spark] class DAGScheduler( } def stop(): Unit = { -messageScheduler.shutdownNow() -shuffleMergeFinalizeScheduler.shutdownNow() -eventProcessLoop.stop() -taskScheduler.stop() +Utils.tryLogNonFatalError { + messageScheduler.shutdownNow() +} +Utils.tryLogNonFatalError { + shuffleMergeFinalizeScheduler.shutdownNow() +} +Utils.tryLogNonFatalError { + eventProcessLoop.stop() +} +Utils.tryLogNonFatalError { + taskScheduler.stop() +} } eventProcessLoop.start() diff --git a/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala b/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala index a6735f380f1..5004262a71c 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala @@ -972,15 +972,23 @@ private[spark] class TaskSchedulerImpl( } override def stop(): Unit = { -speculationScheduler.shutdown() +Utils.tryLogNonFatalError { + speculationScheduler.shutdown() +} if (backend != null) { - backend.stop() + Utils.tryLogNonFatalError { +backend.stop() + } } if (taskResultGetter != null) { - taskResultGetter.stop() + Utils.tryLogNonFatalError { +taskResultGetter.stop() + } } if (barrierCoordinator != null) { - barrierCoordinator.stop() + Utils.tryLogNonFatalError { +barrierCoordinator.stop() + } } starvationTimer.cancel() abortTimer.cancel() - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-38969][K8S] Fix Decom reporting
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 451ac4c4d96 [SPARK-38969][K8S] Fix Decom reporting 451ac4c4d96 is described below commit 451ac4c4d96f08df8176d9b7c872ab8968623b69 Author: Holden Karau AuthorDate: Fri Aug 12 16:29:47 2022 -0700 [SPARK-38969][K8S] Fix Decom reporting ### What changes were proposed in this pull request? Change how we account for executor loss reasons. ### Why are the changes needed? Race condition in executors which decommission quickly. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing core tests. Closes #36434 from holdenk/SPARK-38969-decom-reporting. Authored-by: Holden Karau Signed-off-by: Holden Karau --- .../apache/spark/scheduler/dynalloc/ExecutorMonitor.scala | 10 -- .../kubernetes/docker/src/main/dockerfiles/spark/decom.sh | 15 ++- 2 files changed, 14 insertions(+), 11 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala b/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala index defef5bfcf2..9132d0e46d3 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala @@ -355,13 +355,11 @@ private[spark] class ExecutorMonitor( val removed = executors.remove(event.executorId) if (removed != null) { decrementExecResourceProfileCount(removed.resourceProfileId) - if (removed.decommissioning) { -if (event.reason == ExecutorLossMessage.decommissionFinished || -event.reason == ExecutorDecommission().message) { - metrics.gracefullyDecommissioned.inc() -} else { + if (event.reason == ExecutorLossMessage.decommissionFinished || +event.reason == ExecutorDecommission().message) { +metrics.gracefullyDecommissioned.inc() + } else if (removed.decommissioning) { metrics.decommissionUnfinished.inc() -} } else if (removed.pendingRemoval) { metrics.driverKilled.inc() } else { diff --git a/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/decom.sh b/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/decom.sh index cd973df257f..4c25b42e964 100755 --- a/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/decom.sh +++ b/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/decom.sh @@ -18,17 +18,22 @@ # -set -ex +set +e +set -x echo "Asked to decommission" # Find the pid to signal date | tee -a ${LOG} -WORKER_PID=$(ps -o pid -C java | tail -n 1| awk '{ sub(/^[ \t]+/, ""); print }') +WORKER_PID=$(ps -o pid,cmd -C java |grep Executor \ + | tail -n 1| awk '{ sub(/^[ \t]+/, ""); print }' \ + | cut -f 1 -d " ") echo "Using worker pid $WORKER_PID" kill -s SIGPWR ${WORKER_PID} -# For now we expect this to timeout, since we don't start exiting the backend. +# If the worker does exit stop blocking K8s cleanup. Note this is a "soft" +# block since the pod it's self will have a maximum decommissioning time which will +# overload this. echo "Waiting for worker pid to exit" -# If the worker does exit stop blocking the cleanup. -timeout 60 tail --pid=${WORKER_PID} -f /dev/null +tail --pid=${WORKER_PID} -f /dev/null +sleep 1 date echo "Done" date - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-36664][CORE] Log time waiting for cluster resources
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 148918b5469 [SPARK-36664][CORE] Log time waiting for cluster resources 148918b5469 is described below commit 148918b5469eb972b5d8e22d9b1cea5ca0721045 Author: Holden Karau AuthorDate: Mon Apr 25 20:42:37 2022 -0700 [SPARK-36664][CORE] Log time waiting for cluster resources ### What changes were proposed in this pull request? Keep track of and communicate to the listener bus how long we are waiting for execs to be allocated from the underlying cluster manager. Replaces previous PR with GHA issues ( #35172 ) and WIP PR https://github.com/apache/spark/pull/34650 and #35881 ### What changes were proposed in this pull request? Keep track of and communicate to the listener bus how long we are waiting for execs to be allocated from the underlying cluster manager. Replaces previous PR with GHA issues ( #35172 ) and WIP PR https://github.com/apache/spark/pull/34650 ### Why are the changes needed? Sometimes the cluster manager may choke or otherwise not be able to allocate resources and we don't have a good way of detecting this situation making it difficult for the user to debug and tell apart from Spark not scaling up correctly. ### Does this PR introduce _any_ user-facing change? New field in the listener bus message for when a executor is allocated. ### How was this patch tested? New unit test in the listener suite. ### Why are the changes needed? Sometimes the cluster manager may choke or otherwise not be able to allocate resources and we don't have a good way of detecting this situation making it difficult for the user to debug and tell apart from Spark not scaling up correctly. ### Does this PR introduce _any_ user-facing change? New field in the listener bus message for when a executor is allocated. ### How was this patch tested? New unit test in the listener suite. Closes #36185 from holdenk/SPARK-36664-Log-time-waiting-for-cluster-resources-r4. Lead-authored-by: Holden Karau Co-authored-by: Holden Karau Signed-off-by: Holden Karau --- .../cluster/CoarseGrainedSchedulerBackend.scala| 67 ++- .../spark/scheduler/cluster/ExecutorData.scala | 6 +- .../spark/scheduler/cluster/ExecutorInfo.scala | 20 +++- .../scala/org/apache/spark/util/JsonProtocol.scala | 13 +- .../CoarseGrainedSchedulerBackendSuite.scala | 131 + .../scheduler/SparkListenerWithClusterSuite.scala | 1 + .../scheduler/dynalloc/ExecutorMonitorSuite.scala | 2 +- .../org/apache/spark/util/JsonProtocolSuite.scala | 38 ++ 8 files changed, 266 insertions(+), 12 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala index 13a7183a29d..61d67765c8c 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala @@ -21,7 +21,7 @@ import java.util.concurrent.{ScheduledExecutorService, TimeUnit} import java.util.concurrent.atomic.{AtomicInteger, AtomicReference} import javax.annotation.concurrent.GuardedBy -import scala.collection.mutable.{HashMap, HashSet} +import scala.collection.mutable.{HashMap, HashSet, Queue} import scala.concurrent.Future import org.apache.hadoop.security.UserGroupInformation @@ -82,6 +82,12 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp @GuardedBy("CoarseGrainedSchedulerBackend.this") private val requestedTotalExecutorsPerResourceProfile = new HashMap[ResourceProfile, Int] + // Profile IDs to the times that executors were requested for. + // The operations we do on queue are all amortized constant cost + // see https://www.scala-lang.org/api/2.13.x/scala/collection/mutable/ArrayDeque.html + @GuardedBy("CoarseGrainedSchedulerBackend.this") + private val execRequestTimes = new HashMap[Int, Queue[(Int, Long)]] + private val listenerBus = scheduler.sc.listenerBus // Executors we have requested the cluster manager to kill that have not died yet; maps @@ -260,9 +266,27 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp .resourceProfileFromId(resourceProfileId).getNumSlotsPerAddress(rName, conf) (info.name, new ExecutorResourceInfo(info.name, info.addresses, numParts)) } + // If we've requested the executor figure out when we did
[spark] branch master updated: [SPARK-38561][K8S][DOCS] Add doc for `Customized Kubernetes Schedulers`
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 4ff40c1 [SPARK-38561][K8S][DOCS] Add doc for `Customized Kubernetes Schedulers` 4ff40c1 is described below commit 4ff40c10f02f6e0735ce6554f7338489d8555bce Author: Yikun Jiang AuthorDate: Wed Mar 16 11:12:54 2022 -0700 [SPARK-38561][K8S][DOCS] Add doc for `Customized Kubernetes Schedulers` ### What changes were proposed in this pull request? This is PR to doc for basic framework capability for Customized Kubernetes Schedulers. ### Why are the changes needed? Guide user how to use spark on kubernetes custom scheduler ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed Closes #35869 from Yikun/SPARK-38561. Authored-by: Yikun Jiang Signed-off-by: Holden Karau --- docs/running-on-kubernetes.md | 19 +++ 1 file changed, 19 insertions(+) diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md index de37e22..d1b2fcd 100644 --- a/docs/running-on-kubernetes.md +++ b/docs/running-on-kubernetes.md @@ -1713,6 +1713,25 @@ spec: image: will-be-overwritten ``` + Customized Kubernetes Schedulers for Spark on Kubernetes + +Spark allows users to specify a custom Kubernetes schedulers. + +1. Specify scheduler name. + + Users can specify a custom scheduler using spark.kubernetes.scheduler.name or + spark.kubernetes.{driver/executor}.scheduler.name configuration. + +2. Specify scheduler related configurations. + + To configure the custom scheduler the user can use [Pod templates](#pod-template), add labels (spark.kubernetes.{driver,executor}.label.*) and/or annotations (spark.kubernetes.{driver/executor}.annotation.*). + +3. Specify scheduler feature step. + + Users may also consider to use spark.kubernetes.{driver/executor}.pod.featureSteps to support more complex requirements, including but not limited to: + - Create additional Kubernetes custom resources for driver/executor scheduling. + - Set scheduler hints according to configuration or existing Pod info dynamically. + ### Stage Level Scheduling Overview Stage level scheduling is supported on Kubernetes when dynamic allocation is enabled. This also requires spark.dynamicAllocation.shuffleTracking.enabled to be enabled since Kubernetes doesn't support an external shuffle service at this time. The order in which containers for different profiles is requested from Kubernetes is not guaranteed. Note that since dynamic allocation on Kubernetes requires the shuffle tracking feature, this means that executors from previous stages t [...] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark-website] branch asf-site updated: Fix dockerhub link
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git The following commit(s) were added to refs/heads/asf-site by this push: new d0d3def Fix dockerhub link d0d3def is described below commit d0d3defa2e20ec29ffae4a48702131a2953885ea Author: Holden Karau AuthorDate: Mon Feb 21 14:01:42 2022 -0800 Fix dockerhub link --- downloads.md| 2 +- site/downloads.html | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/downloads.md b/downloads.md index 8a6aca7..bf5a98e 100644 --- a/downloads.md +++ b/downloads.md @@ -43,7 +43,7 @@ Spark artifacts are [hosted in Maven Central](https://search.maven.org/search?q= ### Convenience Docker Container Images -[Spark Docker Container images are available from DockerHub](https://hub.docker.com/repository/docker/apache/spark), these images contain non-ASF software and may be subject to different license terms. +[Spark Docker Container images are available from DockerHub](https://hub.docker.com/r/apache/spark-py/tags), these images contain non-ASF software and may be subject to different license terms. ### Release notes for stable releases diff --git a/site/downloads.html b/site/downloads.html index e8ecb9c..441eb85 100644 --- a/site/downloads.html +++ b/site/downloads.html @@ -187,7 +187,7 @@ version: 3.2.1 Convenience Docker Container Images -https://hub.docker.com/repository/docker/apache/spark";>Spark Docker Container images are available from DockerHub, these images contain non-ASF software and may be subject to different license terms. +https://hub.docker.com/r/apache/spark-py/tags";>Spark Docker Container images are available from DockerHub, these images contain non-ASF software and may be subject to different license terms. Release notes for stable releases - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark-website] branch asf-site updated (08c07ea -> e631332)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git. from 08c07ea Regenerate site. new a5c41b0 Fix 3.1.3 doc link new e631332 Fix detailed change link for Spark 3.1.3 The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: documentation.md | 2 +- releases/_posts/2022-02-18-spark-release-3-1-3.md | 2 +- site/documentation.html | 2 +- site/releases/spark-release-3-1-3.html| 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark-website] 01/02: Fix 3.1.3 doc link
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git commit a5c41b0d1f491585dea3d89c3aca660efa954446 Author: Holden Karau AuthorDate: Mon Feb 21 11:58:03 2022 -0800 Fix 3.1.3 doc link --- documentation.md| 2 +- site/documentation.html | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/documentation.md b/documentation.md index 3a443cd..c5ebb64 100644 --- a/documentation.md +++ b/documentation.md @@ -14,7 +14,7 @@ navigation: Spark 3.2.1 Spark 3.2.0 - Spark 3.1.3 + Spark 3.1.3 Spark 3.1.2 Spark 3.1.1 Spark 3.0.3 diff --git a/site/documentation.html b/site/documentation.html index c1f98a6..07eefb1 100644 --- a/site/documentation.html +++ b/site/documentation.html @@ -151,7 +151,7 @@ Spark 3.2.1 Spark 3.2.0 - Spark 3.1.3 + Spark 3.1.3 Spark 3.1.2 Spark 3.1.1 Spark 3.0.3 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark-website] 02/02: Fix detailed change link for Spark 3.1.3
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git commit e631332c8ac7ee45b0e1cd4d3ad34320a7bc9124 Author: Holden Karau AuthorDate: Mon Feb 21 12:01:20 2022 -0800 Fix detailed change link for Spark 3.1.3 --- releases/_posts/2022-02-18-spark-release-3-1-3.md | 2 +- site/releases/spark-release-3-1-3.html| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/releases/_posts/2022-02-18-spark-release-3-1-3.md b/releases/_posts/2022-02-18-spark-release-3-1-3.md index b48b19b..25a0417 100644 --- a/releases/_posts/2022-02-18-spark-release-3-1-3.md +++ b/releases/_posts/2022-02-18-spark-release-3-1-3.md @@ -36,6 +36,6 @@ While being a maintence release we did still upgrade some dependencies in this r - [[SPARK-36129]](https://issues.apache.org/jira/browse/SPARK-36129): Upgrade commons-compress to 1.21 -You can consult JIRA for the [detailed changes](https://s.apache.org/spark-3.2.1). +You can consult JIRA for the [detailed changes](https://s.apache.org/spark-3.1.3). We would like to acknowledge all community members for contributing patches to this release. diff --git a/site/releases/spark-release-3-1-3.html b/site/releases/spark-release-3-1-3.html index 1ca1fac..1176ff6 100644 --- a/site/releases/spark-release-3-1-3.html +++ b/site/releases/spark-release-3-1-3.html @@ -175,7 +175,7 @@ https://issues.apache.org/jira/browse/SPARK-36129";>[SPARK-36129]: Upgrade commons-compress to 1.21 -You can consult JIRA for the https://s.apache.org/spark-3.2.1";>detailed changes. +You can consult JIRA for the https://s.apache.org/spark-3.1.3";>detailed changes. We would like to acknowledge all community members for contributing patches to this release. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark-website] 01/03: Initial 3.1.3 addition
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git commit cce59bf813570795e1d25d42a98e7c06de413ec5 Author: Holden Karau AuthorDate: Fri Feb 18 17:24:39 2022 -0800 Initial 3.1.3 addition --- documentation.md | 1 + downloads.md | 5 +++ js/downloads.js | 2 +- news/_posts/2022-02-18-3-1-3-released.md | 14 releases/_posts/2022-02-18-spark-release-3-1-3.md | 41 +++ 5 files changed, 62 insertions(+), 1 deletion(-) diff --git a/documentation.md b/documentation.md index eeab8c4..3a443cd 100644 --- a/documentation.md +++ b/documentation.md @@ -14,6 +14,7 @@ navigation: Spark 3.2.1 Spark 3.2.0 + Spark 3.1.3 Spark 3.1.2 Spark 3.1.1 Spark 3.0.3 diff --git a/downloads.md b/downloads.md index e5d1db3..8a6aca7 100644 --- a/downloads.md +++ b/downloads.md @@ -40,6 +40,11 @@ Spark artifacts are [hosted in Maven Central](https://search.maven.org/search?q= ### Installing with PyPi https://pypi.org/project/pyspark/";>PySpark is now available in pypi. To install just run `pip install pyspark`. + +### Convenience Docker Container Images + +[Spark Docker Container images are available from DockerHub](https://hub.docker.com/repository/docker/apache/spark), these images contain non-ASF software and may be subject to different license terms. + ### Release notes for stable releases diff --git a/js/downloads.js b/js/downloads.js index 2d40312..babc72a 100644 --- a/js/downloads.js +++ b/js/downloads.js @@ -27,7 +27,7 @@ var packagesV11 = [hadoop3p2, hadoop2p7, hadoopFree, sources]; var packagesV12 = [hadoop3p3, hadoop3p3scala213, hadoop2p7, hadoopFree, sources]; addRelease("3.2.1", new Date("01/26/2022"), packagesV12, true); -addRelease("3.1.2", new Date("06/01/2021"), packagesV11, true); +addRelease("3.1.3", new Date("02/18/2022"), packagesV11, true); addRelease("3.0.3", new Date("06/23/2021"), packagesV10, true); function append(el, contents) { diff --git a/news/_posts/2022-02-18-3-1-3-released.md b/news/_posts/2022-02-18-3-1-3-released.md new file mode 100644 index 000..d9a826f --- /dev/null +++ b/news/_posts/2022-02-18-3-1-3-released.md @@ -0,0 +1,14 @@ +--- +layout: post +title: Spark 3.1.3 released +categories: +- News +tags: [] +status: publish +type: post +published: true +meta: + _edit_last: '4' + _wpas_done_all: '1' +--- +We are happy to announce the availability of Spark 3.1.3! Visit the release notes to read about the new features, or download the release today. diff --git a/releases/_posts/2022-02-18-spark-release-3-1-3.md b/releases/_posts/2022-02-18-spark-release-3-1-3.md new file mode 100644 index 000..b48b19b --- /dev/null +++ b/releases/_posts/2022-02-18-spark-release-3-1-3.md @@ -0,0 +1,41 @@ +--- +layout: post +title: Spark Release 3.1.3 +categories: [] +tags: [] +status: publish +type: post +published: true +meta: +_edit_last: '4' +_wpas_done_all: '1' +--- + +Spark 3.1.3 is a maintenance release containing stability fixes. This release is based on the branch-3.1 maintenance branch of Spark. We strongly recommend all 3.1.3 users to upgrade to this stable release. + +### Notable changes + + - [[SPARK-38075]](https://issues.apache.org/jira/browse/SPARK-38075): Hive script transform with order by and limit will return fake rows + - [[SPARK-37784]](https://issues.apache.org/jira/browse/SPARK-37784): CodeGenerator.addBufferedState() does not properly handle UDTs + - [[SPARK-37079]](https://issues.apache.org/jira/browse/SPARK-37079): Fix DataFrameWriterV2.partitionedBy to send the arguments to JVM properly + - [[SPARK-37049]](https://issues.apache.org/jira/browse/SPARK-37049): executorIdleTimeout is not working for pending pods on K8s + - [[SPARK-36783]](https://issues.apache.org/jira/browse/SPARK-36783): ScanOperation should not push Filter through nondeterministic Project + - [[SPARK-36782]](https://issues.apache.org/jira/browse/SPARK-36782): Deadlock between map-output-dispatcher and dispatcher-BlockManagerMaster upon migrating shuffle blocks + - [[SPARK-36704]](https://issues.apache.org/jira/browse/SPARK-36704): InaccessibleObjectException in Java 9+ on startup + - [[SPARK-36740]](https://issues.apache.org/jira/browse/SPARK-36740): Various NaN issues with array operations. + - [[SPARK-36532]](https://issues.apache.org/jira/browse/SPARK-36532): Deadlock in CoarseGrainedExecutorBackend.onDisconnected + - [[SPARK-36489]](https://issues.apache.org/jira/browse/SPARK-36489): Aggregate functions over no grouping keys, on tables with a single bucket, return multiple rows + - [[SPARK-36339]](https://issues.apache.org/jira/browse/SPARK-36339):
svn commit: r52683 - in /dev/spark: v3.1.3-rc1-bin/ v3.1.3-rc2-bin/ v3.1.3-rc2-docs/ v3.1.3-rc3-bin/ v3.1.3-rc3-docs/ v3.1.3-rc4-docs/
Author: holden Date: Mon Feb 21 19:39:18 2022 New Revision: 52683 Log: Removing RC artifacts. Removed: dev/spark/v3.1.3-rc1-bin/ dev/spark/v3.1.3-rc2-bin/ dev/spark/v3.1.3-rc2-docs/ dev/spark/v3.1.3-rc3-bin/ dev/spark/v3.1.3-rc3-docs/ dev/spark/v3.1.3-rc4-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] tag v3.1.3 created (now d1f8a50)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to tag v3.1.3 in repository https://gitbox.apache.org/repos/asf/spark.git. at d1f8a50 (commit) No new revisions were added by this update. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r52631 - /dev/spark/v3.1.3-rc4-bin/ /release/spark/spark-3.1.3/
Author: holden Date: Fri Feb 18 22:56:11 2022 New Revision: 52631 Log: Release Spark 3.1.3 Added: release/spark/spark-3.1.3/ - copied from r52630, dev/spark/v3.1.3-rc4-bin/ Removed: dev/spark/v3.1.3-rc4-bin/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r52419 - in /dev/spark/v3.1.3-rc4-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _site/api/java/org/apache/parqu
Author: holden Date: Mon Feb 7 00:26:19 2022 New Revision: 52419 Log: Apache Spark v3.1.3-rc4 docs [This commit notification would consist of 2264 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r52418 - /dev/spark/v3.1.3-rc4-bin/
Author: holden Date: Sun Feb 6 23:22:00 2022 New Revision: 52418 Log: Apache Spark v3.1.3-rc4 Added: dev/spark/v3.1.3-rc4-bin/ dev/spark/v3.1.3-rc4-bin/SparkR_3.1.3.tar.gz (with props) dev/spark/v3.1.3-rc4-bin/SparkR_3.1.3.tar.gz.asc dev/spark/v3.1.3-rc4-bin/SparkR_3.1.3.tar.gz.sha512 dev/spark/v3.1.3-rc4-bin/pyspark-3.1.3.tar.gz (with props) dev/spark/v3.1.3-rc4-bin/pyspark-3.1.3.tar.gz.asc dev/spark/v3.1.3-rc4-bin/pyspark-3.1.3.tar.gz.sha512 dev/spark/v3.1.3-rc4-bin/spark-3.1.3-bin-hadoop2.7.tgz (with props) dev/spark/v3.1.3-rc4-bin/spark-3.1.3-bin-hadoop2.7.tgz.asc dev/spark/v3.1.3-rc4-bin/spark-3.1.3-bin-hadoop2.7.tgz.sha512 dev/spark/v3.1.3-rc4-bin/spark-3.1.3-bin-hadoop3.2.tgz (with props) dev/spark/v3.1.3-rc4-bin/spark-3.1.3-bin-hadoop3.2.tgz.asc dev/spark/v3.1.3-rc4-bin/spark-3.1.3-bin-hadoop3.2.tgz.sha512 dev/spark/v3.1.3-rc4-bin/spark-3.1.3-bin-without-hadoop.tgz (with props) dev/spark/v3.1.3-rc4-bin/spark-3.1.3-bin-without-hadoop.tgz.asc dev/spark/v3.1.3-rc4-bin/spark-3.1.3-bin-without-hadoop.tgz.sha512 dev/spark/v3.1.3-rc4-bin/spark-3.1.3.tgz (with props) dev/spark/v3.1.3-rc4-bin/spark-3.1.3.tgz.asc dev/spark/v3.1.3-rc4-bin/spark-3.1.3.tgz.sha512 Added: dev/spark/v3.1.3-rc4-bin/SparkR_3.1.3.tar.gz == Binary file - no diff available. Propchange: dev/spark/v3.1.3-rc4-bin/SparkR_3.1.3.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v3.1.3-rc4-bin/SparkR_3.1.3.tar.gz.asc == --- dev/spark/v3.1.3-rc4-bin/SparkR_3.1.3.tar.gz.asc (added) +++ dev/spark/v3.1.3-rc4-bin/SparkR_3.1.3.tar.gz.asc Sun Feb 6 23:22:00 2022 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- + +iQJGBAABCgAwFiEEyhnizCAbZSLToqBdxBq2dEJPw/sFAmIAVCcSHGhvbGRlbkBh +cGFjaGUub3JnAAoJEMQatnRCT8P7UjoP/3BqiMyJ6hihcN1vhi1no2/IXeE2xCPF +uOeqZCiTtKzHK6mPSFwwCysZJCdsnbLVZDs/ZKncN9azJS5tc626HUKxD6CTXOk6 +F6U/BcH9WXacWZDPNqriZ00ZqT5UYU4MEUeqnu5TDMvRN4lF9Bcl0qqKAUFC/naA +eZ/cYxnLWUKmmPU93oVFX0JxMlIW9KJCX+UI3S5zzNGq5vbX8cwTx7CE4ks+10xh +V3RvI6KDyJ3Iimt62dpkAROtioqgI3d9x72KgcUYpQaI6J8ZvOIuAEDfRSY+yAYA +/qNXhZuvXQHxpu2A6ubMD4h9aLijApqH3LMPWW4lfIi81NKLlfkK6P2847cTXRvG +BNq9EYC8ANu64eFeItxXb3vWw2fcR0PcGzyRfTwkJgycunzj2mmS744xmKltZNjf +FGWVxInmBOC3j3C1zgVIX2moZu8ywyUF3HWWb9Jgc/hSYl/wtTJBTd99NHDz+Axh +czPSvt3195JRh0JhtW8WECcNl08aa7hoQ68r24tWfjEVvwMTRqr4t161mQs2iyje ++aHx8bg9u9oKYy18ScADUOb2YKRkFpx9U5UWvgyKmWNVU8w0d8FBgu73J1dVLQXQ +aXWO1f2Uflu+gsmO6yNxzRGHXMP+a1/kUmqrFhWP7z9zX8VIxXnb035YPxjqWGpP +VRqqVJEMXl1Y +=/Qwl +-END PGP SIGNATURE- Added: dev/spark/v3.1.3-rc4-bin/SparkR_3.1.3.tar.gz.sha512 == --- dev/spark/v3.1.3-rc4-bin/SparkR_3.1.3.tar.gz.sha512 (added) +++ dev/spark/v3.1.3-rc4-bin/SparkR_3.1.3.tar.gz.sha512 Sun Feb 6 23:22:00 2022 @@ -0,0 +1,3 @@ +SparkR_3.1.3.tar.gz: 6F4FC494 8CCFCC27 20AD34BE 5F30FF5A 1035B48E 7934E955 + C2A00A66 37262DB3 BEE34E9C 34B6DE09 7B41E85E 197AE593 + 3692F598 BDA6D3F6 137F6DBE 780454DE Added: dev/spark/v3.1.3-rc4-bin/pyspark-3.1.3.tar.gz == Binary file - no diff available. Propchange: dev/spark/v3.1.3-rc4-bin/pyspark-3.1.3.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v3.1.3-rc4-bin/pyspark-3.1.3.tar.gz.asc == --- dev/spark/v3.1.3-rc4-bin/pyspark-3.1.3.tar.gz.asc (added) +++ dev/spark/v3.1.3-rc4-bin/pyspark-3.1.3.tar.gz.asc Sun Feb 6 23:22:00 2022 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- + +iQJGBAABCgAwFiEEyhnizCAbZSLToqBdxBq2dEJPw/sFAmIAVCgSHGhvbGRlbkBh +cGFjaGUub3JnAAoJEMQatnRCT8P7uBQP/iKk+W1vtQMlAeYXY/WyoiHjyPNc+kCj +C7MAgDt/rlj0VBoxSJ9R6l6aw+5EYIv/vZDo52tvpAdxk69w9zxj4I+8U/HnwbHw +8DoPbMqfhl4cWLFLH5HXadGjuDRfs7ULeXv6Sz0KMnj+UtISVMBu9KhUeHOt/tyw +ZwM8d53XzgsrjcqR6drEBgbzUM3C1M3qLAt2hoftcYQ4OlNKBixJ3/o6t5nVGPbQ +PCRbNL/iBNGntPNEO3XRjyuDDkqvj3h2VM5miCk8E1mUKIoJfj4zBx4K48qjKDPk +4e6fMZocehdFQ1l1Z03hH/9ZPSRcJIrItMWlacmtfi35FASoKLPLwh88UwMazFCU +NqSHq/n8iZX9tj9F8uSSg4ME8uEr0Uo+nmifkTWLMWGfWXU4FdvhjkjqnWyS+JFx +iJHPtmJq98XJFehdcPTh+3dWpPhLL2nRD/bJwlHF2kGMOGUPGqHp5S3XgjuebYw0 +Ofr9kKs9LoA8Rd9XAFNsc2U1KyUdFz2Sq+lYwivXv1f1MtvAMyc2/rBsD89S7xZF +plcDWGqAiYQZAI47VQmQk9KAOBIIT6gtirHNtFvRDTTqOtn49tsKOXk4tLiFOA+8 +bJ93YLmDStTMZWBdPqUUqQreN96dCazfk+sjWEeT9HUx2dl+5ZBuTnlIOcaoJNuh +T5RG6gWmVldT +=HtZc +-END PGP SIGNATURE- Added: dev/spark/v3.1.3-rc4-bin/pyspark-3.1.3.tar.gz.sha512 == --- dev/spark/v3.1.3-rc4-bin/pyspark-3.1.3
[spark] 01/01: Preparing development version 3.1.4-SNAPSHOT
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git commit 0b35c2464f805970d2a62ccc757616332287eb2b Author: Holden Karau AuthorDate: Sun Feb 6 22:17:37 2022 + Preparing development version 3.1.4-SNAPSHOT --- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10-token-provider/pom.xml | 2 +- external/kafka-0-10/pom.xml| 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 39 files changed, 40 insertions(+), 40 deletions(-) diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index d42841c..db689ea 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 3.1.3 +Version: 3.1.4 Title: R Front End for 'Apache Spark' Description: Provides an R Front end for 'Apache Spark' <https://spark.apache.org>. Authors@R: c(person("Shivaram", "Venkataraman", role = "aut", diff --git a/assembly/pom.xml b/assembly/pom.xml index 95309ee..8e2cded 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../pom.xml diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index 80f6630..baf20e6 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index ead7a4f..1a9c791 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index 232abd7..8a188a3 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml index 79068a6..d53f21d 100644 --- a/common/network-yarn/pom.xml +++ b/common/network-yarn/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml index 2030f85..53b7312 100644 --- a/common/sketch/pom.xml +++ b/common/sketch/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff
[spark] branch branch-3.1 updated (7540421 -> 0b35c24)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git. from 7540421 Preparing development version 3.1.4-SNAPSHOT add d1f8a50 Preparing Spark release v3.1.3-rc4 new 0b35c24 Preparing development version 3.1.4-SNAPSHOT The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] 01/01: Preparing Spark release v3.1.3-rc4
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to tag v3.1.3-rc4 in repository https://gitbox.apache.org/repos/asf/spark.git commit d1f8a503a26bcfb4e466d9accc5fa241a7933667 Author: Holden Karau AuthorDate: Sun Feb 6 22:17:33 2022 + Preparing Spark release v3.1.3-rc4 --- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10-token-provider/pom.xml | 2 +- external/kafka-0-10/pom.xml| 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 39 files changed, 40 insertions(+), 40 deletions(-) diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index db689ea..d42841c 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 3.1.4 +Version: 3.1.3 Title: R Front End for 'Apache Spark' Description: Provides an R Front end for 'Apache Spark' <https://spark.apache.org>. Authors@R: c(person("Shivaram", "Venkataraman", role = "aut", diff --git a/assembly/pom.xml b/assembly/pom.xml index 8e2cded..95309ee 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../pom.xml diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index baf20e6..80f6630 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index 1a9c791..ead7a4f 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index 8a188a3..232abd7 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml index d53f21d..79068a6 100644 --- a/common/network-yarn/pom.xml +++ b/common/network-yarn/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml index 53b7312..2030f85 100644 --- a/common/sketch/pom.xml +++ b/common/sketch/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/t
[spark] tag v3.1.3-rc4 created (now d1f8a50)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to tag v3.1.3-rc4 in repository https://gitbox.apache.org/repos/asf/spark.git. at d1f8a50 (commit) This tag includes the following new commits: new d1f8a50 Preparing Spark release v3.1.3-rc4 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r52357 - in /dev/spark/v3.1.3-rc3-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _site/api/java/org/apache/parqu
Author: holden Date: Tue Feb 1 22:41:57 2022 New Revision: 52357 Log: Apache Spark v3.1.3-rc3 docs [This commit notification would consist of 2264 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r52356 - /dev/spark/v3.1.3-rc3-bin/
Author: holden Date: Tue Feb 1 22:17:13 2022 New Revision: 52356 Log: Apache Spark v3.1.3-rc3 Added: dev/spark/v3.1.3-rc3-bin/ dev/spark/v3.1.3-rc3-bin/SparkR_3.1.3.tar.gz (with props) dev/spark/v3.1.3-rc3-bin/SparkR_3.1.3.tar.gz.asc dev/spark/v3.1.3-rc3-bin/SparkR_3.1.3.tar.gz.sha512 dev/spark/v3.1.3-rc3-bin/pyspark-3.1.3.tar.gz (with props) dev/spark/v3.1.3-rc3-bin/pyspark-3.1.3.tar.gz.asc dev/spark/v3.1.3-rc3-bin/pyspark-3.1.3.tar.gz.sha512 dev/spark/v3.1.3-rc3-bin/spark-3.1.3-bin-hadoop2.tgz (with props) dev/spark/v3.1.3-rc3-bin/spark-3.1.3-bin-hadoop2.tgz.asc dev/spark/v3.1.3-rc3-bin/spark-3.1.3-bin-hadoop2.tgz.sha512 dev/spark/v3.1.3-rc3-bin/spark-3.1.3-bin-hadoop3.tgz (with props) dev/spark/v3.1.3-rc3-bin/spark-3.1.3-bin-hadoop3.tgz.asc dev/spark/v3.1.3-rc3-bin/spark-3.1.3-bin-hadoop3.tgz.sha512 dev/spark/v3.1.3-rc3-bin/spark-3.1.3-bin-without-hadoop.tgz (with props) dev/spark/v3.1.3-rc3-bin/spark-3.1.3-bin-without-hadoop.tgz.asc dev/spark/v3.1.3-rc3-bin/spark-3.1.3-bin-without-hadoop.tgz.sha512 dev/spark/v3.1.3-rc3-bin/spark-3.1.3.tgz (with props) dev/spark/v3.1.3-rc3-bin/spark-3.1.3.tgz.asc dev/spark/v3.1.3-rc3-bin/spark-3.1.3.tgz.sha512 Added: dev/spark/v3.1.3-rc3-bin/SparkR_3.1.3.tar.gz == Binary file - no diff available. Propchange: dev/spark/v3.1.3-rc3-bin/SparkR_3.1.3.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v3.1.3-rc3-bin/SparkR_3.1.3.tar.gz.asc == --- dev/spark/v3.1.3-rc3-bin/SparkR_3.1.3.tar.gz.asc (added) +++ dev/spark/v3.1.3-rc3-bin/SparkR_3.1.3.tar.gz.asc Tue Feb 1 22:17:13 2022 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- + +iQJGBAABCgAwFiEEyhnizCAbZSLToqBdxBq2dEJPw/sFAmH5sR8SHGhvbGRlbkBh +cGFjaGUub3JnAAoJEMQatnRCT8P7xdsP/iSa4EDdX7WLWwv4dnJB7UhFAjey4Vvw +5OygAu7C9BKaUwTvJ1ywmO82Qca6ze5doORpIGWfzj/cUkmZd3Qb/T5pU0gQkUAr +7ovVT2DGClTdwgbfhMNEHW9nj5Y5hclKzeGxKaGT5qwOc5PDd9fKtgzIjQbuMGpy +SLwcoikHpwUTjClWw9IMB+n3NQbOGTUpBZJDtcjUwe1BhAwOCL2B85eTj6XYA2LN +SoARz1MwHyrH41Elh7ITdd1S9ynqbT00HqEPYMRDFN9DOzv2nWwobTdxpSmpLAJq +Lz+wSDWJjS3x4spMJzNX8yvFAOlrPwiRGsRziNGm01XVd6vbDMTNj+ykaw9hkdnu +xzjHmOtU/IKCc8YIOpobC00/tDrrgSktA44/7DPRC8UOreoxZkeDZWw0v942vns2 +YYFylq6zok9Fcqx5rLvDYiaQ1KLe4fXbdqXoqBYv2g0TKpZ55qdWtlc6In+Amui3 +YDAsAV1dHbPrM4C0JhZScWkHQLGq4ELnyt5LRRmOrkCE4+RpXw8EZEnSxgsxeDSn +MJHtHg9QEm/HnuucjQn9Gko6nBqlj/wNjKFOA/0tF+o/1uYfDhNSdHTBdnf2/iaM +uJ4Z27U06Ukp5aJ/Hi8iT2WJjHNZUViHbjAar3R83IlpDZ7S9bwjoDTJSEscilDM +35YJ3kDXxEJp +=pReb +-END PGP SIGNATURE- Added: dev/spark/v3.1.3-rc3-bin/SparkR_3.1.3.tar.gz.sha512 == --- dev/spark/v3.1.3-rc3-bin/SparkR_3.1.3.tar.gz.sha512 (added) +++ dev/spark/v3.1.3-rc3-bin/SparkR_3.1.3.tar.gz.sha512 Tue Feb 1 22:17:13 2022 @@ -0,0 +1,3 @@ +SparkR_3.1.3.tar.gz: 5DAF9715 DD2967CD A65922B1 B5E07284 E9F042D4 235D944A + A44E11A3 3BDE7F1B 703B0D0F BE8B5CAB F812E021 2FD7A806 + EF944261 C497B696 363D1958 E7BA4754 Added: dev/spark/v3.1.3-rc3-bin/pyspark-3.1.3.tar.gz == Binary file - no diff available. Propchange: dev/spark/v3.1.3-rc3-bin/pyspark-3.1.3.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v3.1.3-rc3-bin/pyspark-3.1.3.tar.gz.asc == --- dev/spark/v3.1.3-rc3-bin/pyspark-3.1.3.tar.gz.asc (added) +++ dev/spark/v3.1.3-rc3-bin/pyspark-3.1.3.tar.gz.asc Tue Feb 1 22:17:13 2022 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- + +iQJGBAABCgAwFiEEyhnizCAbZSLToqBdxBq2dEJPw/sFAmH5sSASHGhvbGRlbkBh +cGFjaGUub3JnAAoJEMQatnRCT8P7jrQP/15vJX02cBXhu9lEvO6YBSvLX7y2QNMJ +gkwrKs99km0kiinkhqfZYPQC9kRwT4W+psIP1+yplPMyOAfDCqzS9SAXJF2KsBhJ +cfi4oKrK5R/6reBjq6bJfCOs68bHhND5u+GGm1zdkA7DHVQ9VUKMtENp6cvo4/xQ +NU9TCP7YfLyKu1dY/N1csh0FyljpnoSbDgy8ZfO4q2dqcl/l2xQuFhYL7yE9MP35 +l/ReuWBfuAM8B8WQmVDTb1r74a1SBARyKj/u1PXyiKFxZEVYRniua/Di+uuAxz0A +bT1RGG+2DrX+OuFBw2udSB0L6kLLXL7Oa3DBVakAahVFGDwHemLhclCVxdQHr11z +yKRIExOR5GMy/80egx0R/X0AFgdatCQRebdDds3UuMeFCInRaBtcsxIPhFBm+D9u +SRZft8NacQX1EZW35ZCEJQWNX+QOAbXoX+OoAvs8OZkN6V8wrDi2M+/H8d0HkjL+ +Ujts3vfnpc9F41B1/hR0NeR0LIQo0xjVMlEwab1HBy8j0nu4EX9Aks+nPAERTjM6 +wKdK4TUKUyr/jWw+q9A55ocBYGMr2kB0amfp6sPvzuD4CxQP3PoJiq1qryuWZmIG +de1n4jp8BTP6LTq9yUidKc1NhissR6M7z9ZfCaNr6iVE4K+Ds/9U+dshmInreOx5 +lQqcn+AeHFjE +=ihZ6 +-END PGP SIGNATURE- Added: dev/spark/v3.1.3-rc3-bin/pyspark-3.1.3.tar.gz.sha512 == --- dev/spark/v3.1.3-rc3-bin/pyspark-3.1.3.tar.gz.sha512
[spark] 01/01: Preparing development version 3.1.4-SNAPSHOT
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git commit 7540421a2f26154be22852e876b9acfd004a85d3 Author: Holden Karau AuthorDate: Tue Feb 1 21:15:31 2022 + Preparing development version 3.1.4-SNAPSHOT --- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 6 +++--- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10-token-provider/pom.xml | 2 +- external/kafka-0-10/pom.xml| 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 39 files changed, 41 insertions(+), 41 deletions(-) diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index d42841c..db689ea 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 3.1.3 +Version: 3.1.4 Title: R Front End for 'Apache Spark' Description: Provides an R Front end for 'Apache Spark' <https://spark.apache.org>. Authors@R: c(person("Shivaram", "Venkataraman", role = "aut", diff --git a/assembly/pom.xml b/assembly/pom.xml index 95309ee..8e2cded 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../pom.xml diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index 80f6630..baf20e6 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index ead7a4f..1a9c791 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index 232abd7..8a188a3 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml index 79068a6..d53f21d 100644 --- a/common/network-yarn/pom.xml +++ b/common/network-yarn/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml index 2030f85..53b7312 100644 --- a/common/sketch/pom.xml +++ b/common/sketch/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff
[spark] branch branch-3.1 updated (0a7eda3 -> 7540421)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git. from 0a7eda3 [SPARK-38075][SQL][3.1] Fix `hasNext` in `HiveScriptTransformationExec`'s process output iterator add b8c0799 Preparing Spark release v3.1.3-rc3 new 7540421 Preparing development version 3.1.4-SNAPSHOT The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] 01/01: Preparing Spark release v3.1.3-rc3
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to tag v3.1.3-rc3 in repository https://gitbox.apache.org/repos/asf/spark.git commit b8c0799a8cef22c56132d94033759c9f82b0cc86 Author: Holden Karau AuthorDate: Tue Feb 1 21:15:27 2022 + Preparing Spark release v3.1.3-rc3 --- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 6 +++--- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10-token-provider/pom.xml | 2 +- external/kafka-0-10/pom.xml| 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 39 files changed, 41 insertions(+), 41 deletions(-) diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index db689ea..d42841c 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 3.1.4 +Version: 3.1.3 Title: R Front End for 'Apache Spark' Description: Provides an R Front end for 'Apache Spark' <https://spark.apache.org>. Authors@R: c(person("Shivaram", "Venkataraman", role = "aut", diff --git a/assembly/pom.xml b/assembly/pom.xml index 8e2cded..95309ee 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../pom.xml diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index baf20e6..80f6630 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index 1a9c791..ead7a4f 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index 8a188a3..232abd7 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml index d53f21d..79068a6 100644 --- a/common/network-yarn/pom.xml +++ b/common/network-yarn/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml index 53b7312..2030f85 100644 --- a/common/sketch/pom.xml +++ b/common/sketch/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/t
[spark] tag v3.1.3-rc3 created (now b8c0799)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to tag v3.1.3-rc3 in repository https://gitbox.apache.org/repos/asf/spark.git. at b8c0799 (commit) This tag includes the following new commits: new b8c0799 Preparing Spark release v3.1.3-rc3 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r52259 - in /dev/spark/v3.1.3-rc2-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _site/api/java/org/apache/parqu
Author: holden Date: Tue Jan 25 08:02:31 2022 New Revision: 52259 Log: Apache Spark v3.1.3-rc2 docs [This commit notification would consist of 2264 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r52233 - /dev/spark/v3.1.3-rc2-bin/
Author: holden Date: Sun Jan 23 20:03:24 2022 New Revision: 52233 Log: Apache Spark v3.1.3-rc2 Added: dev/spark/v3.1.3-rc2-bin/ dev/spark/v3.1.3-rc2-bin/SparkR_3.1.3.tar.gz (with props) dev/spark/v3.1.3-rc2-bin/SparkR_3.1.3.tar.gz.asc dev/spark/v3.1.3-rc2-bin/SparkR_3.1.3.tar.gz.sha512 dev/spark/v3.1.3-rc2-bin/pyspark-3.1.3.tar.gz (with props) dev/spark/v3.1.3-rc2-bin/pyspark-3.1.3.tar.gz.asc dev/spark/v3.1.3-rc2-bin/pyspark-3.1.3.tar.gz.sha512 dev/spark/v3.1.3-rc2-bin/spark-3.1.3-bin-hadoop2.tgz (with props) dev/spark/v3.1.3-rc2-bin/spark-3.1.3-bin-hadoop2.tgz.asc dev/spark/v3.1.3-rc2-bin/spark-3.1.3-bin-hadoop2.tgz.sha512 dev/spark/v3.1.3-rc2-bin/spark-3.1.3-bin-hadoop3.tgz (with props) dev/spark/v3.1.3-rc2-bin/spark-3.1.3-bin-hadoop3.tgz.asc dev/spark/v3.1.3-rc2-bin/spark-3.1.3-bin-hadoop3.tgz.sha512 dev/spark/v3.1.3-rc2-bin/spark-3.1.3-bin-without-hadoop.tgz (with props) dev/spark/v3.1.3-rc2-bin/spark-3.1.3-bin-without-hadoop.tgz.asc dev/spark/v3.1.3-rc2-bin/spark-3.1.3-bin-without-hadoop.tgz.sha512 dev/spark/v3.1.3-rc2-bin/spark-3.1.3.tgz (with props) dev/spark/v3.1.3-rc2-bin/spark-3.1.3.tgz.asc dev/spark/v3.1.3-rc2-bin/spark-3.1.3.tgz.sha512 Added: dev/spark/v3.1.3-rc2-bin/SparkR_3.1.3.tar.gz == Binary file - no diff available. Propchange: dev/spark/v3.1.3-rc2-bin/SparkR_3.1.3.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v3.1.3-rc2-bin/SparkR_3.1.3.tar.gz.asc == --- dev/spark/v3.1.3-rc2-bin/SparkR_3.1.3.tar.gz.asc (added) +++ dev/spark/v3.1.3-rc2-bin/SparkR_3.1.3.tar.gz.asc Sun Jan 23 20:03:24 2022 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- + +iQJFBAABCgAwFiEEyhnizCAbZSLToqBdxBq2dEJPw/sFAmHttEsSHGhvbGRlbkBh +cGFjaGUub3JnAAoJEMQatnRCT8P7U+YP9A7mD4KFkJxrGgVFcjWn0Ll5aG0aDpPP +jXb3hFeEEkUALFLzqS6Nnms03cgRNIysIl8RU5JPVkxmQa7FWc3wMkqrzGPKusDP +v842qO6uGLxYqg2EM9PKpz0bV+Lzo2/CqJ8hZKfiJGoAuLv2upPoz80iuV+Ey29a +XKa4AwNiQFQAzMHfG/ydHx2to9nxx3w2Y2PNgsYKrTT2zN/X1i5bCxinCasX5Hm1 +gP8CyfQwTZJnmzg7U1iAnSoOIsf6vG63Vfnb3aKP5+Ee96ny1vSkT29YJVPaFD+r +/ItbDZbCI2rhm0AAWZGF4O4BEP7Qq5HAnaohwXA9P0A07flNT7rdL/bTuGS6+8Sq +vRlqpMvS3N+kxH/c1l36hvoZ1V/uVv8v85ygaWZ2ufIFCPaApufeI4Oc8Xf+ujVF +6zFKH0eNtDVXf0sL0Cs9jezBtA1+n0a3z3cigzw6qVy3nZF69SSzbEEEOHmToLgB +q243s4Mk4B3rOZThW5crHSYFLOS7YraVf6Ocv7jVOl47CN6jCS47Q5CkIS3spW1p +6mLixT47QdgGBDoqUvwW9afWjsfjeRYbUtXFO/loPQteIRnQtVWOGhIxEEGkXZYU +Yb/fnvYpwvlJT1x28OQ1FXts0uuUgS7Cfj891t1mwQm5f0uibgt0bgBxa7ZblkOh +M/eBzY91vOE= +=jyno +-END PGP SIGNATURE- Added: dev/spark/v3.1.3-rc2-bin/SparkR_3.1.3.tar.gz.sha512 == --- dev/spark/v3.1.3-rc2-bin/SparkR_3.1.3.tar.gz.sha512 (added) +++ dev/spark/v3.1.3-rc2-bin/SparkR_3.1.3.tar.gz.sha512 Sun Jan 23 20:03:24 2022 @@ -0,0 +1,3 @@ +SparkR_3.1.3.tar.gz: 2520DA4C 77648843 B0F0F4AF 81930F82 0DC3B8A6 DD474285 + 28FC3DD4 7CDF7EFF 6CFD0982 4AC4D12B 0A8A0633 968EF8E1 + 1B92F48B 7D87DC3B 39BF18DB 67645A7D Added: dev/spark/v3.1.3-rc2-bin/pyspark-3.1.3.tar.gz == Binary file - no diff available. Propchange: dev/spark/v3.1.3-rc2-bin/pyspark-3.1.3.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v3.1.3-rc2-bin/pyspark-3.1.3.tar.gz.asc == --- dev/spark/v3.1.3-rc2-bin/pyspark-3.1.3.tar.gz.asc (added) +++ dev/spark/v3.1.3-rc2-bin/pyspark-3.1.3.tar.gz.asc Sun Jan 23 20:03:24 2022 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- + +iQJGBAABCgAwFiEEyhnizCAbZSLToqBdxBq2dEJPw/sFAmHttE0SHGhvbGRlbkBh +cGFjaGUub3JnAAoJEMQatnRCT8P7kvEP/inJR1yX1EAGXPFkqa89AYd4bf4uQw78 +pYPGMPxxGW7/HjwuIlzxiFcMUwQl/r0n9iinuF9Hs/z0nzwMp2JzL9fmiZDVNV+8 ++wyt7kke3VvXBJzzO7XYJ+bQw3xPTbdmmkfdgL+OPalNVOXueamf+0siANdXaFRL +XNqWWMsp2ujXk+PiQulipumJ4hG3p/LkQAvTPfgpgvr1yqh25TZQUY+o6DB1RN+b +mlkTT47oPWSixS9OdfQZbVBq5YjFa4iSr77I9Bv9BxJDJE13ASHoJ81+yrK3g2eF +a+1UTTTxk9tVL0TAl14TIeC2T0wzDQHcHaSvNYfykeSIbn+0Ou74diWrSCArWRPE +wyThcUfwE971HdB8+l/5qh3D/DNCAQGELQSw+cw8eDG9lpMZqwYzGCSKk2W7xTRi +sEbj2UKd8wbW+JIb0XYrsAukKkJiB3Dv30h4o1cFaSKTXmiVOR/SY7ixs8AEG48m +EmpHN766yS9u6uu/AOJd7vQ2wzB2egNMt8z/z/20jUOt3D4rNde6jN0DcXbhbCEA +wUxvqjqD8Cwm5sVLIBd+nNYKv6FNB5pLDOeakPZJri6EGhgGiWZD3J53mSf0QFLX +dLBmDchZqVE9rF1KbskWysNrWcxOJL41BYAbtW8YrdibjN9gSKIL+h95A6CbkDkL +mqeVOFxNrskc +=RXYl +-END PGP SIGNATURE- Added: dev/spark/v3.1.3-rc2-bin/pyspark-3.1.3.tar.gz.sha512 == --- dev/spark/v3.1.3-rc2-bin/pyspark-3.1.3.tar.gz.sha512
[spark] branch branch-3.1 updated (830d5b6 -> 91db9a3)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git. from 830d5b6 [SPARK-37860][UI] Fix taskindex in the stage page task event timeline add fd2d694 Preparing Spark release v3.1.3-rc2 new 91db9a3 Preparing development version 3.1.4-SNAPSHOT The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] 01/01: Preparing development version 3.1.4-SNAPSHOT
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git commit 91db9a36a9ed74845908f14d21227d5267591653 Author: Holden Karau AuthorDate: Thu Jan 13 17:42:17 2022 + Preparing development version 3.1.4-SNAPSHOT --- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 6 +++--- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10-token-provider/pom.xml | 2 +- external/kafka-0-10/pom.xml| 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 39 files changed, 41 insertions(+), 41 deletions(-) diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index d42841c..db689ea 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 3.1.3 +Version: 3.1.4 Title: R Front End for 'Apache Spark' Description: Provides an R Front end for 'Apache Spark' <https://spark.apache.org>. Authors@R: c(person("Shivaram", "Venkataraman", role = "aut", diff --git a/assembly/pom.xml b/assembly/pom.xml index 95309ee..8e2cded 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../pom.xml diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index 80f6630..baf20e6 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index ead7a4f..1a9c791 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index 232abd7..8a188a3 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml index 79068a6..d53f21d 100644 --- a/common/network-yarn/pom.xml +++ b/common/network-yarn/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml index 2030f85..53b7312 100644 --- a/common/sketch/pom.xml +++ b/common/sketch/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff
[spark] 01/01: Preparing Spark release v3.1.3-rc2
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to tag v3.1.3-rc2 in repository https://gitbox.apache.org/repos/asf/spark.git commit fd2d694d783637e496a72166445b1853625c4cbd Author: Holden Karau AuthorDate: Thu Jan 13 17:42:13 2022 + Preparing Spark release v3.1.3-rc2 --- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 6 +++--- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10-token-provider/pom.xml | 2 +- external/kafka-0-10/pom.xml| 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 39 files changed, 41 insertions(+), 41 deletions(-) diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index db689ea..d42841c 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 3.1.4 +Version: 3.1.3 Title: R Front End for 'Apache Spark' Description: Provides an R Front end for 'Apache Spark' <https://spark.apache.org>. Authors@R: c(person("Shivaram", "Venkataraman", role = "aut", diff --git a/assembly/pom.xml b/assembly/pom.xml index 8e2cded..95309ee 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../pom.xml diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index baf20e6..80f6630 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index 1a9c791..ead7a4f 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index 8a188a3..232abd7 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml index d53f21d..79068a6 100644 --- a/common/network-yarn/pom.xml +++ b/common/network-yarn/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml index 53b7312..2030f85 100644 --- a/common/sketch/pom.xml +++ b/common/sketch/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.4-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/t
[spark] tag v3.1.3-rc2 created (now fd2d694)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to tag v3.1.3-rc2 in repository https://gitbox.apache.org/repos/asf/spark.git. at fd2d694 (commit) This tag includes the following new commits: new fd2d694 Preparing Spark release v3.1.3-rc2 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r52050 - /dev/spark/v3.1.3-rc1-bin/
Author: holden Date: Wed Jan 12 22:19:54 2022 New Revision: 52050 Log: Apache Spark v3.1.3-rc1 Added: dev/spark/v3.1.3-rc1-bin/ dev/spark/v3.1.3-rc1-bin/SparkR_3.1.3.tar.gz (with props) dev/spark/v3.1.3-rc1-bin/SparkR_3.1.3.tar.gz.asc dev/spark/v3.1.3-rc1-bin/SparkR_3.1.3.tar.gz.sha512 dev/spark/v3.1.3-rc1-bin/pyspark-3.1.3.tar.gz (with props) dev/spark/v3.1.3-rc1-bin/pyspark-3.1.3.tar.gz.asc dev/spark/v3.1.3-rc1-bin/pyspark-3.1.3.tar.gz.sha512 dev/spark/v3.1.3-rc1-bin/spark-3.1.3-bin-hadoop2.7.tgz (with props) dev/spark/v3.1.3-rc1-bin/spark-3.1.3-bin-hadoop2.7.tgz.asc dev/spark/v3.1.3-rc1-bin/spark-3.1.3-bin-hadoop2.7.tgz.sha512 dev/spark/v3.1.3-rc1-bin/spark-3.1.3-bin-hadoop3.2.tgz (with props) dev/spark/v3.1.3-rc1-bin/spark-3.1.3-bin-hadoop3.2.tgz.asc dev/spark/v3.1.3-rc1-bin/spark-3.1.3-bin-hadoop3.2.tgz.sha512 dev/spark/v3.1.3-rc1-bin/spark-3.1.3-bin-without-hadoop.tgz (with props) dev/spark/v3.1.3-rc1-bin/spark-3.1.3-bin-without-hadoop.tgz.asc dev/spark/v3.1.3-rc1-bin/spark-3.1.3-bin-without-hadoop.tgz.sha512 dev/spark/v3.1.3-rc1-bin/spark-3.1.3.tgz (with props) dev/spark/v3.1.3-rc1-bin/spark-3.1.3.tgz.asc dev/spark/v3.1.3-rc1-bin/spark-3.1.3.tgz.sha512 Added: dev/spark/v3.1.3-rc1-bin/SparkR_3.1.3.tar.gz == Binary file - no diff available. Propchange: dev/spark/v3.1.3-rc1-bin/SparkR_3.1.3.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v3.1.3-rc1-bin/SparkR_3.1.3.tar.gz.asc == --- dev/spark/v3.1.3-rc1-bin/SparkR_3.1.3.tar.gz.asc (added) +++ dev/spark/v3.1.3-rc1-bin/SparkR_3.1.3.tar.gz.asc Wed Jan 12 22:19:54 2022 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- + +iQJGBAABCgAwFiEEyhnizCAbZSLToqBdxBq2dEJPw/sFAmHfSr0SHGhvbGRlbkBh +cGFjaGUub3JnAAoJEMQatnRCT8P73cgP/j9op+sFsqEzSUZ+VYTOA+ttS3wIDNnw +wvzuaeJzpYmImJj51v6zACD48z5fYY3zjlK0yNwlw0Zho0L0aofjOVaG0eS6RCC3 +sk7dcLj31vrPkt75aSswe2tLtWlVZ94glua1FwuIWckJW9hiogL93Zpm3RsoDxWY +/r7JluXJ1cyab4o8mXsKLczzC3GULST609Q6h/CNHK627SCqsfM6dBWG+zFxqEtT +NaoYHvSGC0CWlfVYlbGfCK7J9yb8+KcaxBOCm38axveXjJF1NOb4ptfdRh54RRc4 +Ekrk6WxyGDMYqpfjY7ICfGNUfynX5MNk79OkccZU0OneikVDFpy+do6HK0G2W09r +ViG2C+ccW5UjzoiNAqK+4p7NKRHtfH6RW9sCe+V6HYuZOGBRwlwSofsGxy4Mk1Z/ +ngWpVSrSs7r1cSY7js4uXCWSycLTx+O7e9u9nU1/cvWtiXTUIaFrDQ2pahP1ug9E +i2hFWq87XfsNNkLmrjpmcmxedBNiDhkix7J3q2wiawIWo8V82puPODbnADUlIDeg +E/vuW3wfOHeVc3qwDehXVstWEzdwmBwNxfVEjLu5q0x8enERbj3Ud9JyqjDXwA34 +59eOXtPpt1Ebc4Xmv6kDVZi1MNicF34r9JklgkWOeplqP9BJpUr56IvZDbVw1O+B +Ca/cRHiS7BNt +=RALG +-END PGP SIGNATURE- Added: dev/spark/v3.1.3-rc1-bin/SparkR_3.1.3.tar.gz.sha512 == --- dev/spark/v3.1.3-rc1-bin/SparkR_3.1.3.tar.gz.sha512 (added) +++ dev/spark/v3.1.3-rc1-bin/SparkR_3.1.3.tar.gz.sha512 Wed Jan 12 22:19:54 2022 @@ -0,0 +1,3 @@ +SparkR_3.1.3.tar.gz: 60FA6C02 865717E3 DDF94831 5B8A2C24 B22EAAF5 2D8D35D2 + A0F569A2 50DA6021 CBCE131F C336011B A5D2A8E5 74F5A85F + 653AC05A D3684B0C CB663932 B1E950E2 Added: dev/spark/v3.1.3-rc1-bin/pyspark-3.1.3.tar.gz == Binary file - no diff available. Propchange: dev/spark/v3.1.3-rc1-bin/pyspark-3.1.3.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v3.1.3-rc1-bin/pyspark-3.1.3.tar.gz.asc == --- dev/spark/v3.1.3-rc1-bin/pyspark-3.1.3.tar.gz.asc (added) +++ dev/spark/v3.1.3-rc1-bin/pyspark-3.1.3.tar.gz.asc Wed Jan 12 22:19:54 2022 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- + +iQJGBAABCgAwFiEEyhnizCAbZSLToqBdxBq2dEJPw/sFAmHfSr4SHGhvbGRlbkBh +cGFjaGUub3JnAAoJEMQatnRCT8P73WwP/3hS1LoVx4LD9sC15cagBRvm95AgySzO +3Quf6791doVk8tsyd65WmVITmF1Yv66rhOgW29IpoGq7mJmjxIkPwmW8qbhCHi8S +wIs88/DRqF+bL7M9+jS9c0quE7/5gE3PgRIhKnunUQfnekj48k0mQM0Za6lq+vhN +E/nTvQvdxznFHoN1j9wCmYRyTB7h4MOGClDIutMvX4yERmfwTgicUxfTlhxuaETd +ZS/v8wvcZU91WHntyxAkwwWTHG2NO03vH5qWVnLePY6IrZCa2mCM3EpxLXuB3HwX +jWLQx08Mb39yhdDxO8zrlsj6W3C7rkqN87bKU7m9oG7N/5lDEuYlMTu6ZU6WC2Af ++G5ChOfxVSa6s1JiFK2h8DzszksOHqnZGLFcDuBKsda5NU/Bgrwd2lKrXTaurXvE +NtT0Qa3LTNs0ivevI1pohWqc08iV20mh7ij8RzQRNP+vilxLI9kj/hwUn8EZpbif +gHuZUFw/UxhEO+fenS9gkYlhvIxECfioJ7ZoEygPx1XeenBJY2+fbcrlPWafdA1o +FK7wzdf/dSko7Z4e7TvH74xKn1iNG9xY6dYpH50MI+flzkj4eL2U2sGQ7cwdIA/A +pAsKjZXTknl3il+S5VT6tiGPZd270asF/EcSQJCEv4UPSa/tQt/2Ac1VpiMZMNlY +Vut5eqb+vmxB +=kHOB +-END PGP SIGNATURE- Added: dev/spark/v3.1.3-rc1-bin/pyspark-3.1.3.tar.gz.sha512 == --- dev/spark/v3.1.3-rc1-bin/pyspark-3.1.3
[spark] 01/01: Preparing development version 3.1.4-SNAPSHOT
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git commit 1a275396da176d9b0436066f0173ec24bc128ab9 Author: Holden Karau AuthorDate: Mon Jan 10 21:23:44 2022 + Preparing development version 3.1.4-SNAPSHOT --- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 6 +++--- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10-token-provider/pom.xml | 2 +- external/kafka-0-10/pom.xml| 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 39 files changed, 41 insertions(+), 41 deletions(-) diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index d42841c..db689ea 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 3.1.3 +Version: 3.1.4 Title: R Front End for 'Apache Spark' Description: Provides an R Front end for 'Apache Spark' <https://spark.apache.org>. Authors@R: c(person("Shivaram", "Venkataraman", role = "aut", diff --git a/assembly/pom.xml b/assembly/pom.xml index 95309ee..8e2cded 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../pom.xml diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index 80f6630..baf20e6 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index ead7a4f..1a9c791 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index 232abd7..8a188a3 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml index 79068a6..d53f21d 100644 --- a/common/network-yarn/pom.xml +++ b/common/network-yarn/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml index 2030f85..53b7312 100644 --- a/common/sketch/pom.xml +++ b/common/sketch/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3 +3.1.4-SNAPSHOT ../../pom.xml diff
[spark] branch branch-3.1 updated (94a69ff -> 1a27539)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git. from 94a69ff Revert "[SPARK-37779][SQL] Make ColumnarToRowExec plan canonicalizable after (de)serialization" add df89eb2 Preparing Spark release v3.1.3-rc1 new 1a27539 Preparing development version 3.1.4-SNAPSHOT The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 6 +++--- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10-token-provider/pom.xml | 2 +- external/kafka-0-10/pom.xml| 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 39 files changed, 41 insertions(+), 41 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] tag v3.1.3-rc1 created (now df89eb2)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to tag v3.1.3-rc1 in repository https://gitbox.apache.org/repos/asf/spark.git. at df89eb2 (commit) This tag includes the following new commits: new df89eb2 Preparing Spark release v3.1.3-rc1 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] 01/01: Preparing Spark release v3.1.3-rc1
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to tag v3.1.3-rc1 in repository https://gitbox.apache.org/repos/asf/spark.git commit df89eb2d748c99665256d3ef297fef774db1014a Author: Holden Karau AuthorDate: Mon Jan 10 21:23:40 2022 + Preparing Spark release v3.1.3-rc1 --- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 2 +- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10-token-provider/pom.xml | 2 +- external/kafka-0-10/pom.xml| 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 38 files changed, 38 insertions(+), 38 deletions(-) diff --git a/assembly/pom.xml b/assembly/pom.xml index bf75946..95309ee 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.12 -3.1.3-SNAPSHOT +3.1.3 ../pom.xml diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index 777477b..80f6630 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index 2233100..ead7a4f 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index aa5b2d4..232abd7 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml index d3cc674..79068a6 100644 --- a/common/network-yarn/pom.xml +++ b/common/network-yarn/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml index c54d485..2030f85 100644 --- a/common/sketch/pom.xml +++ b/common/sketch/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/tags/pom.xml b/common/tags/pom.xml index a02b306..798c88e 100644 --- a/common/tags/pom.xml +++ b/common/tags/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3-SNAPSHOT +3.1.3 ../../pom.xml diff --git a/common/unsafe/pom.xml b/common/unsafe/pom.xml index 64ee6ba..bd829c0 100644 --- a/common/unsafe/pom.xml +++ b/common/unsafe/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.12 -3.1.3-SNAPSHOT +3.1.3 ../../pom.xml diff --git a
[spark] branch master updated: [SPARK-36058][K8S] Add support for statefulset APIs in K8s
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new ff3f3c4 [SPARK-36058][K8S] Add support for statefulset APIs in K8s ff3f3c4 is described below commit ff3f3c45668364da9bd10992791b5ee9a46fea21 Author: Holden Karau AuthorDate: Wed Aug 25 17:38:57 2021 -0700 [SPARK-36058][K8S] Add support for statefulset APIs in K8s ### What changes were proposed in this pull request? Generalize the pod allocator and add support for statefulsets. ### Why are the changes needed? Allocating individual pods in Spark can be not ideal for some clusters and using higher level operators like statefulsets and replicasets can be useful. ### Does this PR introduce _any_ user-facing change? Yes new config options. ### How was this patch tested? Completed: New unit & basic integration test PV integration tests Closes #33508 from holdenk/SPARK-36058-support-replicasets-or-job-api-like-things. Lead-authored-by: Holden Karau Co-authored-by: Holden Karau Signed-off-by: Holden Karau --- .../executor/CoarseGrainedExecutorBackend.scala| 2 +- .../scala/org/apache/spark/storage/DiskStore.scala | 9 +- .../apache/spark/examples/MiniReadWriteTest.scala | 139 + .../scala/org/apache/spark/deploy/k8s/Config.scala | 10 + .../org/apache/spark/deploy/k8s/Constants.scala| 1 + .../k8s/features/BasicExecutorFeatureStep.scala| 15 +- .../cluster/k8s/AbstractPodsAllocator.scala| 59 ++ .../cluster/k8s/ExecutorPodsAllocator.scala| 18 +- .../cluster/k8s/ExecutorPodsSnapshot.scala | 11 +- .../cluster/k8s/KubernetesClusterManager.scala | 38 +++- .../cluster/k8s/KubernetesClusterMessage.scala | 21 ++ .../k8s/KubernetesClusterSchedulerBackend.scala| 41 +++- .../cluster/k8s/KubernetesExecutorBackend.scala| 228 + .../cluster/k8s/StatefulsetPodsAllocator.scala | 201 ++ .../features/BasicExecutorFeatureStepSuite.scala | 6 +- .../cluster/k8s/ExecutorPodsAllocatorSuite.scala | 31 ++- .../k8s/KubernetesClusterManagerSuite.scala| 58 ++ .../KubernetesClusterSchedulerBackendSuite.scala | 13 +- .../cluster/k8s/StatefulsetAllocatorSuite.scala| 153 ++ .../src/main/dockerfiles/spark/entrypoint.sh | 3 +- .../k8s/integrationtest/BasicTestsSuite.scala | 17 ++ .../k8s/integrationtest/KubernetesSuite.scala | 33 ++- .../deploy/k8s/integrationtest/PVTestsSuite.scala | 88 +++- 23 files changed, 1148 insertions(+), 47 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala b/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala index 76d01f8..c87e61a 100644 --- a/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala +++ b/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala @@ -61,7 +61,7 @@ private[spark] class CoarseGrainedExecutorBackend( private implicit val formats = DefaultFormats - private[executor] val stopping = new AtomicBoolean(false) + private[spark] val stopping = new AtomicBoolean(false) var executor: Executor = null @volatile var driver: Option[RpcEndpointRef] = None diff --git a/core/src/main/scala/org/apache/spark/storage/DiskStore.scala b/core/src/main/scala/org/apache/spark/storage/DiskStore.scala index 7269913..f0334c5 100644 --- a/core/src/main/scala/org/apache/spark/storage/DiskStore.scala +++ b/core/src/main/scala/org/apache/spark/storage/DiskStore.scala @@ -59,7 +59,14 @@ private[spark] class DiskStore( */ def put(blockId: BlockId)(writeFunc: WritableByteChannel => Unit): Unit = { if (contains(blockId)) { - throw new IllegalStateException(s"Block $blockId is already present in the disk store") + logWarning(s"Block $blockId is already present in the disk store") + try { +diskManager.getFile(blockId).delete() + } catch { +case e: Exception => + throw new IllegalStateException( +s"Block $blockId is already present in the disk store and could not delete it $e") + } } logDebug(s"Attempting to put block $blockId") val startTimeNs = System.nanoTime() diff --git a/examples/src/main/scala/org/apache/spark/examples/MiniReadWriteTest.scala b/examples/src/main/scala/org/apache/spark/examples/MiniReadWriteTest.scala new file mode 100644 index 000..5a74e1c --- /dev/null +++ b/examples/src/main/scala/org/apache/spark/examples/MiniReadWriteTest.scala @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license
[spark] branch master updated: [SPARK-34309][BUILD][CORE][SQL][K8S] Use Caffeine instead of Guava Cache
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 01cf6f4 [SPARK-34309][BUILD][CORE][SQL][K8S] Use Caffeine instead of Guava Cache 01cf6f4 is described below commit 01cf6f4c6b2a593a2a8717fd2cda13725424120e Author: yangjie01 AuthorDate: Wed Aug 4 12:01:44 2021 -0700 [SPARK-34309][BUILD][CORE][SQL][K8S] Use Caffeine instead of Guava Cache ### What changes were proposed in this pull request? There are 3 ways to use Guava cache in spark code: 1. `Loadingcache` is the main way to use Guava cache in spark code and the key usages are as follows: a. `LoadingCache` with `maximumsize` data eviction policy, such as `appCache` in `ApplicationCache`, `cache` in `Codegenerator` b. `LoadingCache` with `maximumWeight` data eviction policy, such as `shuffleIndexCache` in `ExternalShuffleBlockResolver` c. `LoadingCache` with 'expireAfterWrite' data eviction policy, such as `tableRelationCache` in `SessionCatalog` 2. `ManualCache` is another way to use Guava cache in spark code and the key usage is `cache` in `SharedInMemoryCache`, it use to caches partition file statuses in memory 3. The last use way is `hadoopJobMetadata` in `SparkEnv`, it uses Guava Cache to build a `soft-reference map`. The goal of this pr is use `Caffeine` instead of `Guava Cache` because `Caffeine` is faster than `Guava Cache` from benchmarks, the main changes as follows: 1. Add `Caffeine` deps to maven `pom.xml` 2. Use `Caffeine` instead of Guava `LoadingCache`, `ManualCache` and soft-reference map in `SparkEnv` 3. Add `LocalCacheBenchmark` to compare performance of `Loadingcache` between `Guava Cache` and `Caffeine` ### Why are the changes needed? `Caffeine` is faster than `Guava Cache` from benchmarks ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass the Jenkins or GitHub Action - Add `LocalCacheBenchmark` to compare performance of `Loadingcache` between `Guava Cache` and `Caffeine` Closes #31517 from LuciferYang/guava-cache-to-caffeine. Authored-by: yangjie01 Signed-off-by: Holden Karau --- common/network-shuffle/pom.xml | 4 + .../shuffle/ExternalShuffleBlockResolver.java | 32 +++- .../network/shuffle/RemoteBlockPushResolver.java | 24 +++--- .../LocalCacheBenchmark-jdk11-results.txt | 12 +++ core/benchmarks/LocalCacheBenchmark-results.txt| 12 +++ core/pom.xml | 4 + .../src/main/scala/org/apache/spark/SparkEnv.scala | 4 +- .../spark/deploy/history/ApplicationCache.scala| 37 + .../apache/spark/rdd/ReliableCheckpointRDD.scala | 24 +++--- .../org/apache/spark/storage/BlockManager.scala| 4 +- .../org/apache/spark/storage/BlockManagerId.scala | 15 ++-- .../spark/storage/BlockManagerMasterEndpoint.scala | 4 +- .../main/scala/org/apache/spark/util/Utils.scala | 19 +++-- .../org/apache/spark/LocalCacheBenchmark.scala | 93 ++ .../org/apache/spark/executor/ExecutorSuite.scala | 18 ++--- dev/deps/spark-deps-hadoop-2.7-hive-2.3| 3 + dev/deps/spark-deps-hadoop-3.2-hive-2.3| 3 + pom.xml| 6 ++ resource-managers/kubernetes/core/pom.xml | 5 ++ .../cluster/k8s/ExecutorPodsLifecycleManager.scala | 9 ++- sql/catalyst/pom.xml | 4 + .../sql/catalyst/catalog/SessionCatalog.scala | 9 +-- .../expressions/SubExprEvaluationRuntime.scala | 21 ++--- .../expressions/codegen/CodeGenerator.scala| 6 +- .../catalyst/util/DateTimeFormatterHelper.scala| 4 +- ...CodeGeneratorWithInterpretedFallbackSuite.scala | 4 +- .../SubExprEvaluationRuntimeSuite.scala| 20 ++--- sql/core/pom.xml | 5 +- .../execution/datasources/FileStatusCache.scala| 13 ++- .../spark/sql/execution/metric/SQLMetrics.scala| 4 +- 30 files changed, 283 insertions(+), 139 deletions(-) diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index d3d78f2..1b78182 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -59,6 +59,10 @@ provided + com.github.ben-manes.caffeine + caffeine + + com.google.guava guava diff --git a/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java b/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java index 73d4e6c..650f33e 100644 --- a/common/network-shuffle/src/main/java/org/apache/spark/netwo
[spark] branch master updated: [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new bee2799 [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods bee2799 is described below commit bee279997f2115af6b15e3dbb7433dccef7f14af Author: Holden Karau AuthorDate: Fri Jul 23 15:21:38 2021 -0700 [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods ### What changes were proposed in this pull request? Add a new configuration flag to allow Spark to provide hints to the scheduler when we are decommissioning or exiting a pod that this pod will have the least impact for a pre-emption event. ### Why are the changes needed? Kubernetes added the concepts of pod disruption budgets (which can have selectors based on labels) as well pod deletion for providing hints to the scheduler as to what we would prefer to have pre-empted. ### Does this PR introduce _any_ user-facing change? New configuration flag ### How was this patch tested? The deletion unit test was extended. Closes #33270 from holdenk/SPARK-35956-support-auto-assigning-labels-to-decommissioning-pods. Lead-authored-by: Holden Karau Co-authored-by: Holden Karau Co-authored-by: Holden Karau Signed-off-by: Holden Karau --- docs/running-on-kubernetes.md | 28 +--- .../scala/org/apache/spark/deploy/k8s/Config.scala | 18 .../k8s/KubernetesClusterSchedulerBackend.scala| 50 +- .../KubernetesClusterSchedulerBackendSuite.scala | 47 +++- .../k8s/integrationtest/DecommissionSuite.scala| 40 ++--- 5 files changed, 168 insertions(+), 15 deletions(-) diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md index 09f7d2ab..b30d61d 100644 --- a/docs/running-on-kubernetes.md +++ b/docs/running-on-kubernetes.md @@ -8,9 +8,9 @@ license: | The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. @@ -422,7 +422,7 @@ Your Kubernetes config file typically lives under `.kube/config` in your home di ### Contexts -Kubernetes configuration files can contain multiple contexts that allow for switching between different clusters and/or user identities. By default Spark on Kubernetes will use your current context (which can be checked by running `kubectl config current-context`) when doing the initial auto-configuration of the Kubernetes client. +Kubernetes configuration files can contain multiple contexts that allow for switching between different clusters and/or user identities. By default Spark on Kubernetes will use your current context (which can be checked by running `kubectl config current-context`) when doing the initial auto-configuration of the Kubernetes client. In order to use an alternative context users can specify the desired context via the Spark configuration property `spark.kubernetes.context` e.g. `spark.kubernetes.context=minikube`. @@ -1038,7 +1038,7 @@ See the [configuration page](configuration.html) for information on Spark config spark.kubernetes.executor.secrets.ENV_VAR=spark-secret:key. 2.4.0 - + spark.kubernetes.driver.volumes.[VolumeType].[VolumeName].mount.path (none) @@ -1270,7 +1270,7 @@ See the [configuration page](configuration.html) for information on Spark config 3.0.0 - + spark.kubernetes.appKillPodDeletionGracePeriod (none) @@ -1288,6 +1288,24 @@ See the [configuration page](configuration.html) for information on Spark config 3.0.0 + + spark.kubernetes.executor.decommmissionLabel + (none) + +Label to be applied to pods which are exiting or being decommissioned. Intended for use +with pod disruption budgets, deletion costs, and similar. + + 3.3.0 + + + spark.kubernetes.executor.decommmissionLabelValue + (none) + +Value to be applied with the label when +spark.kubernetes.executor.decommmissionLabel is enabled. + + 3.3.0 + Pod template properties diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala index 49c0a42..33370b7 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala +++
[spark] branch master updated: [SPARK-34764][CORE][K8S][UI] Propagate reason for exec loss to Web UI
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 160b3be [SPARK-34764][CORE][K8S][UI] Propagate reason for exec loss to Web UI 160b3be is described below commit 160b3bee71ac5c4b5587fa16c8eef753e9a4ad91 Author: Holden Karau AuthorDate: Thu May 13 16:02:31 2021 -0700 [SPARK-34764][CORE][K8S][UI] Propagate reason for exec loss to Web UI ### What changes were proposed in this pull request? Adds the exec loss reason to the Spark web UI & in doing so also fix the Kube integration to pass exec loss reason into core. UI change: ![image](https://user-images.githubusercontent.com/59893/117045762-b975ba80-acc4-11eb-9679-8edab3cfadc2.png) ### Why are the changes needed? Debugging Spark jobs is *hard*, making it clearer why executors have exited could help. ### Does this PR introduce _any_ user-facing change? Yes a new column on the executor page. ### How was this patch tested? K8s unit test updated to validate exec loss reasons are passed through regardless of exec alive state, manual testing to validate the UI. Closes #32436 from holdenk/SPARK-34764-propegate-reason-for-exec-loss. Lead-authored-by: Holden Karau Co-authored-by: Holden Karau Signed-off-by: Holden Karau --- .../spark/ui/static/executorspage-template.html| 1 + .../org/apache/spark/ui/static/executorspage.js| 15 +- .../cluster/k8s/ExecutorPodsLifecycleManager.scala | 34 -- .../k8s/KubernetesClusterSchedulerBackend.scala| 4 +-- .../k8s/ExecutorPodsLifecycleManagerSuite.scala| 21 - .../KubernetesClusterSchedulerBackendSuite.scala | 4 +-- 6 files changed, 63 insertions(+), 16 deletions(-) diff --git a/core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html b/core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html index be6d7bc..37d56a0 100644 --- a/core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html +++ b/core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html @@ -128,6 +128,7 @@ limitations under the License. Shuffle Write Logs Thread Dump + Exec Loss Reason diff --git a/core/src/main/resources/org/apache/spark/ui/static/executorspage.js b/core/src/main/resources/org/apache/spark/ui/static/executorspage.js index 2055c8f..5cc2868 100644 --- a/core/src/main/resources/org/apache/spark/ui/static/executorspage.js +++ b/core/src/main/resources/org/apache/spark/ui/static/executorspage.js @@ -31,6 +31,14 @@ function getThreadDumpEnabled() { return threadDumpEnabled; } +function formatLossReason(removeReason, type, row) { +if (removeReason) { + return removeReason +} else { + return "" +} +} + function formatStatus(status, type, row) { if (row.isExcluded) { return "Excluded"; @@ -132,7 +140,7 @@ function totalDurationColor(totalGCTime, totalDuration) { } var sumOptionalColumns = [3, 4]; -var execOptionalColumns = [5, 6, 7, 8, 9, 10, 13, 14]; +var execOptionalColumns = [5, 6, 7, 8, 9, 10, 13, 14, 15]; var execDataTable; var sumDataTable; @@ -543,6 +551,10 @@ $(document).ready(function () { data: 'id', render: function (data, type) { return type === 'display' ? ("Thread Dump" ) : data; } +}, +{ +data: 'removeReason', +render: formatLossReason } ], "order": [[0, "asc"]], @@ -709,6 +721,7 @@ $(document).ready(function () { " Peak Pool Memory Direct / Mapped" + " Resources" + " Resource Profile Id" + + " Exec Loss Reason" + ""); reselectCheckboxesBasedOnTaskTableState(); diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala index 67e8f7e..e255de4 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala @@ -221,10 +221,17 @@ private[spark] class ExecutorPodsLifecycleManager( val pod = podState.pod val reason = Option(pod.getStatus.getReason) val message = Option(pod.getStatus.getMessage) +val
[spark] branch branch-3.1 updated: [SPARK-34361][K8S] In case of downscaling avoid killing of executors already known by the scheduler backend in the pod allocator
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 28f2d51 [SPARK-34361][K8S] In case of downscaling avoid killing of executors already known by the scheduler backend in the pod allocator 28f2d51 is described below commit 28f2d51e5950a8b99a37814c9396b23315880fe5 Author: “attilapiros” AuthorDate: Tue Mar 2 16:58:29 2021 -0800 [SPARK-34361][K8S] In case of downscaling avoid killing of executors already known by the scheduler backend in the pod allocator ### What changes were proposed in this pull request? This PR modifies the POD allocator to use the scheduler backend to get the known executors and remove those from the pending and newly created list. This is different from the normal `ExecutorAllocationManager` requested killing of executors where the `spark.dynamicAllocation.executorIdleTimeout` is used. In this case POD allocator kills the executors which should be only responsible for terminating not satisfied POD allocations (new requests where no POD state is received yet and PODs in pending state). ### Why are the changes needed? Because there is race between executor POD allocator and cluster scheduler backend. Running several experiment during downscaling we experienced a lot of killed fresh executors wich has already running task on them. The pattern in the log was the following (see executor 312 and TID 2079): ``` 21/02/01 15:12:03 INFO ExecutorMonitor: New executor 312 has registered (new total is 138) ... 21/02/01 15:12:03 INFO TaskSetManager: Starting task 247.0 in stage 4.0 (TID 2079, 100.100.18.138, executor 312, partition 247, PROCESS_LOCAL, 8777 bytes) 21/02/01 15:12:03 INFO ExecutorPodsAllocator: Deleting 3 excess pod requests (408,312,307). ... 21/02/01 15:12:04 ERROR TaskSchedulerImpl: Lost executor 312 on 100.100.18.138: The executor with id 312 was deleted by a user or the framework. 21/02/01 15:12:04 INFO TaskSetManager: Task 2079 failed because while it was being computed, its executor exited for a reason unrelated to the task. Not counting this failure towards the maximum number of failures for the task. ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually With this change there was no executor lost with running task on it. # With unit test A new test is added and existing test is modified to check these cases. Closes #31513 from attilapiros/SPARK-34361. Authored-by: “attilapiros” Signed-off-by: Holden Karau (cherry picked from commit 6c5322de6176726955b4bc941f92ecaa54a7f539) Signed-off-by: Holden Karau --- .../scala/org/apache/spark/deploy/k8s/Config.scala | 3 +- .../cluster/k8s/ExecutorPodsAllocator.scala| 66 ++--- .../k8s/KubernetesClusterSchedulerBackend.scala| 2 +- .../cluster/k8s/ExecutorPodsAllocatorSuite.scala | 155 - .../KubernetesClusterSchedulerBackendSuite.scala | 2 +- 5 files changed, 204 insertions(+), 24 deletions(-) diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala index 68dcef1..bec29a9 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala @@ -237,7 +237,8 @@ private[spark] object Config extends Logging { val KUBERNETES_ALLOCATION_EXECUTOR_TIMEOUT = ConfigBuilder("spark.kubernetes.allocation.executor.timeout") - .doc("Time to wait before considering a pending executor timedout.") + .doc("Time to wait before a newly created executor POD request, which does not reached " + +"the POD pending state yet, considered timedout and will be deleted.") .version("3.1.0") .timeConf(TimeUnit.MILLISECONDS) .checkValue(value => value > 0, "Allocation executor timeout must be a positive time value.") diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala index eb35de8..5fc81a6 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala @@ -18,7 +18,7 @@ package
[spark] branch master updated: [SPARK-34361][K8S] In case of downscaling avoid killing of executors already known by the scheduler backend in the pod allocator
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6c5322d [SPARK-34361][K8S] In case of downscaling avoid killing of executors already known by the scheduler backend in the pod allocator 6c5322d is described below commit 6c5322de6176726955b4bc941f92ecaa54a7f539 Author: “attilapiros” AuthorDate: Tue Mar 2 16:58:29 2021 -0800 [SPARK-34361][K8S] In case of downscaling avoid killing of executors already known by the scheduler backend in the pod allocator ### What changes were proposed in this pull request? This PR modifies the POD allocator to use the scheduler backend to get the known executors and remove those from the pending and newly created list. This is different from the normal `ExecutorAllocationManager` requested killing of executors where the `spark.dynamicAllocation.executorIdleTimeout` is used. In this case POD allocator kills the executors which should be only responsible for terminating not satisfied POD allocations (new requests where no POD state is received yet and PODs in pending state). ### Why are the changes needed? Because there is race between executor POD allocator and cluster scheduler backend. Running several experiment during downscaling we experienced a lot of killed fresh executors wich has already running task on them. The pattern in the log was the following (see executor 312 and TID 2079): ``` 21/02/01 15:12:03 INFO ExecutorMonitor: New executor 312 has registered (new total is 138) ... 21/02/01 15:12:03 INFO TaskSetManager: Starting task 247.0 in stage 4.0 (TID 2079, 100.100.18.138, executor 312, partition 247, PROCESS_LOCAL, 8777 bytes) 21/02/01 15:12:03 INFO ExecutorPodsAllocator: Deleting 3 excess pod requests (408,312,307). ... 21/02/01 15:12:04 ERROR TaskSchedulerImpl: Lost executor 312 on 100.100.18.138: The executor with id 312 was deleted by a user or the framework. 21/02/01 15:12:04 INFO TaskSetManager: Task 2079 failed because while it was being computed, its executor exited for a reason unrelated to the task. Not counting this failure towards the maximum number of failures for the task. ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually With this change there was no executor lost with running task on it. # With unit test A new test is added and existing test is modified to check these cases. Closes #31513 from attilapiros/SPARK-34361. Authored-by: “attilapiros” Signed-off-by: Holden Karau --- .../scala/org/apache/spark/deploy/k8s/Config.scala | 3 +- .../cluster/k8s/ExecutorPodsAllocator.scala| 66 ++--- .../k8s/KubernetesClusterSchedulerBackend.scala| 2 +- .../cluster/k8s/ExecutorPodsAllocatorSuite.scala | 155 - .../KubernetesClusterSchedulerBackendSuite.scala | 2 +- 5 files changed, 204 insertions(+), 24 deletions(-) diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala index d8fa9e4..8ac2ee4 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala @@ -272,7 +272,8 @@ private[spark] object Config extends Logging { val KUBERNETES_ALLOCATION_EXECUTOR_TIMEOUT = ConfigBuilder("spark.kubernetes.allocation.executor.timeout") - .doc("Time to wait before considering a pending executor timedout.") + .doc("Time to wait before a newly created executor POD request, which does not reached " + +"the POD pending state yet, considered timedout and will be deleted.") .version("3.1.0") .timeConf(TimeUnit.MILLISECONDS) .checkValue(value => value > 0, "Allocation executor timeout must be a positive time value.") diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala index eb35de8..5fc81a6 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala @@ -18,7 +18,7 @@ package org.apache.spark.scheduler.cluster.k8s import java.time.Instant import java.util.concurrent.ConcurrentHash
[spark] branch master updated: [SPARK-32617][K8S][TESTS] Configure kubernetes client based on kubeconfig settings in kubernetes integration tests
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new b17754a [SPARK-32617][K8S][TESTS] Configure kubernetes client based on kubeconfig settings in kubernetes integration tests b17754a is described below commit b17754a8cbd2593eb2b1952e95a7eeb0f8e09cdb Author: “attilapiros” AuthorDate: Wed Feb 24 11:46:27 2021 -0800 [SPARK-32617][K8S][TESTS] Configure kubernetes client based on kubeconfig settings in kubernetes integration tests ### What changes were proposed in this pull request? From [minikube version v1.1.0](https://github.com/kubernetes/minikube/blob/v1.1.0/CHANGELOG.md) kubectl is available as a command. So the kubeconfig settings can be accessed like: ``` $ minikube kubectl config view apiVersion: v1 clusters: - cluster: certificate-authority: /Users/attilazsoltpiros/.minikube/ca.crt server: https://127.0.0.1:32788 name: minikube contexts: - context: cluster: minikube namespace: default user: minikube name: minikube current-context: minikube kind: Config preferences: {} users: - name: minikube user: client-certificate: /Users/attilazsoltpiros/.minikube/profiles/minikube/client.crt client-key: /Users/attilazsoltpiros/.minikube/profiles/minikube/client.key ``` Here the vm-driver was docker and the server port (https://127.0.0.1:32788) is different from the hardcoded 8443. So the main part of this PR is introducing kubernetes client configuration based on the kubeconfig (output of `minikube kubectl config view`) in case of minikube versions after v1.1.0 and the old legacy way of configuration is also kept as minikube version should be supported back to v0.34.1 . Moreover as the old style of config parsing pattern wasn't sufficient in my case as when the `minikube kubectl config view` is called kubectl downloading message might be included before the first key I changed it even for the existent keys to be a consistent pattern in this file. The old parsing in an example: ``` private val HOST_PREFIX = "host:" val hostString = statusString.find(_.contains(s"$HOST_PREFIX ")) val status1 = hostString.get.split(HOST_PREFIX)(1) ``` The new parsing: ``` private val HOST_PREFIX = "host: " val hostString = statusString.find(_.contains(HOST_PREFIX)) hostString.get.split(HOST_PREFIX)(1) ``` So the PREFIX is extended with the extra space at the declaration (this way the two separate string operation are more safe and consistent with each other) and the replace is changed to split and getting the 2nd string from the result (which is guaranteed to contain only the text after the PREFIX when the PREFIX is a contained substring). Finally there is tiny change in `dev-run-integration-tests.sh` to introduce `--skip-building-dependencies` which switchs off building of maven dependencies of `kubernetes-integration-tests` from the Spark project. This could be used when only the `kubernetes-integration-tests` should be rebuilded as only the tests are modified. ### Why are the changes needed? Kubernetes client configuration based on kubeconfig settings is more reliable and provides a solution which is minikube version independent. ### Does this PR introduce _any_ user-facing change? No. This is only test code. ### How was this patch tested? tested manually on two minikube versions. Minikube v0.34.1: ``` $ minikube version minikube version: v0.34.1 $ grep "version\|building" resource-managers/kubernetes/integration-tests/target/integration-tests.log 20/12/12 12:52:25.135 ScalaTest-main-running-DiscoverySuite INFO Minikube: minikube version: v0.34.1 20/12/12 12:52:25.761 ScalaTest-main-running-DiscoverySuite INFO Minikube: building kubernetes config with apiVersion: v1, masterUrl: https://192.168.99.103:8443, caCertFile: /Users/attilazsoltpiros/.minikube/ca.crt, clientCertFile: /Users/attilazsoltpiros/.minikube/apiserver.crt, clientKeyFile: /Users/attilazsoltpiros/.minikube/apiserver.key ``` Minikube v1.15.1 ``` $ minikube version minikube version: v1.15.1 commit: 23f40a012abb52eff365ff99a709501a61ac5876 $ grep "version\|building" resource-managers/kubernetes/integration-tests/target/integration-tests.log 20/12/13 06:25:55.086 ScalaTest-main-running-DiscoverySuite INFO Minikube: minikube version: v1.15.1 20/12/13 06:25:55.597 ScalaTest-main-running-DiscoverySuite INFO Minikube: building kubernetes config with apiVersion
[spark] branch master updated: [SPARK-33763] Add metrics for better tracking of dynamic allocation
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 76e5d75 [SPARK-33763] Add metrics for better tracking of dynamic allocation 76e5d75 is described below commit 76e5d75e369609b96248a34bc6015ec61936e652 Author: “attilapiros” AuthorDate: Wed Feb 17 13:44:36 2021 -0800 [SPARK-33763] Add metrics for better tracking of dynamic allocation ### What changes were proposed in this pull request? This PR adds the following metrics to track executor remove reasons during dynamic allocation: - `numberExecutorsGracefullyDecommissioned`: number of executors which reached the finished decommissioning state and shut itself down cleanly - `numberExecutorsDecommissionUnfinished`: executors which requested to decommission but they stopped without reaching the finished decommissioning state - `numberExecutorsKilledByDriver`: executors killed by the driver (requested to stop) - `numberExecutorsExitedUnexpectedly`: executors exited without driver request ### Why are the changes needed? For supporting monitoring of dynamic allocation better with these metrics. ### Does this PR introduce _any_ user-facing change? Yes. The new metrics will be available for monitoring. ### How was this patch tested? With unit and integration tests. Finally manually checked the new metrics in jconsole: https://user-images.githubusercontent.com/2017933/107458686-de8adf00-6b54-11eb-86f7-41faf2fb638f.png";> Closes #31450 from attilapiros/SPARK-33763-final. Authored-by: “attilapiros” Signed-off-by: Holden Karau --- .../apache/spark/ExecutorAllocationManager.scala | 80 +- .../executor/CoarseGrainedExecutorBackend.scala| 6 +- .../spark/scheduler/ExecutorLossReason.scala | 4 ++ .../spark/scheduler/dynalloc/ExecutorMonitor.scala | 20 +- .../spark/ExecutorAllocationManagerSuite.scala | 54 ++- .../scheduler/dynalloc/ExecutorMonitorSuite.scala | 22 -- .../k8s/integrationtest/DecommissionSuite.scala| 11 ++- 7 files changed, 150 insertions(+), 47 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala index bdb768e..822a0a5 100644 --- a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala +++ b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala @@ -23,7 +23,7 @@ import scala.collection.mutable import scala.collection.mutable.ArrayBuffer import scala.util.control.{ControlThrowable, NonFatal} -import com.codahale.metrics.{Gauge, MetricRegistry} +import com.codahale.metrics.{Counter, Gauge, MetricRegistry} import org.apache.spark.internal.{config, Logging} import org.apache.spark.internal.config._ @@ -135,14 +135,14 @@ private[spark] class ExecutorAllocationManager( validateSettings() // Number of executors to add for each ResourceProfile in the next round - private val numExecutorsToAddPerResourceProfileId = new mutable.HashMap[Int, Int] + private[spark] val numExecutorsToAddPerResourceProfileId = new mutable.HashMap[Int, Int] numExecutorsToAddPerResourceProfileId(defaultProfileId) = 1 // The desired number of executors at this moment in time. If all our executors were to die, this // is the number of executors we would immediately want from the cluster manager. // Note every profile will be allowed to have initial number, // we may want to make this configurable per Profile in the future - private val numExecutorsTargetPerResourceProfileId = new mutable.HashMap[Int, Int] + private[spark] val numExecutorsTargetPerResourceProfileId = new mutable.HashMap[Int, Int] numExecutorsTargetPerResourceProfileId(defaultProfileId) = initialNumExecutors // A timestamp of when an addition should be triggered, or NOT_SET if it is not set @@ -155,14 +155,15 @@ private[spark] class ExecutorAllocationManager( // Listener for Spark events that impact the allocation policy val listener = new ExecutorAllocationListener - val executorMonitor = new ExecutorMonitor(conf, client, listenerBus, clock) - // Executor that handles the scheduling task. private val executor = ThreadUtils.newDaemonSingleThreadScheduledExecutor("spark-dynamic-executor-allocation") // Metric source for ExecutorAllocationManager to expose internal status to MetricsSystem. - val executorAllocationManagerSource = new ExecutorAllocationManagerSource + val executorAllocationManagerSource = new ExecutorAllocationManagerSource(this) + + val executorMonitor = +new ExecutorMonitor(conf, client, listenerBus, clock, executorAllocationManagerSource) // Whether we are sti
[spark] branch master updated: [SPARK-34104][SPARK-34105][CORE][K8S] Maximum decommissioning time & allow decommissioning for excludes
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 5248ecb [SPARK-34104][SPARK-34105][CORE][K8S] Maximum decommissioning time & allow decommissioning for excludes 5248ecb is described below commit 5248ecb5ab0c9fdbd5f333e751d1f5fb3c514d14 Author: Holden Karau AuthorDate: Tue Feb 9 18:16:09 2021 -0800 [SPARK-34104][SPARK-34105][CORE][K8S] Maximum decommissioning time & allow decommissioning for excludes ### What changes were proposed in this pull request? Allow users to have Spark attempt to decommission excluded executors. Since excluded executors may be flaky, this also adds the ability for users to specify a time limit after which a decommissioning executor will be killed by Spark. ### Why are the changes needed? This may help prevent fetch failures from excluded executors, and also handle the situation in which executors ### Does this PR introduce _any_ user-facing change? Yes, two new configuration flags for the behaviour. ### How was this patch tested? Extended unit and integration tests. Closes #31539 from holdenk/re=enable-SPARK-34104-SPARK-34105. Authored-by: Holden Karau Signed-off-by: Holden Karau --- .../apache/spark/ExecutorAllocationClient.scala| 6 +++ .../org/apache/spark/internal/config/package.scala | 19 +++- .../org/apache/spark/scheduler/HealthTracker.scala | 35 +++--- .../cluster/CoarseGrainedClusterMessage.scala | 3 ++ .../cluster/CoarseGrainedSchedulerBackend.scala| 56 -- .../spark/scheduler/HealthTrackerSuite.scala | 45 + .../k8s/integrationtest/DecommissionSuite.scala| 32 + .../k8s/integrationtest/KubernetesSuite.scala | 5 +- 8 files changed, 187 insertions(+), 14 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala b/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala index cdba1c4..5b587d7 100644 --- a/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala +++ b/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala @@ -129,6 +129,12 @@ private[spark] trait ExecutorAllocationClient { decommissionedExecutors.nonEmpty && decommissionedExecutors(0).equals(executorId) } + /** + * Request that the cluster manager decommission every executor on the specified host. + * + * @return whether the request is acknowledged by the cluster manager. + */ + def decommissionExecutorsOnHost(host: String): Boolean /** * Request that the cluster manager kill every executor on the specified host. diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala b/core/src/main/scala/org/apache/spark/internal/config/package.scala index 7aeb51d..3101bb6 100644 --- a/core/src/main/scala/org/apache/spark/internal/config/package.scala +++ b/core/src/main/scala/org/apache/spark/internal/config/package.scala @@ -827,6 +827,13 @@ package object config { .booleanConf .createWithDefault(false) + private[spark] val EXCLUDE_ON_FAILURE_DECOMMISSION_ENABLED = +ConfigBuilder("spark.excludeOnFailure.killExcludedExecutors.decommission") + .doc("Attempt decommission of excluded nodes instead of going directly to kill") + .version("3.2.0") + .booleanConf + .createWithDefault(false) + private[spark] val EXCLUDE_ON_FAILURE_LEGACY_TIMEOUT_CONF = ConfigBuilder("spark.scheduler.executorTaskExcludeOnFailureTime") .internal() @@ -1958,7 +1965,8 @@ package object config { private[spark] val EXECUTOR_DECOMMISSION_KILL_INTERVAL = ConfigBuilder("spark.executor.decommission.killInterval") - .doc("Duration after which a decommissioned executor will be killed forcefully." + + .doc("Duration after which a decommissioned executor will be killed forcefully " + +"*by an outside* (e.g. non-spark) service. " + "This config is useful for cloud environments where we know in advance when " + "an executor is going to go down after decommissioning signal i.e. around 2 mins " + "in aws spot nodes, 1/2 hrs in spot block nodes etc. This config is currently " + @@ -1967,6 +1975,15 @@ package object config { .timeConf(TimeUnit.SECONDS) .createOptional + private[spark] val EXECUTOR_DECOMMISSION_FORCE_KILL_TIMEOUT = +ConfigBuilder("spark.executor.decommission.forceKillTimeout") + .doc("Duration after which a Spark will force a decommissioning executor to exit." + +" this should be set to a high value in most
[spark] branch master updated: [SPARK-34104][SPARK-34105][CORE][K8S] Maximum decommissioning time & allow decommissioning for excludes
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 50641d2 [SPARK-34104][SPARK-34105][CORE][K8S] Maximum decommissioning time & allow decommissioning for excludes 50641d2 is described below commit 50641d2e3d659f51432aa2c0e6b9af76d71a5796 Author: Holden Karau AuthorDate: Tue Feb 9 14:21:24 2021 -0800 [SPARK-34104][SPARK-34105][CORE][K8S] Maximum decommissioning time & allow decommissioning for excludes ### What changes were proposed in this pull request? Allow users to have Spark attempt to decommission excluded executors. Since excluded executors may be flaky, this also adds the ability for users to specify a time limit after which a decommissioning executor will be killed by Spark. ### Why are the changes needed? This may help prevent fetch failures from excluded executors, and also handle the situation in which executors ### Does this PR introduce _any_ user-facing change? Yes, two new configuration flags for the behaviour. ### How was this patch tested? Extended unit and integration tests. Closes #31249 from holdenk/configure-inaccessibleList-kill-to-use-decommissioning. Lead-authored-by: Holden Karau Co-authored-by: Holden Karau Signed-off-by: Holden Karau --- .../apache/spark/ExecutorAllocationClient.scala| 6 +++ .../org/apache/spark/internal/config/package.scala | 19 +++- .../org/apache/spark/scheduler/HealthTracker.scala | 35 +++--- .../cluster/CoarseGrainedClusterMessage.scala | 3 ++ .../cluster/CoarseGrainedSchedulerBackend.scala| 56 -- .../spark/scheduler/HealthTrackerSuite.scala | 45 + .../k8s/integrationtest/DecommissionSuite.scala| 32 + .../k8s/integrationtest/KubernetesSuite.scala | 5 +- 8 files changed, 187 insertions(+), 14 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala b/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala index cdba1c4..5b587d7 100644 --- a/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala +++ b/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala @@ -129,6 +129,12 @@ private[spark] trait ExecutorAllocationClient { decommissionedExecutors.nonEmpty && decommissionedExecutors(0).equals(executorId) } + /** + * Request that the cluster manager decommission every executor on the specified host. + * + * @return whether the request is acknowledged by the cluster manager. + */ + def decommissionExecutorsOnHost(host: String): Boolean /** * Request that the cluster manager kill every executor on the specified host. diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala b/core/src/main/scala/org/apache/spark/internal/config/package.scala index 7aeb51d..3101bb6 100644 --- a/core/src/main/scala/org/apache/spark/internal/config/package.scala +++ b/core/src/main/scala/org/apache/spark/internal/config/package.scala @@ -827,6 +827,13 @@ package object config { .booleanConf .createWithDefault(false) + private[spark] val EXCLUDE_ON_FAILURE_DECOMMISSION_ENABLED = +ConfigBuilder("spark.excludeOnFailure.killExcludedExecutors.decommission") + .doc("Attempt decommission of excluded nodes instead of going directly to kill") + .version("3.2.0") + .booleanConf + .createWithDefault(false) + private[spark] val EXCLUDE_ON_FAILURE_LEGACY_TIMEOUT_CONF = ConfigBuilder("spark.scheduler.executorTaskExcludeOnFailureTime") .internal() @@ -1958,7 +1965,8 @@ package object config { private[spark] val EXECUTOR_DECOMMISSION_KILL_INTERVAL = ConfigBuilder("spark.executor.decommission.killInterval") - .doc("Duration after which a decommissioned executor will be killed forcefully." + + .doc("Duration after which a decommissioned executor will be killed forcefully " + +"*by an outside* (e.g. non-spark) service. " + "This config is useful for cloud environments where we know in advance when " + "an executor is going to go down after decommissioning signal i.e. around 2 mins " + "in aws spot nodes, 1/2 hrs in spot block nodes etc. This config is currently " + @@ -1967,6 +1975,15 @@ package object config { .timeConf(TimeUnit.SECONDS) .createOptional + private[spark] val EXECUTOR_DECOMMISSION_FORCE_KILL_TIMEOUT = +ConfigBuilder("spark.executor.decommission.forceKillTimeout") + .doc("Duration after which a Spark will force a decommissioning executor to exit." + +
[spark] branch master updated: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 2b51843 [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks 2b51843 is described below commit 2b51843ca41236f8cec29c406ea35ce1088364cf Author: Holden Karau AuthorDate: Tue Feb 9 10:21:56 2021 -0800 [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks ### What changes were proposed in this pull request? Allow users to configure a maximum amount of shuffle blocks to be stored and reject remote shuffle blocks when this threshold is exceeded. ### Why are the changes needed? In disk constrained environments with large amount of shuffle data, migrations may result in excessive disk pressure on the nodes. On Kube nodes this can result in cascading failures when combined with `emptyDir`. ### Does this PR introduce _any_ user-facing change? Yes, new configuration parameter. ### How was this patch tested? New unit tests. Closes #31493 from holdenk/SPARK-34337-reject-disk-blocks-when-under-disk-pressure. Lead-authored-by: Holden Karau Co-authored-by: Holden Karau Signed-off-by: Holden Karau --- .../org/apache/spark/internal/config/package.scala | 11 ++ .../spark/shuffle/IndexShuffleBlockResolver.scala | 20 +- .../apache/spark/shuffle/MigratableResolver.scala | 1 + .../apache/spark/storage/BlockManagerSuite.scala | 46 -- 4 files changed, 65 insertions(+), 13 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala b/core/src/main/scala/org/apache/spark/internal/config/package.scala index 1afad30..7aeb51d 100644 --- a/core/src/main/scala/org/apache/spark/internal/config/package.scala +++ b/core/src/main/scala/org/apache/spark/internal/config/package.scala @@ -488,6 +488,17 @@ package object config { .booleanConf .createWithDefault(false) + private[spark] val STORAGE_DECOMMISSION_SHUFFLE_MAX_DISK_SIZE = +ConfigBuilder("spark.storage.decommission.shuffleBlocks.maxDiskSize") + .doc("Maximum disk space to use to store shuffle blocks before rejecting remote " + +"shuffle blocks. Rejecting remote shuffle blocks means that an executor will not receive " + +"any shuffle migrations, and if there are no other executors available for migration " + +"then shuffle blocks will be lost unless " + +s"${STORAGE_DECOMMISSION_FALLBACK_STORAGE_PATH.key} is configured.") + .version("3.2.0") + .bytesConf(ByteUnit.BYTE) + .createOptional + private[spark] val STORAGE_REPLICATION_TOPOLOGY_FILE = ConfigBuilder("spark.storage.replication.topologyFile") .version("2.1.0") diff --git a/core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala b/core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala index 5f0bb42..d30b73a 100644 --- a/core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala +++ b/core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala @@ -22,8 +22,8 @@ import java.nio.ByteBuffer import java.nio.channels.Channels import java.nio.file.Files -import org.apache.spark.{SparkConf, SparkEnv} -import org.apache.spark.internal.Logging +import org.apache.spark.{SparkConf, SparkEnv, SparkException} +import org.apache.spark.internal.{config, Logging} import org.apache.spark.io.NioBufferedFileInputStream import org.apache.spark.network.buffer.{FileSegmentManagedBuffer, ManagedBuffer} import org.apache.spark.network.client.StreamCallbackWithID @@ -56,6 +56,8 @@ private[spark] class IndexShuffleBlockResolver( private val transportConf = SparkTransportConf.fromSparkConf(conf, "shuffle") + private val remoteShuffleMaxDisk: Option[Long] = +conf.get(config.STORAGE_DECOMMISSION_SHUFFLE_MAX_DISK_SIZE) def getDataFile(shuffleId: Int, mapId: Long): File = getDataFile(shuffleId, mapId, None) @@ -72,6 +74,13 @@ private[spark] class IndexShuffleBlockResolver( } } + private def getShuffleBytesStored(): Long = { +val shuffleFiles: Seq[File] = getStoredShuffles().map { + si => getDataFile(si.shuffleId, si.mapId) +} +shuffleFiles.map(_.length()).sum + } + /** * Get the shuffle data file. * @@ -173,6 +182,13 @@ private[spark] class IndexShuffleBlockResolver( */ override def putShuffleBlockAsStream(blockId: BlockId, serializerManager: SerializerManager): StreamCallbackWithID = { +// Throw an exception if we have exceeded maximum shuffle files stored +remoteShuffleMaxDisk.foreach { maxBytes
[spark] branch master updated: [SPARK-34209][SQL] Delegate table name validation to the session catalog
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new cf7a13c [SPARK-34209][SQL] Delegate table name validation to the session catalog cf7a13c is described below commit cf7a13c363ef5d56556c9d70e7811bf6a40de55f Author: Holden Karau AuthorDate: Tue Feb 9 10:15:16 2021 -0800 [SPARK-34209][SQL] Delegate table name validation to the session catalog ### What changes were proposed in this pull request? Delegate table name validation to the session catalog ### Why are the changes needed? Queerying of tables with nested namespaces. ### Does this PR introduce _any_ user-facing change? SQL queries of nested namespace queries ### How was this patch tested? Unit tests updated. Closes #31427 from holdenk/SPARK-34209-delegate-table-name-validation-to-the-catalog. Authored-by: Holden Karau Signed-off-by: Holden Karau --- .../sql/connector/catalog/LookupCatalog.scala | 5 -- .../spark/sql/connector/DataSourceV2SQLSuite.scala | 58 ++ .../spark/sql/execution/command/DDLSuite.scala | 2 +- 3 files changed, 38 insertions(+), 27 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala index 16416fa..af951a0 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala @@ -57,14 +57,9 @@ private[sql] trait LookupCatalog extends Logging { * Extract session catalog and identifier from a multi-part identifier. */ object SessionCatalogAndIdentifier { -import org.apache.spark.sql.connector.catalog.CatalogV2Implicits.MultipartIdentifierHelper def unapply(parts: Seq[String]): Option[(CatalogPlugin, Identifier)] = parts match { case CatalogAndIdentifier(catalog, ident) if CatalogV2Util.isSessionCatalog(catalog) => -if (ident.namespace.length != 1) { - throw new AnalysisException( -s"The namespace in session catalog must have exactly one name part: ${parts.quoted}") -} Some(catalog, ident) case _ => None } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala index 4d41020..7d67e1c 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala @@ -2085,8 +2085,7 @@ class DataSourceV2SQLSuite val e1 = intercept[AnalysisException] { sql("DESCRIBE FUNCTION default.ns1.ns2.fun") } -assert(e1.message.contains( - "The namespace in session catalog must have exactly one name part: default.ns1.ns2.fun")) +assert(e1.message.contains("Unsupported function name 'default.ns1.ns2.fun'")) } test("SHOW FUNCTIONS not valid v1 namespace") { @@ -2107,8 +2106,7 @@ class DataSourceV2SQLSuite val e1 = intercept[AnalysisException] { sql("DROP FUNCTION default.ns1.ns2.fun") } -assert(e1.message.contains( - "The namespace in session catalog must have exactly one name part: default.ns1.ns2.fun")) +assert(e1.message.contains("Unsupported function name 'default.ns1.ns2.fun'")) } test("CREATE FUNCTION: only support session catalog") { @@ -2120,8 +2118,7 @@ class DataSourceV2SQLSuite val e1 = intercept[AnalysisException] { sql("CREATE FUNCTION default.ns1.ns2.fun as 'f'") } -assert(e1.message.contains( - "The namespace in session catalog must have exactly one name part: default.ns1.ns2.fun")) +assert(e1.message.contains("Unsupported function name 'default.ns1.ns2.fun'")) } test("REFRESH FUNCTION: only support session catalog") { @@ -2134,7 +2131,7 @@ class DataSourceV2SQLSuite sql("REFRESH FUNCTION default.ns1.ns2.fun") } assert(e1.message.contains( - "The namespace in session catalog must have exactly one name part: default.ns1.ns2.fun")) + "Unsupported function name 'default.ns1.ns2.fun'")) } test("global temp view should not be masked by v2 catalog") { @@ -2172,7 +2169,7 @@ class DataSourceV2SQLSuite sql(s"CREATE TABLE $globalTempDB.ns1.ns2.tbl (id bigint, data string) USING json") } assert(e.message.contains( - &q
[spark] branch branch-3.1 updated: [SPARK-34334][K8S] Correctly identify timed out pending pod requests as excess request
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 5d0c84a [SPARK-34334][K8S] Correctly identify timed out pending pod requests as excess request 5d0c84a is described below commit 5d0c84adaf10e2e946b988b51a95cd8a11e795c6 Author: “attilapiros” AuthorDate: Tue Feb 9 10:06:55 2021 -0800 [SPARK-34334][K8S] Correctly identify timed out pending pod requests as excess request ### What changes were proposed in this pull request? Fixing identification of timed-out pending pod requests as excess requests to delete when the excess is higher than the newly created timed out requests and there is some non-timed out newly created requests too. ### Why are the changes needed? After https://github.com/apache/spark/pull/29981 only timed out newly created requests and timed out pending requests are taken as excess request. But there is small bug when the excess is higher than the newly created timed out requests and there is some non-timed out newly created requests as well. Because all the newly created requests are counted as excess request when items are chosen from the timed out pod pending requests. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? There is new unit test added: `SPARK-34334: correctly identify timed out pending pod requests as excess`. Closes #31445 from attilapiros/SPARK-34334. Authored-by: “attilapiros” Signed-off-by: Holden Karau (cherry picked from commit b2dc38b6546552cf3fcfdcd466d7d04d9aa3078c) Signed-off-by: Holden Karau --- .../cluster/k8s/ExecutorPodsAllocator.scala| 11 +++--- .../cluster/k8s/ExecutorPodsAllocatorSuite.scala | 45 ++ 2 files changed, 51 insertions(+), 5 deletions(-) diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala index f4cd2d0..eb35de8 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala @@ -223,14 +223,15 @@ private[spark] class ExecutorPodsAllocator( if (knownPodCount > targetNum) { val excess = knownPodCount - targetNum +val newlyCreatedToDelete = newlyCreatedExecutorsForRpId + .filter { case (_, (_, createTime)) => +currentTime - createTime > executorIdleTimeout + }.keys.take(excess).toList val knownPendingToDelete = currentPendingExecutors .filter(x => isExecutorIdleTimedOut(x._2, currentTime)) + .take(excess - newlyCreatedToDelete.size) .map { case (id, _) => id } - .take(excess - newlyCreatedExecutorsForRpId.size) -val toDelete = newlyCreatedExecutorsForRpId - .filter { case (_, (_, createTime)) => -currentTime - createTime > executorIdleTimeout - }.keys.take(excess).toList ++ knownPendingToDelete +val toDelete = newlyCreatedToDelete ++ knownPendingToDelete if (toDelete.nonEmpty) { logInfo(s"Deleting ${toDelete.size} excess pod requests (${toDelete.mkString(",")}).") diff --git a/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala b/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala index 8401f71..c7e2f1a 100644 --- a/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala +++ b/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala @@ -216,6 +216,51 @@ class ExecutorPodsAllocatorSuite extends SparkFunSuite with BeforeAndAfter { assert(!podsAllocatorUnderTest.isDeleted("4")) } + test("SPARK-34334: correctly identify timed out pending pod requests as excess") { +when(podOperations + .withField("status.phase", "Pending")) + .thenReturn(podOperations) +when(podOperations + .withLabel(SPARK_APP_ID_LABEL, TEST_SPARK_APP_ID)) + .thenReturn(podOperations) +when(podOperations + .withLabel(SPARK_ROLE_LABEL, SPARK_POD_EXECUTOR_ROLE)) + .thenReturn(podOperations) +when(podOperations + .withLabelIn(meq(SPARK_EXECUTOR_ID_LABEL), any())) + .thenReturn(podOperations) + +val sta
[spark] branch master updated: [SPARK-34334][K8S] Correctly identify timed out pending pod requests as excess request
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new b2dc38b [SPARK-34334][K8S] Correctly identify timed out pending pod requests as excess request b2dc38b is described below commit b2dc38b6546552cf3fcfdcd466d7d04d9aa3078c Author: “attilapiros” AuthorDate: Tue Feb 9 10:06:55 2021 -0800 [SPARK-34334][K8S] Correctly identify timed out pending pod requests as excess request ### What changes were proposed in this pull request? Fixing identification of timed-out pending pod requests as excess requests to delete when the excess is higher than the newly created timed out requests and there is some non-timed out newly created requests too. ### Why are the changes needed? After https://github.com/apache/spark/pull/29981 only timed out newly created requests and timed out pending requests are taken as excess request. But there is small bug when the excess is higher than the newly created timed out requests and there is some non-timed out newly created requests as well. Because all the newly created requests are counted as excess request when items are chosen from the timed out pod pending requests. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? There is new unit test added: `SPARK-34334: correctly identify timed out pending pod requests as excess`. Closes #31445 from attilapiros/SPARK-34334. Authored-by: “attilapiros” Signed-off-by: Holden Karau --- .../cluster/k8s/ExecutorPodsAllocator.scala| 11 +++--- .../cluster/k8s/ExecutorPodsAllocatorSuite.scala | 45 ++ 2 files changed, 51 insertions(+), 5 deletions(-) diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala index f4cd2d0..eb35de8 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala @@ -223,14 +223,15 @@ private[spark] class ExecutorPodsAllocator( if (knownPodCount > targetNum) { val excess = knownPodCount - targetNum +val newlyCreatedToDelete = newlyCreatedExecutorsForRpId + .filter { case (_, (_, createTime)) => +currentTime - createTime > executorIdleTimeout + }.keys.take(excess).toList val knownPendingToDelete = currentPendingExecutors .filter(x => isExecutorIdleTimedOut(x._2, currentTime)) + .take(excess - newlyCreatedToDelete.size) .map { case (id, _) => id } - .take(excess - newlyCreatedExecutorsForRpId.size) -val toDelete = newlyCreatedExecutorsForRpId - .filter { case (_, (_, createTime)) => -currentTime - createTime > executorIdleTimeout - }.keys.take(excess).toList ++ knownPendingToDelete +val toDelete = newlyCreatedToDelete ++ knownPendingToDelete if (toDelete.nonEmpty) { logInfo(s"Deleting ${toDelete.size} excess pod requests (${toDelete.mkString(",")}).") diff --git a/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala b/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala index d4d8980..eaf5fd8 100644 --- a/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala +++ b/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala @@ -216,6 +216,51 @@ class ExecutorPodsAllocatorSuite extends SparkFunSuite with BeforeAndAfter { assert(!podsAllocatorUnderTest.isDeleted("4")) } + test("SPARK-34334: correctly identify timed out pending pod requests as excess") { +when(podOperations + .withField("status.phase", "Pending")) + .thenReturn(podOperations) +when(podOperations + .withLabel(SPARK_APP_ID_LABEL, TEST_SPARK_APP_ID)) + .thenReturn(podOperations) +when(podOperations + .withLabel(SPARK_ROLE_LABEL, SPARK_POD_EXECUTOR_ROLE)) + .thenReturn(podOperations) +when(podOperations + .withLabelIn(meq(SPARK_EXECUTOR_ID_LABEL), any())) + .thenReturn(podOperations) + +val startTime = Instant.now.toEpochMilli +waitForExecutorPodsClock.setTime(startTime) + +podsAllocatorUnderTest.setT
[spark] branch branch-3.1 updated: [SPARK-32866][K8S] Fix docker cross-build
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 697bdca [SPARK-32866][K8S] Fix docker cross-build 697bdca is described below commit 697bdca292105a9e6efaa1f44e8aec3478bbb3ae Author: Holden Karau AuthorDate: Thu Jan 28 11:57:42 2021 -0800 [SPARK-32866][K8S] Fix docker cross-build ### What changes were proposed in this pull request? Add `--push` to the docker script for buildx ### Why are the changes needed? Doing a separate docker push with `buildx` images doesn't work. ### Does this PR introduce _any_ user-facing change? Automatically pushes work when cross-building. ### How was this patch tested? cross-built docker containers Closes #31299 from holdenk/SPARK-32866-docker-buildx-update. Authored-by: Holden Karau Signed-off-by: Holden Karau (cherry picked from commit 497f599a37d2250c14a7bd2699bd3ac65bd08a58) Signed-off-by: Holden Karau --- bin/docker-image-tool.sh | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/bin/docker-image-tool.sh b/bin/docker-image-tool.sh index 2ec1ab8..83b13b8 100755 --- a/bin/docker-image-tool.sh +++ b/bin/docker-image-tool.sh @@ -181,7 +181,7 @@ function build { error "Failed to build Spark JVM Docker image, please refer to Docker build output for details." fi if [ "${CROSS_BUILD}" != "false" ]; then - (cd $(img_ctx_dir base) && docker buildx build $ARCHS $NOCACHEARG "${BUILD_ARGS[@]}" \ + (cd $(img_ctx_dir base) && docker buildx build $ARCHS $NOCACHEARG "${BUILD_ARGS[@]}" --push \ -t $(image_ref spark) \ -f "$BASEDOCKERFILE" .) fi @@ -194,7 +194,7 @@ function build { error "Failed to build PySpark Docker image, please refer to Docker build output for details." fi if [ "${CROSS_BUILD}" != "false" ]; then -(cd $(img_ctx_dir pyspark) && docker buildx build $ARCHS $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \ +(cd $(img_ctx_dir pyspark) && docker buildx build $ARCHS $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" --push \ -t $(image_ref spark-py) \ -f "$PYDOCKERFILE" .) fi @@ -208,7 +208,7 @@ function build { error "Failed to build SparkR Docker image, please refer to Docker build output for details." fi if [ "${CROSS_BUILD}" != "false" ]; then - (cd $(img_ctx_dir sparkr) && docker buildx build $ARCHS $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \ + (cd $(img_ctx_dir sparkr) && docker buildx build $ARCHS $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" --push \ -t $(image_ref spark-r) \ -f "$RDOCKERFILE" .) fi - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-32866][K8S] Fix docker cross-build
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 497f599 [SPARK-32866][K8S] Fix docker cross-build 497f599 is described below commit 497f599a37d2250c14a7bd2699bd3ac65bd08a58 Author: Holden Karau AuthorDate: Thu Jan 28 11:57:42 2021 -0800 [SPARK-32866][K8S] Fix docker cross-build ### What changes were proposed in this pull request? Add `--push` to the docker script for buildx ### Why are the changes needed? Doing a separate docker push with `buildx` images doesn't work. ### Does this PR introduce _any_ user-facing change? Automatically pushes work when cross-building. ### How was this patch tested? cross-built docker containers Closes #31299 from holdenk/SPARK-32866-docker-buildx-update. Authored-by: Holden Karau Signed-off-by: Holden Karau --- bin/docker-image-tool.sh | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/bin/docker-image-tool.sh b/bin/docker-image-tool.sh index 2ec1ab8..83b13b8 100755 --- a/bin/docker-image-tool.sh +++ b/bin/docker-image-tool.sh @@ -181,7 +181,7 @@ function build { error "Failed to build Spark JVM Docker image, please refer to Docker build output for details." fi if [ "${CROSS_BUILD}" != "false" ]; then - (cd $(img_ctx_dir base) && docker buildx build $ARCHS $NOCACHEARG "${BUILD_ARGS[@]}" \ + (cd $(img_ctx_dir base) && docker buildx build $ARCHS $NOCACHEARG "${BUILD_ARGS[@]}" --push \ -t $(image_ref spark) \ -f "$BASEDOCKERFILE" .) fi @@ -194,7 +194,7 @@ function build { error "Failed to build PySpark Docker image, please refer to Docker build output for details." fi if [ "${CROSS_BUILD}" != "false" ]; then -(cd $(img_ctx_dir pyspark) && docker buildx build $ARCHS $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \ +(cd $(img_ctx_dir pyspark) && docker buildx build $ARCHS $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" --push \ -t $(image_ref spark-py) \ -f "$PYDOCKERFILE" .) fi @@ -208,7 +208,7 @@ function build { error "Failed to build SparkR Docker image, please refer to Docker build output for details." fi if [ "${CROSS_BUILD}" != "false" ]; then - (cd $(img_ctx_dir sparkr) && docker buildx build $ARCHS $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \ + (cd $(img_ctx_dir sparkr) && docker buildx build $ARCHS $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" --push \ -t $(image_ref spark-r) \ -f "$RDOCKERFILE" .) fi - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-33711][K8S] Avoid race condition between POD lifecycle manager and scheduler backend
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 7da0164 [SPARK-33711][K8S] Avoid race condition between POD lifecycle manager and scheduler backend 7da0164 is described below commit 7da0164824dc6d99d48dcddfcbaef987fe09765e Author: “attilapiros” AuthorDate: Mon Jan 11 14:25:12 2021 -0800 [SPARK-33711][K8S] Avoid race condition between POD lifecycle manager and scheduler backend ### What changes were proposed in this pull request? Missing POD detection is extended by timestamp (and time limit) based check to avoid wrongfully detection of missing POD detection. The two new timestamps: - `fullSnapshotTs` is introduced for the `ExecutorPodsSnapshot` which only updated by the pod polling snapshot source - `registrationTs` is introduced for the `ExecutorData` and it is initialized at the executor registration at the scheduler backend Moreover a new config `spark.kubernetes.executor.missingPodDetectDelta` is used to specify the accepted delta between the two. ### Why are the changes needed? Watching a POD (`ExecutorPodsWatchSnapshotSource`) only inform about single POD changes. This could wrongfully lead to detecting of missing PODs (PODs known by scheduler backend but missing from POD snapshots) by the executor POD lifecycle manager. A key indicator of this error is seeing this log message: > "The executor with ID [some_id] was not found in the cluster but we didn't get a reason why. Marking the executor as failed. The executor may have been deleted but the driver missed the deletion event." So one of the problem is running the missing POD detection check even when a single POD is changed without having a full consistent snapshot about all the PODs (see `ExecutorPodsPollingSnapshotSource`). The other problem could be the race between the executor POD lifecycle manager and the scheduler backend: so even in case of a having a full snapshot the registration at the scheduler backend could precede the snapshot polling (and processing of those polled snapshots). ### Does this PR introduce _any_ user-facing change? Yes. When the POD is missing then the reason message explaining the executor's exit is extended with both timestamps (the polling time and the executor registration time) and even the new config is mentioned. ### How was this patch tested? The existing unit tests are extended. Closes #30675 from attilapiros/SPARK-33711. Authored-by: “attilapiros” Signed-off-by: Holden Karau (cherry picked from commit 6bd7a6200f8beaab1c68b2469df05870ea788d49) Signed-off-by: Holden Karau --- .../cluster/CoarseGrainedSchedulerBackend.scala| 6 ++- .../spark/scheduler/cluster/ExecutorData.scala | 4 +- .../scala/org/apache/spark/deploy/k8s/Config.scala | 11 + .../cluster/k8s/ExecutorPodsAllocator.scala| 2 +- .../cluster/k8s/ExecutorPodsLifecycleManager.scala | 50 +++--- .../cluster/k8s/ExecutorPodsSnapshot.scala | 12 +++--- .../k8s/ExecutorPodsSnapshotsStoreImpl.scala | 8 +++- .../DeterministicExecutorPodsSnapshotsStore.scala | 8 ++-- .../k8s/ExecutorPodsLifecycleManagerSuite.scala| 22 +++--- .../cluster/k8s/ExecutorPodsSnapshotSuite.scala| 4 +- .../k8s/ExecutorPodsSnapshotsStoreSuite.scala | 33 -- 11 files changed, 101 insertions(+), 59 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala index 2bd0b4c..ccb5eb1 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala @@ -239,7 +239,7 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp } val data = new ExecutorData(executorRef, executorAddress, hostname, 0, cores, logUrlHandler.applyPattern(logUrls, attributes), attributes, -resourcesInfo, resourceProfileId) +resourcesInfo, resourceProfileId, registrationTs = System.currentTimeMillis()) // This must be synchronized because variables mutated // in this block are read when requesting executors CoarseGrainedSchedulerBackend.this.synchronized { @@ -629,6 +629,10 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp executorDataMap.keySet.toSeq } + def getExecutorsWithRegistrationTs(): Map[String, Long] = synchronized { +executorDataMap.mapValues(v =&g
[spark] branch master updated: [SPARK-33711][K8S] Avoid race condition between POD lifecycle manager and scheduler backend
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6bd7a62 [SPARK-33711][K8S] Avoid race condition between POD lifecycle manager and scheduler backend 6bd7a62 is described below commit 6bd7a6200f8beaab1c68b2469df05870ea788d49 Author: “attilapiros” AuthorDate: Mon Jan 11 14:25:12 2021 -0800 [SPARK-33711][K8S] Avoid race condition between POD lifecycle manager and scheduler backend ### What changes were proposed in this pull request? Missing POD detection is extended by timestamp (and time limit) based check to avoid wrongfully detection of missing POD detection. The two new timestamps: - `fullSnapshotTs` is introduced for the `ExecutorPodsSnapshot` which only updated by the pod polling snapshot source - `registrationTs` is introduced for the `ExecutorData` and it is initialized at the executor registration at the scheduler backend Moreover a new config `spark.kubernetes.executor.missingPodDetectDelta` is used to specify the accepted delta between the two. ### Why are the changes needed? Watching a POD (`ExecutorPodsWatchSnapshotSource`) only inform about single POD changes. This could wrongfully lead to detecting of missing PODs (PODs known by scheduler backend but missing from POD snapshots) by the executor POD lifecycle manager. A key indicator of this error is seeing this log message: > "The executor with ID [some_id] was not found in the cluster but we didn't get a reason why. Marking the executor as failed. The executor may have been deleted but the driver missed the deletion event." So one of the problem is running the missing POD detection check even when a single POD is changed without having a full consistent snapshot about all the PODs (see `ExecutorPodsPollingSnapshotSource`). The other problem could be the race between the executor POD lifecycle manager and the scheduler backend: so even in case of a having a full snapshot the registration at the scheduler backend could precede the snapshot polling (and processing of those polled snapshots). ### Does this PR introduce _any_ user-facing change? Yes. When the POD is missing then the reason message explaining the executor's exit is extended with both timestamps (the polling time and the executor registration time) and even the new config is mentioned. ### How was this patch tested? The existing unit tests are extended. Closes #30675 from attilapiros/SPARK-33711. Authored-by: “attilapiros” Signed-off-by: Holden Karau --- .../cluster/CoarseGrainedSchedulerBackend.scala| 6 ++- .../spark/scheduler/cluster/ExecutorData.scala | 4 +- .../scala/org/apache/spark/deploy/k8s/Config.scala | 11 + .../cluster/k8s/ExecutorPodsAllocator.scala| 2 +- .../cluster/k8s/ExecutorPodsLifecycleManager.scala | 50 +++--- .../cluster/k8s/ExecutorPodsSnapshot.scala | 12 +++--- .../k8s/ExecutorPodsSnapshotsStoreImpl.scala | 8 +++- .../DeterministicExecutorPodsSnapshotsStore.scala | 8 ++-- .../k8s/ExecutorPodsLifecycleManagerSuite.scala| 22 +++--- .../cluster/k8s/ExecutorPodsSnapshotSuite.scala| 4 +- .../k8s/ExecutorPodsSnapshotsStoreSuite.scala | 33 -- 11 files changed, 101 insertions(+), 59 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala index 2bd0b4c..ccb5eb1 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala @@ -239,7 +239,7 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp } val data = new ExecutorData(executorRef, executorAddress, hostname, 0, cores, logUrlHandler.applyPattern(logUrls, attributes), attributes, -resourcesInfo, resourceProfileId) +resourcesInfo, resourceProfileId, registrationTs = System.currentTimeMillis()) // This must be synchronized because variables mutated // in this block are read when requesting executors CoarseGrainedSchedulerBackend.this.synchronized { @@ -629,6 +629,10 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp executorDataMap.keySet.toSeq } + def getExecutorsWithRegistrationTs(): Map[String, Long] = synchronized { +executorDataMap.mapValues(v => v.registrationTs).toMap + } + override def isExecutorActive(id: String): Boolean = synchronized { e
[spark] branch master updated: [SPARK-33261][K8S] Add a developer API for custom feature steps
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 5885cc1 [SPARK-33261][K8S] Add a developer API for custom feature steps 5885cc1 is described below commit 5885cc15cae9c9780530e235d2bd4bd6beda5dbb Author: Holden Karau AuthorDate: Mon Dec 14 12:05:28 2020 -0800 [SPARK-33261][K8S] Add a developer API for custom feature steps ### What changes were proposed in this pull request? Add a developer API for custom driver & executor feature steps. ### Why are the changes needed? While we allow templates for the basis of pod creation, some deployments need more flexibility in how the pods are configured. This adds a developer API for custom deployments. ### Does this PR introduce _any_ user-facing change? New developer API. ### How was this patch tested? Extended tests to verify custom step is applied when configured. Closes #30206 from holdenk/SPARK-33261-allow-people-to-extend-pod-feature-steps. Authored-by: Holden Karau Signed-off-by: Holden Karau --- .../scala/org/apache/spark/deploy/k8s/Config.scala | 20 ++ .../org/apache/spark/deploy/k8s/SparkPod.scala | 11 +++- .../k8s/features/KubernetesFeatureConfigStep.scala | 7 +- .../k8s/submit/KubernetesDriverBuilder.scala | 8 ++- .../cluster/k8s/KubernetesExecutorBuilder.scala| 8 ++- .../apache/spark/deploy/k8s/PodBuilderSuite.scala | 76 ++ .../k8s/submit/KubernetesDriverBuilderSuite.scala | 5 +- .../k8s/KubernetesExecutorBuilderSuite.scala | 4 ++ 8 files changed, 134 insertions(+), 5 deletions(-) diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala index c28d6fd..40609ae 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala @@ -219,6 +219,26 @@ private[spark] object Config extends Logging { .stringConf .createOptional + val KUBERNETES_DRIVER_POD_FEATURE_STEPS = +ConfigBuilder("spark.kubernetes.driver.pod.featureSteps") + .doc("Class names of an extra driver pod feature step implementing " + +"KubernetesFeatureConfigStep. This is a developer API. Comma separated. " + +"Runs after all of Spark internal feature steps.") + .version("3.2.0") + .stringConf + .toSequence + .createWithDefault(Nil) + + val KUBERNETES_EXECUTOR_POD_FEATURE_STEPS = +ConfigBuilder("spark.kubernetes.executor.pod.featureSteps") + .doc("Class name of an extra executor pod feature step implementing " + +"KubernetesFeatureConfigStep. This is a developer API. Comma separated. " + +"Runs after all of Spark internal feature steps.") + .version("3.2.0") + .stringConf + .toSequence + .createWithDefault(Nil) + val KUBERNETES_ALLOCATION_BATCH_SIZE = ConfigBuilder("spark.kubernetes.allocation.batch.size") .doc("Number of pods to launch at once in each round of executor allocation.") diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkPod.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkPod.scala index fd11963..c2298e7 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkPod.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkPod.scala @@ -18,7 +18,16 @@ package org.apache.spark.deploy.k8s import io.fabric8.kubernetes.api.model.{Container, ContainerBuilder, Pod, PodBuilder} -private[spark] case class SparkPod(pod: Pod, container: Container) { +import org.apache.spark.annotation.{DeveloperApi, Unstable} + +/** + * :: DeveloperApi :: + * + * Represents a SparkPod consisting of pod and the container within the pod. + */ +@Unstable +@DeveloperApi +case class SparkPod(pod: Pod, container: Container) { /** * Convenience method to apply a series of chained transformations to a pod. diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KubernetesFeatureConfigStep.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KubernetesFeatureConfigStep.scala index 58cdaa3..3fec926 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KubernetesFeatureConfigStep.scala +++ b/resource-managers/kuber
[spark] branch master updated: [SPARK-33716][K8S] Fix potential race condition during pod termination
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 15486fa [SPARK-33716][K8S] Fix potential race condition during pod termination 15486fa is described below commit 15486fa970aa104e285cae0379a110f3795f3eaa Author: Holden Karau AuthorDate: Fri Dec 11 14:43:57 2020 -0800 [SPARK-33716][K8S] Fix potential race condition during pod termination ### What changes were proposed in this pull request? Check that the pod state is not pending or running even if there is a deletion timestamp. ### Why are the changes needed? This can occur when the pod state and deletion timestamp are not updated by etcd in sync & we get a pod snapshot during an inconsistent view. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manual testing with local version of Minikube on an overloaded computer that caused out of sync updates. Closes #30693 from holdenk/SPARK-33716-decommissioning-race-condition-during-pod-snapshot. Authored-by: Holden Karau Signed-off-by: Holden Karau --- .../org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshot.scala | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshot.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshot.scala index be75311..e81d213 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshot.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsSnapshot.scala @@ -93,7 +93,8 @@ object ExecutorPodsSnapshot extends Logging { ( pod.getStatus == null || pod.getStatus.getPhase == null || -pod.getStatus.getPhase.toLowerCase(Locale.ROOT) != "terminating" + (pod.getStatus.getPhase.toLowerCase(Locale.ROOT) != "terminating" && + pod.getStatus.getPhase.toLowerCase(Locale.ROOT) != "running") )) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-32381][CORE][SQL][FOLLOWUP] More cleanup on HadoopFSUtils
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 27cd945 [SPARK-32381][CORE][SQL][FOLLOWUP] More cleanup on HadoopFSUtils 27cd945 is described below commit 27cd945c151dccb5ac863e6bc2c4f5b2c6a6d996 Author: Chao Sun AuthorDate: Wed Nov 18 12:39:00 2020 -0800 [SPARK-32381][CORE][SQL][FOLLOWUP] More cleanup on HadoopFSUtils ### What changes were proposed in this pull request? This PR is a follow-up of #29471 and does the following improvements for `HadoopFSUtils`: 1. Removes the extra `filterFun` from the listing API and combines it with the `filter`. 2. Removes `SerializableBlockLocation` and `SerializableFileStatus` given that `BlockLocation` and `FileStatus` are already serializable. 3. Hides the `isRootLevel` flag from the top-level API. ### Why are the changes needed? Main purpose is to simplify the logic within `HadoopFSUtils` as well as cleanup the API. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing unit tests (e.g., `FileIndexSuite`) Closes #29959 from sunchao/hadoop-fs-utils-followup. Authored-by: Chao Sun Signed-off-by: Holden Karau --- .../org/apache/spark/util/HadoopFSUtils.scala | 104 - .../spark/sql/execution/command/CommandUtils.scala | 2 +- .../execution/datasources/InMemoryFileIndex.scala | 19 ++-- 3 files changed, 31 insertions(+), 94 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala b/core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala index c0a135e..a3a528c 100644 --- a/core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala +++ b/core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala @@ -27,7 +27,6 @@ import org.apache.hadoop.fs.viewfs.ViewFileSystem import org.apache.hadoop.hdfs.DistributedFileSystem import org.apache.spark._ -import org.apache.spark.annotation.Private import org.apache.spark.internal.Logging import org.apache.spark.metrics.source.HiveCatalogMetrics @@ -45,8 +44,6 @@ private[spark] object HadoopFSUtils extends Logging { * @param paths Input paths to list * @param hadoopConf Hadoop configuration * @param filter Path filter used to exclude leaf files from result - * @param isRootLevel Whether the input paths are at the root level, i.e., they are the root - *paths as opposed to nested paths encountered during recursive calls of this. * @param ignoreMissingFiles Ignore missing files that happen during recursive listing * (e.g., due to race conditions) * @param ignoreLocality Whether to fetch data locality info when listing leaf files. If false, @@ -57,11 +54,22 @@ private[spark] object HadoopFSUtils extends Logging { * @param parallelismMax The maximum parallelism for listing. If the number of input paths is * larger than this value, parallelism will be throttled to this value * to avoid generating too many tasks. - * @param filterFun Optional predicate on the leaf files. Files who failed the check will be - * excluded from the results * @return for each input path, the set of discovered files for the path */ def parallelListLeafFiles( +sc: SparkContext, +paths: Seq[Path], +hadoopConf: Configuration, +filter: PathFilter, +ignoreMissingFiles: Boolean, +ignoreLocality: Boolean, +parallelismThreshold: Int, +parallelismMax: Int): Seq[(Path, Seq[FileStatus])] = { +parallelListLeafFilesInternal(sc, paths, hadoopConf, filter, isRootLevel = true, + ignoreMissingFiles, ignoreLocality, parallelismThreshold, parallelismMax) + } + + private def parallelListLeafFilesInternal( sc: SparkContext, paths: Seq[Path], hadoopConf: Configuration, @@ -70,8 +78,7 @@ private[spark] object HadoopFSUtils extends Logging { ignoreMissingFiles: Boolean, ignoreLocality: Boolean, parallelismThreshold: Int, - parallelismMax: Int, - filterFun: Option[String => Boolean] = None): Seq[(Path, Seq[FileStatus])] = { + parallelismMax: Int): Seq[(Path, Seq[FileStatus])] = { // Short-circuits parallel listing when serial listing is likely to be faster. if (paths.size <= parallelismThreshold) { @@ -85,8 +92,7 @@ private[spark] object HadoopFSUtils extends Logging { ignoreLocality = ignoreLocality, isRootPath = isRootLevel, parallelismThreshold = parallelismThreshold, - parallelismMax = parallelismMax, - filterFun = filterFun) + parallelismMax = parallelismMax) (path, leafFiles) } } @@ -
[spark] branch branch-3.0 updated: [SPARK-30821][K8S] Handle executor failure with multiple containers
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new f7c7f4f8 [SPARK-30821][K8S] Handle executor failure with multiple containers f7c7f4f8 is described below commit f7c7f4f8b6fe7ed55c1aed5494311ba23844d019 Author: Shiqi Sun AuthorDate: Sat Oct 24 09:55:57 2020 -0700 [SPARK-30821][K8S] Handle executor failure with multiple containers Handle executor failure with multiple containers Added a spark property spark.kubernetes.executor.checkAllContainers, with default being false. When it's true, the executor snapshot will take all containers in the executor into consideration when deciding whether the executor is in "Running" state, if the pod restart policy is "Never". Also, added the new spark property to the doc. ### What changes were proposed in this pull request? Checking of all containers in the executor pod when reporting executor status, if the `spark.kubernetes.executor.checkAllContainers` property is set to true. ### Why are the changes needed? Currently, a pod remains "running" as long as there is at least one running container. This prevents Spark from noticing when a container has failed in an executor pod with multiple containers. With this change, user can configure the behavior to be different. Namely, if any container in the executor pod has failed, either the executor process or one of its sidecars, the pod is considered to be failed, and it will be rescheduled. ### Does this PR introduce _any_ user-facing change? Yes, new spark property added. User is now able to choose whether to turn on this feature using the `spark.kubernetes.executor.checkAllContainers` property. ### How was this patch tested? Unit test was added and all passed. I tried to run integration test by following the instruction [here](https://spark.apache.org/developer-tools.html) (section "Testing K8S") and also [here](https://github.com/apache/spark/blob/master/resource-managers/kubernetes/integration-tests/README.md), but I wasn't able to run it smoothly as it fails to talk with minikube cluster. Maybe it's because my minikube version is too new (I'm using v1.13.1)...? Since I've been trying it for two days and still can't make it work, I decide [...] Closes #29924 from huskysun/exec-sidecar-failure. Authored-by: Shiqi Sun Signed-off-by: Holden Karau (cherry picked from commit f65952772702f0a8772c93b79f562f35c337f5a5) Signed-off-by: Holden Karau --- docs/running-on-kubernetes.md | 8 .../scala/org/apache/spark/deploy/k8s/Config.scala | 8 .../cluster/k8s/ExecutorPodsSnapshot.scala | 16 ++- .../cluster/k8s/KubernetesClusterManager.scala | 3 ++ .../DeterministicExecutorPodsSnapshotsStore.scala | 2 + .../cluster/k8s/ExecutorLifecycleTestUtils.scala | 32 - .../cluster/k8s/ExecutorPodsSnapshotSuite.scala| 56 +++--- .../k8s/ExecutorPodsSnapshotsStoreSuite.scala | 1 + 8 files changed, 108 insertions(+), 18 deletions(-) diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md index e6a000d..f49f953 100644 --- a/docs/running-on-kubernetes.md +++ b/docs/running-on-kubernetes.md @@ -1158,6 +1158,14 @@ See the [configuration page](configuration.html) for information on Spark config 3.0.0 + spark.kubernetes.executor.checkAllContainers + false + + Specify whether executor pods should be check all containers (including sidecars) or only the executor container when determining the pod status. + + 3.1.0 + + spark.kubernetes.submission.connectionTimeout 1 diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala index 22f4c75..b08875b 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala @@ -409,6 +409,14 @@ private[spark] object Config extends Logging { .stringConf .createOptional + val KUBERNETES_EXECUTOR_CHECK_ALL_CONTAINERS = +ConfigBuilder("spark.kubernetes.executor.checkAllContainers") + .doc("If set to true, all containers in the executor pod will be checked when reporting" + +"executor status.") + .version("3.1.0") + .booleanConf + .createWithDefault(false) + val KUBERNETES_DRIVER_LABEL_PREFIX = "spark.kubernetes.driver.label." val KUBERNE
[spark] branch master updated: [SPARK-30821][K8S] Handle executor failure with multiple containers
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new f659527 [SPARK-30821][K8S] Handle executor failure with multiple containers f659527 is described below commit f65952772702f0a8772c93b79f562f35c337f5a5 Author: Shiqi Sun AuthorDate: Sat Oct 24 09:55:57 2020 -0700 [SPARK-30821][K8S] Handle executor failure with multiple containers Handle executor failure with multiple containers Added a spark property spark.kubernetes.executor.checkAllContainers, with default being false. When it's true, the executor snapshot will take all containers in the executor into consideration when deciding whether the executor is in "Running" state, if the pod restart policy is "Never". Also, added the new spark property to the doc. ### What changes were proposed in this pull request? Checking of all containers in the executor pod when reporting executor status, if the `spark.kubernetes.executor.checkAllContainers` property is set to true. ### Why are the changes needed? Currently, a pod remains "running" as long as there is at least one running container. This prevents Spark from noticing when a container has failed in an executor pod with multiple containers. With this change, user can configure the behavior to be different. Namely, if any container in the executor pod has failed, either the executor process or one of its sidecars, the pod is considered to be failed, and it will be rescheduled. ### Does this PR introduce _any_ user-facing change? Yes, new spark property added. User is now able to choose whether to turn on this feature using the `spark.kubernetes.executor.checkAllContainers` property. ### How was this patch tested? Unit test was added and all passed. I tried to run integration test by following the instruction [here](https://spark.apache.org/developer-tools.html) (section "Testing K8S") and also [here](https://github.com/apache/spark/blob/master/resource-managers/kubernetes/integration-tests/README.md), but I wasn't able to run it smoothly as it fails to talk with minikube cluster. Maybe it's because my minikube version is too new (I'm using v1.13.1)...? Since I've been trying it for two days and still can't make it work, I decide [...] Closes #29924 from huskysun/exec-sidecar-failure. Authored-by: Shiqi Sun Signed-off-by: Holden Karau --- docs/running-on-kubernetes.md | 8 .../scala/org/apache/spark/deploy/k8s/Config.scala | 8 .../cluster/k8s/ExecutorPodsSnapshot.scala | 16 ++- .../cluster/k8s/KubernetesClusterManager.scala | 3 ++ .../DeterministicExecutorPodsSnapshotsStore.scala | 2 + .../cluster/k8s/ExecutorLifecycleTestUtils.scala | 32 - .../cluster/k8s/ExecutorPodsSnapshotSuite.scala| 56 +++--- .../k8s/ExecutorPodsSnapshotsStoreSuite.scala | 1 + 8 files changed, 108 insertions(+), 18 deletions(-) diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md index 3bd1c41..4714e35 100644 --- a/docs/running-on-kubernetes.md +++ b/docs/running-on-kubernetes.md @@ -1182,6 +1182,14 @@ See the [configuration page](configuration.html) for information on Spark config 3.0.0 + spark.kubernetes.executor.checkAllContainers + false + + Specify whether executor pods should be check all containers (including sidecars) or only the executor container when determining the pod status. + + 3.1.0 + + spark.kubernetes.submission.connectionTimeout 1 diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala index 00eaff4..d399f66 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala @@ -417,6 +417,14 @@ private[spark] object Config extends Logging { .stringConf .createOptional + val KUBERNETES_EXECUTOR_CHECK_ALL_CONTAINERS = +ConfigBuilder("spark.kubernetes.executor.checkAllContainers") + .doc("If set to true, all containers in the executor pod will be checked when reporting" + +"executor status.") + .version("3.1.0") + .booleanConf + .createWithDefault(false) + val KUBERNETES_DRIVER_LABEL_PREFIX = "spark.kubernetes.driver.label." val KUBERNETES_DRIVER_ANNOTATION_PREFIX = "spark.kubernetes.driver.annotation." val KUBERNETES_DRIVER_SERVICE_ANNOTATION_PREFIX = "
[spark] branch master updated: [SPARK-32381][CORE][SQL] Move and refactor parallel listing & non-location sensitive listing to core
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 8ccfbc1 [SPARK-32381][CORE][SQL] Move and refactor parallel listing & non-location sensitive listing to core 8ccfbc1 is described below commit 8ccfbc114e3e8d9fc919bf05602e02a506566e31 Author: Chao Sun AuthorDate: Thu Sep 24 10:58:52 2020 -0700 [SPARK-32381][CORE][SQL] Move and refactor parallel listing & non-location sensitive listing to core ### What changes were proposed in this pull request? This moves and refactors the parallel listing utilities from `InMemoryFileIndex` to Spark core so it can be reused by modules beside SQL. Along the process this also did some cleanups/refactorings: - Created a `HadoopFSUtils` class under core - Moved `InMemoryFileIndex.bulkListLeafFiles` into `HadoopFSUtils.parallelListLeafFiles`. It now depends on a `SparkContext` instead of `SparkSession` in SQL. Also added a few parameters which used to be read from `SparkSession.conf`: `ignoreMissingFiles`, `ignoreLocality`, `parallelismThreshold`, `parallelismMax ` and `filterFun` (for additional filtering support but we may be able to merge this with `filter` parameter in future). - Moved `InMemoryFileIndex.listLeafFiles` into `HadoopFSUtils.listLeafFiles` with similar changes above. ### Why are the changes needed? Currently the locality-aware parallel listing mechanism only applies to `InMemoryFileIndex`. By moving this to core, we can potentially reuse the same mechanism for other code paths as well. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Since this is mostly a refactoring, it relies on existing unit tests such as those for `InMemoryFileIndex`. Closes #29471 from sunchao/SPARK-32381. Lead-authored-by: Chao Sun Co-authored-by: Holden Karau Co-authored-by: Chao Sun Signed-off-by: Holden Karau --- .../org/apache/spark/util/HadoopFSUtils.scala | 352 - .../spark/sql/execution/command/CommandUtils.scala | 2 +- .../execution/datasources/InMemoryFileIndex.scala | 297 + 3 files changed, 150 insertions(+), 501 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala b/core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala similarity index 51% copy from sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala copy to core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala index a488ed1..c0a135e 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala +++ b/core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala @@ -15,7 +15,7 @@ * limitations under the License. */ -package org.apache.spark.sql.execution.datasources +package org.apache.spark.util import java.io.FileNotFoundException @@ -25,175 +25,68 @@ import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs._ import org.apache.hadoop.fs.viewfs.ViewFileSystem import org.apache.hadoop.hdfs.DistributedFileSystem -import org.apache.hadoop.mapred.{FileInputFormat, JobConf} -import org.apache.spark.SparkContext +import org.apache.spark._ +import org.apache.spark.annotation.Private import org.apache.spark.internal.Logging import org.apache.spark.metrics.source.HiveCatalogMetrics -import org.apache.spark.sql.SparkSession -import org.apache.spark.sql.execution.streaming.FileStreamSink -import org.apache.spark.sql.types.StructType -import org.apache.spark.util.SerializableConfiguration - /** - * A [[FileIndex]] that generates the list of files to process by recursively listing all the - * files present in `paths`. - * - * @param rootPathsSpecified the list of root table paths to scan (some of which might be - * filtered out later) - * @param parameters as set of options to control discovery - * @param userSpecifiedSchema an optional user specified schema that will be use to provide - *types for the discovered partitions + * Utility functions to simplify and speed-up file listing. */ -class InMemoryFileIndex( -sparkSession: SparkSession, -rootPathsSpecified: Seq[Path], -parameters: Map[String, String], -userSpecifiedSchema: Option[StructType], -fileStatusCache: FileStatusCache = NoopCache, -userSpecifiedPartitionSpec: Option[PartitionSpec] = None, -override val metadataOpsTimeNs: Option[Long] = None) - extends PartitioningAwareFileIndex( -sparkSession, parameters, userSpecifiedSchema, fileStatusCache) { - - // Filter out streaming metadata dirs or files such as "/.../_spark_m
[spark] branch master updated: [SPARK-32643][CORE][K8S] Consolidate state decommissioning in the TaskSchedulerImpl realm
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new b786f31 [SPARK-32643][CORE][K8S] Consolidate state decommissioning in the TaskSchedulerImpl realm b786f31 is described below commit b786f31a42180523b0baa8113e26b2ddee445498 Author: Devesh Agrawal AuthorDate: Wed Aug 26 15:16:47 2020 -0700 [SPARK-32643][CORE][K8S] Consolidate state decommissioning in the TaskSchedulerImpl realm ### What changes were proposed in this pull request? The decommissioning state is a bit fragment across two places in the TaskSchedulerImpl: https://github.com/apache/spark/pull/29014/ stored the incoming decommission info messages in TaskSchedulerImpl.executorsPendingDecommission. While https://github.com/apache/spark/pull/28619/ was storing just the executor end time in the map TaskSetManager.tidToExecutorKillTimeMapping (which in turn is contained in TaskSchedulerImpl). While the two states are not really overlapping, it's a bit of a code hygiene concern to save this state in two places. With https://github.com/apache/spark/pull/29422, TaskSchedulerImpl is emerging as the place where all decommissioning book keeping is kept within the driver. So consolidate the information in _tidToExecutorKillTimeMapping_ into _executorsPendingDecommission_. However, in order to do so, we need to walk away from keeping the raw ExecutorDecommissionInfo messages and instead keep another class ExecutorDecommissionState. This decoupling will allow the RPC message class ExecutorDecommissionInfo to evolve independently from the book keeping ExecutorDecommissionState. ### Why are the changes needed? This is just a code cleanup. These two features were added independently and its time to consolidate their state for good hygiene. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests. Closes #29452 from agrawaldevesh/consolidate_decom_state. Authored-by: Devesh Agrawal Signed-off-by: Holden Karau --- .../apache/spark/ExecutorAllocationClient.scala| 2 +- .../org/apache/spark/scheduler/DAGScheduler.scala | 2 +- .../spark/scheduler/ExecutorDecommissionInfo.scala | 14 ++- .../org/apache/spark/scheduler/TaskScheduler.scala | 2 +- .../apache/spark/scheduler/TaskSchedulerImpl.scala | 52 ++ .../apache/spark/scheduler/TaskSetManager.scala| 34 +++ .../apache/spark/scheduler/DAGSchedulerSuite.scala | 8 +- .../scheduler/ExternalClusterManagerSuite.scala| 4 +- .../spark/scheduler/TaskSchedulerImplSuite.scala | 112 ++--- .../spark/scheduler/TaskSetManagerSuite.scala | 33 +++--- 10 files changed, 167 insertions(+), 96 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala b/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala index 079340a..ce47f3f 100644 --- a/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala +++ b/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala @@ -88,7 +88,7 @@ private[spark] trait ExecutorAllocationClient { * Default implementation delegates to kill, scheduler must override * if it supports graceful decommissioning. * - * @param executorsAndDecominfo identifiers of executors & decom info. + * @param executorsAndDecomInfo identifiers of executors & decom info. * @param adjustTargetNumExecutors whether the target number of executors will be adjusted down * after these executors have been decommissioned. * @return the ids of the executors acknowledged by the cluster manager to be removed. diff --git a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala index ae0387e..18cd241 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala @@ -1825,7 +1825,7 @@ private[spark] class DAGScheduler( if (bmAddress != null) { val externalShuffleServiceEnabled = env.blockManager.externalShuffleServiceEnabled val isHostDecommissioned = taskScheduler - .getExecutorDecommissionInfo(bmAddress.executorId) + .getExecutorDecommissionState(bmAddress.executorId) .exists(_.isHostDecommissioned) // Shuffle output of all executors on host `bmAddress.host` may be lost if: diff --git a/core/src/main/scala/org/apache/spark/scheduler/ExecutorDecommissionInfo.scala b/core/src/main/scala/org/apache/spark/scheduler/ExecutorDecommissionInfo.scala index a82b5d3..48ae879 100644 --- a/core/s
[spark-website] branch asf-site updated: Update committer guide
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git The following commit(s) were added to refs/heads/asf-site by this push: new a3f618b Update committer guide a3f618b is described below commit a3f618bec99e5b4132c6586f12c1313b34cf5b13 Author: Holden Karau AuthorDate: Mon Aug 24 10:59:16 2020 -0700 Update committer guide Update the committer guide with the new policy we voted on for when to commit. Author: Holden Karau Closes #284 from holdenk/update-committer-guide. --- committers.md| 21 + site/committers.html | 42 +- 2 files changed, 50 insertions(+), 13 deletions(-) diff --git a/committers.md b/committers.md index 2665e0a..fc6958c 100644 --- a/committers.md +++ b/committers.md @@ -132,6 +132,27 @@ In particular, if you are working on an area of the codebase you are unfamiliar Git history for that code to see who reviewed patches before. You can do this using `git log --format=full `, by examining the "Commit" field to see who committed each patch. +When to commit/merge a pull request + +PRs shall not be merged during active, on-topic discussion unless they address issues such as critical security fixes of a public vulnerability. Under extenuating circumstances, PRs may be merged during active, off-topic discussion and the discussion directed to a more appropriate venue. Time should be given prior to merging for those involved with the conversation to explain if they believe they are on-topic. + +Lazy consensus requires giving time for discussion to settle while understanding that people may not be working on Spark as their full-time job and may take holidays. It is believed that by doing this, we can limit how often people feel the need to exercise their veto. + + +All -1s with justification merit discussion. A -1 from a non-committer can be overridden only with input from multiple committers, and suitable time must be offered for any committer to raise concerns. A -1 from a committer who cannot be reached requires a consensus vote of the PMC under ASF voting rules to determine the next steps within the [ASF guidelines for code vetoes](https://www.apache.org/foundation/voting.html). + + +These policies serve to reiterate the core principle that code must not be merged with a pending veto or before a consensus has been reached (lazy or otherwise). + + +It is the PMC’s hope that vetoes continue to be infrequent, and when they occur, that all parties will take the time to build consensus prior to additional feature work. + + +Being a committer means exercising your judgement while working in a community of people with diverse views. There is nothing wrong in getting a second (or third or fourth) opinion when you are uncertain. Thank you for your dedication to the Spark project; it is appreciated by the developers and users of Spark. + + +It is hoped that these guidelines do not slow down development; rather, by removing some of the uncertainty, the goal is to make it easier for us to reach consensus. If you have ideas on how to improve these guidelines or other Spark project operating procedures, you should reach out on the dev@ list to start the discussion. + How to Merge a Pull Request Changes pushed to the master branch on Apache cannot be removed; that is, we can't force-push to diff --git a/site/committers.html b/site/committers.html index 91bc57b..ff09913 100644 --- a/site/committers.html +++ b/site/committers.html @@ -565,7 +565,23 @@ who have shown they understand and can help with these activities. Contributing to Spark. In particular, if you are working on an area of the codebase you are unfamiliar with, look at the Git history for that code to see who reviewed patches before. You can do this using -git log --format=full <filename>, by examining the “Commit” field to see who committed each patch. +git log --format=full <filename>, by examining the “Commit” field to see who committed each patch. + +When to commit/merge a pull request + +PRs shall not be merged during active, on-topic discussion unless they address issues such as critical security fixes of a public vulnerability. Under extenuating circumstances, PRs may be merged during active, off-topic discussion and the discussion directed to a more appropriate venue. Time should be given prior to merging for those involved with the conversation to explain if they believe they are on-topic. + +Lazy consensus requires giving time for discussion to settle while understanding that people may not be working on Spark as their full-time job and may take holidays. It is believed that by doing this, we can limit how often people feel the need to exercise their veto. + +All -1s with justification merit discussion. A -1 from a non-c
[spark] branch master updated: [SPARK-31198][CORE] Use graceful decommissioning as part of dynamic scaling
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 548ac7c [SPARK-31198][CORE] Use graceful decommissioning as part of dynamic scaling 548ac7c is described below commit 548ac7c4af2270a6bdbf7a6f29f4846eecdc0171 Author: Holden Karau AuthorDate: Wed Aug 12 17:07:18 2020 -0700 [SPARK-31198][CORE] Use graceful decommissioning as part of dynamic scaling ### What changes were proposed in this pull request? If graceful decommissioning is enabled, Spark's dynamic scaling uses this instead of directly killing executors. ### Why are the changes needed? When scaling down Spark we should avoid triggering recomputes as much as possible. ### Does this PR introduce _any_ user-facing change? Hopefully their jobs run faster or at the same speed. It also enables experimental shuffle service free dynamic scaling when graceful decommissioning is enabled (using the same code as the shuffle tracking dynamic scaling). ### How was this patch tested? For now I've extended the ExecutorAllocationManagerSuite for both core & streaming. Closes #29367 from holdenk/SPARK-31198-use-graceful-decommissioning-as-part-of-dynamic-scaling. Lead-authored-by: Holden Karau Co-authored-by: Holden Karau Signed-off-by: Holden Karau --- .../apache/spark/ExecutorAllocationClient.scala| 40 + .../apache/spark/ExecutorAllocationManager.scala | 30 +++- .../cluster/CoarseGrainedSchedulerBackend.scala| 190 - .../cluster/StandaloneSchedulerBackend.scala | 3 +- .../spark/scheduler/dynalloc/ExecutorMonitor.scala | 61 ++- .../org/apache/spark/storage/BlockManager.scala| 2 +- .../apache/spark/storage/BlockManagerMaster.scala | 6 +- .../spark/ExecutorAllocationManagerSuite.scala | 71 +++- .../WorkerDecommissionExtendedSuite.scala | 3 +- .../spark/scheduler/WorkerDecommissionSuite.scala | 4 +- .../BlockManagerDecommissionIntegrationSuite.scala | 7 +- project/SparkBuild.scala | 2 + .../docker/src/main/dockerfiles/spark/decom.sh | 2 +- .../k8s/integrationtest/KubernetesSuite.scala | 27 ++- .../integration-tests/tests/decommissioning.py | 5 - .../scheduler/ExecutorAllocationManager.scala | 10 +- .../scheduler/ExecutorAllocationManagerSuite.scala | 51 -- 17 files changed, 380 insertions(+), 134 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala b/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala index 00bd006..079340a 100644 --- a/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala +++ b/core/src/main/scala/org/apache/spark/ExecutorAllocationClient.scala @@ -17,6 +17,8 @@ package org.apache.spark +import org.apache.spark.scheduler.ExecutorDecommissionInfo + /** * A client that communicates with the cluster manager to request or kill executors. * This is currently supported only in YARN mode. @@ -82,6 +84,44 @@ private[spark] trait ExecutorAllocationClient { force: Boolean = false): Seq[String] /** + * Request that the cluster manager decommission the specified executors. + * Default implementation delegates to kill, scheduler must override + * if it supports graceful decommissioning. + * + * @param executorsAndDecominfo identifiers of executors & decom info. + * @param adjustTargetNumExecutors whether the target number of executors will be adjusted down + * after these executors have been decommissioned. + * @return the ids of the executors acknowledged by the cluster manager to be removed. + */ + def decommissionExecutors( +executorsAndDecomInfo: Array[(String, ExecutorDecommissionInfo)], +adjustTargetNumExecutors: Boolean): Seq[String] = { +killExecutors(executorsAndDecomInfo.map(_._1), + adjustTargetNumExecutors, + countFailures = false) + } + + + /** + * Request that the cluster manager decommission the specified executor. + * Delegates to decommissionExecutors. + * + * @param executorId identifiers of executor to decommission + * @param decommissionInfo information about the decommission (reason, host loss) + * @param adjustTargetNumExecutors if we should adjust the target number of executors. + * @return whether the request is acknowledged by the cluster manager. + */ + final def decommissionExecutor(executorId: String, + decommissionInfo: ExecutorDecommissionInfo, + adjustTargetNumExecutors: Boolean): Boolean = { +val decommissionedExecutors = decommissionExecutors( + Array((executorId, decommissionInfo)), + adjustTargetNumExecutors = adjust
[spark] branch master updated: [SPARK-31197][CORE] Shutdown executor once we are done decommissioning
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 375d348 [SPARK-31197][CORE] Shutdown executor once we are done decommissioning 375d348 is described below commit 375d348a83e6ffa38dfaece5047633f67aee1da5 Author: Holden Karau AuthorDate: Wed Aug 5 16:28:14 2020 -0700 [SPARK-31197][CORE] Shutdown executor once we are done decommissioning ### What changes were proposed in this pull request? Exit the executor when it has been asked to decommission and there is nothing left for it to do. This is a rebase of https://github.com/apache/spark/pull/28817 ### Why are the changes needed? If we want to use decommissioning in Spark's own scale down we should terminate the executor once finished. Furthermore, in graceful shutdown it makes sense to release resources we no longer need if we've been asked to shutdown by the cluster manager instead of always holding the resources as long as possible. ### Does this PR introduce _any_ user-facing change? The decommissioned executors will exit and the end of decommissioning. This is sort of a user facing change, however decommissioning hasn't been in any releases yet. ### How was this patch tested? I changed the unit test to not send the executor exit message and still wait on the executor exited message. Closes #29211 from holdenk/SPARK-31197-exit-execs-redone. Authored-by: Holden Karau Signed-off-by: Holden Karau --- .../org/apache/spark/deploy/DeployMessage.scala| 2 - .../org/apache/spark/deploy/worker/Worker.scala| 2 +- .../executor/CoarseGrainedExecutorBackend.scala| 58 - .../cluster/CoarseGrainedClusterMessage.scala | 3 + .../cluster/CoarseGrainedSchedulerBackend.scala| 10 ++ .../org/apache/spark/storage/BlockManager.scala| 8 ++ .../spark/storage/BlockManagerDecommissioner.scala | 96 +++--- .../spark/scheduler/WorkerDecommissionSuite.scala | 19 ++- .../BlockManagerDecommissionIntegrationSuite.scala | 17 ++- .../BlockManagerDecommissionUnitSuite.scala| 139 - 10 files changed, 310 insertions(+), 44 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala b/core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala index c8c6e5a..b7a64d75 100644 --- a/core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala +++ b/core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala @@ -165,8 +165,6 @@ private[deploy] object DeployMessages { case object ReregisterWithMaster // used when a worker attempts to reconnect to a master - case object DecommissionSelf // Mark as decommissioned. May be Master to Worker in the future. - // AppClient to Master case class RegisterApplication(appDescription: ApplicationDescription, driver: RpcEndpointRef) diff --git a/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala b/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala index aa8c46f..862e685 100755 --- a/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala +++ b/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala @@ -668,7 +668,7 @@ private[deploy] class Worker( finishedApps += id maybeCleanupApplication(id) -case DecommissionSelf => +case WorkerDecommission(_, _) => decommissionSelf() } diff --git a/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala b/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala index def125b..55fb76b 100644 --- a/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala +++ b/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala @@ -64,7 +64,6 @@ private[spark] class CoarseGrainedExecutorBackend( private[this] val stopping = new AtomicBoolean(false) var executor: Executor = null - @volatile private var decommissioned = false @volatile var driver: Option[RpcEndpointRef] = None // If this CoarseGrainedExecutorBackend is changed to support multiple threads, then this may need @@ -80,6 +79,8 @@ private[spark] class CoarseGrainedExecutorBackend( */ private[executor] val taskResources = new mutable.HashMap[Long, Map[String, ResourceInformation]] + @volatile private var decommissioned = false + override def onStart(): Unit = { logInfo("Registering PWR handler.") SignalUtils.register("PWR", "Failed to register SIGPWR handler - " + @@ -214,6 +215,10 @@ private[spark] class CoarseGrainedExecutorBackend( case UpdateDelegationTokens(tokenBytes) => logInfo(s"Received tokens of ${tokenBytes.len
[spark] branch master updated: [SPARK-32417] Fix flakyness of BlockManagerDecommissionIntegrationSuite
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6032c5b [SPARK-32417] Fix flakyness of BlockManagerDecommissionIntegrationSuite 6032c5b is described below commit 6032c5b0320fe70455586f4ce863d5d9361b5e07 Author: Devesh Agrawal AuthorDate: Thu Jul 30 12:00:19 2020 -0700 [SPARK-32417] Fix flakyness of BlockManagerDecommissionIntegrationSuite ### What changes were proposed in this pull request? This test tries to fix the flakyness of BlockManagerDecommissionIntegrationSuite. ### Description of the problem Make the block manager decommissioning test be less flaky An interesting failure happens when migrateDuring = true (and persist or shuffle is true): - We schedule the job with tasks on executors 0, 1, 2. - We wait 300 ms and decommission executor 0. - If the task is not yet done on executor 0, it will now fail because the block manager won't be able to save the block. This condition is easy to trigger on a loaded machine where the github checks run. - The task with retry on a different executor (1 or 2) and its shuffle blocks will land there. - No actual block migration happens here because the decommissioned executor technically failed before it could even produce a block. To remove the above race, this change replaces the fixed wait for 300 ms to wait for an actual task to succeed. When a task has succeeded, we know its blocks would have been written for sure and thus its executor would certainly be forced to migrate those blocks when it is decommissioned. The change always decommissions an executor on which a real task finished successfully instead of picking the first executor. Because the system may choose to schedule nothing on the first executor and instead run the two tasks on one executor. ### Why are the changes needed? I have had bad luck with BlockManagerDecommissionIntegrationSuite and it has failed several times on my PRs. So fixing it. ### Does this PR introduce _any_ user-facing change? No, unit test only change. ### How was this patch tested? Github checks. Ran this test 100 times, 10 at a time in parallel in a script. Closes #29226 from agrawaldevesh/block-manager-decom-flaky. Authored-by: Devesh Agrawal Signed-off-by: Holden Karau --- .../BlockManagerDecommissionIntegrationSuite.scala | 147 +++-- 1 file changed, 103 insertions(+), 44 deletions(-) diff --git a/core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionIntegrationSuite.scala b/core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionIntegrationSuite.scala index 5741010..6a52f72 100644 --- a/core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionIntegrationSuite.scala +++ b/core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionIntegrationSuite.scala @@ -17,8 +17,9 @@ package org.apache.spark.storage -import java.util.concurrent.Semaphore +import java.util.concurrent.{ConcurrentHashMap, ConcurrentLinkedQueue, Semaphore} +import scala.collection.JavaConverters._ import scala.collection.mutable.ArrayBuffer import scala.concurrent.duration._ @@ -28,29 +29,40 @@ import org.apache.spark._ import org.apache.spark.internal.config import org.apache.spark.scheduler._ import org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend -import org.apache.spark.util.{ResetSystemProperties, ThreadUtils} +import org.apache.spark.util.{ResetSystemProperties, SystemClock, ThreadUtils} class BlockManagerDecommissionIntegrationSuite extends SparkFunSuite with LocalSparkContext with ResetSystemProperties with Eventually { val numExecs = 3 val numParts = 3 + val TaskStarted = "TASK_STARTED" + val TaskEnded = "TASK_ENDED" + val JobEnded = "JOB_ENDED" test(s"verify that an already running task which is going to cache data succeeds " + -s"on a decommissioned executor") { -runDecomTest(true, false, true) +s"on a decommissioned executor after task start") { +runDecomTest(true, false, TaskStarted) } - test(s"verify that shuffle blocks are migrated") { -runDecomTest(false, true, false) + test(s"verify that an already running task which is going to cache data succeeds " + +s"on a decommissioned executor after one task ends but before job ends") { +runDecomTest(true, false, TaskEnded) } - test(s"verify that both migrations can work at the same time.") { -runDecomTest(true, true, false) + test(s"verify that shuffle blocks are migrated") { +runDecomTest(false, true, JobEnded)
[spark] branch master updated: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 366a178 [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning 366a178 is described below commit 366a1789333bac97643159857a206bcd773761a4 Author: Devesh Agrawal AuthorDate: Thu Jul 30 11:58:11 2020 -0700 [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning ### What changes were proposed in this pull request? This PR reduces the prospect of a job loss during decommissioning. It fixes two holes in the current decommissioning framework: - (a) Loss of decommissioned executors is not treated as a job failure: We know that the decommissioned executor would be dying soon, so its death is clearly not caused by the application. - (b) Shuffle files on the decommissioned host are cleared when the first fetch failure is detected from a decommissioned host: This is a bit tricky in terms of when to clear the shuffle state ? Ideally you want to clear it the millisecond before the shuffle service on the node dies (or the executor dies when there is no external shuffle service) -- too soon and it could lead to some wastage and too late would lead to fetch failures. The approach here is to do this clearing when the very first fetch failure is observed on the decommissioned block manager, without waiting for other blocks to also signal a failure. ### Why are the changes needed? Without them decommissioning a lot of executors at a time leads to job failures. ### Code overview The task scheduler tracks the executors that were decommissioned along with their `ExecutorDecommissionInfo`. This information is used by: (a) For handling a `ExecutorProcessLost` error, or (b) by the `DAGScheduler` when handling a fetch failure. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added a new unit test `DecommissionWorkerSuite` to test the new behavior by exercising the Master-Worker decommissioning. I chose to add a new test since the setup logic was quite different from the existing `WorkerDecommissionSuite`. I am open to changing the name of the newly added test suite :-) ### Questions for reviewers - Should I add a feature flag to guard these two behaviors ? They seem safe to me that they should only get triggered by decommissioning, but you never know :-). Closes #29014 from agrawaldevesh/decom_harden. Authored-by: Devesh Agrawal Signed-off-by: Holden Karau --- .../org/apache/spark/scheduler/DAGScheduler.scala | 19 +- .../spark/scheduler/ExecutorLossReason.scala | 7 +- .../org/apache/spark/scheduler/TaskScheduler.scala | 5 + .../apache/spark/scheduler/TaskSchedulerImpl.scala | 37 +- .../apache/spark/scheduler/TaskSetManager.scala| 1 + .../spark/deploy/DecommissionWorkerSuite.scala | 424 + .../apache/spark/scheduler/DAGSchedulerSuite.scala | 4 + .../scheduler/ExternalClusterManagerSuite.scala| 2 + .../spark/scheduler/TaskSchedulerImplSuite.scala | 47 +++ 9 files changed, 539 insertions(+), 7 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala index 2503ae0..6b376cd 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala @@ -1821,10 +1821,19 @@ private[spark] class DAGScheduler( // TODO: mark the executor as failed only if there were lots of fetch failures on it if (bmAddress != null) { -val hostToUnregisterOutputs = if (env.blockManager.externalShuffleServiceEnabled && - unRegisterOutputOnHostOnFetchFailure) { - // We had a fetch failure with the external shuffle service, so we - // assume all shuffle data on the node is bad. +val externalShuffleServiceEnabled = env.blockManager.externalShuffleServiceEnabled +val isHostDecommissioned = taskScheduler + .getExecutorDecommissionInfo(bmAddress.executorId) + .exists(_.isHostDecommissioned) + +// Shuffle output of all executors on host `bmAddress.host` may be lost if: +// - External shuffle service is enabled, so we assume that all shuffle data on node is +// bad. +// - Host is decommissioned, thus all executors on that host will die. +val shuffleOutputOfEntireHostLost = externalShuffleServiceEnabled || + isHostDecommissioned +val hostToUnregisterOutp
[spark] branch master updated: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new f8d29d3 [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor f8d29d3 is described below commit f8d29d371cdf0b8be6a48a9124ffbc3c0794f32a Author: Devesh Agrawal AuthorDate: Wed Jul 22 21:04:06 2020 -0700 [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor ### What changes were proposed in this pull request? This PR is a giant plumbing PR that plumbs an `ExecutorDecommissionInfo` along with the DecommissionExecutor message. ### Why are the changes needed? The primary motivation is to know whether a decommissioned executor would also be loosing shuffle files -- and thus it is important to know whether the host would also be decommissioned. In the absence of this PR, the existing code assumes that decommissioning an executor does not loose the whole host with it, and thus does not clear the shuffle state if external shuffle service is enabled. While this may hold in some cases (like K8s decommissioning an executor pod, or YARN container preemption), it does not hold in others like when the cluster is managed by a Standalone Scheduler (Master). This is similar to the existing `workerLost` field in the `ExecutorProcessLost [...] In the future, this `ExecutorDecommissionInfo` can be embellished for knowing how long the executor has to live for scenarios like Cloud spot kills (or Yarn preemption) and the like. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tweaked an existing unit test in `AppClientSuite` Closes #29032 from agrawaldevesh/plumb_decom_info. Authored-by: Devesh Agrawal Signed-off-by: Holden Karau --- .../spark/deploy/client/StandaloneAppClient.scala | 4 +++- .../client/StandaloneAppClientListener.scala | 4 +++- .../org/apache/spark/deploy/master/Master.scala| 5 +++- .../executor/CoarseGrainedExecutorBackend.scala| 16 + .../spark/scheduler/ExecutorDecommissionInfo.scala | 28 ++ .../org/apache/spark/scheduler/TaskScheduler.scala | 2 +- .../apache/spark/scheduler/TaskSchedulerImpl.scala | 3 ++- .../cluster/CoarseGrainedClusterMessage.scala | 4 +++- .../cluster/CoarseGrainedSchedulerBackend.scala| 22 + .../cluster/StandaloneSchedulerBackend.scala | 6 ++--- .../spark/deploy/client/AppClientSuite.scala | 15 .../apache/spark/scheduler/DAGSchedulerSuite.scala | 8 +-- .../scheduler/ExternalClusterManagerSuite.scala| 4 +++- .../WorkerDecommissionExtendedSuite.scala | 2 +- .../spark/scheduler/WorkerDecommissionSuite.scala | 2 +- .../BlockManagerDecommissionIntegrationSuite.scala | 2 +- 16 files changed, 92 insertions(+), 35 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClient.scala b/core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClient.scala index eedf5e9..a6da839 100644 --- a/core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClient.scala +++ b/core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClient.scala @@ -31,6 +31,7 @@ import org.apache.spark.deploy.DeployMessages._ import org.apache.spark.deploy.master.Master import org.apache.spark.internal.Logging import org.apache.spark.rpc._ +import org.apache.spark.scheduler.ExecutorDecommissionInfo import org.apache.spark.util.{RpcUtils, ThreadUtils} /** @@ -181,7 +182,8 @@ private[spark] class StandaloneAppClient( if (ExecutorState.isFinished(state)) { listener.executorRemoved(fullId, message.getOrElse(""), exitStatus, workerLost) } else if (state == ExecutorState.DECOMMISSIONED) { - listener.executorDecommissioned(fullId, message.getOrElse("")) + listener.executorDecommissioned(fullId, +ExecutorDecommissionInfo(message.getOrElse(""), isHostDecommissioned = workerLost)) } case WorkerRemoved(id, host, message) => diff --git a/core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClientListener.scala b/core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClientListener.scala index 2e38a68..e72f7e9 100644 --- a/core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClientListener.scala +++ b/core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClientListener.scala @@ -17,6 +17,8 @@ package org.apache.spark.deploy.client +import org.apache.spark.scheduler.ExecutorDecommissionInfo + /** * Callbacks invoked by deploy client when various events happen. There are currently f
[spark] branch master updated: [SPARK-24266][K8S] Restart the watcher when we receive a version changed from k8s
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 0432379 [SPARK-24266][K8S] Restart the watcher when we receive a version changed from k8s 0432379 is described below commit 0432379f9923768a767566e9ac5a4021cfe8d052 Author: Stijn De Haes AuthorDate: Tue Jul 21 16:34:30 2020 -0700 [SPARK-24266][K8S] Restart the watcher when we receive a version changed from k8s ### What changes were proposed in this pull request? Restart the watcher when it failed with a HTTP_GONE code from the kubernetes api. Which means a resource version has changed. For more relevant information see here: https://github.com/fabric8io/kubernetes-client/issues/1075 ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Running spark-submit to a k8s cluster. Not sure how to make an automated test for this. If someone can help me out that would be great. Closes #28423 from stijndehaes/bugfix/k8s-submit-resource-version-change. Authored-by: Stijn De Haes Signed-off-by: Holden Karau --- .../k8s/submit/KubernetesClientApplication.scala | 52 ++ .../k8s/submit/LoggingPodStatusWatcher.scala | 35 +++ .../spark/deploy/k8s/submit/ClientSuite.scala | 1 + 3 files changed, 61 insertions(+), 27 deletions(-) diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala index b4155fe..713d35d 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala @@ -21,9 +21,11 @@ import java.util.{Collections, UUID} import java.util.Properties import io.fabric8.kubernetes.api.model._ -import io.fabric8.kubernetes.client.KubernetesClient +import io.fabric8.kubernetes.client.{KubernetesClient, Watch} +import io.fabric8.kubernetes.client.Watcher.Action import scala.collection.mutable import scala.util.control.NonFatal +import util.control.Breaks._ import org.apache.spark.SparkConf import org.apache.spark.deploy.SparkApplication @@ -127,25 +129,37 @@ private[spark] class Client( .endSpec() .build() val driverPodName = resolvedDriverPod.getMetadata.getName -Utils.tryWithResource( - kubernetesClient -.pods() -.withName(driverPodName) -.watch(watcher)) { _ => - val createdDriverPod = kubernetesClient.pods().create(resolvedDriverPod) - try { -val otherKubernetesResources = - resolvedDriverSpec.driverKubernetesResources ++ Seq(configMap) -addDriverOwnerReference(createdDriverPod, otherKubernetesResources) -kubernetesClient.resourceList(otherKubernetesResources: _*).createOrReplace() - } catch { -case NonFatal(e) => - kubernetesClient.pods().delete(createdDriverPod) - throw e - } - val sId = Seq(conf.namespace, driverPodName).mkString(":") - watcher.watchOrStop(sId) +var watch: Watch = null +val createdDriverPod = kubernetesClient.pods().create(resolvedDriverPod) +try { + val otherKubernetesResources = resolvedDriverSpec.driverKubernetesResources ++ Seq(configMap) + addDriverOwnerReference(createdDriverPod, otherKubernetesResources) + kubernetesClient.resourceList(otherKubernetesResources: _*).createOrReplace() +} catch { + case NonFatal(e) => +kubernetesClient.pods().delete(createdDriverPod) +throw e +} +val sId = Seq(conf.namespace, driverPodName).mkString(":") +breakable { + while (true) { +val podWithName = kubernetesClient + .pods() + .withName(driverPodName) +// Reset resource to old before we start the watch, this is important for race conditions +watcher.reset() +watch = podWithName.watch(watcher) + +// Send the latest pod state we know to the watcher to make sure we didn't miss anything +watcher.eventReceived(Action.MODIFIED, podWithName.get()) + +// Break the while loop if the pod is completed or we don't want to wait +if(watcher.watchOrStop(sId)) { + watch.close() + break +} + } } } diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/LoggingPodStatusWatcher.scala b/resource-managers/kubernetes/core/src/main/s
[spark] branch master updated: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new a4ca355 [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown a4ca355 is described below commit a4ca355af8556e8c5948e492ef70ef0b48416dc4 Author: Holden Karau AuthorDate: Sun Jul 19 21:33:13 2020 -0700 [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown ### What is changed? This pull request adds the ability to migrate shuffle files during Spark's decommissioning. The design document associated with this change is at https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE . To allow this change the `MapOutputTracker` has been extended to allow the location of shuffle files to be updated with `updateMapOutput`. When a shuffle block is put, a block update message will be sent which triggers the `updateMapOutput`. Instead of rejecting remote puts of shuffle blocks `BlockManager` delegates the storage of shuffle blocks to it's shufflemanager's resolver (if supported). A new, experimental, trait is added for shuffle resolvers to indicate they handle remote putting of blocks. The existing block migration code is moved out into a separate file, and a producer/consumer model is introduced for migrating shuffle files from the host as quickly as possible while not overwhelming other executors. ### Why are the changes needed? Recomputting shuffle blocks can be expensive, we should take advantage of our decommissioning time to migrate these blocks. ### Does this PR introduce any user-facing change? This PR introduces two new configs parameters, `spark.storage.decommission.shuffleBlocks.enabled` & `spark.storage.decommission.rddBlocks.enabled` that control which blocks should be migrated during storage decommissioning. ### How was this patch tested? New unit test & expansion of the Spark on K8s decom test to assert that decommisioning with shuffle block migration means that the results are not recomputed even when the original executor is terminated. This PR is a cleaned-up version of the previous WIP PR I made https://github.com/apache/spark/pull/28331 (thanks to attilapiros for his very helpful reviewing on it :)). Closes #28708 from holdenk/SPARK-20629-copy-shuffle-data-when-nodes-are-being-shutdown-cleaned-up. Lead-authored-by: Holden Karau Co-authored-by: Holden Karau Co-authored-by: “attilapiros” Co-authored-by: Attila Zsolt Piros Signed-off-by: Holden Karau --- .../scala/org/apache/spark/MapOutputTracker.scala | 38 ++- .../src/main/scala/org/apache/spark/SparkEnv.scala | 3 +- .../org/apache/spark/internal/config/package.scala | 23 ++ .../network/netty/NettyBlockTransferService.scala | 5 +- .../org/apache/spark/scheduler/MapStatus.scala | 15 +- .../cluster/StandaloneSchedulerBackend.scala | 2 +- .../spark/shuffle/IndexShuffleBlockResolver.scala | 99 ++- .../apache/spark/shuffle/MigratableResolver.scala | 48 +++ .../apache/spark/shuffle/ShuffleBlockInfo.scala| 28 ++ .../scala/org/apache/spark/storage/BlockId.scala | 5 +- .../org/apache/spark/storage/BlockManager.scala| 154 +++--- .../spark/storage/BlockManagerDecommissioner.scala | 330 + .../spark/storage/BlockManagerMasterEndpoint.scala | 26 +- ...avedOnDecommissionedBlockManagerException.scala | 21 ++ .../spark/scheduler/WorkerDecommissionSuite.scala | 2 +- .../sort/IndexShuffleBlockResolverSuite.scala | 3 +- .../org/apache/spark/storage/BlockIdSuite.scala| 4 +- .../BlockManagerDecommissionIntegrationSuite.scala | 229 ++ .../storage/BlockManagerDecommissionSuite.scala| 106 --- .../BlockManagerDecommissionUnitSuite.scala| 92 ++ .../storage/BlockManagerReplicationSuite.scala | 2 +- .../apache/spark/storage/BlockManagerSuite.scala | 93 +- .../k8s/integrationtest/DecommissionSuite.scala| 13 +- .../k8s/integrationtest/KubernetesSuite.scala | 35 ++- .../integration-tests/tests/decommissioning.py | 27 +- .../streaming/ReceivedBlockHandlerSuite.scala | 2 +- 26 files changed, 1150 insertions(+), 255 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/MapOutputTracker.scala b/core/src/main/scala/org/apache/spark/MapOutputTracker.scala index 32251df..64102cc 100644 --- a/core/src/main/scala/org/apache/spark/MapOutputTracker.scala +++ b/core/src/main/scala/org/apache/spark/MapOutputTracker.scala @@ -49,7 +49,7 @@ import org.apache.spark.util._ * * All public methods of this class are thread-safe. */ -private class ShuffleStatus(numPartitions: Int) { +private class ShuffleSt
[spark] branch master updated: [SPARK-21040][CORE] Speculate tasks which are running on decommission executors
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 0678afe [SPARK-21040][CORE] Speculate tasks which are running on decommission executors 0678afe is described below commit 0678afe393b1e4f65b70470483fe0cdb1fe139dc Author: Prakhar Jain AuthorDate: Fri Jul 17 16:11:02 2020 -0700 [SPARK-21040][CORE] Speculate tasks which are running on decommission executors ### What changes were proposed in this pull request? This PR adds functionality to consider the running tasks on decommission executors based on some config. In spark-on-cloud , we sometimes already know that an executor won't be alive for more than fix amount of time. Ex- In AWS Spot nodes, once we get the notification, we know that a node will be gone in 120 seconds. So if the running tasks on the decommissioning executors may run beyond currentTime+120 seconds, then they are candidate for speculation. ### Why are the changes needed? Currently when an executor is decommission, we stop scheduling new tasks on those executors but the already running tasks keeps on running on them. Based on the cloud, we might know beforehand that an executor won't be alive for more than a preconfigured time. Different cloud providers gives different timeouts before they take away the nodes. For Ex- In case of AWS spot nodes, an executor won't be alive for more than 120 seconds. We can utilize this information in cloud environments a [...] ### Does this PR introduce _any_ user-facing change? Yes. This PR adds a new config "spark.executor.decommission.killInterval" which they can explicitly set based on the cloud environment where they are running. ### How was this patch tested? Added UT. Closes #28619 from prakharjain09/SPARK-21040-speculate-decommission-exec-tasks. Authored-by: Prakhar Jain Signed-off-by: Holden Karau --- .../org/apache/spark/internal/config/package.scala | 11 +++ .../apache/spark/scheduler/TaskSetManager.scala| 28 +- .../spark/scheduler/TaskSetManagerSuite.scala | 106 + 3 files changed, 141 insertions(+), 4 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala b/core/src/main/scala/org/apache/spark/internal/config/package.scala index ca75a19..f0b292b 100644 --- a/core/src/main/scala/org/apache/spark/internal/config/package.scala +++ b/core/src/main/scala/org/apache/spark/internal/config/package.scala @@ -1843,6 +1843,17 @@ package object config { .timeConf(TimeUnit.MILLISECONDS) .createOptional + private[spark] val EXECUTOR_DECOMMISSION_KILL_INTERVAL = +ConfigBuilder("spark.executor.decommission.killInterval") + .doc("Duration after which a decommissioned executor will be killed forcefully." + +"This config is useful for cloud environments where we know in advance when " + +"an executor is going to go down after decommissioning signal i.e. around 2 mins " + +"in aws spot nodes, 1/2 hrs in spot block nodes etc. This config is currently " + +"used to decide what tasks running on decommission executors to speculate.") + .version("3.1.0") + .timeConf(TimeUnit.SECONDS) + .createOptional + private[spark] val STAGING_DIR = ConfigBuilder("spark.yarn.stagingDir") .doc("Staging directory used while submitting applications.") .version("2.0.0") diff --git a/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala b/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala index a302f68..4b31ff0 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala @@ -19,7 +19,7 @@ package org.apache.spark.scheduler import java.io.NotSerializableException import java.nio.ByteBuffer -import java.util.concurrent.ConcurrentLinkedQueue +import java.util.concurrent.{ConcurrentLinkedQueue, TimeUnit} import scala.collection.immutable.Map import scala.collection.mutable.{ArrayBuffer, HashMap, HashSet} @@ -102,6 +102,8 @@ private[spark] class TaskSetManager( } numTasks <= slots } + val executorDecommissionKillInterval = conf.get(EXECUTOR_DECOMMISSION_KILL_INTERVAL).map( +TimeUnit.SECONDS.toMillis) // For each task, tracks whether a copy of the task has succeeded. A task will also be // marked as "succeeded" if it failed with a fetch failure, in which case it should not @@ -165,6 +167,7 @@ private[spark] class TaskSetManager( // Task index, start and finish time for each task attempt (indexed by task ID) private[scheduler
[spark-website] branch asf-site updated: Fix 2-4-6 web build
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git The following commit(s) were added to refs/heads/asf-site by this push: new 3d9740f Fix 2-4-6 web build 3d9740f is described below commit 3d9740f38beca3b8609b8650409edb93a70c1aec Author: Holden Karau AuthorDate: Wed Jun 10 18:36:12 2020 -0700 Fix 2-4-6 web build Fix the 2.4.6 web build, the jekyll serve wrote some localhost values in the sitemap we don't want and add the generated release files. Author: Holden Karau Closes #266 from holdenk/spark-2-4-6-rebuild. --- site/mailing-lists.html| 2 +- site/{mailing-lists.html => news/spark-2-4-6.html} | 16 +- .../spark-release-2-4-6.html} | 56 +++- site/sitemap.xml | 370 ++--- 4 files changed, 248 insertions(+), 196 deletions(-) diff --git a/site/mailing-lists.html b/site/mailing-lists.html index 2f4a88f..f6686f9 100644 --- a/site/mailing-lists.html +++ b/site/mailing-lists.html @@ -12,7 +12,7 @@ -http://localhost:4000/community.html"; /> +https://spark.apache.org/community.html"; /> diff --git a/site/mailing-lists.html b/site/news/spark-2-4-6.html similarity index 94% copy from site/mailing-lists.html copy to site/news/spark-2-4-6.html index 2f4a88f..53d1399 100644 --- a/site/mailing-lists.html +++ b/site/news/spark-2-4-6.html @@ -6,14 +6,11 @@ - Mailing Lists | Apache Spark + Spark 2.4.6 released | Apache Spark - -http://localhost:4000/community.html"; /> - @@ -203,7 +200,16 @@ - +Spark 2.4.6 released + + +We are happy to announce the availability of Spark 2.4.6! Visit the release notes to read about the new features, or download the release today. + + + + +Spark News Archive + diff --git a/site/mailing-lists.html b/site/releases/spark-release-2-4-6.html similarity index 68% copy from site/mailing-lists.html copy to site/releases/spark-release-2-4-6.html index 2f4a88f..299cf58 100644 --- a/site/mailing-lists.html +++ b/site/releases/spark-release-2-4-6.html @@ -6,14 +6,11 @@ - Mailing Lists | Apache Spark + Spark Release 2.4.6 | Apache Spark - -http://localhost:4000/community.html"; /> - @@ -203,7 +200,56 @@ - +Spark Release 2.4.6 + + +Spark 2.4.6 is a maintenance release containing stability, correctness, and security fixes. This release is based on the branch-2.4 maintenance branch of Spark. We strongly recommend all 2.4 users to upgrade to this stable release. + +Notable changes + + https://issues.apache.org/jira/browse/SPARK-29419";>[SPARK-29419]: Seq.toDS / spark.createDataset(Seq) is not thread-safe + https://issues.apache.org/jira/browse/SPARK-31519";>[SPARK-31519]: Cast in having aggregate expressions returns the wrong result + https://issues.apache.org/jira/browse/SPARK-26293";>[SPARK-26293]: Cast exception when having python udf in subquery + https://issues.apache.org/jira/browse/SPARK-30826";>[SPARK-30826]: LIKE returns wrong result from external table using parquet + https://issues.apache.org/jira/browse/SPARK-30857";>[SPARK-30857]: Wrong truncations of timestamps before the epoch to hours and days + https://issues.apache.org/jira/browse/SPARK-31256";>[SPARK-31256]: Dropna doesn’t work for struct columns + https://issues.apache.org/jira/browse/SPARK-31312";>[SPARK-31312]: Transforming Hive simple UDF (using JAR) expression may incur CNFE in later evaluation + https://issues.apache.org/jira/browse/SPARK-31420";>[SPARK-31420]: Infinite timeline redraw in job details page + https://issues.apache.org/jira/browse/SPARK-31485";>[SPARK-31485]: Barrier stage can hang if only partial tasks launched + https://issues.apache.org/jira/browse/SPARK-31500";>[SPARK-31500]: collect_set() of BinaryType returns duplicate elements + https://issues.apache.org/jira/browse/SPARK-31503";>[SPARK-31503]: fix the SQL string of the TRIM functions + https://issues.apache.org/jira/browse/SPARK-31663";>[SPARK-31663]: Grouping sets with having clause returns the wrong result + https://issues.apache.org/jira/browse/SPARK-26908";>[SPARK-26908]: Fix toMilis + https://issues.apache.org/jira/browse/SPARK-31563";>[SPARK-31563]: Failure of Inset.sql for UTF8String collection + + +Dependency Changes + +While being a maintence release we did still upgrade some dependencies in this release they are: + + netty-all to 4.1.47.Final (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-20445";>[CVE-2019-20445]) + Janino to 3.0.16 (SQL Generated code) + aws-java-sdk-sts to 1.11.655 (required for k
[spark] branch branch-3.0 updated: [SPARK-31934][BUILD] Remove set -x from docker image tool
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new bd0f5f2 [SPARK-31934][BUILD] Remove set -x from docker image tool bd0f5f2 is described below commit bd0f5f2c57ae954d29b0940c1cff77cd2fff8ed7 Author: Holden Karau AuthorDate: Mon Jun 8 16:03:13 2020 -0700 [SPARK-31934][BUILD] Remove set -x from docker image tool ### What changes were proposed in this pull request? Remove `set -x` from the docker image tool. ### Why are the changes needed? The image tool prints out information that may confusing. ### Does this PR introduce _any_ user-facing change? Less information is displayed by the docker image tool. ### How was this patch tested? Ran docker image tool locally. Closes #28759 from holdenk/SPARK-31934-remove-extranious-info-from-the-docker-image-tool. Authored-by: Holden Karau Signed-off-by: Holden Karau (cherry picked from commit 06959ebc399e4fa6a90c30e4f0c897cad1f6a496) Signed-off-by: Holden Karau --- bin/docker-image-tool.sh | 2 -- 1 file changed, 2 deletions(-) diff --git a/bin/docker-image-tool.sh b/bin/docker-image-tool.sh index 8a01b80..6d74f83 100755 --- a/bin/docker-image-tool.sh +++ b/bin/docker-image-tool.sh @@ -19,8 +19,6 @@ # This script builds and pushes docker images when run from a release of Spark # with Kubernetes support. -set -x - function error { echo "$@" 1>&2 exit 1 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-31934][BUILD] Remove set -x from docker image tool
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 06959eb [SPARK-31934][BUILD] Remove set -x from docker image tool 06959eb is described below commit 06959ebc399e4fa6a90c30e4f0c897cad1f6a496 Author: Holden Karau AuthorDate: Mon Jun 8 16:03:13 2020 -0700 [SPARK-31934][BUILD] Remove set -x from docker image tool ### What changes were proposed in this pull request? Remove `set -x` from the docker image tool. ### Why are the changes needed? The image tool prints out information that may confusing. ### Does this PR introduce _any_ user-facing change? Less information is displayed by the docker image tool. ### How was this patch tested? Ran docker image tool locally. Closes #28759 from holdenk/SPARK-31934-remove-extranious-info-from-the-docker-image-tool. Authored-by: Holden Karau Signed-off-by: Holden Karau --- bin/docker-image-tool.sh | 2 -- 1 file changed, 2 deletions(-) diff --git a/bin/docker-image-tool.sh b/bin/docker-image-tool.sh index 8a01b80..6d74f83 100755 --- a/bin/docker-image-tool.sh +++ b/bin/docker-image-tool.sh @@ -19,8 +19,6 @@ # This script builds and pushes docker images when run from a release of Spark # with Kubernetes support. -set -x - function error { echo "$@" 1>&2 exit 1 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r39955 - /dev/spark/v2.4.6-rc8-docs/
Author: holden Date: Sat Jun 6 00:09:52 2020 New Revision: 39955 Log: Remove RC artifacts Removed: dev/spark/v2.4.6-rc8-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] tag v2.4.6 created (now 807e0a4)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to tag v2.4.6 in repository https://gitbox.apache.org/repos/asf/spark.git. at 807e0a4 (commit) No new revisions were added by this update. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r39950 - /release/spark/KEYS
Author: holden Date: Fri Jun 5 18:03:18 2020 New Revision: 39950 Log: Update KEYS Modified: release/spark/KEYS Modified: release/spark/KEYS == --- release/spark/KEYS (original) +++ release/spark/KEYS Fri Jun 5 18:03:18 2020 @@ -1167,3 +1167,188 @@ rMA+YcuC9o2K7dKjVv3KinQ2Tiv4TVxyTjcyZurg 0TbepIdiQlc= =wdlY -END PGP PUBLIC KEY BLOCK- + +pub rsa4096 2020-03-30 [SC] + 4A8BDA48E6E212A734632502DEA963E2E9347D66 +uid [ultimate] Reynold Xin (CODE SIGNING KEY) +sub rsa4096 2020-03-30 [E] + +-BEGIN PGP PUBLIC KEY BLOCK- + +mQINBF6BkJkBEACmRKcV6c575E6jOyZBwLteV7hJsETNYx9jMkENiyeyTFJ3A8Hg ++gPAmoU6jvzugR98qgVSH0uj/HZH1zEkJx049+OHwBcZ48mGJakIaKcg3k1CPRTL +VDRWg7M4P7nQisMHsPHrdGPJFVBE7Mn6pafuRZ46gtnXf2Ec1EsvMBOYjRNt6nSg +GvoQdiv5SjUuwxfrw7CICj1agxwLarBcWpIF6PMU7yG+XjTIrSM63KuuV+fOZvKM +AdjwwUNNj2aOkprPHfmFIgSnEMsxvoJQNqYTaWzwT8WAyW1qTd0LhYYDTnb4J+j2 +BxgG5ASHYpsLQ1Moy+lYsTxWsoZMvqTqv/h+Mlb8fiUTiYppeMnLzxtI/t8Trvt8 +rXNGSkNd8dM5uqJ9Ba2MS6UB6EZUd5e7aPy8z5ThlhygRjLk0527O4BYAWlZw5F8 +egq/X0liCeRHoFUsyNnuQYSqo2spdTIV2ExKo/hEF1FgbXF6s1v/TcfzS0PkSYEH +5yhKYoEkYOXIneIjUasy8xM9O2578NsVu1GH0n+E29KDA0w+QKwpbjgb9VWKCjk1 +CPvK7oi3DKA4A28w/h5jI9Xzb343L0gb+IhdgL5lNWp2HoSy+y7Smnbz6IchjAP7 +zCtQ9ZJCLdXgCtDlXUeF+TXzEfKUYwa0jnha/fArM3PVGvQlWdpVhe/oLQARAQAB +tDBSZXlub2xkIFhpbiAoQ09ERSBTSUdOSU5HIEtFWSkgPHJ4aW5AYXBhY2hlLm9y +Zz6JAk4EEwEIADgWIQRKi9pI5uISpzRjJQLeqWPi6TR9ZgUCXoGQmQIbAwULCQgH +AgYVCgkICwIEFgIDAQIeAQIXgAAKCRDeqWPi6TR9ZrBJEACW92VdruNL+dYYH0Cu +9oxZx0thCE1twc/6rvgvIj//0kZ4ZA6RoDId8vSmKSkB0GwMT7daIoeIvRTiEdMQ +Wai7zqvNEdT1qdNn7MfN1rveN1tBNVndzbZ8S8Nz4sqZ/8R3wG90c2XLwno3joXA +FhFRfVa+TWI1Ux84/ZXuzD14f54dorVo0CT51CnU67ERBAijl7UugPM3Fs7ApU/o +SWCMq7ScPde81jmgMqBDLcj/hueCOTU5m8irOGGY439qEF+H41I+IB60yzAS4Gez +xZl55Mv7ZKdwWtCcwtUYIm4R8NNu4alTxUpxw4ttRW3Kzue78TOIMTWTwRKrP5t2 +yq9bMT1fSO7h/Ntn8dXUL0EM/h+6k5py5Kr0+mrV/s0Z530Fit6AC/ReWV6hSGdk +F1Z1ECa4AoUHqtoQKL+CNgO2qlJn/sKj3g10NiSwqUdUuxCSOpsY72udRLG9tfkB +OwW3lTKLp66gYYE3nYaHzJKGdRs7aJ8RRALMQkadsyqpdVMp+Yvbj/3Hn3uB3jTt +S+RolH545toeuhXaiIWlm2434oHW6QjzpPwaNp5AiWm+vMfPkhhCX6WT0jv9nEtM +kJJVgwlWNKYEW9nLaIRMWWONSy9aJapZfLW0XDiKidibPHqNFih9z49eDVLobi5e +mzmOFkKFxs9D4sg9oVmId6Y9SbkCDQRegZCZARAA5ZMv1ki5mKJVpASRGfTHVH5o +9HixwJOinkHjSK3zFpuvh0bs+rKZL2+TUXci9Em64xXuYbiGH3YgH061H9tgAMaN +iSIFGPlbBPbduJjdiUALqauOjjCIoWJLyuAC25zSGCeAwzQiRXN6VJUYwjQnDMDG +8iUyL+IdXjq2T6vFVZGR/uVteRqqvEcg9km6IrFmXefqfry4hZ5a7SbmThCHqGxx +5Oy+VkWw1IP7fHIUdC9ie45X6n08yC2BfWI4+RBny8906pSXEN/ag0Yw7vWkiyuK +wZsoe0pRczV8mx6QF2+oJjRMtziKYW72jKE9a/DXXzQ3Luq5gyZeq0cluYNGHVdj +ijA2ORNLloAfGjVGRKVznUFN8LMkcxm4jiiHKRkZEcjgm+1tRzGPufFidyhQIYO2 +YCOpnPQh5IXznb3RZ0JqJcXdne+7Nge85URTEMmMyx5kXvD03ZmUObshDL12YoM3 +bGzObo6jYg+h38Xlx9+9QAwGkf+gApIPI8KqPAVyP6s60AR4iR6iehEOciz7h6/b +T9bKMw0w9cvyJzY1IJsy2sQYFwNyHYWQkyDciRAmIwriHhBDfXdBodF95V3uGbIp +DZw3jVxcgJWKZ3y65N1aCguEI1fyy9JU12++GMBa+wuv9kdhSoj2qgInFB1VXGC7 +bBlRnHB44tsFTBEqqOcAEQEAAYkCNgQYAQgAIBYhBEqL2kjm4hKnNGMlAt6pY+Lp +NH1mBQJegZCZAhsMAAoJEN6pY+LpNH1mwIYQAIRqbhEjL6uMxM19OMPDydbhiWoI +8BmoqzsvRNF9VidjPRicYJ5JL5FFvvTyT6g87L8aRhiAdX/la92PdJ9DTS3sfIKF +pIcUDFybKgk4pmGWl0fNIwEjHewf6HlndCFmVuPe32V/ZkCwb58dro15xzxblckB +kgsqb0Xbfz/3Iwlqr5eTKH5iPrDFcYKy1ODcFmXS+udMm5uwn+d/RNmj8B3kgwrw +brs53264qdWbfsxGPC1ZkDNNSRyIy6wGvc/diRm4TSV/Lmd5OoDX4UkPJ++JhGoO +cYKxc2KzrEZxzMgJ3xFRs3zeymOwtgXUU1GBCuD7uxr1vacFwUV+9ymTeyUdTxB3 ++/DzxYOJGQL/3IXlyQ2azoCWUpCjW0MFM1OolragOFJeQ+V0xrlOiXXAFfHo0KPG +y0QdK810Ok+XYR6U9Y7yb6tYDgi+w9r46XjurdiZnUxxLUpFG++tSgBQ5X4y2UGw +C4n0T8/jn6KIUZ0kx51ZZ6CEChjBt+AU+HCnw2sZfgq8Nlos95tw2MT6kn8BrY68 +n297ev/1T6B0OasQaw3Itw29+T+FdzdU4c6XW/rC6VAlBikWIS5zCT//vAeBacxL +HYoqwKL52HzG121lfWXhx5vNF4bg/fKrFEOy2Wp1fMG6nRcuUUROvieD6ZU4ZrLA +NjpTIP+lOkfxRwUi +=rggH +-END PGP PUBLIC KEY BLOCK- + +pub rsa4096 2018-07-01 [SC] + CA19E2CC201B6522D3A2A05DC41AB674424FC3FB +uid [ultimate] Holden Karau (CODE SIGNING KEY) +sub rsa4096 2018-07-01 [E] + +-BEGIN PGP PUBLIC KEY BLOCK- + +mQINBFs5MrIBEAC1jGSgkZd+HjtAt5Mlh9LKOucaHRN9ItLxk4rrYklxt3YI20ft +DIrLiAQCotiYYhmhBaXZRTfgGgusRa6MhdmX+5t/+yKZfoLqfeZeKawDsvKhGgod +Yl5iCxB0OmwdZjjOaZPDW2zVlJkjuB24SnSNwsBnd0kkDAEoGMN3bUflvnzL2EJI +zzvmR0HGK0aWRFzj7fvBu5sN4RM03ZN/3CCCVHlznH8ILvFDz8PnLf+nMhBFIQOo +OBiiLq20Ag19vj1rXjfe3U0YT6L4+SbLAt6vde8YpXs/mpkWvVPH4OtEATKsLqvJ +5pP8YGxOkPfls6lfziyLV/JMUO6f9BxqgXXWLPvjAuhRWKJ+yyFfWa4Ju8PTKEO6 +Z344tl+1FNNXNkBGTVN+k264lKe/He8ywYZKgtyF9iGvOf8HgUMsjg6DSFOMaZ9I +oFfe1G1zYtDCwV4rkf+AS/SlxaMcQpr96wxhtnBL6zD92gK8Of1+EN4yrfVSp8Gj +cnRC3jYC4iUtyWDZcUpHkSajxH2IbYQMty8fHhYLA6mq8HEhPEtsSE3eX1KQEZEX +O+mcjHaC6XuvC7aQozNkYw19S6vK9l7Dwcq7+0WYpHFkQ5RgTmE7ns36/lq/4s/B +9TTC9m8eXAjBtjNjlnax7qXBshru6dYt9kfI4PM6SV0WHZtSHldYkqy/GwARAQAB +tDNIb2xkZW4gS2FyYXUgKENPREUgU0lHTklORyBLRVkpIDxob2xkZW5AYXBhY2hl +Lm9yZz6JAk4EEwEKADgWIQTKGeLMIBtlItOioF3EGrZ0Qk/D+wUCWzkysgIbAwUL +CQgHAgYVCgkICwIEFgIDAQIeAQIXgAAKCRDEGrZ0Qk/D+4pwEACUTTtP2xxp7oE/ +1V7ozsvrhs0TwaqVVpBnhgvtj8iW8towVRPiBw+f
svn commit: r39949 - /dev/spark/v2.4.6-rc8-bin/ /release/spark/spark-2.4.6/
Author: holden Date: Fri Jun 5 18:01:51 2020 New Revision: 39949 Log: Release Spark 2.4.6 RC8 :) Added: release/spark/spark-2.4.6/ - copied from r39948, dev/spark/v2.4.6-rc8-bin/ Removed: dev/spark/v2.4.6-rc8-bin/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31778][K8S][BUILD] Support cross-building docker images
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 1292bf2 [SPARK-31778][K8S][BUILD] Support cross-building docker images 1292bf2 is described below commit 1292bf22bf0baa60c902a8c6985bc508cbdbef1b Author: Holden Karau AuthorDate: Tue Jun 2 11:11:23 2020 -0700 [SPARK-31778][K8S][BUILD] Support cross-building docker images ### What changes were proposed in this pull request? Add cross build support to our docker image script using the new dockerx extension. ### Why are the changes needed? We have a CI for Spark on ARM, we should support building images for ARM and AMD64. ### Does this PR introduce _any_ user-facing change? Yes, a new flag is added to the docker image build script to cross-build ### How was this patch tested? Manually ran build script & pushed to https://hub.docker.com/repository/registry-1.docker.io/holdenk/spark/tags?page=1 verified amd64 & arm64 listed. Closes #28615 from holdenk/cross-build. Lead-authored-by: Holden Karau Co-authored-by: Holden Karau Signed-off-by: Holden Karau (cherry picked from commit 25702281dc0c7cc333978f51a15ebf9fd02cc684) Signed-off-by: Holden Karau --- bin/docker-image-tool.sh | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/bin/docker-image-tool.sh b/bin/docker-image-tool.sh index 57b8625..8a01b80 100755 --- a/bin/docker-image-tool.sh +++ b/bin/docker-image-tool.sh @@ -19,6 +19,8 @@ # This script builds and pushes docker images when run from a release of Spark # with Kubernetes support. +set -x + function error { echo "$@" 1>&2 exit 1 @@ -172,6 +174,7 @@ function build { local BASEDOCKERFILE=${BASEDOCKERFILE:-"kubernetes/dockerfiles/spark/Dockerfile"} local PYDOCKERFILE=${PYDOCKERFILE:-false} local RDOCKERFILE=${RDOCKERFILE:-false} + local ARCHS=${ARCHS:-"--platform linux/amd64,linux/arm64"} (cd $(img_ctx_dir base) && docker build $NOCACHEARG "${BUILD_ARGS[@]}" \ -t $(image_ref spark) \ @@ -179,6 +182,11 @@ function build { if [ $? -ne 0 ]; then error "Failed to build Spark JVM Docker image, please refer to Docker build output for details." fi + if [ "${CROSS_BUILD}" != "false" ]; then + (cd $(img_ctx_dir base) && docker buildx build $ARCHS $NOCACHEARG "${BUILD_ARGS[@]}" \ +-t $(image_ref spark) \ +-f "$BASEDOCKERFILE" .) + fi if [ "${PYDOCKERFILE}" != "false" ]; then (cd $(img_ctx_dir pyspark) && docker build $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \ @@ -187,6 +195,11 @@ function build { if [ $? -ne 0 ]; then error "Failed to build PySpark Docker image, please refer to Docker build output for details." fi + if [ "${CROSS_BUILD}" != "false" ]; then +(cd $(img_ctx_dir pyspark) && docker buildx build $ARCHS $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \ + -t $(image_ref spark-py) \ + -f "$PYDOCKERFILE" .) + fi fi if [ "${RDOCKERFILE}" != "false" ]; then @@ -196,6 +209,11 @@ function build { if [ $? -ne 0 ]; then error "Failed to build SparkR Docker image, please refer to Docker build output for details." fi +if [ "${CROSS_BUILD}" != "false" ]; then + (cd $(img_ctx_dir sparkr) && docker buildx build $ARCHS $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \ +-t $(image_ref spark-r) \ +-f "$RDOCKERFILE" .) +fi fi } @@ -227,6 +245,8 @@ Options: -nBuild docker image with --no-cache -u uidUID to use in the USER directive to set the user the main Spark process runs as inside the resulting container + -XUse docker buildx to cross build. Automatically pushes. +See https://docs.docker.com/buildx/working-with-buildx/ for steps to setup buildx. -b argBuild arg to build or push the image. For multiple build args, this option needs to be used separately for each build arg. @@ -252,6 +272,12 @@ Examples: - Build and push JDK11-based image with tag "v3.0.0" to docker.io/myrepo $0 -r docker.io/myrepo -t v3.0.0 -b java_image_tag=11-jre-slim build $0 -r docker.io/myrepo -t v3.0.0 push + + - Build and push JDK11-based image for multiple archs to docker.io/myrepo +$0 -r docker.io/myrepo -t v3.0.0 -X -b java_image_tag=11-jre-slim build +# Note: buildx, which does
[spark] branch master updated: [SPARK-31778][K8S][BUILD] Support cross-building docker images
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 2570228 [SPARK-31778][K8S][BUILD] Support cross-building docker images 2570228 is described below commit 25702281dc0c7cc333978f51a15ebf9fd02cc684 Author: Holden Karau AuthorDate: Tue Jun 2 11:11:23 2020 -0700 [SPARK-31778][K8S][BUILD] Support cross-building docker images ### What changes were proposed in this pull request? Add cross build support to our docker image script using the new dockerx extension. ### Why are the changes needed? We have a CI for Spark on ARM, we should support building images for ARM and AMD64. ### Does this PR introduce _any_ user-facing change? Yes, a new flag is added to the docker image build script to cross-build ### How was this patch tested? Manually ran build script & pushed to https://hub.docker.com/repository/registry-1.docker.io/holdenk/spark/tags?page=1 verified amd64 & arm64 listed. Closes #28615 from holdenk/cross-build. Lead-authored-by: Holden Karau Co-authored-by: Holden Karau Signed-off-by: Holden Karau --- bin/docker-image-tool.sh | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/bin/docker-image-tool.sh b/bin/docker-image-tool.sh index 57b8625..8a01b80 100755 --- a/bin/docker-image-tool.sh +++ b/bin/docker-image-tool.sh @@ -19,6 +19,8 @@ # This script builds and pushes docker images when run from a release of Spark # with Kubernetes support. +set -x + function error { echo "$@" 1>&2 exit 1 @@ -172,6 +174,7 @@ function build { local BASEDOCKERFILE=${BASEDOCKERFILE:-"kubernetes/dockerfiles/spark/Dockerfile"} local PYDOCKERFILE=${PYDOCKERFILE:-false} local RDOCKERFILE=${RDOCKERFILE:-false} + local ARCHS=${ARCHS:-"--platform linux/amd64,linux/arm64"} (cd $(img_ctx_dir base) && docker build $NOCACHEARG "${BUILD_ARGS[@]}" \ -t $(image_ref spark) \ @@ -179,6 +182,11 @@ function build { if [ $? -ne 0 ]; then error "Failed to build Spark JVM Docker image, please refer to Docker build output for details." fi + if [ "${CROSS_BUILD}" != "false" ]; then + (cd $(img_ctx_dir base) && docker buildx build $ARCHS $NOCACHEARG "${BUILD_ARGS[@]}" \ +-t $(image_ref spark) \ +-f "$BASEDOCKERFILE" .) + fi if [ "${PYDOCKERFILE}" != "false" ]; then (cd $(img_ctx_dir pyspark) && docker build $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \ @@ -187,6 +195,11 @@ function build { if [ $? -ne 0 ]; then error "Failed to build PySpark Docker image, please refer to Docker build output for details." fi + if [ "${CROSS_BUILD}" != "false" ]; then +(cd $(img_ctx_dir pyspark) && docker buildx build $ARCHS $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \ + -t $(image_ref spark-py) \ + -f "$PYDOCKERFILE" .) + fi fi if [ "${RDOCKERFILE}" != "false" ]; then @@ -196,6 +209,11 @@ function build { if [ $? -ne 0 ]; then error "Failed to build SparkR Docker image, please refer to Docker build output for details." fi +if [ "${CROSS_BUILD}" != "false" ]; then + (cd $(img_ctx_dir sparkr) && docker buildx build $ARCHS $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \ +-t $(image_ref spark-r) \ +-f "$RDOCKERFILE" .) +fi fi } @@ -227,6 +245,8 @@ Options: -nBuild docker image with --no-cache -u uidUID to use in the USER directive to set the user the main Spark process runs as inside the resulting container + -XUse docker buildx to cross build. Automatically pushes. +See https://docs.docker.com/buildx/working-with-buildx/ for steps to setup buildx. -b argBuild arg to build or push the image. For multiple build args, this option needs to be used separately for each build arg. @@ -252,6 +272,12 @@ Examples: - Build and push JDK11-based image with tag "v3.0.0" to docker.io/myrepo $0 -r docker.io/myrepo -t v3.0.0 -b java_image_tag=11-jre-slim build $0 -r docker.io/myrepo -t v3.0.0 push + + - Build and push JDK11-based image for multiple archs to docker.io/myrepo +$0 -r docker.io/myrepo -t v3.0.0 -X -b java_image_tag=11-jre-slim build +# Note: buildx, which does cross building, needs to do the push during build +# So there is no seperate push step with -X + EOF } @@
[spark] branch branch-3.0 updated: [SPARK-31860][BUILD] only push release tags on succes
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new b814cbc [SPARK-31860][BUILD] only push release tags on succes b814cbc is described below commit b814cbc8121c8d02a81efbae3e3cf1b6a6219262 Author: Holden Karau AuthorDate: Tue Jun 2 11:04:07 2020 -0700 [SPARK-31860][BUILD] only push release tags on succes ### What changes were proposed in this pull request? Only push the release tag after the build has finished. ### Why are the changes needed? If the build fails we don't need a release tag. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Running locally with a fake user in https://github.com/apache/spark/pull/28667 Closes #28700 from holdenk/SPARK-31860-build-master-only-push-tags-on-success. Authored-by: Holden Karau Signed-off-by: Holden Karau (cherry picked from commit 69ba9b662e2ace592380e3cc5de041031dec2254) Signed-off-by: Holden Karau --- dev/create-release/do-release.sh| 14 +++--- dev/create-release/release-build.sh | 7 +-- dev/create-release/release-tag.sh | 16 +++- 3 files changed, 19 insertions(+), 18 deletions(-) diff --git a/dev/create-release/do-release.sh b/dev/create-release/do-release.sh index 4f18a55..64fba8a 100755 --- a/dev/create-release/do-release.sh +++ b/dev/create-release/do-release.sh @@ -17,6 +17,8 @@ # limitations under the License. # +set -e + SELF=$(cd $(dirname $0) && pwd) . "$SELF/release-util.sh" @@ -52,9 +54,6 @@ function should_build { if should_build "tag" && [ $SKIP_TAG = 0 ]; then run_silent "Creating release tag $RELEASE_TAG..." "tag.log" \ "$SELF/release-tag.sh" - echo "It may take some time for the tag to be synchronized to github." - echo "Press enter when you've verified that the new tag ($RELEASE_TAG) is available." - read else echo "Skipping tag creation for $RELEASE_TAG." fi @@ -79,3 +78,12 @@ if should_build "publish"; then else echo "Skipping publish step." fi + +if should_build "tag" && [ $SKIP_TAG = 0 ]; then + git push origin $RELEASE_TAG + if [[ $RELEASE_TAG != *"preview"* ]]; then +git push origin HEAD:$GIT_BRANCH + else +echo "It's preview release. We only push $RELEASE_TAG to remote." + fi +fi diff --git a/dev/create-release/release-build.sh b/dev/create-release/release-build.sh index e3bcb72..66c5184 100755 --- a/dev/create-release/release-build.sh +++ b/dev/create-release/release-build.sh @@ -92,9 +92,12 @@ BASE_DIR=$(pwd) init_java init_maven_sbt -rm -rf spark -git clone "$ASF_REPO" +# Only clone repo fresh if not present, otherwise use checkout from the tag step +if [ ! -d spark ]; then + git clone "$ASF_REPO" +fi cd spark +git fetch git checkout $GIT_REF git_hash=`git rev-parse --short HEAD` echo "Checked out Spark git hash $git_hash" diff --git a/dev/create-release/release-tag.sh b/dev/create-release/release-tag.sh index 39856a9..e37aa27 100755 --- a/dev/create-release/release-tag.sh +++ b/dev/create-release/release-tag.sh @@ -25,6 +25,7 @@ function exit_with_usage { cat << EOF usage: $NAME Tags a Spark release on a particular branch. +You must push the tags after. Inputs are specified with the following environment variables: ASF_USERNAME - Apache Username @@ -105,19 +106,8 @@ sed -i".tmp7" 's/SPARK_VERSION_SHORT:.*$/SPARK_VERSION_SHORT: '"$R_NEXT_VERSION" git commit -a -m "Preparing development version $NEXT_VERSION" -if ! is_dry_run; then - # Push changes - git push origin $RELEASE_TAG - if [[ $RELEASE_VERSION != *"preview"* ]]; then -git push origin HEAD:$GIT_BRANCH - else -echo "It's preview release. We only push $RELEASE_TAG to remote." - fi - - cd .. - rm -rf spark -else - cd .. +cd .. +if is_dry_run; then mv spark spark.tag echo "Clone with version changes and tag available as spark.tag in the output directory." fi - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-31860][BUILD] only push release tags on succes
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 69ba9b6 [SPARK-31860][BUILD] only push release tags on succes 69ba9b6 is described below commit 69ba9b662e2ace592380e3cc5de041031dec2254 Author: Holden Karau AuthorDate: Tue Jun 2 11:04:07 2020 -0700 [SPARK-31860][BUILD] only push release tags on succes ### What changes were proposed in this pull request? Only push the release tag after the build has finished. ### Why are the changes needed? If the build fails we don't need a release tag. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Running locally with a fake user in https://github.com/apache/spark/pull/28667 Closes #28700 from holdenk/SPARK-31860-build-master-only-push-tags-on-success. Authored-by: Holden Karau Signed-off-by: Holden Karau --- dev/create-release/do-release.sh| 14 +++--- dev/create-release/release-build.sh | 7 +-- dev/create-release/release-tag.sh | 16 +++- 3 files changed, 19 insertions(+), 18 deletions(-) diff --git a/dev/create-release/do-release.sh b/dev/create-release/do-release.sh index 4f18a55..64fba8a 100755 --- a/dev/create-release/do-release.sh +++ b/dev/create-release/do-release.sh @@ -17,6 +17,8 @@ # limitations under the License. # +set -e + SELF=$(cd $(dirname $0) && pwd) . "$SELF/release-util.sh" @@ -52,9 +54,6 @@ function should_build { if should_build "tag" && [ $SKIP_TAG = 0 ]; then run_silent "Creating release tag $RELEASE_TAG..." "tag.log" \ "$SELF/release-tag.sh" - echo "It may take some time for the tag to be synchronized to github." - echo "Press enter when you've verified that the new tag ($RELEASE_TAG) is available." - read else echo "Skipping tag creation for $RELEASE_TAG." fi @@ -79,3 +78,12 @@ if should_build "publish"; then else echo "Skipping publish step." fi + +if should_build "tag" && [ $SKIP_TAG = 0 ]; then + git push origin $RELEASE_TAG + if [[ $RELEASE_TAG != *"preview"* ]]; then +git push origin HEAD:$GIT_BRANCH + else +echo "It's preview release. We only push $RELEASE_TAG to remote." + fi +fi diff --git a/dev/create-release/release-build.sh b/dev/create-release/release-build.sh index e3bcb72..66c5184 100755 --- a/dev/create-release/release-build.sh +++ b/dev/create-release/release-build.sh @@ -92,9 +92,12 @@ BASE_DIR=$(pwd) init_java init_maven_sbt -rm -rf spark -git clone "$ASF_REPO" +# Only clone repo fresh if not present, otherwise use checkout from the tag step +if [ ! -d spark ]; then + git clone "$ASF_REPO" +fi cd spark +git fetch git checkout $GIT_REF git_hash=`git rev-parse --short HEAD` echo "Checked out Spark git hash $git_hash" diff --git a/dev/create-release/release-tag.sh b/dev/create-release/release-tag.sh index 39856a9..e37aa27 100755 --- a/dev/create-release/release-tag.sh +++ b/dev/create-release/release-tag.sh @@ -25,6 +25,7 @@ function exit_with_usage { cat << EOF usage: $NAME Tags a Spark release on a particular branch. +You must push the tags after. Inputs are specified with the following environment variables: ASF_USERNAME - Apache Username @@ -105,19 +106,8 @@ sed -i".tmp7" 's/SPARK_VERSION_SHORT:.*$/SPARK_VERSION_SHORT: '"$R_NEXT_VERSION" git commit -a -m "Preparing development version $NEXT_VERSION" -if ! is_dry_run; then - # Push changes - git push origin $RELEASE_TAG - if [[ $RELEASE_VERSION != *"preview"* ]]; then -git push origin HEAD:$GIT_BRANCH - else -echo "It's preview release. We only push $RELEASE_TAG to remote." - fi - - cd .. - rm -rf spark -else - cd .. +cd .. +if is_dry_run; then mv spark spark.tag echo "Clone with version changes and tag available as spark.tag in the output directory." fi - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-31860][BUILD][2.4] only push release tags on success
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new bf01280 [SPARK-31860][BUILD][2.4] only push release tags on success bf01280 is described below commit bf012809eb28a7feb5f5e20788550a39e1dbe8ad Author: Holden Karau AuthorDate: Mon Jun 1 17:46:13 2020 -0700 [SPARK-31860][BUILD][2.4] only push release tags on success ### What changes were proposed in this pull request? Only push the release tag after the build has finished. ### Why are the changes needed? If the build fails we don't need a release tag. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Running locally with a fake user Closes #28667 from holdenk/SPARK-31860-only-push-release-tags-on-success. Authored-by: Holden Karau Signed-off-by: Holden Karau --- dev/create-release/do-release.sh| 11 --- dev/create-release/release-build.sh | 7 +-- dev/create-release/release-tag.sh | 11 ++- 3 files changed, 15 insertions(+), 14 deletions(-) diff --git a/dev/create-release/do-release.sh b/dev/create-release/do-release.sh index f1d4f3a..9ff 100755 --- a/dev/create-release/do-release.sh +++ b/dev/create-release/do-release.sh @@ -17,6 +17,8 @@ # limitations under the License. # +set -e + SELF=$(cd $(dirname $0) && pwd) . "$SELF/release-util.sh" @@ -52,9 +54,6 @@ function should_build { if should_build "tag" && [ $SKIP_TAG = 0 ]; then run_silent "Creating release tag $RELEASE_TAG..." "tag.log" \ "$SELF/release-tag.sh" - echo "It may take some time for the tag to be synchronized to github." - echo "Press enter when you've verified that the new tag ($RELEASE_TAG) is available." - read else echo "Skipping tag creation for $RELEASE_TAG." fi @@ -79,3 +78,9 @@ if should_build "publish"; then else echo "Skipping publish step." fi + +if should_build "tag" && [ $SKIP_TAG = 0 ]; then + # Push the tag after success + git push origin "$RELEASE_TAG" + git push origin "HEAD:$GIT_BRANCH" +fi diff --git a/dev/create-release/release-build.sh b/dev/create-release/release-build.sh index 1fd8a30..3c287fd 100755 --- a/dev/create-release/release-build.sh +++ b/dev/create-release/release-build.sh @@ -92,9 +92,12 @@ BASE_DIR=$(pwd) init_java init_maven_sbt -rm -rf spark -git clone "$ASF_REPO" +# Only clone the repo fresh when not present, otherwise use checkout +if [ ! -d spark ]; then + git clone "$ASF_REPO" +fi cd spark +git fetch git checkout $GIT_REF git_hash=`git rev-parse --short HEAD` echo "Checked out Spark git hash $git_hash" diff --git a/dev/create-release/release-tag.sh b/dev/create-release/release-tag.sh index 8024440..517c7f7 100755 --- a/dev/create-release/release-tag.sh +++ b/dev/create-release/release-tag.sh @@ -24,7 +24,7 @@ function exit_with_usage { local NAME=$(basename $0) cat << EOF usage: $NAME -Tags a Spark release on a particular branch. +Tags a Spark release on a particular branch. Must push after Inputs are specified with the following environment variables: ASF_USERNAME - Apache Username @@ -101,14 +101,7 @@ sed -i".tmp7" 's/SPARK_VERSION_SHORT:.*$/SPARK_VERSION_SHORT: '"$R_NEXT_VERSION" git commit -a -m "Preparing development version $NEXT_VERSION" -if ! is_dry_run; then - # Push changes - git push origin $RELEASE_TAG - git push origin HEAD:$GIT_BRANCH - - cd .. - rm -rf spark -else +if is_dry_run; then cd .. mv spark spark.tag echo "Clone with version changes and tag available as spark.tag in the output directory." - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-31889][BUILD] Docker release script does not allocate enough memory to reliably publish
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new ac59a8c [SPARK-31889][BUILD] Docker release script does not allocate enough memory to reliably publish ac59a8c is described below commit ac59a8ccee0f3121fd87477b99223c6d3690a936 Author: Holden Karau AuthorDate: Mon Jun 1 15:49:17 2020 -0700 [SPARK-31889][BUILD] Docker release script does not allocate enough memory to reliably publish ### What changes were proposed in this pull request? Allow overriding the zinc options in the docker release and set a higher so the publish step can succeed consistently. ### Why are the changes needed? The publish step experiences memory pressure. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Running test locally with fake user to see if publish step (besides svn part) succeeds Closes #28698 from holdenk/SPARK-31889-docker-release-script-does-not-allocate-enough-memory-to-reliably-publish. Authored-by: Holden Karau Signed-off-by: Holden Karau (cherry picked from commit ab9e5a2fe9a0e9e93301096a027d7409fc3c9b64) Signed-off-by: Holden Karau --- dev/create-release/do-release-docker.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/dev/create-release/do-release-docker.sh b/dev/create-release/do-release-docker.sh index f643c06..8ea75b8 100755 --- a/dev/create-release/do-release-docker.sh +++ b/dev/create-release/do-release-docker.sh @@ -128,6 +128,7 @@ ASF_PASSWORD=$ASF_PASSWORD GPG_PASSPHRASE=$GPG_PASSPHRASE RELEASE_STEP=$RELEASE_STEP USER=$USER +ZINC_OPTS=${RELEASE_ZINC_OPTS:-"-Xmx4g -XX:ReservedCodeCacheSize=2g"} EOF JAVA_VOL= - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31889][BUILD] Docker release script does not allocate enough memory to reliably publish
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 387f57c [SPARK-31889][BUILD] Docker release script does not allocate enough memory to reliably publish 387f57c is described below commit 387f57c574beb726db4d305f3c1da7d8255ad103 Author: Holden Karau AuthorDate: Mon Jun 1 15:49:17 2020 -0700 [SPARK-31889][BUILD] Docker release script does not allocate enough memory to reliably publish ### What changes were proposed in this pull request? Allow overriding the zinc options in the docker release and set a higher so the publish step can succeed consistently. ### Why are the changes needed? The publish step experiences memory pressure. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Running test locally with fake user to see if publish step (besides svn part) succeeds Closes #28698 from holdenk/SPARK-31889-docker-release-script-does-not-allocate-enough-memory-to-reliably-publish. Authored-by: Holden Karau Signed-off-by: Holden Karau (cherry picked from commit ab9e5a2fe9a0e9e93301096a027d7409fc3c9b64) Signed-off-by: Holden Karau --- dev/create-release/do-release-docker.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/dev/create-release/do-release-docker.sh b/dev/create-release/do-release-docker.sh index 2f794c0..8f53f4a 100755 --- a/dev/create-release/do-release-docker.sh +++ b/dev/create-release/do-release-docker.sh @@ -128,6 +128,7 @@ ASF_PASSWORD=$ASF_PASSWORD GPG_PASSPHRASE=$GPG_PASSPHRASE RELEASE_STEP=$RELEASE_STEP USER=$USER +ZINC_OPTS=${RELEASE_ZINC_OPTS:-"-Xmx4g -XX:ReservedCodeCacheSize=2g"} EOF JAVA_VOL= - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-31889][BUILD] Docker release script does not allocate enough memory to reliably publish
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new ab9e5a2 [SPARK-31889][BUILD] Docker release script does not allocate enough memory to reliably publish ab9e5a2 is described below commit ab9e5a2fe9a0e9e93301096a027d7409fc3c9b64 Author: Holden Karau AuthorDate: Mon Jun 1 15:49:17 2020 -0700 [SPARK-31889][BUILD] Docker release script does not allocate enough memory to reliably publish ### What changes were proposed in this pull request? Allow overriding the zinc options in the docker release and set a higher so the publish step can succeed consistently. ### Why are the changes needed? The publish step experiences memory pressure. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Running test locally with fake user to see if publish step (besides svn part) succeeds Closes #28698 from holdenk/SPARK-31889-docker-release-script-does-not-allocate-enough-memory-to-reliably-publish. Authored-by: Holden Karau Signed-off-by: Holden Karau --- dev/create-release/do-release-docker.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/dev/create-release/do-release-docker.sh b/dev/create-release/do-release-docker.sh index 2f794c0..8f53f4a 100755 --- a/dev/create-release/do-release-docker.sh +++ b/dev/create-release/do-release-docker.sh @@ -128,6 +128,7 @@ ASF_PASSWORD=$ASF_PASSWORD GPG_PASSPHRASE=$GPG_PASSPHRASE RELEASE_STEP=$RELEASE_STEP USER=$USER +ZINC_OPTS=${RELEASE_ZINC_OPTS:-"-Xmx4g -XX:ReservedCodeCacheSize=2g"} EOF JAVA_VOL= - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r39857 - in /dev/spark/v2.4.6-rc8-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _site/api/java/org/apache/spark
Author: holden Date: Sat May 30 01:27:54 2020 New Revision: 39857 Log: Apache Spark v2.4.6-rc8 docs [This commit notification would consist of 1458 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r39856 - /dev/spark/v2.4.6-rc8-bin/
Author: holden Date: Sat May 30 00:54:32 2020 New Revision: 39856 Log: Apache Spark v2.4.6-rc8 Added: dev/spark/v2.4.6-rc8-bin/ dev/spark/v2.4.6-rc8-bin/SparkR_2.4.6.tar.gz (with props) dev/spark/v2.4.6-rc8-bin/SparkR_2.4.6.tar.gz.asc dev/spark/v2.4.6-rc8-bin/SparkR_2.4.6.tar.gz.sha512 dev/spark/v2.4.6-rc8-bin/pyspark-2.4.6.tar.gz (with props) dev/spark/v2.4.6-rc8-bin/pyspark-2.4.6.tar.gz.asc dev/spark/v2.4.6-rc8-bin/pyspark-2.4.6.tar.gz.sha512 dev/spark/v2.4.6-rc8-bin/spark-2.4.6-bin-hadoop2.6.tgz (with props) dev/spark/v2.4.6-rc8-bin/spark-2.4.6-bin-hadoop2.6.tgz.asc dev/spark/v2.4.6-rc8-bin/spark-2.4.6-bin-hadoop2.6.tgz.sha512 dev/spark/v2.4.6-rc8-bin/spark-2.4.6-bin-hadoop2.7.tgz (with props) dev/spark/v2.4.6-rc8-bin/spark-2.4.6-bin-hadoop2.7.tgz.asc dev/spark/v2.4.6-rc8-bin/spark-2.4.6-bin-hadoop2.7.tgz.sha512 dev/spark/v2.4.6-rc8-bin/spark-2.4.6-bin-without-hadoop-scala-2.12.tgz (with props) dev/spark/v2.4.6-rc8-bin/spark-2.4.6-bin-without-hadoop-scala-2.12.tgz.asc dev/spark/v2.4.6-rc8-bin/spark-2.4.6-bin-without-hadoop-scala-2.12.tgz.sha512 dev/spark/v2.4.6-rc8-bin/spark-2.4.6-bin-without-hadoop.tgz (with props) dev/spark/v2.4.6-rc8-bin/spark-2.4.6-bin-without-hadoop.tgz.asc dev/spark/v2.4.6-rc8-bin/spark-2.4.6-bin-without-hadoop.tgz.sha512 dev/spark/v2.4.6-rc8-bin/spark-2.4.6.tgz (with props) dev/spark/v2.4.6-rc8-bin/spark-2.4.6.tgz.asc dev/spark/v2.4.6-rc8-bin/spark-2.4.6.tgz.sha512 Added: dev/spark/v2.4.6-rc8-bin/SparkR_2.4.6.tar.gz == Binary file - no diff available. Propchange: dev/spark/v2.4.6-rc8-bin/SparkR_2.4.6.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v2.4.6-rc8-bin/SparkR_2.4.6.tar.gz.asc == --- dev/spark/v2.4.6-rc8-bin/SparkR_2.4.6.tar.gz.asc (added) +++ dev/spark/v2.4.6-rc8-bin/SparkR_2.4.6.tar.gz.asc Sat May 30 00:54:32 2020 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- +Version: GnuPG v1 + +iQIcBAABAgAGBQJe0abqAAoJEMQatnRCT8P7kGEP/2bc9UCJnW49GL1LGgXGca+a +/88T3iO1FEAeJtukLYjK90nx/iAa0Sf7hwNWYm+tycbueJq5+SIwQYbe97jrs65T +8NZeVasGhPflZ3W4beZqQohipPgIApPcdQIk40Uwj0pTEXsOZn9AiZwyRQ/HRUaI +0alkUQc+18hm32dGIQPYAl3aOclxh9pSR2Nf4bpVYnkJWNXP/YIpWD69KGZHoRcS +Cib4OS+VSG7FtpHV45W6g3Z0sjf9Ls8N1x0/14lWgMkuzwkpC6s0FRUcQ7n6LjWb +ytYEupdLar0T2YGuL5gbJheF2SdjdB02jAdKB3utAKD0pIvYvHYJy+vZoYkt3hsG +YjWWrnKMuP0k6jBUd3VSgsiSFCRe5OiZi6qUyg1va6ag7Y7qEarUT4SyLmH1EHG0 +kwcbNUW4D26KkUgxeXqBEUX9rtmVGyOGhy9T2BDKm1MlDPvCHLimPdCAumU6kLcy +HSS3S2Pd2EX6znIdJlHASbTs7HGuhJ38CtYbGWTuf/X1OwqDwWBA1cA9IOMWcPx0 +xZCBDgIjtrtuB7IdxFN+yZcLD5peTDcI5/e7q8FEbovKdO4pKhQkDqvDoWSbVVC7 +JqWMG7oYgN3IOO7B8gpxbr/n/KWJZ8UWRrhJj7aRHdSSx45NjLVXfJWBGic8FGAZ +5QQUa9SJZ+Zx2bjtL3/a +=ZTPX +-END PGP SIGNATURE- Added: dev/spark/v2.4.6-rc8-bin/SparkR_2.4.6.tar.gz.sha512 == --- dev/spark/v2.4.6-rc8-bin/SparkR_2.4.6.tar.gz.sha512 (added) +++ dev/spark/v2.4.6-rc8-bin/SparkR_2.4.6.tar.gz.sha512 Sat May 30 00:54:32 2020 @@ -0,0 +1,3 @@ +SparkR_2.4.6.tar.gz: 3CC0F7FF FF0D6BD2 0A268A36 D9F33E42 49A66B56 4C49E834 + 368DA707 E5AC3E5E 44B0A527 798181EF 4A951AAE 6E927206 + 7246406F 7D7A38F6 BC059661 DA903604 Added: dev/spark/v2.4.6-rc8-bin/pyspark-2.4.6.tar.gz == Binary file - no diff available. Propchange: dev/spark/v2.4.6-rc8-bin/pyspark-2.4.6.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v2.4.6-rc8-bin/pyspark-2.4.6.tar.gz.asc == --- dev/spark/v2.4.6-rc8-bin/pyspark-2.4.6.tar.gz.asc (added) +++ dev/spark/v2.4.6-rc8-bin/pyspark-2.4.6.tar.gz.asc Sat May 30 00:54:32 2020 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- +Version: GnuPG v1 + +iQIcBAABAgAGBQJe0aKWAAoJEMQatnRCT8P7N7EP/RuobQLVGHz6CpWFDKZCHUvk +W24KFvAkhNS3S6+YzC5+GjZRM6WWQUx7n2+fwB7ntUq4B1IMZ9Tk5w3aVvmJkiOZ +cwz1sqvcxmDZ97b0e8H/7j0Uue1V65nkVVnd9lrUgZicAaBRCkYJQdEqtUmJK+fd +Wm4+9XHrssac1mG75b1DeEfjw9Ip3DhG+ey/QXZ1F7XaQcbnX36BKGi2PQhXF8No +fFDDonnHxKh/JHvGyCCe6nixO4R6un28uTdUf5WXTToYrgsrp0Dn+N0bt+O5PMw3 +PcI7ODnsLVXgVsxkYZvhSq8Dskx6H8JYVTe4bv3Ayt2OfhNbbi1VPTnCKGeoyzxo +7SKnmHLXeSqQ9AxhNRT+VxtR4GvJXNHT+MAFYwdtZQ0C8GqXqBlRfyUKkiSLaC5r +8usaVtugOQRpEVepvSiwnsJngMIYUOhitBttRoEfpO6aikMsaCWk47s1CUpkkffO +z7IqtaYwNXwUcRbk7ObUsep4hWfCGcPvwbDiRgvygOJ5LtMYM8foi3xNisw8OPU7 +V/VvO4jxPVroivHmokW9cZpaN3gV1tWNmMxrLPNJX0DCU41lVe+LZXi71Uf3sOmA +HV7fH+bkMPzHxNog1uhO/8S3PvDwOWqoVStTJ116vKGokDttcr+1bb/mroYIb9Py +4bhiZlciLcwLnfV//H0k +=FPuQ +-END PGP SIGNATURE- Added: dev
[spark] 01/01: Preparing development version 2.4.7-SNAPSHOT
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git commit f32c60e001abe564736e0d00772dfce23c3b07b5 Author: Holden Karau AuthorDate: Fri May 29 23:28:37 2020 + Preparing development version 2.4.7-SNAPSHOT --- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/flume-assembly/pom.xml| 2 +- external/flume-sink/pom.xml| 2 +- external/flume/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10/pom.xml| 2 +- external/kafka-0-8-assembly/pom.xml| 2 +- external/kafka-0-8/pom.xml | 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 43 files changed, 44 insertions(+), 44 deletions(-) diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index c913a38..b70014d 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 2.4.6 +Version: 2.4.7 Title: R Front End for 'Apache Spark' Description: Provides an R Front end for 'Apache Spark' <https://spark.apache.org>. Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), diff --git a/assembly/pom.xml b/assembly/pom.xml index de59f40..712cc7f 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.11 -2.4.6 +2.4.7-SNAPSHOT ../pom.xml diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index b3a2265..825d771 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.6 +2.4.7-SNAPSHOT ../../pom.xml diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index d3db527..9dd26b3 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.6 +2.4.7-SNAPSHOT ../../pom.xml diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index 34872fd..386782b 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.6 +2.4.7-SNAPSHOT ../../pom.xml diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml index 5059ee0..8496a68 100644 --- a/common/network-yarn/pom.xml +++ b/common/network-yarn/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.6 +2.4.7-SNAPSHOT ../../pom.xml diff -
[spark] branch branch-2.4 updated (8307f1a -> f32c60e)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 8307f1a [SPARK-26095][BUILD] Disable parallelization in make-distibution.sh. add 807e0a4 Preparing Spark release v2.4.6-rc8 new f32c60e Preparing development version 2.4.7-SNAPSHOT The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] tag v2.4.6-rc8 created (now 807e0a4)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to tag v2.4.6-rc8 in repository https://gitbox.apache.org/repos/asf/spark.git. at 807e0a4 (commit) This tag includes the following new commits: new 807e0a4 Preparing Spark release v2.4.6-rc8 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] 01/01: Preparing Spark release v2.4.6-rc8
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to tag v2.4.6-rc8 in repository https://gitbox.apache.org/repos/asf/spark.git commit 807e0a484d1de767d1f02bd8a622da6450bdf940 Author: Holden Karau AuthorDate: Fri May 29 23:28:32 2020 + Preparing Spark release v2.4.6-rc8 --- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/flume-assembly/pom.xml| 2 +- external/flume-sink/pom.xml| 2 +- external/flume/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10/pom.xml| 2 +- external/kafka-0-8-assembly/pom.xml| 2 +- external/kafka-0-8/pom.xml | 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 43 files changed, 44 insertions(+), 44 deletions(-) diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index b70014d..c913a38 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 2.4.7 +Version: 2.4.6 Title: R Front End for 'Apache Spark' Description: Provides an R Front end for 'Apache Spark' <https://spark.apache.org>. Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), diff --git a/assembly/pom.xml b/assembly/pom.xml index 712cc7f..de59f40 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.11 -2.4.7-SNAPSHOT +2.4.6 ../pom.xml diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index 825d771..b3a2265 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.7-SNAPSHOT +2.4.6 ../../pom.xml diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index 9dd26b3..d3db527 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.7-SNAPSHOT +2.4.6 ../../pom.xml diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index 386782b..34872fd 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.7-SNAPSHOT +2.4.6 ../../pom.xml diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml index 8496a68..5059ee0 100644 --- a/common/network-yarn/pom.xml +++ b/common/network-yarn/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.7-SNAPSHOT +2.4.6 ../../pom.xml diff --git a/co
[spark] branch branch-2.4 updated: [SPARK-26095][BUILD] Disable parallelization in make-distibution.sh.
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 8307f1a [SPARK-26095][BUILD] Disable parallelization in make-distibution.sh. 8307f1a is described below commit 8307f1aa6c421669367f60b1bc9a9d497fb74d6f Author: Marcelo Vanzin AuthorDate: Fri Nov 16 15:57:38 2018 -0800 [SPARK-26095][BUILD] Disable parallelization in make-distibution.sh. It makes the build slower, but at least it doesn't hang. Seems that maven-shade-plugin has some issue with parallelization. Closes #23061 from vanzin/SPARK-26095. Authored-by: Marcelo Vanzin Signed-off-by: Marcelo Vanzin (cherry picked from commit d2792046a1b10a07b65fc30be573983f1237e450) Signed-off-by: Holden Karau --- dev/make-distribution.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dev/make-distribution.sh b/dev/make-distribution.sh index 78b85bd..2b02c24 100755 --- a/dev/make-distribution.sh +++ b/dev/make-distribution.sh @@ -166,7 +166,7 @@ export MAVEN_OPTS="${MAVEN_OPTS:--Xmx2g -XX:ReservedCodeCacheSize=1g}" # Store the command as an array because $MVN variable might have spaces in it. # Normal quoting tricks don't work. # See: http://mywiki.wooledge.org/BashFAQ/050 -BUILD_COMMAND=("$MVN" -T 1C clean package -DskipTests $@) +BUILD_COMMAND=("$MVN" clean package -DskipTests $@) # Actually build the jar echo -e "\nBuilding with..." - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r39852 - in /dev/spark: v2.4.6-rc1-bin/ v2.4.6-rc1-docs/ v2.4.6-rc3-bin/ v2.4.6-rc3-docs/ v2.4.6-rc6-bin/ v2.4.6-rc6-docs/ v2.4.6-rc7-bin/
Author: holden Date: Fri May 29 19:49:01 2020 New Revision: 39852 Log: Remove failed 2.4.6 RCs Removed: dev/spark/v2.4.6-rc1-bin/ dev/spark/v2.4.6-rc1-docs/ dev/spark/v2.4.6-rc3-bin/ dev/spark/v2.4.6-rc3-docs/ dev/spark/v2.4.6-rc6-bin/ dev/spark/v2.4.6-rc6-docs/ dev/spark/v2.4.6-rc7-bin/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r39820 - /dev/spark/v2.4.6-rc7-bin/
Author: holden Date: Fri May 29 08:53:39 2020 New Revision: 39820 Log: Apache Spark v2.4.6-rc7 Added: dev/spark/v2.4.6-rc7-bin/ dev/spark/v2.4.6-rc7-bin/SparkR_2.4.6.tar.gz (with props) dev/spark/v2.4.6-rc7-bin/SparkR_2.4.6.tar.gz.asc dev/spark/v2.4.6-rc7-bin/SparkR_2.4.6.tar.gz.sha512 dev/spark/v2.4.6-rc7-bin/pyspark-2.4.6.tar.gz (with props) dev/spark/v2.4.6-rc7-bin/pyspark-2.4.6.tar.gz.asc dev/spark/v2.4.6-rc7-bin/pyspark-2.4.6.tar.gz.sha512 dev/spark/v2.4.6-rc7-bin/spark-2.4.6-bin-hadoop2.6.tgz (with props) dev/spark/v2.4.6-rc7-bin/spark-2.4.6-bin-hadoop2.6.tgz.asc dev/spark/v2.4.6-rc7-bin/spark-2.4.6-bin-hadoop2.6.tgz.sha512 dev/spark/v2.4.6-rc7-bin/spark-2.4.6-bin-hadoop2.7.tgz (with props) dev/spark/v2.4.6-rc7-bin/spark-2.4.6-bin-hadoop2.7.tgz.asc dev/spark/v2.4.6-rc7-bin/spark-2.4.6-bin-hadoop2.7.tgz.sha512 dev/spark/v2.4.6-rc7-bin/spark-2.4.6-bin-without-hadoop-scala-2.12.tgz (with props) dev/spark/v2.4.6-rc7-bin/spark-2.4.6-bin-without-hadoop-scala-2.12.tgz.asc dev/spark/v2.4.6-rc7-bin/spark-2.4.6-bin-without-hadoop-scala-2.12.tgz.sha512 dev/spark/v2.4.6-rc7-bin/spark-2.4.6-bin-without-hadoop.tgz (with props) dev/spark/v2.4.6-rc7-bin/spark-2.4.6-bin-without-hadoop.tgz.asc dev/spark/v2.4.6-rc7-bin/spark-2.4.6-bin-without-hadoop.tgz.sha512 dev/spark/v2.4.6-rc7-bin/spark-2.4.6.tgz (with props) dev/spark/v2.4.6-rc7-bin/spark-2.4.6.tgz.asc dev/spark/v2.4.6-rc7-bin/spark-2.4.6.tgz.sha512 Added: dev/spark/v2.4.6-rc7-bin/SparkR_2.4.6.tar.gz == Binary file - no diff available. Propchange: dev/spark/v2.4.6-rc7-bin/SparkR_2.4.6.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v2.4.6-rc7-bin/SparkR_2.4.6.tar.gz.asc == --- dev/spark/v2.4.6-rc7-bin/SparkR_2.4.6.tar.gz.asc (added) +++ dev/spark/v2.4.6-rc7-bin/SparkR_2.4.6.tar.gz.asc Fri May 29 08:53:39 2020 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- +Version: GnuPG v1 + +iQIcBAABAgAGBQJe0MgaAAoJEMQatnRCT8P79NEP/1zQxuI6gVcD6gsNMoYF8k+c +woyDyHnNcEBZsTL7WTPu/ut/hrCe8anXVfzzsc6LudB5cRRbvO1KXpYvcARmKQID +tnoUZLVUuFs+ejFsM+Ux2vTvrF06wPxWBKXHw71QYU4hVKnl2X8oekK1Ae2u74Yl +RCtTzjbbgxYc5NyvKBnMU2lrxij+izFunQy3UGPuvxhwS0Ip2VWId04jWGwE1hA0 +S+UoEG8aZI3R7by//AO8+lMhRl2ZOKCn0Le94EapT1R3bBCF5kIdC82Q6xR5z4Z+ +3C8lc+sNO7BJmscJeMXIs5JRHVb/twdvTnPs2HIQ0hZQUaT3GrByaE/lJkeA4rcg +qewhnCBX/S/E4spETwB9nJjFyt2caT1L06nW7ICK2zXiTmbPiB2jTWh+v7KDmdNN +8dMr0lvbvmNAzKwBFEthremEEKoqp82dImMffTlJSMGmnAoCgfBDB7OuuP1Pq7G5 +KZnY7f6BA6ylVp2gxGHCHTOYEJd3fR4fHSFv2AUOyh96HZ3yqjFP3bSn2PEMNvfH +2C2E7UG7cNgdHef6IQXRktxyvoWDN1tvhVh5AJU1hBFJ4bCD16SsJls/cvvn4AMn +Sa43Yt6YPzeKpJh2xIkhWW7xK2EnjDwT1KXx819M+qmatKANPcoZpHMqBLuXap8g +FC/l5MpURmJtItJCUfm0 +=/74N +-END PGP SIGNATURE- Added: dev/spark/v2.4.6-rc7-bin/SparkR_2.4.6.tar.gz.sha512 == --- dev/spark/v2.4.6-rc7-bin/SparkR_2.4.6.tar.gz.sha512 (added) +++ dev/spark/v2.4.6-rc7-bin/SparkR_2.4.6.tar.gz.sha512 Fri May 29 08:53:39 2020 @@ -0,0 +1,3 @@ +SparkR_2.4.6.tar.gz: 664DE4CC CBA415FA CBCEF3B0 2D6B83B4 D44D4AA6 B0A9A3B5 + D27DBBB8 5889858C 6FCC2394 C5B6BAE9 1B5546DF 44EB0512 + DDEDC106 52763F02 0AF834C0 C091542B Added: dev/spark/v2.4.6-rc7-bin/pyspark-2.4.6.tar.gz == Binary file - no diff available. Propchange: dev/spark/v2.4.6-rc7-bin/pyspark-2.4.6.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v2.4.6-rc7-bin/pyspark-2.4.6.tar.gz.asc == --- dev/spark/v2.4.6-rc7-bin/pyspark-2.4.6.tar.gz.asc (added) +++ dev/spark/v2.4.6-rc7-bin/pyspark-2.4.6.tar.gz.asc Fri May 29 08:53:39 2020 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- +Version: GnuPG v1 + +iQIcBAABAgAGBQJe0MTQAAoJEMQatnRCT8P7EcgP/3LOQmy+nmMLhdllrzWvYbUD +jl01lpIKVYWFhF1w9sXZ76UjzKnuj56gwsVlXUkL6k56GVQEcP6Vyu7pOCflwAZz +nQbCkFZFnajFeBJ/3qKu6iqsIDEcvDt5vSIwNw+CdIzM/Jc4evpQ2Lqxj4/eUhTf +laSj6Qilj2OiMdRpr/UIWpbNN/JWlxhov6a9vHnWCkF5DYexuVuUzMVPNC8oWxWq +adtfWS2/MzYB6Y/wYifnGEZz1MOZysTHEprvx8CiFPznb00a224dZ8W3P1emH5dX ++tQjuFcg4TDpRKPqO9nLAkASYRso0kuhwMVLHh+Xoq1GqOhMCTMMx+uUk0L8NOJJ +KNUjUaSUIR4UsWA30aEWvWV+a1xhduhWSlj7rgylPj+slxfO1aAjiBkSK8DfGz3U +3fSsX8r4l7dZOjPDg0/6EnlI6RDbReZ30MGHQ8FBKGuZ9SezHPMpqN2/ny6pOMTM +RhgXCgNbLcolp6satIHbgWz88bFv3cbBoWdQYrN7nqbqc8fXaGhjeaw50Et5IG77 +QylBnqX424xVFP7G+1CFpcuxgsXUGXhLSLqCQNchBbGVR/IXXL6skJUgyXL3xz6t +LpnTWnKm7xdFaaWOG9rp82yZVQSti2Feh36c0NwXjrk1BwjFiW0Jj9p1MEAWmlrF +G18Qlh8n7RUUqk67SQK/ +=H4tm +-END PGP SIGNATURE- Added: dev
[spark] 01/01: Preparing development version 2.4.7-SNAPSHOT
This is an automated email from the ASF dual-hosted git repository. holden pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git commit 1c36f8e8f633cce8c78bccdac215acccd2d8202b Author: Holden Karau AuthorDate: Fri May 29 07:51:23 2020 + Preparing development version 2.4.7-SNAPSHOT --- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml | 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml | 2 +- common/network-yarn/pom.xml| 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml| 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/avro/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/flume-assembly/pom.xml| 2 +- external/flume-sink/pom.xml| 2 +- external/flume/pom.xml | 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml| 2 +- external/kafka-0-10/pom.xml| 2 +- external/kafka-0-8-assembly/pom.xml| 2 +- external/kafka-0-8/pom.xml | 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml| 2 +- graphx/pom.xml | 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml| 2 +- mllib/pom.xml | 2 +- pom.xml| 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/kubernetes/integration-tests/pom.xml | 2 +- resource-managers/mesos/pom.xml| 2 +- resource-managers/yarn/pom.xml | 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 43 files changed, 44 insertions(+), 44 deletions(-) diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index c913a38..b70014d 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 2.4.6 +Version: 2.4.7 Title: R Front End for 'Apache Spark' Description: Provides an R Front end for 'Apache Spark' <https://spark.apache.org>. Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), diff --git a/assembly/pom.xml b/assembly/pom.xml index de59f40..712cc7f 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.11 -2.4.6 +2.4.7-SNAPSHOT ../pom.xml diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index b3a2265..825d771 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.6 +2.4.7-SNAPSHOT ../../pom.xml diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index d3db527..9dd26b3 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.6 +2.4.7-SNAPSHOT ../../pom.xml diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index 34872fd..386782b 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.6 +2.4.7-SNAPSHOT ../../pom.xml diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml index 5059ee0..8496a68 100644 --- a/common/network-yarn/pom.xml +++ b/common/network-yarn/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.4.6 +2.4.7-SNAPSHOT ../../pom.xml diff -
[spark] branch branch-2.4 updated (d53363d -> 1c36f8e)
This is an automated email from the ASF dual-hosted git repository. holden pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from d53363d Preparing development version 2.4.7-SNAPSHOT add 105de0512 Preparing Spark release v2.4.6-rc7 new 1c36f8e Preparing development version 2.4.7-SNAPSHOT The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org