[GitHub] spark issue #22845: [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks to use m...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22845 **[Test build #98074 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98074/testReport)** for PR 22845 at commit [`a10eb1a`](https://github.com/apache/spark/commit/a10eb1aa8f06fc94fa097c2ab9023a67256d30c4). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22844: [SPARK-25847][SQL][TEST] Refactor JSONBenchmarks to use ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22844 **[Test build #98075 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98075/testReport)** for PR 22844 at commit [`62af4fd`](https://github.com/apache/spark/commit/62af4fd4182f9b63f529efbcb51c15535e200a5b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22823: [SPARK-25676][SQL][TEST] Improve BenchmarkWideTab...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22823#discussion_r228409979 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/BenchmarkWideTable.scala --- @@ -1,52 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - *http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -package org.apache.spark.sql.execution.benchmark - -import org.apache.spark.benchmark.Benchmark - -/** - * Benchmark to measure performance for wide table. - * To run this: - * build/sbt "sql/test-only *benchmark.BenchmarkWideTable" - * - * Benchmarks in this file are skipped in normal builds. - */ -class BenchmarkWideTable extends BenchmarkWithCodegen { - - ignore("project on wide table") { -val N = 1 << 20 -val df = sparkSession.range(N) -val columns = (0 until 400).map{ i => s"id as id$i"} -val benchmark = new Benchmark("projection on wide table", N) -benchmark.addCase("wide table", numIters = 5) { iter => - df.selectExpr(columns : _*).queryExecution.toRdd.count() -} -benchmark.run() - -/** - * Here are some numbers with different split threshold: - * - * Split threshold methods Rate(M/s) Per Row(ns) - * 10 400 0.4 2279 - * 100 200 0.6 1554 - * 1k 370.9 1116 --- End diff -- I think we should have a PR to add this config officially. It should be useful for performance tunning. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22771: [SPARK-25773][Core]Cancel zombie tasks in a resul...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22771#discussion_r228409787 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1364,6 +1385,21 @@ private[spark] class DAGScheduler( if (job.numFinished == job.numPartitions) { markStageAsFinished(resultStage) cleanupStateForJobAndIndependentStages(job) +try { + // killAllTaskAttempts will fail if a SchedulerBackend does not implement + // killTask. + logInfo(s"Job ${job.jobId} is finished. Killing potential speculative or " + +s"zombie tasks for this job") --- End diff -- I created https://issues.apache.org/jira/browse/SPARK-25849 to improve the document. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22771: [SPARK-25773][Core]Cancel zombie tasks in a result stage...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22771 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22771: [SPARK-25773][Core]Cancel zombie tasks in a result stage...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22771 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4525/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22771: [SPARK-25773][Core]Cancel zombie tasks in a result stage...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22771 **[Test build #98073 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98073/testReport)** for PR 22771 at commit [`2e03290`](https://github.com/apache/spark/commit/2e0329039435b7bc61ef0370490efe45ba8048c6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22771: [SPARK-25773][Core]Cancel zombie tasks in a result stage...
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22771 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22841: [SPARK-25842][SQL] Deprecate rangeBetween APIs in...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22841 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22841: [SPARK-25842][SQL] Deprecate rangeBetween APIs introduce...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22841 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22823: [SPARK-25676][SQL][TEST] Improve BenchmarkWideTab...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22823#discussion_r228407600 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -910,12 +910,14 @@ class CodegenContext { val blocks = new ArrayBuffer[String]() val blockBuilder = new StringBuilder() var length = 0 +val splitThreshold = + SQLConf.get.getConfString("spark.testing.codegen.splitThreshold", "1024").toInt --- End diff -- Personally I don't think this is a good solution: 1. The configuration contains "testing", which is super wired as it can be used in production. 2. We should start a new discuss about whether to make it configurable. The reason should not be making the benchmark easier. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22843 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22843 **[Test build #98070 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98070/testReport)** for PR 22843 at commit [`60c5808`](https://github.com/apache/spark/commit/60c5808ddd72f0f41cb33208268dfac3da5baa03). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22843 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98070/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22816: [SPARK-25822][PySpark]Fix a race condition when r...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22816 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22816: [SPARK-25822][PySpark]Fix a race condition when releasin...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/22816 Thanks! merging to master/2.4/2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22816: [SPARK-25822][PySpark]Fix a race condition when releasin...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/22816 LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...
Github user functicons commented on the issue: https://github.com/apache/spark/pull/21588 Do we really want to switch to Hive 2.3? From this page https://hive.apache.org/downloads.html, Hive 2.3 works with Hadoop 2.x (Hive 3.x works with Hadoop 3.x). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22624: [SPARK-23781][CORE] Merge token renewer functiona...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22624#discussion_r228400112 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/security/KubernetesHadoopDelegationTokenManager.scala --- @@ -18,45 +18,20 @@ package org.apache.spark.deploy.k8s.security import org.apache.hadoop.conf.Configuration -import org.apache.hadoop.fs.FileSystem -import org.apache.hadoop.security.{Credentials, UserGroupInformation} +import org.apache.hadoop.security.UserGroupInformation import org.apache.spark.SparkConf -import org.apache.spark.deploy.SparkHadoopUtil import org.apache.spark.deploy.security.HadoopDelegationTokenManager -import org.apache.spark.internal.Logging /** - * The KubernetesHadoopDelegationTokenManager fetches Hadoop delegation tokens - * on the behalf of the Kubernetes submission client. The new credentials - * (called Tokens when they are serialized) are stored in Secrets accessible - * to the driver and executors, when new Tokens are received they overwrite the current Secrets. + * Adds Kubernetes-specific functionality to HadoopDelegationTokenManager. */ private[spark] class KubernetesHadoopDelegationTokenManager( --- End diff -- this class doesn't really seem necessary anymore, but not a big deal --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22624: [SPARK-23781][CORE] Merge token renewer functiona...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22624#discussion_r228398765 --- Diff: core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala --- @@ -110,32 +209,105 @@ private[spark] class HadoopDelegationTokenManager( } /** - * Get delegation token provider for the specified service. + * List of file systems for which to obtain delegation tokens. The base implementation + * returns just the default file system in the given Hadoop configuration. */ - def getServiceDelegationTokenProvider(service: String): Option[HadoopDelegationTokenProvider] = { -delegationTokenProviders.get(service) + protected def fileSystemsToAccess(): Set[FileSystem] = { +Set(FileSystem.get(hadoopConf)) + } + + private def scheduleRenewal(delay: Long): Unit = { +val _delay = math.max(0, delay) +logInfo(s"Scheduling login from keytab in ${UIUtils.formatDuration(delay)}.") + +val renewalTask = new Runnable() { + override def run(): Unit = { +updateTokensTask() + } +} +renewalExecutor.schedule(renewalTask, _delay, TimeUnit.MILLISECONDS) } /** - * Writes delegation tokens to creds. Delegation tokens are fetched from all registered - * providers. - * - * @param hadoopConf hadoop Configuration - * @param creds Credentials that will be updated in place (overwritten) - * @return Time after which the fetched delegation tokens should be renewed. + * Periodic task to login to the KDC and create new delegation tokens. Re-schedules itself + * to fetch the next set of tokens when needed. */ - def obtainDelegationTokens( - hadoopConf: Configuration, - creds: Credentials): Long = { -delegationTokenProviders.values.flatMap { provider => - if (provider.delegationTokensRequired(sparkConf, hadoopConf)) { -provider.obtainDelegationTokens(hadoopConf, sparkConf, creds) + private def updateTokensTask(): Unit = { +try { + val freshUGI = doLogin() + val creds = obtainTokensAndScheduleRenewal(freshUGI) + val tokens = SparkHadoopUtil.get.serialize(creds) + + val driver = driverRef.get() + if (driver != null) { +logInfo("Updating delegation tokens.") +driver.send(UpdateDelegationTokens(tokens)) } else { -logDebug(s"Service ${provider.serviceName} does not require a token." + - s" Check your configuration to see if security is disabled or not.") -None +// This shouldn't really happen, since the driver should register way before tokens expire. +logWarning("Delegation tokens close to expiration but no driver has registered yet.") +SparkHadoopUtil.get.addDelegationTokens(tokens, sparkConf) } -}.foldLeft(Long.MaxValue)(math.min) +} catch { + case e: Exception => +val delay = TimeUnit.SECONDS.toMillis(sparkConf.get(CREDENTIALS_RENEWAL_RETRY_WAIT)) +logWarning(s"Failed to update tokens, will try again in ${UIUtils.formatDuration(delay)}!" + + " If this happens too often tasks will fail.", e) +scheduleRenewal(delay) +} + } + + /** + * Obtain new delegation tokens from the available providers. Schedules a new task to fetch + * new tokens before the new set expires. + * + * @return Credentials containing the new tokens. + */ + private def obtainTokensAndScheduleRenewal(ugi: UserGroupInformation): Credentials = { +ugi.doAs(new PrivilegedExceptionAction[Credentials]() { + override def run(): Credentials = { +val creds = new Credentials() +val nextRenewal = obtainDelegationTokens(creds) + +// Calculate the time when new credentials should be created, based on the configured +// ratio. +val now = System.currentTimeMillis +val ratio = sparkConf.get(CREDENTIALS_RENEWAL_INTERVAL_RATIO) +val adjustedNextRenewal = (now + (ratio * (nextRenewal - now))).toLong + +scheduleRenewal(adjustedNextRenewal - now) +creds + } +}) + } + + private def doLogin(): UserGroupInformation = { +logInfo(s"Attempting to login to KDC using principal: $principal") +val ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) +logInfo("Successfully logged into KDC.") +ugi + } + + private def loadProviders(): Map[String, HadoopDelegationTokenProvider] = { +val providers = Seq(new HadoopFSDelegationTokenProvider(fileSystemsToAccess)) ++ +
[GitHub] spark pull request #22624: [SPARK-23781][CORE] Merge token renewer functiona...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22624#discussion_r228400837 --- Diff: core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala --- @@ -17,76 +17,175 @@ package org.apache.spark.deploy.security +import java.io.File +import java.security.PrivilegedExceptionAction +import java.util.concurrent.{ScheduledExecutorService, TimeUnit} +import java.util.concurrent.atomic.AtomicReference + import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.FileSystem -import org.apache.hadoop.security.Credentials +import org.apache.hadoop.security.{Credentials, UserGroupInformation} import org.apache.spark.SparkConf +import org.apache.spark.deploy.SparkHadoopUtil import org.apache.spark.internal.Logging +import org.apache.spark.internal.config._ +import org.apache.spark.rpc.RpcEndpointRef +import org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.UpdateDelegationTokens +import org.apache.spark.ui.UIUtils +import org.apache.spark.util.ThreadUtils /** - * Manages all the registered HadoopDelegationTokenProviders and offer APIs for other modules to - * obtain delegation tokens and their renewal time. By default [[HadoopFSDelegationTokenProvider]], - * [[HiveDelegationTokenProvider]] and [[HBaseDelegationTokenProvider]] will be loaded in if not - * explicitly disabled. + * Manager for delegation tokens in a Spark application. + * + * This manager has two modes of operation: + * + * 1. When configured with a principal and a keytab, it will make sure long-running apps can run + * without interruption while accessing secured services. It periodically logs in to the KDC with + * user-provided credentials, and contacts all the configured secure services to obtain delegation + * tokens to be distributed to the rest of the application. + * + * Because the Hadoop UGI API does not expose the TTL of the TGT, a configuration controls how often + * to check that a relogin is necessary. This is done reasonably often since the check is a no-op + * when the relogin is not yet needed. The check period can be overridden in the configuration. * - * Also, each HadoopDelegationTokenProvider is controlled by - * spark.security.credentials.{service}.enabled, and will not be loaded if this config is set to - * false. For example, Hive's delegation token provider [[HiveDelegationTokenProvider]] can be - * enabled/disabled by the configuration spark.security.credentials.hive.enabled. + * New delegation tokens are created once 75% of the renewal interval of the original tokens has + * elapsed. The new tokens are sent to the Spark driver endpoint once it's registered with the AM. + * The driver is tasked with distributing the tokens to other processes that might need them. * - * @param sparkConf Spark configuration - * @param hadoopConf Hadoop configuration - * @param fileSystems Delegation tokens will be fetched for these Hadoop filesystems. + * 2. When operating without an explicit principal and keytab, token renewal will not be available. + * Starting the manager will distribute an initial set of delegation tokens to the provided Spark + * driver, but the app will not get new tokens when those expire. + * + * It can also be used just to create delegation tokens, by calling the `obtainDelegationTokens` + * method. This option does not require calling the `start` method, but leaves it up to the + * caller to distribute the tokens that were generated. */ private[spark] class HadoopDelegationTokenManager( -sparkConf: SparkConf, -hadoopConf: Configuration, -fileSystems: Configuration => Set[FileSystem]) - extends Logging { +protected val sparkConf: SparkConf, +protected val hadoopConf: Configuration) extends Logging { private val deprecatedProviderEnabledConfigs = List( "spark.yarn.security.tokens.%s.enabled", "spark.yarn.security.credentials.%s.enabled") private val providerEnabledConfig = "spark.security.credentials.%s.enabled" - // Maintain all the registered delegation token providers - private val delegationTokenProviders = getDelegationTokenProviders + private val principal = sparkConf.get(PRINCIPAL).orNull + private val keytab = sparkConf.get(KEYTAB).orNull + + if (principal != null) { +require(keytab != null, "Kerberos principal specified without a keytab.") +require(new File(keytab).isFile(), s"Cannot find keytab at $keytab.") + } + + private val delegationTokenProviders = loadProviders() logDebug("Using the following builtin delegation token providers: " +
[GitHub] spark pull request #22624: [SPARK-23781][CORE] Merge token renewer functiona...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22624#discussion_r228400760 --- Diff: core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala --- @@ -17,76 +17,175 @@ package org.apache.spark.deploy.security +import java.io.File +import java.security.PrivilegedExceptionAction +import java.util.concurrent.{ScheduledExecutorService, TimeUnit} +import java.util.concurrent.atomic.AtomicReference + import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.FileSystem -import org.apache.hadoop.security.Credentials +import org.apache.hadoop.security.{Credentials, UserGroupInformation} import org.apache.spark.SparkConf +import org.apache.spark.deploy.SparkHadoopUtil import org.apache.spark.internal.Logging +import org.apache.spark.internal.config._ +import org.apache.spark.rpc.RpcEndpointRef +import org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.UpdateDelegationTokens +import org.apache.spark.ui.UIUtils +import org.apache.spark.util.ThreadUtils /** - * Manages all the registered HadoopDelegationTokenProviders and offer APIs for other modules to - * obtain delegation tokens and their renewal time. By default [[HadoopFSDelegationTokenProvider]], - * [[HiveDelegationTokenProvider]] and [[HBaseDelegationTokenProvider]] will be loaded in if not - * explicitly disabled. + * Manager for delegation tokens in a Spark application. + * + * This manager has two modes of operation: + * + * 1. When configured with a principal and a keytab, it will make sure long-running apps can run + * without interruption while accessing secured services. It periodically logs in to the KDC with + * user-provided credentials, and contacts all the configured secure services to obtain delegation + * tokens to be distributed to the rest of the application. + * + * Because the Hadoop UGI API does not expose the TTL of the TGT, a configuration controls how often + * to check that a relogin is necessary. This is done reasonably often since the check is a no-op + * when the relogin is not yet needed. The check period can be overridden in the configuration. * - * Also, each HadoopDelegationTokenProvider is controlled by - * spark.security.credentials.{service}.enabled, and will not be loaded if this config is set to - * false. For example, Hive's delegation token provider [[HiveDelegationTokenProvider]] can be - * enabled/disabled by the configuration spark.security.credentials.hive.enabled. + * New delegation tokens are created once 75% of the renewal interval of the original tokens has + * elapsed. The new tokens are sent to the Spark driver endpoint once it's registered with the AM. + * The driver is tasked with distributing the tokens to other processes that might need them. * - * @param sparkConf Spark configuration - * @param hadoopConf Hadoop configuration - * @param fileSystems Delegation tokens will be fetched for these Hadoop filesystems. + * 2. When operating without an explicit principal and keytab, token renewal will not be available. + * Starting the manager will distribute an initial set of delegation tokens to the provided Spark + * driver, but the app will not get new tokens when those expire. + * + * It can also be used just to create delegation tokens, by calling the `obtainDelegationTokens` + * method. This option does not require calling the `start` method, but leaves it up to the + * caller to distribute the tokens that were generated. */ private[spark] class HadoopDelegationTokenManager( -sparkConf: SparkConf, -hadoopConf: Configuration, -fileSystems: Configuration => Set[FileSystem]) - extends Logging { +protected val sparkConf: SparkConf, +protected val hadoopConf: Configuration) extends Logging { private val deprecatedProviderEnabledConfigs = List( "spark.yarn.security.tokens.%s.enabled", "spark.yarn.security.credentials.%s.enabled") private val providerEnabledConfig = "spark.security.credentials.%s.enabled" - // Maintain all the registered delegation token providers - private val delegationTokenProviders = getDelegationTokenProviders + private val principal = sparkConf.get(PRINCIPAL).orNull + private val keytab = sparkConf.get(KEYTAB).orNull + + if (principal != null) { +require(keytab != null, "Kerberos principal specified without a keytab.") +require(new File(keytab).isFile(), s"Cannot find keytab at $keytab.") + } + + private val delegationTokenProviders = loadProviders() logDebug("Using the following builtin delegation token providers: " +
[GitHub] spark pull request #22624: [SPARK-23781][CORE] Merge token renewer functiona...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22624#discussion_r228397209 --- Diff: core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala --- @@ -17,76 +17,175 @@ package org.apache.spark.deploy.security +import java.io.File +import java.security.PrivilegedExceptionAction +import java.util.concurrent.{ScheduledExecutorService, TimeUnit} +import java.util.concurrent.atomic.AtomicReference + import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.FileSystem -import org.apache.hadoop.security.Credentials +import org.apache.hadoop.security.{Credentials, UserGroupInformation} import org.apache.spark.SparkConf +import org.apache.spark.deploy.SparkHadoopUtil import org.apache.spark.internal.Logging +import org.apache.spark.internal.config._ +import org.apache.spark.rpc.RpcEndpointRef +import org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.UpdateDelegationTokens +import org.apache.spark.ui.UIUtils +import org.apache.spark.util.ThreadUtils /** - * Manages all the registered HadoopDelegationTokenProviders and offer APIs for other modules to - * obtain delegation tokens and their renewal time. By default [[HadoopFSDelegationTokenProvider]], - * [[HiveDelegationTokenProvider]] and [[HBaseDelegationTokenProvider]] will be loaded in if not - * explicitly disabled. + * Manager for delegation tokens in a Spark application. + * + * This manager has two modes of operation: + * + * 1. When configured with a principal and a keytab, it will make sure long-running apps can run + * without interruption while accessing secured services. It periodically logs in to the KDC with + * user-provided credentials, and contacts all the configured secure services to obtain delegation + * tokens to be distributed to the rest of the application. + * + * Because the Hadoop UGI API does not expose the TTL of the TGT, a configuration controls how often + * to check that a relogin is necessary. This is done reasonably often since the check is a no-op + * when the relogin is not yet needed. The check period can be overridden in the configuration. * - * Also, each HadoopDelegationTokenProvider is controlled by - * spark.security.credentials.{service}.enabled, and will not be loaded if this config is set to - * false. For example, Hive's delegation token provider [[HiveDelegationTokenProvider]] can be - * enabled/disabled by the configuration spark.security.credentials.hive.enabled. + * New delegation tokens are created once 75% of the renewal interval of the original tokens has + * elapsed. The new tokens are sent to the Spark driver endpoint once it's registered with the AM. + * The driver is tasked with distributing the tokens to other processes that might need them. * - * @param sparkConf Spark configuration - * @param hadoopConf Hadoop configuration - * @param fileSystems Delegation tokens will be fetched for these Hadoop filesystems. + * 2. When operating without an explicit principal and keytab, token renewal will not be available. + * Starting the manager will distribute an initial set of delegation tokens to the provided Spark + * driver, but the app will not get new tokens when those expire. + * + * It can also be used just to create delegation tokens, by calling the `obtainDelegationTokens` + * method. This option does not require calling the `start` method, but leaves it up to the + * caller to distribute the tokens that were generated. */ private[spark] class HadoopDelegationTokenManager( -sparkConf: SparkConf, -hadoopConf: Configuration, -fileSystems: Configuration => Set[FileSystem]) - extends Logging { +protected val sparkConf: SparkConf, +protected val hadoopConf: Configuration) extends Logging { private val deprecatedProviderEnabledConfigs = List( "spark.yarn.security.tokens.%s.enabled", "spark.yarn.security.credentials.%s.enabled") private val providerEnabledConfig = "spark.security.credentials.%s.enabled" - // Maintain all the registered delegation token providers - private val delegationTokenProviders = getDelegationTokenProviders + private val principal = sparkConf.get(PRINCIPAL).orNull + private val keytab = sparkConf.get(KEYTAB).orNull + + if (principal != null) { +require(keytab != null, "Kerberos principal specified without a keytab.") --- End diff -- what if the keytab is specified but not the principal? shoudl this be the same check as in Client.scala
[GitHub] spark pull request #22624: [SPARK-23781][CORE] Merge token renewer functiona...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22624#discussion_r228398489 --- Diff: core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala --- @@ -110,32 +209,105 @@ private[spark] class HadoopDelegationTokenManager( } /** - * Get delegation token provider for the specified service. + * List of file systems for which to obtain delegation tokens. The base implementation + * returns just the default file system in the given Hadoop configuration. */ - def getServiceDelegationTokenProvider(service: String): Option[HadoopDelegationTokenProvider] = { -delegationTokenProviders.get(service) + protected def fileSystemsToAccess(): Set[FileSystem] = { +Set(FileSystem.get(hadoopConf)) + } + + private def scheduleRenewal(delay: Long): Unit = { +val _delay = math.max(0, delay) +logInfo(s"Scheduling login from keytab in ${UIUtils.formatDuration(delay)}.") + +val renewalTask = new Runnable() { + override def run(): Unit = { +updateTokensTask() + } +} +renewalExecutor.schedule(renewalTask, _delay, TimeUnit.MILLISECONDS) } /** - * Writes delegation tokens to creds. Delegation tokens are fetched from all registered - * providers. - * - * @param hadoopConf hadoop Configuration - * @param creds Credentials that will be updated in place (overwritten) - * @return Time after which the fetched delegation tokens should be renewed. + * Periodic task to login to the KDC and create new delegation tokens. Re-schedules itself + * to fetch the next set of tokens when needed. */ - def obtainDelegationTokens( - hadoopConf: Configuration, - creds: Credentials): Long = { -delegationTokenProviders.values.flatMap { provider => - if (provider.delegationTokensRequired(sparkConf, hadoopConf)) { -provider.obtainDelegationTokens(hadoopConf, sparkConf, creds) + private def updateTokensTask(): Unit = { +try { + val freshUGI = doLogin() + val creds = obtainTokensAndScheduleRenewal(freshUGI) + val tokens = SparkHadoopUtil.get.serialize(creds) + + val driver = driverRef.get() + if (driver != null) { +logInfo("Updating delegation tokens.") +driver.send(UpdateDelegationTokens(tokens)) } else { -logDebug(s"Service ${provider.serviceName} does not require a token." + - s" Check your configuration to see if security is disabled or not.") -None +// This shouldn't really happen, since the driver should register way before tokens expire. +logWarning("Delegation tokens close to expiration but no driver has registered yet.") +SparkHadoopUtil.get.addDelegationTokens(tokens, sparkConf) } -}.foldLeft(Long.MaxValue)(math.min) +} catch { + case e: Exception => +val delay = TimeUnit.SECONDS.toMillis(sparkConf.get(CREDENTIALS_RENEWAL_RETRY_WAIT)) +logWarning(s"Failed to update tokens, will try again in ${UIUtils.formatDuration(delay)}!" + + " If this happens too often tasks will fail.", e) +scheduleRenewal(delay) +} + } + + /** + * Obtain new delegation tokens from the available providers. Schedules a new task to fetch + * new tokens before the new set expires. + * + * @return Credentials containing the new tokens. + */ + private def obtainTokensAndScheduleRenewal(ugi: UserGroupInformation): Credentials = { +ugi.doAs(new PrivilegedExceptionAction[Credentials]() { + override def run(): Credentials = { +val creds = new Credentials() +val nextRenewal = obtainDelegationTokens(creds) + +// Calculate the time when new credentials should be created, based on the configured +// ratio. +val now = System.currentTimeMillis +val ratio = sparkConf.get(CREDENTIALS_RENEWAL_INTERVAL_RATIO) +val adjustedNextRenewal = (now + (ratio * (nextRenewal - now))).toLong + +scheduleRenewal(adjustedNextRenewal - now) --- End diff -- you're adding `now` and subtracting it off again, instead you could do ```scala val adjustedRenewalDelay = (ratio * (nextRenewal - now)).toLong scheduleRenewal(adjustedRenewalDelay) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22845: [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks to use m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22845 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98071/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22844: [SPARK-25847][SQL][TEST] Refactor JSONBenchmarks to use ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22844 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22844: [SPARK-25847][SQL][TEST] Refactor JSONBenchmarks to use ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22844 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98072/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22845: [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks to use m...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22845 **[Test build #98071 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98071/testReport)** for PR 22845 at commit [`9ddb847`](https://github.com/apache/spark/commit/9ddb8476544fa34b15fbe15387e1b4983d4d76d4). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22844: [SPARK-25847][SQL][TEST] Refactor JSONBenchmarks to use ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22844 **[Test build #98072 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98072/testReport)** for PR 22844 at commit [`937111f`](https://github.com/apache/spark/commit/937111f7f53744c8fe1a6b4fd0559643743eefae). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22845: [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks to use m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22845 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22844: [SPARK-25847][SQL][TEST] Refactor JSONBenchmarks to use ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22844 **[Test build #98072 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98072/testReport)** for PR 22844 at commit [`937111f`](https://github.com/apache/spark/commit/937111f7f53744c8fe1a6b4fd0559643743eefae). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22845: [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks to use m...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22845 **[Test build #98071 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98071/testReport)** for PR 22845 at commit [`9ddb847`](https://github.com/apache/spark/commit/9ddb8476544fa34b15fbe15387e1b4983d4d76d4). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/21588 Sounds like we should try this then --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22844: [SPARK-25847][SQL][TEST] Refactor JSONBenchmarks to use ...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22844 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22845: [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks to use m...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22845 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22845: [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks to use m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22845 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22845: [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks to use m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22845 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22845: [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks to use m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22845 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22844: [SPARK-25847][SQL][TEST] Refactor JSONBenchmarks to use ...
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/22844 cc @dongjoon-hyun, @wangyum --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22845: [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks to use m...
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/22845 cc @dongjoon-hyun, @wangyum --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22845: [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks t...
GitHub user heary-cao opened a pull request: https://github.com/apache/spark/pull/22845 [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks to use main method ## What changes were proposed in this pull request? use spark-submit: bin/spark-submit --class org.apache.spark.sql.execution.datasources.csv.CSVBenchmarks --jars ./core/target/spark-core_2.11-3.0.0-SNAPSHOT-tests.jar ./sql/catalyst/target/spark-sql_2.11-3.0.0-SNAPSHOT-tests.jar Generate benchmark result: SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain org.apache.spark.sql.execution.datasources.csv.CSVBenchmarks" ## How was this patch tested? manual tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/heary-cao/spark CSVBenchmarks Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22845.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22845 commit 9ddb8476544fa34b15fbe15387e1b4983d4d76d4 Author: caoxuewen Date: 2018-10-26T04:07:48Z Refactor CSVBenchmarks to use main method --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22843 **[Test build #98070 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98070/testReport)** for PR 22843 at commit [`60c5808`](https://github.com/apache/spark/commit/60c5808ddd72f0f41cb33208268dfac3da5baa03). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22844: [SPARK-25847][SQL][TEST] Refactor JSONBenchmarks to use ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22844 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22843 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4524/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22843 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22844: [SPARK-25847][SQL][TEST] Refactor JSONBenchmarks to use ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22844 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22844: [SPARK-25847][SQL][TEST] Refactor JSONBenchmarks to use ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22844 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22844: [SPARK-25847][SQL][TEST] Refactor JSONBenchmarks ...
GitHub user heary-cao opened a pull request: https://github.com/apache/spark/pull/22844 [SPARK-25847][SQL][TEST] Refactor JSONBenchmarks to use main method ## What changes were proposed in this pull request? Refactor JSONBenchmarks to use main method use spark-submit: bin/spark-submit --class org.apache.spark.sql.execution.datasources.json.JSONBenchmarks --jars ./core/target/spark-core_2.11-3.0.0-SNAPSHOT-tests.jar ./sql/catalyst/target/spark-sql_2.11-3.0.0-SNAPSHOT-tests.jar Generate benchmark result: SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain org.apache.spark.sql.execution.datasources.json.JSONBenchmarks" ## How was this patch tested? manual tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/heary-cao/spark JSONBenchmarks Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22844.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22844 commit 937111f7f53744c8fe1a6b4fd0559643743eefae Author: caoxuewen Date: 2018-10-26T03:52:31Z Refactor JSONBenchmarks to use main method --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22840: [SPARK-25840][BUILD] `make-distribution.sh` should not f...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22840 Oops. My bad. I'll monitor the branch. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22843 **[Test build #98069 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98069/testReport)** for PR 22843 at commit [`3864490`](https://github.com/apache/spark/commit/3864490f9bcb2f30e6508b4ae8a98f5faf910b47). * This patch **fails some tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22843 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98069/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22843 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22823: [SPARK-25676][SQL][TEST] Improve BenchmarkWideTab...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22823#discussion_r228400706 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/BenchmarkWideTable.scala --- @@ -1,52 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - *http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -package org.apache.spark.sql.execution.benchmark - -import org.apache.spark.benchmark.Benchmark - -/** - * Benchmark to measure performance for wide table. - * To run this: - * build/sbt "sql/test-only *benchmark.BenchmarkWideTable" - * - * Benchmarks in this file are skipped in normal builds. - */ -class BenchmarkWideTable extends BenchmarkWithCodegen { - - ignore("project on wide table") { -val N = 1 << 20 -val df = sparkSession.range(N) -val columns = (0 until 400).map{ i => s"id as id$i"} -val benchmark = new Benchmark("projection on wide table", N) -benchmark.addCase("wide table", numIters = 5) { iter => - df.selectExpr(columns : _*).queryExecution.toRdd.count() -} -benchmark.run() - -/** - * Here are some numbers with different split threshold: - * - * Split threshold methods Rate(M/s) Per Row(ns) - * 10 400 0.4 2279 - * 100 200 0.6 1554 - * 1k 370.9 1116 --- End diff -- Hi, @davies and @cloud-fan and @kiszk . This benchmark is added in [Spark 2.1.0](https://github.com/apache/spark/commit/8d35a6f68d6d733212674491cbf31bed73fada0f#diff-71964129f49db97eb030a6d7320af314). This value `1k` is determined by **manually** changing the split threhold. This PR wants to [add a configuration in CodeGenerator.scala](https://github.com/apache/spark/pull/22823/files#diff-8bcc5aea39c73d4bf38aef6f6951d42cR914) for testing-purpose only. 1. Is the configuration helpful in general purpose? 2. If then, can we make another PR for that first? 3. If not, is it allowed to add this testing parameter? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22843 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4523/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22843 **[Test build #98069 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98069/testReport)** for PR 22843 at commit [`3864490`](https://github.com/apache/spark/commit/3864490f9bcb2f30e6508b4ae8a98f5faf910b47). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22843 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22843: [SPARK-16693][SPARKR] Remove methods deprecated
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/22843 [SPARK-16693][SPARKR] Remove methods deprecated ## What changes were proposed in this pull request? Remove deprecated functions which includes: SQLContext/HiveContext stuff sparkR.init jsonFile parquetFile registerTempTable saveAsParquetFile unionAll createExternalTable dropTempTable ## How was this patch tested? jenkins You can merge this pull request into a Git repository by running: $ git pull https://github.com/felixcheung/spark rrddapi Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22843.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22843 commit 3864490f9bcb2f30e6508b4ae8a98f5faf910b47 Author: Felix Cheung Date: 2018-10-26T04:04:58Z remove deprecated --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22823: [SPARK-25676][SQL][TEST] Improve BenchmarkWideTab...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22823#discussion_r228399871 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -910,12 +910,14 @@ class CodegenContext { val blocks = new ArrayBuffer[String]() val blockBuilder = new StringBuilder() var length = 0 +val splitThreshold = + SQLConf.get.getConfString("spark.testing.codegen.splitThreshold", "1024").toInt --- End diff -- In this case, we need advice from the right person. :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22666 **[Test build #98068 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98068/testReport)** for PR 22666 at commit [`d876b92`](https://github.com/apache/spark/commit/d876b9270afa9b30defea6d4621bcc63dc61f3e0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22840: [SPARK-25840][BUILD] `make-distribution.sh` should not f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22840 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22840: [SPARK-25840][BUILD] `make-distribution.sh` should not f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22840 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98051/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22840: [SPARK-25840][BUILD] `make-distribution.sh` should not f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22840 **[Test build #98051 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98051/testReport)** for PR 22840 at commit [`0950bb8`](https://github.com/apache/spark/commit/0950bb86ac028655c665687398c7dcfce1853f04). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22775 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4522/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22775 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22775 **[Test build #98067 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98067/testReport)** for PR 22775 at commit [`03f34d9`](https://github.com/apache/spark/commit/03f34d9d86a8087c3de4d5580e2ddf9fba8a8407). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22814: [SPARK-25819][SQL] Support parse mode option for ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22814 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22815: [SPARK-25821][SQL] Remove SQLContext methods deprecated ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22815 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4521/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22815: [SPARK-25821][SQL] Remove SQLContext methods deprecated ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22815 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22775 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22775 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4520/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22820: [SPARK-25828][K8S] Bumping Kubernetes-Client version to ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22820 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4519/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22820: [SPARK-25828][K8S] Bumping Kubernetes-Client version to ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22820 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22820: [SPARK-25828][K8S] Bumping Kubernetes-Client version to ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22820 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4519/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/21588 So, let's say we decide to only support Hive 2.3.x+, as a precursor to this. We could already eliminate a lot of the Hive tests, right? that might be useful in its own right as they take time and are a little flaky. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22837: [MINOR][TEST][BRANCH-2.4] Regenerate golden file ...
Github user dongjoon-hyun closed the pull request at: https://github.com/apache/spark/pull/22837 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22814: [SPARK-25819][SQL] Support parse mode option for the fun...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22814 Merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21588 Yup, it supports Hadoop 3, and other fixes what @wangyum mentioned. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22837: [MINOR][TEST][BRANCH-2.4] Regenerate golden file `dateti...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22837 Merged to `branch-2.4`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22837: [MINOR][TEST][BRANCH-2.4] Regenerate golden file `dateti...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22837 Thank you for review and approval, @HyukjinKwon ! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22815: [SPARK-25821][SQL] Remove SQLContext methods deprecated ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22815 **[Test build #98065 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98065/testReport)** for PR 22815 at commit [`8199362`](https://github.com/apache/spark/commit/81993625218818a2b9444e5ba11588713eda557f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22775: [SPARK-24709][SQL][FOLLOW-UP] Make schema_of_json's inpu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22775 **[Test build #98066 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98066/testReport)** for PR 22775 at commit [`e2ca651`](https://github.com/apache/spark/commit/e2ca6517098adc093f957a6158ed760fb0826f4d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22815: [SPARK-25821][SQL] Remove SQLContext methods depr...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22815#discussion_r228396856 --- Diff: R/pkg/R/SQLContext.R --- @@ -434,6 +388,7 @@ read.orc <- function(path, ...) { #' Loads a Parquet file, returning the result as a SparkDataFrame. #' #' @param path path of file to read. A vector of multiple paths is allowed. +#' @param ... additional external data source specific named properties. --- End diff -- Oops, I missed that, sorry. I'll incorporate both changes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22840: [SPARK-25840][BUILD] `make-distribution.sh` shoul...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22840 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/21588 does Apache Hive 2.3.2 have all the fixes we need? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22837: [MINOR][TEST][BRANCH-2.4] Regenerate golden file `dateti...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22837 Could you review this, @HyukjinKwon ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22820: [SPARK-25828][K8S] Bumping Kubernetes-Client version to ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22820 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4519/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22840: [SPARK-25840][BUILD] `make-distribution.sh` should not f...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22840 Thank you, @srowen and @HyukjinKwon . Merged to master/branch-2.4. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22814: [SPARK-25819][SQL] Support parse mode option for the fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22814 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98063/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22814: [SPARK-25819][SQL] Support parse mode option for the fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22814 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22814: [SPARK-25819][SQL] Support parse mode option for the fun...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22814 **[Test build #98063 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98063/testReport)** for PR 22814 at commit [`b33a5ad`](https://github.com/apache/spark/commit/b33a5ade4b3f091d5e67d3f3bdc47e87f9b37eee). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22841: [SPARK-25842][SQL] Deprecate rangeBetween APIs introduce...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22841 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98059/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22820: [SPARK-25828][K8S] Bumping Kubernetes-Client version to ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22820 **[Test build #98064 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98064/testReport)** for PR 22820 at commit [`728d70a`](https://github.com/apache/spark/commit/728d70af1d0917745879362abef2209a760d4f22). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22841: [SPARK-25842][SQL] Deprecate rangeBetween APIs introduce...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22841 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22841: [SPARK-25842][SQL] Deprecate rangeBetween APIs introduce...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22841 **[Test build #98059 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98059/testReport)** for PR 22841 at commit [`0a49c85`](https://github.com/apache/spark/commit/0a49c859049a376872053dcfaacba81d47070d77). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22820: [SPARK-25828][K8S] Bumping Kubernetes-Client version to ...
Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/22820 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22838: [SPARK-25835][K8s] Create kubernetes-tests profile and u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22838 **[Test build #4397 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4397/testReport)** for PR 22838 at commit [`9b8f6b4`](https://github.com/apache/spark/commit/9b8f6b41cdb0e3139d76a0cfd281c094bcc91469). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22841: [SPARK-25842][SQL] Deprecate rangeBetween APIs introduce...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22841 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98060/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for class...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22790 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22841: [SPARK-25842][SQL] Deprecate rangeBetween APIs introduce...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22841 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22841: [SPARK-25842][SQL] Deprecate rangeBetween APIs introduce...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22841 **[Test build #98060 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98060/testReport)** for PR 22841 at commit [`45ef16b`](https://github.com/apache/spark/commit/45ef16bac363979d1626824673a41360a3c9648a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org