[GitHub] spark issue #21150: [SPARK-24075][MESOS] Option to limit number of retries f...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/21150 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21150: [SPARK-24075][MESOS] Option to limit number of re...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/21150#discussion_r227028091 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -728,6 +729,28 @@ private[spark] class MesosClusterScheduler( state == MesosTaskState.TASK_LOST } + /** + * Check if the driver has exceed the number of retries. + * When "spark.mesos.driver.supervise.maxRetries" is not set, + * the default behavior is to retry indefinitely + * + * @param retryState Retry state of the driver + * @param conf Spark Context to check if it contains "spark.mesos.driver.supervise.maxRetries" + * @return true if driver has reached retry limit + * false if driver can be retried + */ + private[scheduler] def hasDriverExceededRetries(retryState: Option[MesosClusterRetryState], --- End diff -- Please fix the param style: hasDriverExceededRetries( retryState: Option[MesosClusterRetryState], conf.) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22146: [SPARK-24434][K8S] pod template files
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/22146 The PR works for me now as well for adding volumes to executors. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22146: [SPARK-24434][K8S] pod template files
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/22146#discussion_r214205873 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala --- @@ -59,5 +66,28 @@ private[spark] object KubernetesUtils { } } + def loadPodFromTemplate( + kubernetesClient: KubernetesClient, + templateFile: File): SparkPod = { +try { + val pod = kubernetesClient.pods().load(templateFile).get() + pod.getSpec.getContainers.asScala.toList match { +case first :: rest => SparkPod( + new PodBuilder(pod) +.editSpec() + .withContainers(rest.asJava) + .endSpec() +.build(), + first) +case Nil => SparkPod(pod, new ContainerBuilder().build()) + } +} catch { + case e: Exception => +logError( + s"Encountered exception while attempting to load initial pod spec from file", e) +throw new SparkException("Could not load driver pod from template file.", e) --- End diff -- This error message is misleading, it throws when both executor and driver pod failed to load from its own template. Either remove "driver" or be more specific that its executor or driver. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22146: [SPARK-24434][K8S] pod template files
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/22146 I've been looking to mount additional volumes to the executor containers and just tried out the PR. It doesn't seem possible since if you add the container in the pod template, BasicExecutorFeatureStep still adds another executor container and then it becomes a invalid pod spec. I think it's worth considering if we want pod templates to have the ability to modify or add existing elements that the code adds to the pod spec, or this is only for adding additional attributes than what the code already does. It seems like the latter is simplest, but just throwing out here that for features like adding volumes to executor this won't work. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22071: [SPARK-25088][CORE][MESOS][DOCS] Update Rest Server docs...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/22071 LGTM as well --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22071: [SPARK-25088][CORE][MESOS][DOCS] Update Rest Serv...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/22071#discussion_r209714362 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/MesosClusterDispatcher.scala --- @@ -51,6 +51,13 @@ private[mesos] class MesosClusterDispatcher( conf: SparkConf) extends Logging { + { +val authKey = SecurityManager.SPARK_AUTH_SECRET_CONF --- End diff -- Got it, my reasoning is that it could be harder for someone looking at the code to figure out why this is not allowed, since we don't really mention about the rest server which is really the one requiring security to be turned off. Another reason it will be beneficial to have the check in the MesosRestServer is that the MesosClusterDispatcher framework could technically be decoupled from the MesosRestServer and allow another way to receive requests, so to increase flexibility and avoid someone forgetting about why we put this here, my suggestion is to move the check closer to where it's being required will help maintain this a bit better. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22071: [SPARK-25088][CORE][MESOS][DOCS] Update Rest Serv...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/22071#discussion_r209464703 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/MesosClusterDispatcher.scala --- @@ -51,6 +51,13 @@ private[mesos] class MesosClusterDispatcher( conf: SparkConf) extends Logging { + { +val authKey = SecurityManager.SPARK_AUTH_SECRET_CONF --- End diff -- I think it might be better to place this in the MesosRestServer code, since it's not really about the framework (MesosClusterDispatcher) but the RestServer receiving requests. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21027: [SPARK-23943][MESOS][DEPLOY] Improve observabilit...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/21027#discussion_r208468846 --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala --- @@ -63,6 +63,8 @@ private[spark] abstract class RestSubmissionServer( s"$baseContext/create/*" -> submitRequestServlet, s"$baseContext/kill/*" -> killRequestServlet, s"$baseContext/status/*" -> statusRequestServlet, +"/health" -> new ServerStatusServlet(this), +"/status" -> new ServerStatusServlet(this), --- End diff -- Also, what is the intended user for this information? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21027: [SPARK-23943][MESOS][DEPLOY] Improve observabilit...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/21027#discussion_r208468697 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/deploy/rest/mesos/MesosRestServer.scala --- @@ -50,6 +50,24 @@ private[spark] class MesosRestServer( new MesosKillRequestServlet(scheduler, masterConf) protected override val statusRequestServlet = new MesosStatusRequestServlet(scheduler, masterConf) + + override def isServerHealthy(): Boolean = !scheduler.isSchedulerDriverStopped() + + override def serverStatus(): ServerStatusResponse = { +val s = new ServerStatusResponse +s.schedulerDriverStopped = scheduler.isSchedulerDriverStopped() +s.queuedDrivers = scheduler.getQueuedDriversSize +s.launchedDrivers = scheduler.getLaunchedDriversSize +s.pendingRetryDrivers = scheduler.getPendingRetryDriversSize +s.success = true +s.message = "iamok" --- End diff -- How about leaving this blank? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21027: [SPARK-23943][MESOS][DEPLOY] Improve observabilit...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/21027#discussion_r208468594 --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala --- @@ -331,3 +345,15 @@ private class ErrorServlet extends RestServlet { sendResponse(error, response) } } + +private class ServerStatusServlet(server: RestSubmissionServer) extends RestServlet { + override def doGet(req: HttpServletRequest, resp: HttpServletResponse): Unit = { +val path = req.getRequestURI +if (!server.isServerHealthy() && path == "/health") { --- End diff -- I would switch the order (check path first). However, from this logic, if server is healthy and request is asking for /health, it will return status instead? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21027: [SPARK-23943][MESOS][DEPLOY] Improve observabilit...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/21027#discussion_r208468305 --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala --- @@ -63,6 +63,8 @@ private[spark] abstract class RestSubmissionServer( s"$baseContext/create/*" -> submitRequestServlet, s"$baseContext/kill/*" -> killRequestServlet, s"$baseContext/status/*" -> statusRequestServlet, +"/health" -> new ServerStatusServlet(this), --- End diff -- This impacts the Rest submission server in general too, I do like the idea to provide an endpoint to get status but I'm not sure this is a paradigm that Spark is going for. I know the common pattern is to poll Spark metrics to understand status of the components. @felixcheung do you have thoughts around this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21027: [SPARK-23943][MESOS][DEPLOY] Improve observabilit...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/21027#discussion_r208467704 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala --- @@ -160,7 +161,10 @@ trait MesosSchedulerUtils extends Logging { logError("driver.run() failed", e) error = Some(e) markErr() - } + } finally { +logWarning("schedulerDriver stopped") +schedulerDriverStopped.set(true) +} --- End diff -- Fix indent --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21006: [SPARK-22256][MESOS] - Introduce spark.mesos.driver.memo...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/21006 Besides the test comment, everything else LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21006: [SPARK-22256][MESOS] - Introduce spark.mesos.driv...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/21006#discussion_r207060617 --- Diff: resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterSchedulerSuite.scala --- @@ -199,6 +200,33 @@ class MesosClusterSchedulerSuite extends SparkFunSuite with LocalSparkContext wi }) } + test("supports spark.mesos.driver.memoryOverhead") { +setScheduler() + +val mem = 1000 +val cpu = 1 + +val response = scheduler.submitDriver( + new MesosDriverDescription("d1", "jar", mem, cpu, true, +command, +Map("spark.mesos.executor.home" -> "test", + "spark.app.name" -> "test"), +"s1", +new Date())) +assert(response.success) + +val offer = Utils.createOffer("o1", "s1", mem*2, cpu) +scheduler.resourceOffers(driver, List(offer).asJava) +val tasks = Utils.verifyTaskLaunched(driver, "o1") +// 1384.0 +val taskMem = tasks.head.getResourcesList + .asScala + .filter(_.getName.equals("mem")) + .map(_.getScalar.getValue) + .head +assert(1384.0 === taskMem) --- End diff -- Can we also test the 10% case as well? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21006: [SPARK-22256][MESOS] - Introduce spark.mesos.driver.memo...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/21006 jenkins ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20451: [SPARK-23146][WIP] Support client mode for Kubern...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/20451#discussion_r200020286 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterManager.scala --- @@ -18,43 +18,68 @@ package org.apache.spark.scheduler.cluster.k8s import java.io.File -import io.fabric8.kubernetes.client.Config +import io.fabric8.kubernetes.client.{Config, KubernetesClient} -import org.apache.spark.{SparkContext, SparkException} +import org.apache.spark.{SparkContext, SparkConf} import org.apache.spark.deploy.k8s.{KubernetesUtils, SparkKubernetesClientFactory} import org.apache.spark.deploy.k8s.Config._ import org.apache.spark.deploy.k8s.Constants._ import org.apache.spark.internal.Logging import org.apache.spark.scheduler.{ExternalClusterManager, SchedulerBackend, TaskScheduler, TaskSchedulerImpl} import org.apache.spark.util.ThreadUtils -private[spark] class KubernetesClusterManager extends ExternalClusterManager with Logging { +trait ManagerSpecificHandlers { + def createKubernetesClient(sparkConf: SparkConf): KubernetesClient + } - override def canCreate(masterURL: String): Boolean = masterURL.startsWith("k8s") +private[spark] class KubernetesClusterManager extends ExternalClusterManager + with ManagerSpecificHandlers with Logging { - override def createTaskScheduler(sc: SparkContext, masterURL: String): TaskScheduler = { -if (masterURL.startsWith("k8s") && - sc.deployMode == "client" && - !sc.conf.get(KUBERNETES_DRIVER_SUBMIT_CHECK).getOrElse(false)) { - throw new SparkException("Client mode is currently not supported for Kubernetes.") + class InClusterHandlers extends ManagerSpecificHandlers { + override def createKubernetesClient(sparkConf: SparkConf): KubernetesClient = + SparkKubernetesClientFactory.createKubernetesClient( + KUBERNETES_MASTER_INTERNAL_URL, + Some(sparkConf.get(KUBERNETES_NAMESPACE)), + APISERVER_AUTH_DRIVER_MOUNTED_CONF_PREFIX, --- End diff -- Why do we need a separate conf prefix as well? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20451: [SPARK-23146][WIP] Support client mode for Kubern...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/20451#discussion_r199340795 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala --- @@ -140,13 +140,6 @@ private[spark] class Client( throw e } - if (waitForAppCompletion) { --- End diff -- Why is this not needed anymore? If we enable cluster mode we still want the same behavior defined here right? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20451: [SPARK-23146][WIP] Support client mode for Kubern...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/20451#discussion_r199340757 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -88,6 +103,56 @@ private[spark] object SparkKubernetesClientFactory { new DefaultKubernetesClient(httpClientWithCustomDispatcher, config) } + def createOutClusterKubernetesClient( + master: String, --- End diff -- Fix ident --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20451: [SPARK-23146][WIP] Support client mode for Kubern...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/20451#discussion_r199340748 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -88,6 +103,56 @@ private[spark] object SparkKubernetesClientFactory { new DefaultKubernetesClient(httpClientWithCustomDispatcher, config) } + def createOutClusterKubernetesClient( + master: String, + namespace: Option[String], + kubernetesAuthConfPrefix: String, + sparkConf: SparkConf, + maybeServiceAccountToken: Option[File], + maybeServiceAccountCaCert: Option[File]): KubernetesClient = { + val oauthTokenFileConf = s"$kubernetesAuthConfPrefix.$OAUTH_TOKEN_FILE_CONF_SUFFIX" + val oauthTokenConf = s"$kubernetesAuthConfPrefix.$OAUTH_TOKEN_CONF_SUFFIX" + val oauthTokenFile = sparkConf.getOption(oauthTokenFileConf) + .map(new File(_)) + .orElse(maybeServiceAccountToken) + val oauthTokenValue = sparkConf.getOption(oauthTokenConf) + OptionRequirements.requireNandDefined( --- End diff -- Since it's only used once I'm not sure it warrents a separate file/method for checking Options. Also the method signature isn't quite clear for me what it does. (especially requireN) How about just a simple match that @squito suggested here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21033: [SPARK-19320][MESOS][WIP]allow specifying a hard limit o...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/21033 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21033: [SPARK-19320][MESOS][WIP]allow specifying a hard ...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/21033#discussion_r181199842 --- Diff: resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackendSuite.scala --- @@ -165,18 +165,47 @@ class MesosCoarseGrainedSchedulerBackendSuite extends SparkFunSuite } - test("mesos does not acquire more than spark.mesos.gpus.max") { -val maxGpus = 5 -setBackend(Map("spark.mesos.gpus.max" -> maxGpus.toString)) + test("mesos acquires spark.mesos.executor.gpus number of gpus per executor") { +setBackend(Map("spark.mesos.gpus.max" -> "5", + "spark.mesos.executor.gpus" -> "2")) val executorMemory = backend.executorMemory(sc) -offerResources(List(Resources(executorMemory, 1, maxGpus + 1))) +offerResources(List(Resources(executorMemory, 1, 5))) val taskInfos = verifyTaskLaunched(driver, "o1") assert(taskInfos.length == 1) val gpus = backend.getResource(taskInfos.head.getResourcesList, "gpus") -assert(gpus == maxGpus) +assert(gpus == 2) + } + + + test("mesos declines offers that cannot satisfy spark.mesos.executor.gpus") { +setBackend(Map("spark.mesos.gpus.max" -> "5", --- End diff -- I think it's worth testing setting max less than the number of executor gpus as well. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17714: [SPARK-20428][Core]REST interface about 'v1/submissions/...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/17714 Unfortunately I'm not a committer, so need to loop in someone who is to help merge it though. @srowen do you know who's responsible for the general deploy package? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17714: [SPARK-20428][Core]REST interface about 'v1/submi...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/17714#discussion_r114089418 --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala --- @@ -214,15 +214,15 @@ private[rest] abstract class KillRequestServlet extends RestServlet { protected override def doPost( request: HttpServletRequest, response: HttpServletResponse): Unit = { -val submissionId = parseSubmissionId(request.getPathInfo) -val responseMessage = submissionId.map(handleKill).getOrElse { +val submissionIds = parseSubmissionId(request.getPathInfo) --- End diff -- I don't think having submission Ids parsed on the request path is a good idea. I would assume most use cases for batch delete is required when you have a larger number of drivers to delete (otherwise you would be fine just deleting a few one by one). But most URLs are length limited. You might be better have creating a new request for this that takes a body --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17109: [SPARK-19740][MESOS]Add support in Spark to pass ...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/17109#discussion_r111670608 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala --- @@ -99,6 +99,26 @@ private[mesos] object MesosSchedulerBackendUtil extends Logging { .toList } + /** + * Parse a list of docker parameters, each of which + * takes the form key=value + */ + private def parseParamsSpec(params: String): List[Parameter] = { +params.split(",").map(_.split("=")).flatMap { spec: Array[String] => --- End diff -- I see, so we should split with a limit instead. @yanji84 can you fix this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/17109 @srowen Appreciate the help you're doing, I think we're doing what we can to help review these patches and making sure Mesos support is still being maintained and improved over time. If you trust our judgement and also us still around fixing issues when arises, then we really just need someone like you to help merge patches. Ensuring someone else or if anyone that's been contributing to this area can become a committer definitely is a ever ongoing problem that we're still hoping one day can be addressed. Another parallel effort that I think is very worth investigating is to decouple the cluster manager intergation from Spark, which I believe is becoming more relevant now as we have more integration coming. Long story short, if you can still help in the mean time will be greatly appreciated as we can still make sure improvements around Mesos integration can still happen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17109: [SPARK-19740][MESOS]Add support in Spark to pass ...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/17109#discussion_r106053115 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala --- @@ -99,6 +99,26 @@ private[mesos] object MesosSchedulerBackendUtil extends Logging { .toList } + /** + * Parse a list of docker parameters, each of which + * takes the form key=value + */ + private def parseParamsSpec(params: String): List[Parameter] = { +params.split(",").map(_.split("=")).flatMap { spec: Array[String] => --- End diff -- hmm if a value contains a '=' we will have a parsing error. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/17109 @srowen @mgummelt PTAL --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/17109 Hey sorry for the late response, code looks good to me, however we need to add documentation about the new flag. Can you modify the Mesos configuration docs in the docs folder? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/17109 @yanji84 can you add a test for this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17109: [SPARK-19740][MESOS]Add support in Spark to pass arbitra...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/17109 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13072: [SPARK-15288] [Mesos] Mesos dispatcher should handle gra...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13072 LGTM, @srowen can you please take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13077: [SPARK-10748] [Mesos] Log error instead of crashing Spar...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13077 @devaraj-kavali let us know if you can still update this, otherwise I'll close this as it's no longer being updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12933: [Spark-15155][Mesos] Optionally ignore default role reso...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/12933 @srowen can you help review this? Besides my minor comment overall it looks fine with me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12933: [Spark-15155][Mesos] Optionally ignore default ro...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/12933#discussion_r91195201 --- Diff: mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala --- @@ -50,6 +50,44 @@ trait MesosSchedulerUtils extends Logging { protected var mesosDriver: SchedulerDriver = null /** + * Returns the configured set of roles that an offer can be selected from + * @param conf Spark configuration + */ + protected def getAcceptedResourceRoles(conf: SparkConf): Set[String] = { +getAcceptedResourceRoles( + conf.getBoolean("spark.mesos.ignoreDefaultRoleResources", false), + conf.getOption("spark.mesos.role")) + } + /** + * Returns the configured set of roles that an offer can be selected from + * @param props Mesos driver description schedulerProperties map + */ + protected def getAcceptedResourceRoles(props: Map[String, String]): Set[String] = { +getAcceptedResourceRoles( + props.get("spark.mesos.ignoreDefaultRoleResources") match { +case Some(truth) => truth.toBoolean +case None => false + }, + props.get("spark.mesos.role")) + } + /** + * Internal version of getAcceptedResourceRoles + * @param ignoreDefaultRoleResources user specified property + * @param role user specified property + */ + private def getAcceptedResourceRoles( + ignoreDefaultRoleResources: Boolean, + role: Option[String]) = { +val roles = ignoreDefaultRoleResources match { + case true if role.isDefined => Set(role) + case _ => Set(Some("*"), role) +} +val acceptedRoles = roles.flatten +logDebug(s"Accepting resources from role(s): ${acceptedRoles.mkString(",")}") --- End diff -- I think we should move this log outside of this helper method, as there might be other context in the future that's calling this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14936: [SPARK-7877][MESOS] Allow configuration of framework tim...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/14936 @philipphoffmann Sorry for the long delay, one last ask. Can you add a simple unit test to verify it works? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15684: [SPARK-18171][MESOS] Show correct framework address in m...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/15684 LGTM, @srowen can you help on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16092: [SPARK-18662] Move resource managers to separate directo...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/16092 Changed suggested by @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16061: [SPARK-18278] [Scheduler] Support native submission of s...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/16061 @rxin Makes sense, @srowen also talked about starting the discussion of having a better support for external cluster managers as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16061: [SPARK-18278] [Scheduler] Support native submissi...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/16061#discussion_r90063528 --- Diff: kubernetes/src/main/scala/org/apache/spark/scheduler/cluster/kubernetes/KubernetesClusterScheduler.scala --- @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster.kubernetes + +import java.io.File +import java.util.Date +import java.util.concurrent.atomic.AtomicLong + +import io.fabric8.kubernetes.client.{ConfigBuilder, DefaultKubernetesClient, KubernetesClient} +import io.fabric8.kubernetes.api.model.{PodBuilder, ServiceBuilder} +import io.fabric8.kubernetes.client.dsl.LogWatch +import org.apache.spark.deploy.Command +import org.apache.spark.deploy.kubernetes.ClientArguments +import org.apache.spark.{io, _} +import org.apache.spark.internal.Logging +import org.apache.spark.internal.config._ + +import collection.JavaConverters._ +import org.apache.spark.util.Utils + +import scala.util.Random + +private[spark] object KubernetesClusterScheduler { + def defaultNameSpace = "default" + def defaultServiceAccountName = "default" +} + +/** + * This is a simple extension to ClusterScheduler + * */ +private[spark] class KubernetesClusterScheduler(conf: SparkConf) +extends Logging { + private val DEFAULT_SUPERVISE = false + private val DEFAULT_MEMORY = Utils.DEFAULT_DRIVER_MEM_MB // mb + private val DEFAULT_CORES = 1.0 + + logInfo("Created KubernetesClusterScheduler instance") + + var client = setupKubernetesClient() + val driverName = s"spark-driver-${Random.alphanumeric take 5 mkString("")}".toLowerCase() + val svcName = s"spark-svc-${Random.alphanumeric take 5 mkString("")}".toLowerCase() + val nameSpace = conf.get( +"spark.kubernetes.namespace", +KubernetesClusterScheduler.defaultNameSpace) + val serviceAccountName = conf.get( +"spark.kubernetes.serviceAccountName", +KubernetesClusterScheduler.defaultServiceAccountName) + + // Anything that should either not be passed to driver config in the cluster, or + // that is going to be explicitly managed as command argument to the driver pod + val confBlackList = scala.collection.Set( +"spark.master", +"spark.app.name", +"spark.submit.deployMode", +"spark.executor.jar", +"spark.dynamicAllocation.enabled", +"spark.shuffle.service.enabled") + + def start(args: ClientArguments): Unit = { +startDriver(client, args) + } + + def stop(): Unit = { +client.pods().inNamespace(nameSpace).withName(driverName).delete() +client + .services() + .inNamespace(nameSpace) + .withName(svcName) + .delete() + } + + def startDriver(client: KubernetesClient, + args: ClientArguments): Unit = { +logInfo("Starting spark driver on kubernetes cluster") +val driverDescription = buildDriverDescription(args) + +// image needs to support shim scripts "/opt/driver.sh" and "/opt/executor.sh" +val sparkImage = conf.getOption("spark.kubernetes.sparkImage").getOrElse { + // TODO: this needs to default to some standard Apache Spark image + throw new SparkException("Spark image not set. Please configure spark.kubernetes.sparkImage") +} + +// This is the URL of the client jar. +val clientJarUri = args.userJar + +// This is the kubernetes master we're launching on. +val kubernetesHost = "k8s://" + client.getMasterUrl().getHost() +logInfo("Using as kubernetes-master: " + kubernetesHost.toString()) + +val submitArgs = sca
[GitHub] spark pull request #16061: [SPARK-18278] [Scheduler] Support native submissi...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/16061#discussion_r90063456 --- Diff: kubernetes/src/main/scala/org/apache/spark/scheduler/cluster/kubernetes/KubernetesClusterSchedulerBackend.scala --- @@ -0,0 +1,222 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster.kubernetes + +import collection.JavaConverters._ +import io.fabric8.kubernetes.api.model.PodBuilder +import io.fabric8.kubernetes.api.model.extensions.JobBuilder +import io.fabric8.kubernetes.client.{ConfigBuilder, DefaultKubernetesClient} +import org.apache.spark.internal.config._ +import org.apache.spark.scheduler._ +import org.apache.spark.scheduler.cluster._ +import org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages._ +import org.apache.spark.{SparkConf, SparkContext, SparkException} +import org.apache.spark.rpc.RpcEndpointAddress +import org.apache.spark.scheduler.TaskSchedulerImpl +import org.apache.spark.util.Utils + +import scala.collection.mutable +import scala.util.Random +import scala.concurrent.Future + +private[spark] class KubernetesClusterSchedulerBackend( + scheduler: TaskSchedulerImpl, --- End diff -- Fix the formatting to conform with Spark style --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16061: [SPARK-18278] [Scheduler] Support native submissi...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/16061#discussion_r90063380 --- Diff: kubernetes/src/main/scala/org/apache/spark/scheduler/cluster/kubernetes/KubernetesClusterSchedulerBackend.scala --- @@ -0,0 +1,222 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster.kubernetes + +import collection.JavaConverters._ +import io.fabric8.kubernetes.api.model.PodBuilder +import io.fabric8.kubernetes.api.model.extensions.JobBuilder +import io.fabric8.kubernetes.client.{ConfigBuilder, DefaultKubernetesClient} +import org.apache.spark.internal.config._ +import org.apache.spark.scheduler._ +import org.apache.spark.scheduler.cluster._ +import org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages._ +import org.apache.spark.{SparkConf, SparkContext, SparkException} +import org.apache.spark.rpc.RpcEndpointAddress +import org.apache.spark.scheduler.TaskSchedulerImpl +import org.apache.spark.util.Utils + +import scala.collection.mutable +import scala.util.Random +import scala.concurrent.Future + +private[spark] class KubernetesClusterSchedulerBackend( + scheduler: TaskSchedulerImpl, + sc: SparkContext) + extends CoarseGrainedSchedulerBackend(scheduler, sc.env.rpcEnv) { + + val client = new DefaultKubernetesClient() + + val DEFAULT_NUMBER_EXECUTORS = 2 + val sparkExecutorName = s"spark-executor-${Random.alphanumeric take 5 mkString("")}".toLowerCase() + + // TODO: do these need mutex guarding? + // key is executor id, value is pod name + var executorToPod = mutable.Map.empty[String, String] // active executors + var shutdownToPod = mutable.Map.empty[String, String] // pending shutdown + var executorID = 0 + + val sparkImage = conf.get("spark.kubernetes.sparkImage") + val clientJarUri = conf.get("spark.executor.jar") + val ns = conf.get( +"spark.kubernetes.namespace", +KubernetesClusterScheduler.defaultNameSpace) + val dynamicExecutors = Utils.isDynamicAllocationEnabled(conf) + + // executor back-ends take their configuration this way + if (dynamicExecutors) { +conf.setExecutorEnv("spark.dynamicAllocation.enabled", "true") +conf.setExecutorEnv("spark.shuffle.service.enabled", "true") + } + + override def start(): Unit = { +super.start() +createExecutorPods(getInitialTargetExecutorNumber(sc.getConf)) + } + + override def stop(): Unit = { +// Kill all executor pods indiscriminately +killExecutorPods(executorToPod.toVector) +killExecutorPods(shutdownToPod.toVector) +super.stop() + } + + // Dynamic allocation interfaces + override def doRequestTotalExecutors(requestedTotal: Int): Future[Boolean] = { +logInfo(s"Received doRequestTotalExecutors: $requestedTotal") +val n = executorToPod.size +val delta = requestedTotal - n +if (delta > 0) { + logInfo(s"Adding $delta new executors") + createExecutorPods(delta) +} else if (delta < 0) { + val d = -delta --- End diff -- This shouldn't happen, assert instead --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16061: [SPARK-18278] [Scheduler] Support native submissi...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/16061#discussion_r90062695 --- Diff: dev/make-distribution.sh --- @@ -154,7 +154,9 @@ export MAVEN_OPTS="${MAVEN_OPTS:--Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCac # Store the command as an array because $MVN variable might have spaces in it. # Normal quoting tricks don't work. # See: http://mywiki.wooledge.org/BashFAQ/050 -BUILD_COMMAND=("$MVN" -T 1C clean package -DskipTests $@) +# BUILD_COMMAND=("$MVN" -T 1C clean package -DskipTests $@) + +BUILD_COMMAND=("$MVN" -T 2C package -DskipTests $@) --- End diff -- ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16061: [SPARK-18278] [Scheduler] Support native submissi...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/16061#discussion_r90062639 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -596,6 +599,26 @@ object SparkSubmit extends CommandLineUtils { } } +if (isKubernetesCluster) { --- End diff -- What if in Kubernetes and client mode? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12933: [Spark-15155][Mesos] Optionally ignore default role reso...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/12933 I ran `mvn test` inside of the mesos folder. On Thu, Oct 13, 2016 at 3:21 AM, Chris Heller <notificati...@github.com> wrote: > You saw the error with `./dev/run-tests`? Ok I'll figure this out. > > Sent from my iPhone > > > On Oct 13, 2016, at 12:24 AM, Timothy Chen <notificati...@github.com> > wrote: > > > > I just tried running it locally and I'm getting the same error. It seems > like with your change that test is simply declining the offer. > > > > > — > > You are receiving this because you were mentioned. > > Reply to this email directly, view it on GitHub, or mute the thread. > > > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/apache/spark/pull/12933#issuecomment-253474465>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AAEGrAU4TKCiLBwMpDm7PmGsEm79tdoEks5qzgY0gaJpZM4IYBAd> > . > --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12933: [Spark-15155][Mesos] Optionally ignore default role reso...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/12933 I just tried running it locally and I'm getting the same error. It seems like with your change that test is simply declining the offer. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13713: [SPARK-15994] [MESOS] Allow enabling Mesos fetch ...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/13713#discussion_r82695978 --- Diff: docs/running-on-mesos.md --- @@ -506,8 +506,13 @@ See the [configuration page](configuration.html) for information on Spark config since this configuration is just a upper limit and not a guaranteed amount. - - + + spark.mesos.fetchCache.enable + false + +If set to `true`, all URIs in `spark.mesos.uris` will be eligible for caching by the [Mesos fetch cache](http://mesos.apache.org/documentation/latest/fetcher/) --- End diff -- From the implementation you actually set all downloadable URIs (like spark.executor.uri, jarUrl, etc) to be fetcher cachable. I think we need to be more explicit here that it's more than just spark.mesos.uris --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13713: [SPARK-15994] [MESOS] Allow enabling Mesos fetch cache i...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13713 Other than the 2 comments, the changes LGTM. @mgummelt @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13713: [SPARK-15994] [MESOS] Allow enabling Mesos fetch ...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/13713#discussion_r82695810 --- Diff: mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackendSuite.scala --- @@ -463,6 +463,21 @@ class MesosCoarseGrainedSchedulerBackendSuite extends SparkFunSuite assert(launchedTasks.head.getCommand.getUrisList.asScala(0).getValue == url) } + test("mesos supports setting fetcher") { --- End diff -- s/supports setting fetcher/supports setting fetcher cache/g --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14936: [SPARK-7877][MESOS] Allow configuration of framework tim...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/14936 Hmm it is currently Integer.MAX because we assume the cluster scheduler to be long living, and without setting it be a long value Mesos will automatically terminate that framework when it's disconnected. I think currently all Spark jobs don't have it specified so when it disconnects it's simply removed. I think we should keep the same semantics and don't impose a default value for everything, we have it left to be 0 in coarse grain scheduler, and default to Int.MAX in cluster scheduler. But user can always override it no matter what. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14644: [SPARK-14082][MESOS] Enable GPU support with Mesos
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/14644 I just tested it with a GPU instance and it works. @mgummelt @klueska any more comments? otherwise @srowen I think we should merge as we no longer have any comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #9287: SPARK-11326: Split networking in standalone mode
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/9287 This has been stale for a while, we should close this if there is no update here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12933: [Spark-15155][Mesos] Optionally ignore default role reso...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/12933 @hellertime Are you able to rebase? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13077: [SPARK-10748] [Mesos] Log error instead of crashing Spar...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13077 @devaraj-kavali Are you still able to update this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13713: [SPARK-15994] [MESOS] Allow enabling Mesos fetch cache i...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13713 @drcrallen Are you still planning to update this? It's quite a useful feature, so hoping this can get in. Also since Fine grain mode is depcreated I don't think we need to update it too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14936: [SPARK-7877][MESOS] Allow configuration of framew...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/14936#discussion_r82430755 --- Diff: mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala --- @@ -69,38 +68,51 @@ trait MesosSchedulerUtils extends Logging { conf: SparkConf, webuiUrl: Option[String] = None, checkpoint: Option[Boolean] = None, - failoverTimeout: Option[Double] = None, frameworkId: Option[String] = None): SchedulerDriver = { -val fwInfoBuilder = FrameworkInfo.newBuilder().setUser(sparkUser).setName(appName) +val fwInfo = createFrameworkInfo(sparkUser, appName, conf, webuiUrl, checkpoint, frameworkId) val credBuilder = Credential.newBuilder() -webuiUrl.foreach { url => fwInfoBuilder.setWebuiUrl(url) } -checkpoint.foreach { checkpoint => fwInfoBuilder.setCheckpoint(checkpoint) } -failoverTimeout.foreach { timeout => fwInfoBuilder.setFailoverTimeout(timeout) } -frameworkId.foreach { id => - fwInfoBuilder.setId(FrameworkID.newBuilder().setValue(id).build()) -} conf.getOption("spark.mesos.principal").foreach { principal => - fwInfoBuilder.setPrincipal(principal) credBuilder.setPrincipal(principal) } conf.getOption("spark.mesos.secret").foreach { secret => credBuilder.setSecret(secret) } -if (credBuilder.hasSecret && !fwInfoBuilder.hasPrincipal) { +if (credBuilder.hasSecret && !fwInfo.hasPrincipal) { throw new SparkException( "spark.mesos.principal must be configured when spark.mesos.secret is set") } -conf.getOption("spark.mesos.role").foreach { role => - fwInfoBuilder.setRole(role) -} if (credBuilder.hasPrincipal) { new MesosSchedulerDriver( -scheduler, fwInfoBuilder.build(), masterUrl, credBuilder.build()) +scheduler, fwInfo, masterUrl, credBuilder.build()) } else { - new MesosSchedulerDriver(scheduler, fwInfoBuilder.build(), masterUrl) + new MesosSchedulerDriver(scheduler, fwInfo, masterUrl) } } + def createFrameworkInfo( +sparkUser: String, +appName: String, +conf: SparkConf, +webuiUrl: Option[String] = None, +checkpoint: Option[Boolean] = None, +frameworkId: Option[String] = None): FrameworkInfo = { +val fwInfoBuilder = FrameworkInfo.newBuilder().setUser(sparkUser).setName(appName) +webuiUrl.foreach { url => fwInfoBuilder.setWebuiUrl(url) } +checkpoint.foreach { checkpoint => fwInfoBuilder.setCheckpoint(checkpoint) } + fwInfoBuilder.setFailoverTimeout(conf.getDouble("spark.mesos.failoverTimeout", 10)) --- End diff -- This new flag needs to get added to the documentation as well --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14936: [SPARK-7877][MESOS] Allow configuration of framew...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/14936#discussion_r82430668 --- Diff: mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala --- @@ -69,38 +68,51 @@ trait MesosSchedulerUtils extends Logging { conf: SparkConf, webuiUrl: Option[String] = None, checkpoint: Option[Boolean] = None, - failoverTimeout: Option[Double] = None, frameworkId: Option[String] = None): SchedulerDriver = { -val fwInfoBuilder = FrameworkInfo.newBuilder().setUser(sparkUser).setName(appName) +val fwInfo = createFrameworkInfo(sparkUser, appName, conf, webuiUrl, checkpoint, frameworkId) val credBuilder = Credential.newBuilder() -webuiUrl.foreach { url => fwInfoBuilder.setWebuiUrl(url) } -checkpoint.foreach { checkpoint => fwInfoBuilder.setCheckpoint(checkpoint) } -failoverTimeout.foreach { timeout => fwInfoBuilder.setFailoverTimeout(timeout) } -frameworkId.foreach { id => - fwInfoBuilder.setId(FrameworkID.newBuilder().setValue(id).build()) -} conf.getOption("spark.mesos.principal").foreach { principal => - fwInfoBuilder.setPrincipal(principal) credBuilder.setPrincipal(principal) } conf.getOption("spark.mesos.secret").foreach { secret => credBuilder.setSecret(secret) } -if (credBuilder.hasSecret && !fwInfoBuilder.hasPrincipal) { +if (credBuilder.hasSecret && !fwInfo.hasPrincipal) { throw new SparkException( "spark.mesos.principal must be configured when spark.mesos.secret is set") } -conf.getOption("spark.mesos.role").foreach { role => - fwInfoBuilder.setRole(role) -} if (credBuilder.hasPrincipal) { new MesosSchedulerDriver( -scheduler, fwInfoBuilder.build(), masterUrl, credBuilder.build()) +scheduler, fwInfo, masterUrl, credBuilder.build()) } else { - new MesosSchedulerDriver(scheduler, fwInfoBuilder.build(), masterUrl) + new MesosSchedulerDriver(scheduler, fwInfo, masterUrl) } } + def createFrameworkInfo( +sparkUser: String, --- End diff -- Fix the parameters indent --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14644: [SPARK-14082][MESOS] Enable GPU support with Mesos
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/14644 Ya, the default GPU requirement I have is 0 (cores per executor/node is 1). I'm still gathering feedback what's the more sensible thing to do for GPUs. We can either set a configurable amount that each executor has to use, or have a max. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14644: [SPARK-14082][MESOS] Enable GPU support with Mesos
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/14644 1. Good catch, my old patch had docs but I rebased and it didn't apply for some reason. Let me add it. 2,3: we don't fail if you ask for more GPUs since it's not a hard requirement but simply a max, just like how CPUs.max work. I didn't add a required amount setting but we can certainly add it in the future. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14644: [SPARK-14082][MESOS] Enable GPU support with Mesos
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/14644 @mgummelt @srowen Please review as well --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14644: [SPARK-14082][MESOS] Enable GPU support with Mesos
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/14644 @klueska Just updated the patch and I think it's using the right semantics now, where it has a global gpus max just like cores. Can you try it out? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14644: [MESOS] Enable GPU support with Mesos
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/14644#discussion_r78348094 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -103,6 +103,7 @@ private[spark] class MesosCoarseGrainedSchedulerBackend( private val stateLock = new ReentrantLock val extraCoresPerExecutor = conf.getInt("spark.mesos.extra.cores", 0) + val maxGpus = conf.getInt("spark.mesos.gpus.max", 0) --- End diff -- I see, in this case it's the same semantics as cpus.max, so I think using a really big number seems right to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14644: [MESOS] Enable GPU support with Mesos
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/14644#discussion_r78298417 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -103,6 +103,7 @@ private[spark] class MesosCoarseGrainedSchedulerBackend( private val stateLock = new ReentrantLock val extraCoresPerExecutor = conf.getInt("spark.mesos.extra.cores", 0) + val maxGpus = conf.getInt("spark.mesos.gpus.max", 0) --- End diff -- Which sounds sensible to me since GPU is not usually required to run your Spark job. And also cores.max is an aggregate max, where gpu.max as the current patch is a per node max. I think I will change this into how cores.max work, but default to 0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14644: [MESOS] Enable GPU support with Mesos
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/14644#discussion_r78002761 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -103,6 +103,7 @@ private[spark] class MesosCoarseGrainedSchedulerBackend( private val stateLock = new ReentrantLock val extraCoresPerExecutor = conf.getInt("spark.mesos.extra.cores", 0) + val maxGpus = conf.getInt("spark.mesos.gpus.max", 0) --- End diff -- My thoughts was that by only allowing a Boolean flag a spark job either uses all GPUs from a host or not, which it won't be able to have different GPu devices shared by different jobs. By specifying a limit at least there is ability to let a job specify how much GPUs it should grab per node. thoughts? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14644: [MESOS] Enable GPU support with Mesos
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/14644 @srowen Mesos also supports node labels as well (which is how constraints is implemented in Spark framework). However GPUs are implemented as a resource (as we want to account for # of GPUs instead of just placing a task there). As for the config name, I just picked that to begin with. I was also thinking we should consider having a generic config name (spark.gpus?) as I believe it could be reused. But I wasn't sure how we like to account for this yet as GPUs are quite different from CPUs (Mesos currently just do a integer number of GPUs, not sharing or topology information yet). You have suggestons? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14644: Enable GPU support with Mesos
GitHub user tnachen opened a pull request: https://github.com/apache/spark/pull/14644 Enable GPU support with Mesos ## What changes were proposed in this pull request? Enable GPU resources to be used when running coarse grain mode with Mesos. ## How was this patch tested? Manual test with GPU. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tnachen/spark gpu_mesos Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14644.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14644 commit 163cfa49b2116612f981aa8158054e006d40b52d Author: Timothy Chen <tnac...@gmail.com> Date: 2016-05-23T23:23:51Z Enable GPU with Mesos on Spark commit 4edc6db5329a19f49af9303897ee0a2f1fc91a14 Author: Timothy Chen <tnac...@gmail.com> Date: 2016-08-15T06:39:05Z Enable GPU support with Mesos --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13713: [SPARK-15994] [MESOS] Allow enabling Mesos fetch cache i...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13713 We also fetch URIs for running drivers in cluster mode (MesosClusterScheduler.scala). I'm thinking we should also allow this configuration to effect that too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13077: [SPARK-10748] [Mesos] Log error instead of crashing Spar...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13077 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13051: [SPARK-15271] [MESOS] Allow force pulling executor docke...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13051 @srowen Or if you could help :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14275: [SPARK-16637] Unified containerizer
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/14275#discussion_r72003350 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala --- @@ -105,16 +105,27 @@ private[mesos] object MesosSchedulerBackendUtil extends Logging { def addDockerInfo( container: ContainerInfo.Builder, image: String, + containerizer: String, volumes: Option[List[Volume]] = None, - network: Option[ContainerInfo.DockerInfo.Network] = None, portmaps: Option[List[ContainerInfo.DockerInfo.PortMapping]] = None): Unit = { -val docker = ContainerInfo.DockerInfo.newBuilder().setImage(image) +containerizer match { --- End diff -- Can we have a sensible message/exception when we pass in a unknown containerizer? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/13950#discussion_r69690304 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -502,6 +502,9 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli _applicationId = _taskScheduler.applicationId() _applicationAttemptId = taskScheduler.applicationAttemptId() _conf.set("spark.app.id", _applicationId) +if (_conf.getBoolean("spark.ui.reverseProxy", false)) { + System.setProperty("spark.ui.proxyBase", "/target/" + _applicationId) --- End diff -- ah I see, sorry I thought it will be just opaque ids, but worker- and app- prefixes looks fine to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/13950#discussion_r69678895 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -502,6 +502,9 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli _applicationId = _taskScheduler.applicationId() _applicationAttemptId = taskScheduler.applicationAttemptId() _conf.set("spark.app.id", _applicationId) +if (_conf.getBoolean("spark.ui.reverseProxy", false)) { + System.setProperty("spark.ui.proxyBase", "/target/" + _applicationId) --- End diff -- I wonder if this is better named /proxy/app/applicationId for applications, and /proxy/worker/workerId, so it's more clear what the destination target is? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/13950#discussion_r69676891 --- Diff: core/src/main/scala/org/apache/spark/ui/JettyUtils.scala --- @@ -186,6 +188,67 @@ private[spark] object JettyUtils extends Logging { contextHandler } + /** Create a handler for proxying request to Workers and Application Drivers */ + def createProxyHandler( + prefix: String, + target: String): ServletContextHandler = { +val servlet = new ProxyServlet { + override def rewriteTarget(request: HttpServletRequest): String = { +val path = request.getRequestURI(); +if (!path.startsWith(prefix)) return null + +val uri = new StringBuilder(target) +if (target.endsWith("/")) uri.setLength(uri.length() - 1) +val rest = path.substring(prefix.length()) +if (!rest.isEmpty()) +{ --- End diff -- Move { to previous line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/13950#discussion_r69676869 --- Diff: docs/configuration.md --- @@ -598,6 +598,20 @@ Apart from these, the following properties are also available, and may be useful + spark.ui.reverseProxy + false + +To enable running Spark Master, worker and application UI behined a reverse proxy. In this mode, Spark master will reverse proxy the worker and application UIs to enable access. + + + + spark.ui.reverseProxyUrl + http://localhost:8080 + +This is the URL where your proxy is running. Make sure this is a complete URL includeing scheme (http/https) and port to reach your proxy. --- End diff -- includeing -> including --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13143: [SPARK-15359] [Mesos] Mesos dispatcher should han...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/13143#discussion_r68194509 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala --- @@ -120,14 +120,25 @@ private[mesos] trait MesosSchedulerUtils extends Logging { val ret = mesosDriver.run() logInfo("driver.run() returned with code " + ret) if (ret != null && ret.equals(Status.DRIVER_ABORTED)) { - error = Some(new SparkException("Error starting driver, DRIVER_ABORTED")) - markErr() + val ex = new SparkException("Error starting driver, DRIVER_ABORTED") + // if the driver gets aborted after the successful registration --- End diff -- Also to simplify the code, can we just throw SparkExecption here? Then the catch will then handle all cases --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12933: [Spark-15155][Mesos] Optionally ignore default role reso...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/12933 @hellertime ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13072: [SPARK-15288] [Mesos] Mesos dispatcher should handle gra...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13072 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13143: [SPARK-15359] [Mesos] Mesos dispatcher should handle DRI...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13143 Is it because MesosDriver actually threw an exception? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13143: [SPARK-15359] [Mesos] Mesos dispatcher should han...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/13143#discussion_r67988954 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala --- @@ -120,14 +120,25 @@ private[mesos] trait MesosSchedulerUtils extends Logging { val ret = mesosDriver.run() logInfo("driver.run() returned with code " + ret) if (ret != null && ret.equals(Status.DRIVER_ABORTED)) { - error = Some(new SparkException("Error starting driver, DRIVER_ABORTED")) - markErr() + val ex = new SparkException("Error starting driver, DRIVER_ABORTED") + // if the driver gets aborted after the successful registration --- End diff -- s/after the successful registration/after registration/g --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13077: [SPARK-10748] [Mesos] Log error instead of crashing Spar...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13077 @devaraj-kavali Ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13326: [SPARK-15560] [Mesos] Queued/Supervise drivers waiting f...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13326 jenkins please retest --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13713: [SPARK-15994] [MESOS] Allow enabling Mesos fetch cache i...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13713 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13715: [SPARK-15992] [MESOS] Refactor MesosCoarseGrained...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/13715#discussion_r67988575 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -382,59 +382,97 @@ private[spark] class MesosCoarseGrainedSchedulerBackend( for (offer <- offers) { val slaveId = offer.getSlaveId.getValue val offerId = offer.getId.getValue -val resources = remainingResources(offerId) - -if (canLaunchTask(slaveId, resources)) { - // Create a task - launchTasks = true - val taskId = newMesosTaskId() - val offerCPUs = getResource(resources, "cpus").toInt - - val taskCPUs = executorCores(offerCPUs) - val taskMemory = executorMemory(sc) - - slaves.getOrElseUpdate(slaveId, new Slave(offer.getHostname)).taskIDs.add(taskId) - - val (afterCPUResources, cpuResourcesToUse) = -partitionResources(resources, "cpus", taskCPUs) - val (resourcesLeft, memResourcesToUse) = -partitionResources(afterCPUResources.asJava, "mem", taskMemory) - - val taskBuilder = MesosTaskInfo.newBuilder() - .setTaskId(TaskID.newBuilder().setValue(taskId.toString).build()) -.setSlaveId(offer.getSlaveId) -.setCommand(createCommand(offer, taskCPUs + extraCoresPerExecutor, taskId)) -.setName("Task " + taskId) -.addAllResources(cpuResourcesToUse.asJava) -.addAllResources(memResourcesToUse.asJava) - - sc.conf.getOption("spark.mesos.executor.docker.image").foreach { image => -MesosSchedulerBackendUtil - .setupContainerBuilderDockerInfo(image, sc.conf, taskBuilder.getContainerBuilder) +val availableResources = remainingResources(offerId) +val offerMem = getResource(availableResources, "mem") +val offerCpu = getResource(availableResources, "cpus") + +// Catch offer limits +calculateUsableResources( + sc, + offerCpu.toInt, + offerMem.toInt +).flatMap( + { +// Catch "global" limits +case (taskCPUs: Int, taskMemory: Int) => + if (numExecutors() >= executorLimit) { +logTrace(s"${numExecutors()} exceeds limit of $executorLimit") +None + } else if ( +slaves.get(slaveId).map(_.taskFailures).getOrElse(0) >= MAX_SLAVE_FAILURES + ) { +logTrace(s"Slave $slaveId exceeded limit of $MAX_SLAVE_FAILURES failures") +None + } else { +Some((taskCPUs, taskMemory)) + } } - - tasks(offer.getId) ::= taskBuilder.build() - remainingResources(offerId) = resourcesLeft.asJava - totalCoresAcquired += taskCPUs - coresByTaskId(taskId) = taskCPUs +) match { + case Some((taskCPUs: Int, taskMemory: Int)) => +// Create a task +launchTasks = true +val taskId = newMesosTaskId() + +slaves.getOrElseUpdate(slaveId, new Slave(offer.getHostname)).taskIDs.add(taskId) + +val (afterCPUResources, cpuResourcesToUse) = + partitionResources(availableResources, "cpus", taskCPUs) +val (resourcesLeft, memResourcesToUse) = + partitionResources(afterCPUResources.asJava, "mem", taskMemory) + +val taskBuilder = MesosTaskInfo.newBuilder() + .setTaskId(TaskID.newBuilder().setValue(taskId.toString).build()) + .setSlaveId(offer.getSlaveId) + .setCommand(createCommand(offer, taskCPUs + extraCoresPerExecutor, taskId)) + .setName("Task " + taskId) + .addAllResources(cpuResourcesToUse.asJava) + .addAllResources(memResourcesToUse.asJava) + +sc.conf.getOption("spark.mesos.executor.docker.image").foreach { image => + MesosSchedulerBackendUtil +.setupContainerBuilderDockerInfo(image, sc.conf, taskBuilder.getContainerBuilder) +} + +tasks(offer.getId) ::= taskBuilder.build() +remainingResources(offerId) = resourcesLeft.asJava +totalCoresAcquired += taskCPUs +coresByTaskId(taskId) = taskCPUs + case None => logDebu
[GitHub] spark issue #13715: [SPARK-15992] [MESOS] Refactor MesosCoarseGrainedSchedul...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13715 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13323: [SPARK-15555] [Mesos] Driver with --supervise option can...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13323 Thanks @devaraj-kavali, this LGTM. @andrewor14 can you take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13051: [SPARK-15271] [MESOS] Allow force pulling executor docke...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13051 @andrewor14 PTAL, this PR LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12933: [Spark-15155][Mesos] Optionally ignore default role reso...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/12933 @hellertime Can you rebase and submit again to rerun the tests? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #10949: [SPARK-12832][MESOS] mesos scheduler respect agent attri...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/10949 @atongen please rebase and try again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13077: [SPARK-10748] [Mesos] Log error instead of crashing Spar...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13077 I think instead of just moving to finished drivers, can we also show that message on the UI? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13077: [SPARK-10748] [Mesos] Log error instead of crashing Spar...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13077 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13323: [SPARK-15555] [Mesos] Driver with --supervise option can...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13323 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13143: [SPARK-15359] [Mesos] Mesos dispatcher should handle DRI...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13143 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13051: [SPARK-15271] [MESOS] Allow force pulling executor docke...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13051 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13323: [SPARK-15555] [Mesos] Driver with --supervise option can...
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/13323 Nice catch, can you add a unit test to test this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13326: [SPARK-15560] [Mesos] Queued/Supervise drivers wa...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/13326#discussion_r65580076 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -188,10 +188,10 @@ private[spark] class MesosClusterScheduler( mesosDriver.killTask(task.taskId) k.success = true k.message = "Killing running driver" - } else if (removeFromQueuedDrivers(submissionId)) { --- End diff -- We should just rename and change the existing function, I Don't think it's being used elsewhere? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11887: [SPARK-13041][Mesos]add driver sandbox uri to the...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/11887#discussion_r65579886 --- Diff: core/src/main/scala/org/apache/spark/deploy/mesos/ui/MesosClusterPage.scala --- @@ -115,4 +142,58 @@ private[mesos] class MesosClusterPage(parent: MesosClusterUI) extends WebUIPage( } sb.toString() } + + private def getIp4(ip: Int): String = { +val buffer = ByteBuffer.allocate(4) +buffer.putInt(ip) +// we need to check about that because protocolbuf changes the order +// which by mesos api is considered to be network order (big endian). +val result = if (ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN) { + buffer.array.toList.reverse +} else { + buffer.array.toList +} +result.map{byte => byte & 0xFF}.mkString(".") + } + + private def getListFromJson(value: JValue): List[Map[String, Any]] = { +value.values.asInstanceOf[List[Map[String, Any]]] + } + + private def getTaskDirectory(masterUri: String, driverFwId: String, slaveId: String): + Option[String] = { + --- End diff -- Remove extra white spaces here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11887: [SPARK-13041][Mesos]add driver sandbox uri to the...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/11887#discussion_r65579463 --- Diff: core/src/main/scala/org/apache/spark/deploy/mesos/ui/MesosClusterPage.scala --- @@ -68,6 +75,25 @@ private[mesos] class MesosClusterPage(parent: MesosClusterUI) extends WebUIPage( private def driverRow(state: MesosClusterSubmissionState): Seq[Node] = { val id = state.driverDescription.submissionId +val masterInfo = parent.scheduler.getSchedulerMasterInfo() +val schedulerFwId = parent.scheduler.getSchedulerFrameworkId() +val sandboxCol = if (masterInfo.isDefined && schedulerFwId.isDefined) { + + val masterUri = masterInfo.map{info => s"http://${getIp4(info.getIp)}:${info.getPort}"}.get + val directory = getTaskDirectory(masterUri, id, state.slaveId.getValue) + --- End diff -- Remove extra space --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11887: [SPARK-13041][Mesos]add driver sandbox uri to the...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/11887#discussion_r65579454 --- Diff: core/src/main/scala/org/apache/spark/deploy/mesos/ui/MesosClusterPage.scala --- @@ -68,6 +75,25 @@ private[mesos] class MesosClusterPage(parent: MesosClusterUI) extends WebUIPage( private def driverRow(state: MesosClusterSubmissionState): Seq[Node] = { val id = state.driverDescription.submissionId +val masterInfo = parent.scheduler.getSchedulerMasterInfo() +val schedulerFwId = parent.scheduler.getSchedulerFrameworkId() +val sandboxCol = if (masterInfo.isDefined && schedulerFwId.isDefined) { + + val masterUri = masterInfo.map{info => s"http://${getIp4(info.getIp)}:${info.getPort}"}.get + val directory = getTaskDirectory(masterUri, id, state.slaveId.getValue) + + if(directory.isDefined) { --- End diff -- Space between if ( --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11887: [SPARK-13041][Mesos]add driver sandbox uri to the...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/11887#discussion_r65579412 --- Diff: core/src/main/scala/org/apache/spark/deploy/mesos/ui/MesosClusterPage.scala --- @@ -68,6 +75,25 @@ private[mesos] class MesosClusterPage(parent: MesosClusterUI) extends WebUIPage( private def driverRow(state: MesosClusterSubmissionState): Seq[Node] = { val id = state.driverDescription.submissionId +val masterInfo = parent.scheduler.getSchedulerMasterInfo() +val schedulerFwId = parent.scheduler.getSchedulerFrameworkId() +val sandboxCol = if (masterInfo.isDefined && schedulerFwId.isDefined) { + --- End diff -- Kill space --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #11887: [SPARK-13041][Mesos]add driver sandbox uri to the...
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/11887#discussion_r65579276 --- Diff: core/src/main/scala/org/apache/spark/deploy/mesos/ui/MesosClusterPage.scala --- @@ -68,6 +75,25 @@ private[mesos] class MesosClusterPage(parent: MesosClusterUI) extends WebUIPage( private def driverRow(state: MesosClusterSubmissionState): Seq[Node] = { val id = state.driverDescription.submissionId +val masterInfo = parent.scheduler.getSchedulerMasterInfo() +val schedulerFwId = parent.scheduler.getSchedulerFrameworkId() +val sandboxCol = if (masterInfo.isDefined && schedulerFwId.isDefined) { + + val masterUri = masterInfo.map{info => s"http://${getIp4(info.getIp)}:${info.getPort}"}.get + val directory = getTaskDirectory(masterUri, id, state.slaveId.getValue) + + if(directory.isDefined) { +val sandBoxUri = s"$masterUri" + + s"/#/slaves/${state.slaveId.getValue}" + + s"/browse?path=${directory.get}" + Sandbox --- End diff -- I think we should add a property like @mgummelt suggested to override masterUri if available. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org