[
https://issues.apache.org/jira/browse/SPARK-11638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195304#comment-15195304
]
Radoslaw Gruchalski edited comment on SPARK-11638 at 3/15/16 4:29 PM:
----------------------------------------------------------------------
[~skonto], yes, sounds like a good idea.
Would it be worth discussing exactly what the outcome of this feature should be?
was (Author: radekg):
[~skonto], being just made redundant, I believe I have a lot of spare cycles as
well. Yes, sounds like a good idea.
Would it be worth discussing exactly what the outcome of this feature should be?
> Run Spark on Mesos with bridge networking
> -----------------------------------------
>
> Key: SPARK-11638
> URL: https://issues.apache.org/jira/browse/SPARK-11638
> Project: Spark
> Issue Type: Improvement
> Components: Mesos, Spark Core
> Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2, 1.6.0
> Reporter: Radoslaw Gruchalski
> Attachments: 1.4.0.patch, 1.4.1.patch, 1.5.0.patch, 1.5.1.patch,
> 1.5.2.patch, 1.6.0.patch, 2.3.11.patch, 2.3.4.patch
>
>
> h4. Summary
> Provides {{spark.driver.advertisedPort}},
> {{spark.fileserver.advertisedPort}}, {{spark.broadcast.advertisedPort}} and
> {{spark.replClassServer.advertisedPort}} settings to enable running Spark in
> Mesos on Docker with Bridge networking. Provides patches for Akka Remote to
> enable Spark driver advertisement using alternative host and port.
> With these settings, it is possible to run Spark Master in a Docker container
> and have the executors running on Mesos talk back correctly to such Master.
> The problem is discussed on the Mesos mailing list here:
> https://mail-archives.apache.org/mod_mbox/mesos-user/201510.mbox/%3CCACTd3c9vjAMXk=bfotj5ljzfrh5u7ix-ghppfqknvg9mkkc...@mail.gmail.com%3E
> h4. Running Spark on Mesos - LIBPROCESS_ADVERTISE_IP opens the door
> In order for the framework to receive orders in the bridged container, Mesos
> in the container has to register for offers using the IP address of the
> Agent. Offers are sent by Mesos Master to the Docker container running on a
> different host, an Agent. Normally, prior to Mesos 0.24.0, {{libprocess}}
> would advertise itself using the IP address of the container, something like
> {{172.x.x.x}}. Obviously, Mesos Master can't reach that address, it's a
> different host, it's a different machine. Mesos 0.24.0 introduced two new
> properties for {{libprocess}} - {{LIBPROCESS_ADVERTISE_IP}} and
> {{LIBPROCESS_ADVERTISE_PORT}}. This allows the container to use the Agent's
> address to register for offers. This was provided mainly for running Mesos in
> Docker on Mesos.
> h4. Spark - how does the above relate and what is being addressed here?
> Similar to Mesos, out of the box, Spark does not allow to advertise its
> services on ports different than bind ports. Consider following scenario:
> Spark is running inside a Docker container on Mesos, it's a bridge networking
> mode. Assuming a port {{6666}} for the {{spark.driver.port}}, {{6677}} for
> the {{spark.fileserver.port}}, {{6688}} for the {{spark.broadcast.port}} and
> {{23456}} for the {{spark.replClassServer.port}}. If such task is posted to
> Marathon, Mesos will give 4 ports in range {{31000-32000}} mapping to the
> container ports. Starting the executors from such container results in
> executors not being able to communicate back to the Spark Master.
> This happens because of 2 things:
> Spark driver is effectively an {{akka-remote}} system with {{akka.tcp}}
> transport. {{akka-remote}} prior to version {{2.4}} can't advertise a port
> different to what it bound to. The settings discussed are here:
> https://github.com/akka/akka/blob/f8c1671903923837f22d0726a955e0893add5e9f/akka-remote/src/main/resources/reference.conf#L345-L376.
> These do not exist in Akka {{2.3.x}}. Spark driver will always advertise
> port {{6666}} as this is the one {{akka-remote}} is bound to.
> Any URIs the executors contact the Spark Master on, are prepared by Spark
> Master and handed over to executors. These always contain the port number
> used by the Master to find the service on. The services are:
> - {{spark.broadcast.port}}
> - {{spark.fileserver.port}}
> - {{spark.replClassServer.port}}
> all above ports are by default {{0}} (random assignment) but can be specified
> using Spark configuration ( {{-Dspark...port}} ). However, they are limited
> in the same way as the {{spark.driver.port}}; in the above example, an
> executor should not contact the file server on port {{6677}} but rather on
> the respective 31xxx assigned by Mesos.
> Spark currently does not allow any of that.
> h4. Taking on the problem, step 1: Spark Driver
> As mentioned above, Spark Driver is based on {{akka-remote}}. In order to
> take on the problem, the {{akka.remote.net.tcp.bind-hostname}} and
> {{akka.remote.net.tcp.bind-port}} settings are a must. Spark does not compile
> with Akka 2.4.x yet.
> What we want is the back port of mentioned {{akka-remote}} settings to
> {{2.3.x}} versions. These patches are attached to this ticket -
> {{2.3.4.patch}} and {{2.3.11.patch}} files provide patches for respective
> akka versions. These add mentioned settings and ensure they work as
> documented for Akka 2.4. In other words, these are future compatible.
> A part of that patch also exists in the patch for Spark, in the
> {{org.apache.spark.util.AkkaUtils}} class. This is where Spark is creating
> the driver and compiling the Akka configuration. That part of the patch tells
> Akka to use {{bind-hostname}} instead of {{hostname}}, if
> {{spark.driver.advertisedHost}} is given and use {{bind-port}} instead of
> {{port}}, if {{spark.driver.advertisedPort}} is given. In such cases,
> {{hostname}} and {{port}} are set to the advertised values, respectively.
> *Worth mentioning:* if {{spark.driver.advertisedHost}} or
> {{spark.driver.advertisedPort}} isn't given, patched Spark reverts to using
> the settings as they would be in case of non-patched {{akka-remote}}; exactly
> for that purpose: if there is no patched {{akka-remote}} in use. Even if it
> is in use, {{akka-remote}} will correctly handle undefined {{bind-hostname}}
> and {{bind-port}}, as specified by Akka 2.4.x.
> h5. Akka versions in Spark (attached patches only)
> - Akka 2.3.4
> - Spark 1.4.0
> - Spark 1.4.1
> - Akka 2.3.11
> - spark 1.5.0
> - spark 1.5.1
> - spark-1.6.0-SNAPSHOT
> h4. Taking on the problem, step 2: Spark services
> The fortunate thing is that every other Spark service is running over HTTP,
> using an {{org.apache.spark.HttpServer}} class. This is where the second part
> of the Spark patch comes into play. All other changes in the patch files
> provide alternative {{advertised...}} ports for each of the following
> services:
> - {{spark.broadcast.port}} -> {{spark.broadcast.advertisedPort}}
> - {{spark.fileserver.port}} -> {{spark.fileserver.advertisedPort}}
> - {{spark.replClassServer.port}} -> {{spark.replClassServer.advertisedPort}}
> What we are telling Spark here, is the following: if there is an alternative
> {{advertisedPort}} setting given to this server instance, use that setting
> for advertising the port.
> h4. Patches
> These patches are cleared by the Technicolor IP&L Team to be contributed back
> under the Apache 2.0 License to Spark.
> All patches for versions from {{1.4.0}} to {{1.5.2}} can be applied directly
> to the respective tag from Spark git repository. The {{1.6.0-master.patch}}
> applies to git sha {{18350a57004eb87cafa9504ff73affab4b818e06}}.
> h4. Building Akka
> To build the required akka version:
> {noformat}
> AKKA_VERSION=2.3.4
> git clone https://github.com/akka/akka.git .
> git fetch origin
> git checkout v${AKKA_VERSION}
> git apply ...2.3.4.patch
> sbt package -Dakka.scaladoc.diagrams=false
> {noformat}
> h4. What is not supplied
> At the moment of contribution, we do not supply any unit tests. We would like
> to contribute those but we may require some assistance.
> =====
> Happy to answer any questions and looking forward to any guidance which would
> lead to have these included in the master Spark version.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]