Thank you, Andrew. That makes sense for me now. I was confused by "In
yarn-cluster mode, the Spark driver runs inside an application master
process which is managed by YARN on the cluster" in
http://spark.apache.org/docs/latest/running-on-yarn.html . After
you explanation, it's clear now. Thank you.

Best,

Fang, Yan
yanfang...@gmail.com
+1 (206) 849-4108


On Mon, Jul 7, 2014 at 1:07 PM, Andrew Or <and...@databricks.com> wrote:

> @Yan, the UI should still work. As long as you look into the container
> that launches the driver, you will find the SparkUI address and port. Note
> that in yarn-cluster mode the Spark driver doesn't actually run in the
> Application Manager; just like the executors, it runs in a container that
> is launched by the Resource Manager after the Application Master requests
> the container resources. In contrast, in yarn-client mode, your driver is
> not launched in a container, but in the client process that launched your
> application (i.e. spark-submit), so the stdout of this program directly
> contains the SparkUI messages.
>
> @Chester, I'm not sure what has gone wrong as there are many factors at
> play here. When you go the Resource Manager UI, does the "application URL"
> link point you to the same SparkUI address as indicated in the logs? If so,
> this is the correct behavior. However, I believe the redirect error has
> little to do with Spark itself, but more to do with how you set up the
> cluster. I have actually run into this myself, but I haven't found a
> workaround. Let me know if you find anything.
>
>
>
>
> 2014-07-07 12:07 GMT-07:00 Chester Chen <ches...@alpinenow.com>:
>
> As Andrew explained, the port is random rather than 4040, as the the spark
>> driver is started in Application Master and the port is random selected.
>>
>>
>> But I have the similar UI issue. I am running Yarn Cluster mode against
>> my local CDH5 cluster.
>>
>> The log states
>> "14/07/07 11:59:29 INFO ui.SparkUI: Started SparkUI at
>> http://10.0.0.63:58750
>>
>>
>> "
>>
>>
>> but when you client the spark UI link (ApplicationMaster or
>>
>> http://10.0.0.63:58750), I will got a 404 with the redirect URI
>>
>>
>>
>>
>>  http://localhost/proxy/application_1404443455764_0010/
>>
>>
>>
>> Looking at the Spark code, notice that the "proxy" is reallya variable to 
>> get the proxy at the yarn-site.xml http address. But when I specified the 
>> value at yarn-site.xml, it still doesn't work for me.
>>
>>
>>
>> Oddly enough, it works for my co-worker on Pivotal HD cluster, therefore I 
>> am still looking what's the difference in terms of cluster setup or 
>> something else.
>>
>>
>> Chester
>>
>>
>>
>>
>>
>> On Mon, Jul 7, 2014 at 11:42 AM, Andrew Or <and...@databricks.com> wrote:
>>
>>> I will assume that you are running in yarn-cluster mode. Because the
>>> driver is launched in one of the containers, it doesn't make sense to
>>> expose port 4040 for the node that contains the container. (Imagine if
>>> multiple driver containers are launched on the same node. This will cause a
>>> port collision). If you're launching Spark from a gateway node that is
>>> physically near your worker nodes, then you can just launch your
>>> application in yarn-client mode, in which case the SparkUI will always be
>>> started on port 4040 on the node that you ran spark-submit on. The reason
>>> why sometimes you see the red text is because it appears only on the driver
>>> containers, not the executor containers. This is because SparkUI belongs to
>>> the SparkContext, which only exists on the driver.
>>>
>>> Andrew
>>>
>>>
>>> 2014-07-07 11:20 GMT-07:00 Yan Fang <yanfang...@gmail.com>:
>>>
>>> Hi guys,
>>>>
>>>> Not sure if you  have similar issues. Did not find relevant tickets in
>>>> JIRA. When I deploy the Spark Streaming to YARN, I have following two
>>>> issues:
>>>>
>>>> 1. The UI port is random. It is not default 4040. I have to look at the
>>>> container's log to check the UI port. Is this suppose to be this way?
>>>>
>>>> 2. Most of the time, the UI does not work. The difference between logs
>>>> are (I ran the same program):
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *14/07/03 11:38:50 INFO spark.HttpServer: Starting HTTP Server14/07/03
>>>> 11:38:50 INFO server.Server: jetty-8.y.z-SNAPSHOT 14/07/03 11:38:50 INFO
>>>> server.AbstractConnector: Started SocketConnector@0.0.0.0:12026
>>>> <http://SocketConnector@0.0.0.0:12026>14/07/03 11:38:51 INFO
>>>> executor.CoarseGrainedExecutorBackend: Got assigned task 0 14/07/03
>>>> 11:38:51 INFO executor.Executor: Running task ID 0...*
>>>>
>>>> 14/07/02 16:55:32 INFO spark.HttpServer: Starting HTTP Server
>>>> 14/07/02 16:55:32 INFO server.Server: jetty-8.y.z-SNAPSHOT
>>>> 14/07/02 16:55:32 INFO server.AbstractConnector: Started
>>>> SocketConnector@0.0.0.0:14211
>>>>
>>>>
>>>>
>>>>
>>>> *14/07/02 16:55:32 INFO ui.JettyUtils: Adding filter:
>>>> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter14/07/02 16:55:32
>>>> INFO server.Server: jetty-8.y.z-SNAPSHOT14/07/02 16:55:32 INFO
>>>> server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:21867
>>>> <http://SelectChannelConnector@0.0.0.0:21867> 14/07/02 16:55:32 INFO
>>>> ui.SparkUI: Started SparkUI at http://myNodeName:21867
>>>> <http://myNodeName:21867>14/07/02 16:55:32 INFO
>>>> cluster.YarnClusterScheduler: Created YarnClusterScheduler*
>>>>
>>>> When the red part comes, the UI works sometime. Any ideas? Thank you.
>>>>
>>>> Best,
>>>>
>>>> Fang, Yan
>>>> yanfang...@gmail.com
>>>> +1 (206) 849-4108
>>>>
>>>
>>>
>>
>

Reply via email to