Hi Aureliano, The Spark Application is defined by all things executed within a given Spark Context. This application's web server runs on port 4040 of the machine where the driver of the application is being executed. An example driver of a Spark Application is a single instance of the Spark Shell. This web ui, on port 4040, displays statistics about the Application such as the stages being executed, the number of tasks per stage and the progress of the tasks within a stage. Other Application statistics include the caching locations and percentages of RDDs being used within an Application (and across the stages of that Application) and the garbage collection times of the tasks that have been completed.
The Spark Cluster is defined by all Applications executing on top of the resources provisioned to your particular deployment of Spark. These resources are managed by a Spark Master which contains the task scheduler and the cluster manager (unless you're using YARN or Mesos in which case they will provide the cluster manager). The UI on port 8080 is the UI of the Spark Master, and it is accessible on whichever node is currently executing the Spark Master. This UI displays cluster statistics such as the number of available worker nodes, the number of JVM executor processes per worker node, the number of running Applications utilizing this Cluster, et cetera. In short, shutting down a Spark Application will kill the UI on port 4040 because your application is terminated and therefore there are no running statistics to collect about that application. However, the UI on port 8080 continues to be up and report cluster-wide statistics until you kill the cluster by killing Spark Master. Hope that long-winded explanation made sense! Happy Holidays! On Fri, Dec 27, 2013 at 9:23 AM, Aureliano Buendia <[email protected]>wrote: > Hi, > > > I'm a bit confused about web UI access of a stand alone spark app. > > - When running a spark app, a web server is launched at localhost:4040. > When the standalone app execution is finished, the web server is shut down. > What's the use of this web server? There is no way of reviewing the data > when the standalone app exists. > > - Creating SparkContext at spark://localhost:7077 creates another web UI. > Is this web UI supposed to be used with localhost:4040, or is it a > replacement? > > - Creating a context with spark://localhost:7077, and after running > ./bin/start-all.sh, I get this warning: > > WARN ClusterScheduler: Initial job has not accepted any resources; check > your cluster UI to ensure that workers are registered and have sufficient > memory >
