The reason you get connection refused when connecting to the application UI
(port 4040) is because you app gets stopped thus the application UI stops
as well. To inspect your executors logs after the fact you might find
useful the Spark History server
<https://spark.apache.org/docs/latest/monitoring.html#viewing-after-the-fact>
(for standalone mode).

Personally I I collect the logs from my worker nodes. They generally sit
under the $SPARK_HOME/work/<app-id>/<executor-number> (for standalone).
There you can find exceptions and messages from the executors assigned to
your app.

Now, about you app crashing, might be useful check whether it is sized
correctly. The issue you linked sounds appropriate however I would give
some sanity checks a try. I solved many issues just by sizing an app that I
would first check memory size, cpu allocations and so on..

Best,

On Tue, Jul 18, 2017 at 3:30 PM, Saatvik Shah <saatvikshah1...@gmail.com>
wrote:

> Hi Riccardo,
>
> Yes, Thanks for suggesting I do that.
>
> [Stage 1:==========================================>       (12750 + 40) /
> 15000]17/07/18 13:22:28 ERROR org.apache.spark.scheduler.LiveListenerBus:
> Dropping SparkListenerEvent because no remaining room in event queue. This
> likely means one of the SparkListeners is too slow and cannot keep up with
> the rate at which tasks are being started by the scheduler.
> 17/07/18 13:22:28 WARN org.apache.spark.scheduler.LiveListenerBus:
> Dropped 1 SparkListenerEvents since Thu Jan 01 00:00:00 UTC 1970
> [Stage 1:============================================>     (13320 + 41) /
> 15000]17/07/18 13:23:28 WARN org.apache.spark.scheduler.LiveListenerBus:
> Dropped 26782 SparkListenerEvents since Tue Jul 18 13:22:28 UTC 2017
> [Stage 1:==============================================>   (13867 + 40) /
> 15000]17/07/18 13:24:28 WARN org.apache.spark.scheduler.LiveListenerBus:
> Dropped 58751 SparkListenerEvents since Tue Jul 18 13:23:28 UTC 2017
> [Stage 1:===============================================>  (14277 + 40) /
> 15000]17/07/18 13:25:10 INFO org.spark_project.jetty.server.ServerConnector:
> Stopped ServerConnector@3b7284c4{HTTP/1.1}{0.0.0.0:4040}
> 17/07/18 13:25:10 ERROR org.apache.spark.scheduler.LiveListenerBus:
> SparkListenerBus has already stopped! Dropping event
> SparkListenerExecutorMetricsUpdate(4,WrappedArray())
> And similar WARN/INFO messages continue occurring.
>
> When I try to access the UI, I get:
>
> Problem accessing /proxy/application_1500380353993_0001/. Reason:
>
>     Connection to http://10.142.0.17:4040 refused
>
> Caused by:
>
> org.apache.http.conn.HttpHostConnectException: Connection to 
> http://10.142.0.17:4040 refused
>       at 
> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:190)
>       at 
> org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
>       at 
> org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:643)
>       at 
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:479)
>       at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
>       at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
>       at 
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
>       at 
> org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.proxyLink(WebAppProxyServlet.java:200)
>       at 
> org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:387)
>       at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>       at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>
>
>
> I noticed this issue talks about something similar and I guess is related:
> https://issues.apache.org/jira/browse/SPARK-18838.
>
> On Tue, Jul 18, 2017 at 2:49 AM, Riccardo Ferrari <ferra...@gmail.com>
> wrote:
>
>> Hi,
>>  can you share more details. do you have any exceptions from the driver?
>> or executors?
>>
>> best,
>>
>> On Jul 18, 2017 02:49, "saatvikshah1994" <saatvikshah1...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I have a pyspark App which when provided a huge amount of data as input
>>> throws the error explained here sometimes:
>>> https://stackoverflow.com/questions/32340639/unable-to-under
>>> stand-error-sparklistenerbus-has-already-stopped-dropping-event.
>>> All my code is running inside the main function, and the only slightly
>>> peculiar thing I am doing in this app is using a custom PySpark ML
>>> Transformer(Modified from
>>> https://stackoverflow.com/questions/32331848/create-a-custom
>>> -transformer-in-pyspark-ml).
>>> Could this be the issue? How can I debug why this is happening?
>>>
>>>
>>>
>>> --
>>> View this message in context: http://apache-spark-user-list.
>>> 1001560.n3.nabble.com/Spark-UI-crashes-on-Large-Workloads-tp28873.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>
>>>
>
>
> --
> *Saatvik Shah,*
> *Masters in the School of Computer Science,*
> *Carnegie Mellon University,*
> *LinkedIn <https://www.linkedin.com/in/saatvikshah/>, Website
> <https://saatvikshah1994.github.io/>*
>

Reply via email to