Re: Spark webUI - application details page
Where is the history server running? Is it running on the same node as the logs directory. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p21374.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
Hi, I do'nt have any history server running. As SK's already pointed in a previous post the history server seems to be required only in mesos or yarn mode, not in standalone mode. https://spark.apache.org/docs/1.1.1/monitoring.html If Spark is run on Mesos or YARN, it is still possible to reconstruct the UI of a finished application through Spark’s history server, provided that the application’s event logs exist. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p21379.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
Hi, I've a similar problem. I want to see the detailed logs of Completed Applications so I've set in my program : set(spark.eventLog.enabled,true). set(spark.eventLog.dir,file:/tmp/spark-events) but when I click on the application in the webui, I got a page with the message : Application history not found (app-20150126000651-0331) No event logs found for application xxx$ in file:/tmp/spark-events/xxx-147211500. Did you specify the correct logging directory? despite the fact that the directory exist and contains 3 files : APPLICATION_COMPLETE* EVENT_LOG_1* SPARK_VERSION_1.1.0* I use spark 1.1.0 on a standalone cluster with 3 nodes. Any suggestion to solve the problem ? Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p21358.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
Perhaps you need to set this in your spark-defaults.conf so that¹s it¹s already set when your slave/worker processes start. -Joe On 1/25/15, 6:50 PM, ilaxes ila...@hotmail.com wrote: Hi, I've a similar problem. I want to see the detailed logs of Completed Applications so I've set in my program : set(spark.eventLog.enabled,true). set(spark.eventLog.dir,file:/tmp/spark-events) but when I click on the application in the webui, I got a page with the message : Application history not found (app-20150126000651-0331) No event logs found for application xxx$ in file:/tmp/spark-events/xxx-147211500. Did you specify the correct logging directory? despite the fact that the directory exist and contains 3 files : APPLICATION_COMPLETE* EVENT_LOG_1* SPARK_VERSION_1.1.0* I use spark 1.1.0 on a standalone cluster with 3 nodes. Any suggestion to solve the problem ? Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-applicatio n-details-page-tp3490p21358.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
How did you specify the HDFS path? When i put spark.eventLog.dir hdfs:// crosby.research.intel-research.net:54310/tmp/spark-events in my spark-defaults.conf file, I receive the following error: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : java.io.IOException: Call to crosby.research.intel-research.net/10.212.84.53:54310 failed on local exception: java.io.EOFException -Brad On Thu, Aug 28, 2014 at 12:26 PM, SK skrishna...@gmail.com wrote: I was able to recently solve this problem for standalone mode. For this mode, I did not use a history server. Instead, I set spark.eventLog.dir (in conf/spark-defaults.conf) to a directory in hdfs (basically this directory should be in a place that is writable by the master and accessible globally to all the nodes). -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p13055.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
Hi All, @Andrew Thanks for the tips. I just built the master branch of Spark last night, but am still having problems viewing history through the standalone UI. I dug into the Spark job events directories as you suggested, and I see at a minimum 'SPARK_VERSION_1.0.0' and 'EVENT_LOG_1'; for applications that call 'sc.stop()' I also see 'APPLICATION_COMPLETE'. The version and application complete files are empty; the event log file contains the information one would need to repopulate the web UI. The follow may be helpful in debugging this: -Each job directory (e.g. '/tmp/spark-events/testhistoryjob-1409246088110') and the files within are owned by the user who ran the job with permissions 770. This prevents the 'spark' user from accessing the contents. -When I make a directory and contents accessible to the spark user, the history server (invoked as 'sbin/start-history-server.sh /tmp/spark-events') is able to display the history, but the standalone web UI still produces the following error: 'No event logs found for application HappyFunTimes in file:///tmp/spark-events/testhistoryjob-1409246088110. Did you specify the correct logging directory?' -Incase it matters, I'm running pyspark. Do you know what may be causing this? When you attempt to reproduce locally, who do you observe owns the files in /tmp/spark-events? best, -Brad On Tue, Aug 26, 2014 at 8:51 AM, SK skrishna...@gmail.com wrote: I have already tried setting the history server and accessing it on master-url:18080 as per the link. But the page does not list any completed applications. As I mentioned in my previous mail, I am running Spark in standalone mode on the cluster (as well as on my local machine). According to the link, it appears that the history server is required only in mesos or yarn mode, not in standalone mode. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p12834.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
I was able to recently solve this problem for standalone mode. For this mode, I did not use a history server. Instead, I set spark.eventLog.dir (in conf/spark-defaults.conf) to a directory in hdfs (basically this directory should be in a place that is writable by the master and accessible globally to all the nodes). -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p13055.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
Have a look at the history server, looks like you have enabled history server on your local and not on the remote server. http://people.apache.org/~tdas/spark-1.0.0-rc11-docs/monitoring.html Thanks Best Regards On Tue, Aug 26, 2014 at 7:01 AM, SK skrishna...@gmail.com wrote: Hi, I am able to access the Application details web page from the master UI page when I run Spark in standalone mode on my local machine. However, I am not able to access it when I run Spark on our private cluster. The Spark master runs on one of the nodes in the cluster. I am able to access the spark master UI at spark://master-url:8080. It shows the listing of all the running and completed apps. When I click on the completed app, and access the Application details link, the link points to: master-url/app/?appId=app-idvalue When I view the page source to view the html source, the href portion is blank (). However, on my local machine, when I click the Application detail link for a completed app, it correctly points to master-url/history/app-id and when I view the page's html source, the href portion points to /history/app-id On the cluster, I have set spark.eventLog.enabled to true in $SPARK_HOME/conf/spark-defaults.conf on the master node as well as all the slave nodes. I am using spark 1.0.1 on the cluster. I am not sure why I am able to access the application details for completed apps when the app runs on my local machine but not for the apps that run on our cluster, although in both cases I am using spark 1.0.1 in standalone mode. Do I need to do any additional configuration to enable this history on the cluster? thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p12792.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
I have already tried setting the history server and accessing it on master-url:18080 as per the link. But the page does not list any completed applications. As I mentioned in my previous mail, I am running Spark in standalone mode on the cluster (as well as on my local machine). According to the link, it appears that the history server is required only in mesos or yarn mode, not in standalone mode. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p12834.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
Hi, I am able to access the Application details web page from the master UI page when I run Spark in standalone mode on my local machine. However, I am not able to access it when I run Spark on our private cluster. The Spark master runs on one of the nodes in the cluster. I am able to access the spark master UI at spark://master-url:8080. It shows the listing of all the running and completed apps. When I click on the completed app, and access the Application details link, the link points to: master-url/app/?appId=app-idvalue When I view the page source to view the html source, the href portion is blank (). However, on my local machine, when I click the Application detail link for a completed app, it correctly points to master-url/history/app-id and when I view the page's html source, the href portion points to /history/app-id On the cluster, I have set spark.eventLog.enabled to true in $SPARK_HOME/conf/spark-defaults.conf on the master node as well as all the slave nodes. I am using spark 1.0.1 on the cluster. I am not sure why I am able to access the application details for completed apps when the app runs on my local machine but not for the apps that run on our cluster, although in both cases I am using spark 1.0.1 in standalone mode. Do I need to do any additional configuration to enable this history on the cluster? thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p12792.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
Hi Andrew, I'm running something close to the present master (I compiled several days ago) but am having some trouble viewing history. I set spark.eventLog.dir to true, but continually receive the error message (via the web UI) Application history not found...No event logs found for application ml-pipeline in file:/tmp/spark-events/ml-pipeline-1408117588599. I tried 2 fixes: -I manually set spark.eventLog.dir to a path beginning with file:///, believe that perhaps the problem was an invalid protocol specification. -I inspected /tmp/spark-events manually and noticed that each job directory (and the files there-in) were owned by the user who launched the job and were not world readable. Since I run Spark from a dedicated Spark user, I set the files world readable but I still receive the same Application history not found error. Is there a configuration step I may be missing? -Brad On Thu, Aug 14, 2014 at 7:33 PM, Andrew Or and...@databricks.com wrote: Hi SK, Not sure if I understand you correctly, but here is how the user normally uses the event logging functionality: After setting spark.eventLog.enabled and optionally spark.eventLog.dir, the user runs his/her Spark application and calls sc.stop() at the end of it. Then he/she goes to the standalone Master UI (under http://master-url:8080 by default) and click on the application under the Completed Applications table. This will link to the Spark UI of the finished application in its completed state, under a path that looks like http://master-url:8080/history/app-Id. It won't be on http://localhost:4040; anymore because the port is now freed for new applications to bind their SparkUIs to. To access the file that stores the raw statistics, go to the file specified in spark.eventLog.dir. This is by default /tmp/spark-events, though in Spark 1.0.1 it may be in HDFS under the same path. I could be misunderstanding what you mean by the stats being buried in the console output, because the events are not logged to the console but to a file in spark.eventLog.dir. For all of this to work, of course, you have to run Spark in standalone mode (i.e. with master set to spark://master-url:7077). In other modes, you will need to use the history server instead. Does this make sense? Andrew 2014-08-14 18:08 GMT-07:00 SK skrishna...@gmail.com: More specifically, as indicated by Patrick above, in 1.0+, apps will have persistent state so that the UI can be reloaded. Is there a way to enable this feature in 1.0.1? thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p12157.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
Hi, Ok, I was specifying --master local. I changed that to --master spark://localhostname:7077 and am now able to see the completed applications. It provides summary stats about runtime and memory usage, which is sufficient for me at this time. However it doesn't seem to archive the info in the application detail UI that lists detailed stats about the completed stages of the application - which would be useful for identifying bottleneck steps in a large application. I guess we need to capture the application detail UI screen before the app run completes or find a way to extract this info by parsing the Json log file in /tmp/spark-events. thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p12187.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
@Brad Your configuration looks alright to me. We parse both file:/ and file:/// the same way so that shouldn't matter. I just tried this on the latest master and verified that it works for me. Can you dig into the directory /tmp/spark-events/ml-pipeline-1408117588599 to make sure that it's not empty? In particular, look for a file that looks like EVENT_LOG_0, then check the content of that file. The last event (on the last line) of the file should be an Application Complete event. If this is not true, it's likely that your application did not call sc.stop(), though the logs should still show up in spite of that. If all of that fails, try logging it in a more accessible place through setting spark.eventLog.dir. Let me know if that helps. @SK You shouldn't need to capture the screen before it finishes; the whole point of the event logging functionality is that the user doesn't have to do that themselves. What happens if you click into the application detail UI? In Spark 1.0.1, if it can't find the logs it may just refresh instead of printing a more explicit message. However, from your configuration you should be able to see the detailed stage information in the UI in addition to just the summary statistics under Completed Applications. I have listed a few debugging steps in the paragraph above, so maybe they're also applicable to you. Let me know if that works, Andrew 2014-08-15 11:07 GMT-07:00 SK skrishna...@gmail.com: Hi, Ok, I was specifying --master local. I changed that to --master spark://localhostname:7077 and am now able to see the completed applications. It provides summary stats about runtime and memory usage, which is sufficient for me at this time. However it doesn't seem to archive the info in the application detail UI that lists detailed stats about the completed stages of the application - which would be useful for identifying bottleneck steps in a large application. I guess we need to capture the application detail UI screen before the app run completes or find a way to extract this info by parsing the Json log file in /tmp/spark-events. thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p12187.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
Hi, I am using Spark 1.0.1. But I am still not able to see the stats for completed apps on port 4040 - only for running apps. Is this feature supported or is there a way to log this info to some file? I am interested in stats about the total # of executors, total runtime, and total memory used by my Spark program. thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p12144.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
If I don't understand you wrong, setting event logging in the SPARK_JAVA_OPTS should achieve what you want. I'm logging to the HDFS, but according to the config page http://spark.apache.org/docs/latest/configuration.html a folder should be possible as well. Example with all other settings removed: SPARK_JAVA_OPTS=-Dspark.eventLog.enabled=true -Dspark.eventLog.dir=hdfs://idp11:9100/user/myname/logs/ This works with the Spark shell, I haven't tested other applications though. Note that the completed applications will disappear from the list if you restart Spark completely, even though they'll still be stored in the log folder. Best regards, Simon -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p12150.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
Hi all, As Simon explained, you need to set spark.eventLog.enabled to true. I'd like to add that the usage of SPARK_JAVA_OPTS to set spark configurations is deprecated. I'm sure many of you have noticed this from the scary warning message we print out. :) The recommended and supported way of setting this is by adding the line spark.eventLog.enabled true to $SPARK_HOME/conf/spark-defaults.conf. This will be picked up by Spark submit and passed to your application. Cheers, Andrew 2014-08-14 15:45 GMT-07:00 durin m...@simon-schaefer.net: If I don't understand you wrong, setting event logging in the SPARK_JAVA_OPTS should achieve what you want. I'm logging to the HDFS, but according to the config page http://spark.apache.org/docs/latest/configuration.html a folder should be possible as well. Example with all other settings removed: SPARK_JAVA_OPTS=-Dspark.eventLog.enabled=true -Dspark.eventLog.dir=hdfs://idp11:9100/user/myname/logs/ This works with the Spark shell, I haven't tested other applications though. Note that the completed applications will disappear from the list if you restart Spark completely, even though they'll still be stored in the log folder. Best regards, Simon -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p12150.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
I set spark.eventLog.enabled to true in $SPARK_HOME/conf/spark-defaults.conf and also configured the logging to a file as well as console in log4j.properties. But I am not able to get the log of the statistics in a file. On the console there is a lot of log messages along with the stats - so hard to separate the stats. I prefer the online format that appears on localhost:4040 - it is more clear. I am running the job in standalone mode on my local machine. is there some way to recreate the stats online after the job has completed? thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p12156.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
Hi SK, Not sure if I understand you correctly, but here is how the user normally uses the event logging functionality: After setting spark.eventLog.enabled and optionally spark.eventLog.dir, the user runs his/her Spark application and calls sc.stop() at the end of it. Then he/she goes to the standalone Master UI (under http://master-url:8080 by default) and click on the application under the Completed Applications table. This will link to the Spark UI of the finished application in its completed state, under a path that looks like http://master-url:8080/history/app-Id. It won't be on http://localhost:4040; anymore because the port is now freed for new applications to bind their SparkUIs to. To access the file that stores the raw statistics, go to the file specified in spark.eventLog.dir. This is by default /tmp/spark-events, though in Spark 1.0.1 it may be in HDFS under the same path. I could be misunderstanding what you mean by the stats being buried in the console output, because the events are not logged to the console but to a file in spark.eventLog.dir. For all of this to work, of course, you have to run Spark in standalone mode (i.e. with master set to spark://master-url:7077). In other modes, you will need to use the history server instead. Does this make sense? Andrew 2014-08-14 18:08 GMT-07:00 SK skrishna...@gmail.com: More specifically, as indicated by Patrick above, in 1.0+, apps will have persistent state so that the UI can be reloaded. Is there a way to enable this feature in 1.0.1? thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p12157.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Spark webUI - application details page
Is there a way to see 'Application Detail UI' page (at master:4040) for completed applications? Currently, I can see that page only for running applications, I would like to see various numbers for the application after it has completed.
Re: Spark webUI - application details page
This will be a feature in Spark 1.0 but is not yet released. In 1.0 Spark applications can persist their state so that the UI can be reloaded after they have completed. - Patrick On Sun, Mar 30, 2014 at 10:30 AM, David Thomas dt5434...@gmail.com wrote: Is there a way to see 'Application Detail UI' page (at master:4040) for completed applications? Currently, I can see that page only for running applications, I would like to see various numbers for the application after it has completed.