[
https://issues.apache.org/jira/browse/SPARK-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated SPARK-4539:
-----------------------------
Component/s: Spark Core
> History Server counts "incomplete" applications against the
> "retainedApplications" total, fails to show eligible "completed" applications
> -----------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-4539
> URL: https://issues.apache.org/jira/browse/SPARK-4539
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.2.0
> Reporter: Ryan Williams
>
> I have observed the history server to return 0 or 1 applications from a
> directory that contains many complete and incomplete applications (the latter
> being application directories that are missing the {{APPLICATION_COMPLETE}}
> file).
> Without having dug too much, my theory is that HistoryServer is seeing the
> "incomplete" directories and counting them against the
> {{retainedApplications}} maximum but not displaying them.
> One supporting anecdote for this is that I loaded HS against a directory that
> had one complete application and nothing else, and HS worked as expected (I
> saw the one application in the web UI).
> I then copied ~100 other application directories in, the majority of which
> were "incomplete" (in particular, most of the ones that had the earliest
> timestamps), and still only saw the one original completed application via
> the web UI.
> Finally, I restarted the same server with the {{retainedApplications}} set to
> 1000 (instead of 50; the directory a this point had ~10 completed
> applications and 90 incomplete ones), and saw all/exactly the completed
> applications, leading me to believe that they were being "boxed out" of the
> maximum-50-retained-applications iteration of the history server.
> Silently failing on "incomplete" directories while still docking the count,
> if that is indeed what is happening, is a pretty confusing failure mode.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]