Ryan Williams created SPARK-4539:
------------------------------------
Summary: History Server counts "incomplete" applications against
the "retainedApplications" total, fails to show eligible "completed"
applications
Key: SPARK-4539
URL: https://issues.apache.org/jira/browse/SPARK-4539
Project: Spark
Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Ryan Williams
I have observed the history server to return 0 or 1 applications from a
directory that contains many complete and incomplete applications (the latter
being application directories that are missing the {{APPLICATION_COMPLETE}}
file).
Without having dug too much, my theory is that HistoryServer is seeing the
"incomplete" directories and counting them against the {{retainedApplications}}
maximum but not displaying them.
One supporting anecdote for this is that I loaded HS against a directory that
had one complete application and nothing else, and HS worked as expected (I saw
the one application in the web UI).
I then copied ~100 other application directories in, the majority of which were
"incomplete" (in particular, most of the ones that had the earliest
timestamps), and still only saw the one original completed application via the
web UI.
Finally, I restarted the same server with the {{retainedApplications}} set to
1000 (instead of 50; the directory a this point had ~10 completed applications
and 90 incomplete ones), and saw all/exactly the completed applications,
leading me to believe that they were being "boxed out" of the
maximum-50-retained-applications iteration of the history server.
Silently failing on "incomplete" directories while still docking the count, if
that is indeed what is happening, is a pretty confusing failure mode.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]