[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514598#comment-15514598
 ] 

Robert Kanter commented on MAPREDUCE-6718:
------------------------------------------

Two more things I found while actually trying this out:
# Something seems to be wrong with the math.  When I had 0 jobs, it said this:
{noformat}
2016-09-22 14:40:26,650 INFO 
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Found 0 directories to 
load
2016-09-22 14:40:26,650 INFO 
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Existing job 
initialization finished. 0.0% of cache is loaded.
{noformat}
And when I had ~20 jobs, it said this:
{noformat}
2016-09-22 14:52:02,491 INFO hs.HistoryFileManager: Found 1 directories to load
2016-09-22 14:52:02,541 INFO hs.HistoryFileManager: Existing job initialization 
finished. 0.125% of cache is loaded.
{noformat}
# I also saw that the timing is not as useful.  The idea is to print this out 
while the JHS is loading files and appears to be stuck, but when I had ~20 
jobs, it had this:
{noformat}
2016-09-22 14:51:52,303 INFO jobhistory.JobHistoryUtils: Default file system 
[hdfs://0.0.0.0:23010]
2016-09-22 14:52:02,473 INFO hs.HistoryFileManager: Initializing Existing 
Jobs...
2016-09-22 14:52:02,491 INFO hs.HistoryFileManager: Found 1 directories to load
2016-09-22 14:52:02,541 INFO hs.HistoryFileManager: Existing job initialization 
finished. 0.125% of cache is loaded.
{noformat}
Even though it was only ~10 seconds in this case, that's where the gap is.  So 
while it starts to load, all I see in the log is the "Default file system" 
message, and not the "Initializing Existing Jobs..." message.

> add progress log to JHS during startup
> --------------------------------------
>
>                 Key: MAPREDUCE-6718
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6718
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobhistoryserver
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>            Priority: Minor
>              Labels: supportability
>         Attachments: mapreduce6718.001.patch, mapreduce6718.002.patch
>
>
> lWhen the JHS starts up, it initializes the internal caches and storage via 
> the HistoryFileManager. If we have a large number of existing finished jobs 
> then we could spent minutes in this startup phase without logging progress:
> 2016-03-14 10:56:01,444 INFO 
> org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file 
> system [hdfs://hadoopcdh.itnas01.ieee.org:8020]
> 2016-03-14 10:56:11,455 INFO 
> org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Initializing Existing 
> Jobs...
> 2016-03-14 12:01:36,926 INFO 
> org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage: CachedHistoryStorage 
> Init
> This makes it really difficult to assess if things are working correctly (it 
> looks hung). We can add logs to notify users of progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to