[GitHub] spark pull request: [SPARK-7336][HistoryServer] Fix bug that appli...

vanzin Wed, 06 May 2015 10:00:17 -0700

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/5886#issuecomment-99537802
  
    Ok, I buy that scenario. But as I mentioned before, the current solution is 
not very good: it increases the memory usage of the history server too much. If 
you have `n` log files, now during polling your HS will need order of `3 * n` 
memory (instead of the current `2 * n`, which is not great but we shouldn't 
make it worse).
    
    My suggestion: before doing a `listStatus`, create an empty file in the log 
directory and retrieve its mod time. Use that as `newLastModifiedTime`. So you 
don't need to keep another map with every single log file available. Next time 
you poll, any modifications that happen during the `listStatus` call would be 
caught since `lastModificationTime` is guaranteed to be before anything that 
happened during the `listStatus`. (And `lastModificationTime` basically becomes 
`lastPollTime`.)
    
    That may exacerbate the issue raised in SPARK-7189, though.
    
    >  I realize that some log file doesn't get processed until a bit later
    
    That's a good question. Because if the mod time changes, then the file is 
being written to, and its mod time will eventually change again. But I guess 
the same race can occur if the file is closed / renamed during an app shutdown, 
while the HS is doing the listing?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-7336][HistoryServer] Fix bug that appli...

Reply via email to