[ 
https://issues.apache.org/jira/browse/HADOOP-7714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122127#comment-13122127
 ] 

Nathan Roberts commented on HADOOP-7714:
----------------------------------------

Todd, you may already be well aware of this, but just in case...

Patterns like the one below don't usually do what one would expect, especially 
if the data has to go over a wire. I believe the reason is due to the way the 
socket buffers inside the kernel keep track of the data that needs to be sent. 
It's basically just a reference to the page cache page. Therefore, if the data 
has not actually left the box when the fadvise is called, the references are 
still there so the pages cannot  be invalidated. I tried this with a small 
native app and a 128MB file, and sure enough everything except for the first 
few pages stayed in the page cache. 

I can't immediately think of a surefire way around this. We could just call 
fadvise once at close and just live with the fact that everything still 
buffered at the time won't be affected. We could do what Cristina was doing and 
always call fadvise with offset of 0 so that we try to invalidate pages 
multiple times. We could call the fadvise asynchronously after a second or so. 
Delaying a bit might help us deal with hot blocks better as well. 
sendfile(4, 5, [131072000], 65536)      = 65536
sendfile(4, 5, [131137536], 65536)      = 65536
sendfile(4, 5, [131203072], 65536)      = 65536
sendfile(4, 5, [131268608], 65536)      = 65536
sendfile(4, 5, [131334144], 65536)      = 65536
sendfile(4, 5, [131399680], 65536)      = 65536
sendfile(4, 5, [131465216], 65536)      = 65536
sendfile(4, 5, [131530752], 65536)      = 65536
sendfile(4, 5, [131596288], 65536)      = 65536
sendfile(4, 5, [131661824], 65536)      = 65536
sendfile(4, 5, [131727360], 65536)      = 65536
sendfile(4, 5, [131792896], 65536)      = 65536
sendfile(4, 5, [131858432], 65536)      = 65536
sendfile(4, 5, [131923968], 65536)      = 65536
sendfile(4, 5, [131989504], 65536)      = 65536
sendfile(4, 5, [132055040], 65536)      = 65536
fadvise64(5, 131072000, 1048576, POSIX_FADV_DONTNEED) = 0

                
> Add support in native libs for OS buffer cache management
> ---------------------------------------------------------
>
>                 Key: HADOOP-7714
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7714
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: native
>    Affects Versions: 0.24.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-7714-20s-prelim.txt
>
>
> Especially in shared HBase/MR situations, management of the OS buffer cache 
> is important. Currently, running a big MR job will evict all of HBase's hot 
> data from cache, causing HBase performance to really suffer. However, caching 
> of the MR input/output is rarely useful, since the datasets tend to be larger 
> than cache and not re-read often enough that the cache is used. Having access 
> to the native calls {{posix_fadvise}} and {{sync_data_range}} on platforms 
> where they are supported would allow us to do a better job of managing this 
> cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to