[ 
https://issues.apache.org/jira/browse/HBASE-20430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439985#comment-16439985
 ] 

Andrew Purtell edited comment on HBASE-20430 at 4/16/18 8:42 PM:
-----------------------------------------------------------------

I suppose this could also be handled with an enhancement to Hadoop's S3A to 
multiplex open files among the internal resource pools it already maintains. 
[~steve_l]


was (Author: apurtell):
I suppose this could also be handled with an enhancement to Hadoop's S3A to 
multiplex open files among the internal resource pools it already maintains.

> Improve store file management for non-HDFS filesystems
> ------------------------------------------------------
>
>                 Key: HBASE-20430
>                 URL: https://issues.apache.org/jira/browse/HBASE-20430
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Andrew Purtell
>            Priority: Major
>
> HBase keeps a file open for every active store file so no additional round 
> trips to the NameNode are needed after the initial open. HDFS internally 
> multiplexes open files, but the Hadoop S3 filesystem implementations do not, 
> or, at least, not as well. As the bulk of data under management increases we 
> observe the required number of concurrently open connections will rise, and 
> expect it will eventually exhaust a limit somewhere (the client, the OS file 
> descriptor table or open file limits, or the S3 service).
> Initially we can simply introduce an option to close every store file after 
> the reader has finished, and determine the performance impact. Use cases 
> backed by non-HDFS filesystems will already have to cope with a different 
> read performance profile. Based on experiments with the S3 backed Hadoop 
> filesystems, notably S3A, even with aggressively tuned options simple reads 
> can be very slow when there are blockcache misses, 15-20 seconds observed for 
> Get of a single small row, for example. We expect extensive use of the 
> BucketCache to mitigate in this application already. Could be backed by 
> offheap storage, but more likely a large number of cache files managed by the 
> file engine on local SSD storage. If misses are already going to be super 
> expensive, then the motivation to do more than simply open store files on 
> demand is largely absent.
> Still, we could employ a predictive cache. Where frequent access to a given 
> store file (or, at least, its store) is predicted, keep a reference to the 
> store file open. Can keep statistics about read frequency, write it out to 
> HFiles during compaction, and note these stats when opening the region, 
> perhaps by reading all meta blocks of region HFiles when opening. Otherwise, 
> close the file after reading and open again on demand. Need to be careful not 
> to use ARC or equivalent as cache replacement strategy as it is encumbered. 
> The size of the cache can be determined at startup after detecting the 
> underlying filesystem. Eg. setCacheSize(VERY_LARGE_CONSTANT) if (fs 
> instanceof DistributedFileSystem), so we don't lose much when on HDFS still.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to