Re: Does HBase RegionServer benefit from OS Page Cache

Ted Yu Sat, 23 Mar 2013 18:14:10 -0700

Coming up is the following enhancement which would make MSLAB even better:

HBASE-8163 MemStoreChunkPool: An improvement for JAVA GC when using MSLAB


FYI

On Sat, Mar 23, 2013 at 5:31 PM, Pankaj Gupta <[email protected]>wrote:

> Thanks a lot for the explanation. It's good to know that MSlab is stable
> and safe to enable (we don't have it enable right now, we're using 0.92).
> This would allow us to more freely allocate memory to HBase. I really
> enjoyed the depth of explanation from both Enis and J-D. I was indeed
> mistakenly referring to HFile as HLog, fortunately you were still able
> understand my question.
>
> Thanks,
> Pankaj
> On Mar 21, 2013, at 1:28 PM, Enis Söztutar <[email protected]> wrote:
>
> > I think the page cache is not totally useless, but as long as you can
> > control the GC, you should prefer the block cache. Some of the reasons of
> > the top of my head:
> > - In case of a cache hit, for OS cache, you have to go through the DN
> > layer (an RPC if ssr disabled), and do a kernel jump, and read using the
> > read() libc vs  for reading a block from the block cache, only the HBase
> > process is involved. There is no process switch involved and no kernel
> > jumps.
> > - The read access path is optimized per hfile block. FS page boundaries
> > and hfile block boundaries are not aligned at all.
> > - There is very little control to the page cache to cache / not cache
> > based on expected access patterns. For example, we can mark META region
> > blocks, and some column families, and hfile index blocks always cached or
> > cached with high priority. Also, for full table scans, we can explicitly
> > disable block caching to not trash the current working set. With OS page
> > cache, you do not have this control.
> >
> > Enis
> >
> >
> > On Wed, Mar 20, 2013 at 10:30 AM, Jean-Daniel Cryans <
> [email protected]>wrote:
> >
> >> First, MSLAB has been enabled by default since 0.92.0 as it was deemed
> >> stable enough. So, unless you are on 0.90, you are already using it.
> >>
> >> Also, I'm not sure why you are referencing the HLog in your first
> >> paragraph in the context of reading from disk, because the HLogs are
> >> rarely read (only on recovery). Maybe you meant HFile?
> >>
> >> In any case, your email covers most arguments except for one:
> >> checksumming. Retrieving a block from HDFS, even when using short
> >> circuit reads to go directly to the OS instead of passing through the
> >> DN, will take quite a bit more time than reading directly from the
> >> block cache. This is why even if you disable block caching on a family
> >> that the index and root blocks will still be block cached, as reading
> >> those very hot blocks from disk would take way too long.
> >>
> >> Regarding your main question (how does the OS buffer help?), I don't
> >> have a good answer. It kind of depends on the amount of RAM you have
> >> and what your workload is like. As a data point, I've been successful
> >> running with 24GB of heap (50% dedicated to the block cache) with a
> >> workload consisting mainly of small writes, short scans, and a typical
> >> random read distribution for a website. I can't remember the last time
> >> I saw a full GC and it's been running for more than a year like this.
> >>
> >> Hope this somehow helps,
> >>
> >> J-D
> >>
> >> On Wed, Mar 20, 2013 at 12:34 AM, Pankaj Gupta <[email protected]>
> >> wrote:
> >>> Given that HBase has it's own cache (block cache and bloom filters) and
> >> that all the table data is stored in HDFS, I'm wondering if HBase
> benefits
> >> from OS page cache at all. In the set up I'm using HBase Region Servers
> run
> >> on the same boxes as the HDFS data node. In such a scenario if the
> >> underlying HLog files lives on the same machine then having a healthy
> >> memory surplus may mean that the data node can serve underlying file
> from
> >> page cache and thus improving HBase performance. Is this really the
> case?
> >> (I guess page cache should also help in case where HLog file lives on a
> >> different machine but in that case network I/O will probably drown the
> >> speedup achieved due to not hitting the disk.
> >>>
> >>> I'm asking because if page cache were useful then in an HBase set up
> not
> >> utilizing all the memory on the machine for the region server may not be
> >> that bad. The reason one would not want to use all the memory for region
> >> server would be long garbage collection pauses that large heap size may
> >> induce. I understand that work has been done to fix the long pauses
> caused
> >> due to memory fragmentation in the old generation, mostly concurrent
> >> garbage collector by using slab cache allocator for memstore but that
> >> feature is marked experimental and we're not ready to take risks yet.
> So if
> >> the page cache was useful in any way on Region Servers we could go with
> >> less memory for RegionServer process with the understanding that free
> >> memory on the machine is not completely going to waste. Thus my
> curiosity
> >> about utility of os page cache to performance of HBase.
> >>>
> >>> Thanks in Advance,
> >>> Pankaj
> >>
>
>

Re: Does HBase RegionServer benefit from OS Page Cache

Reply via email to