Forgot one thing ...

I'd like to share a design document regarding the Row Cache for HBase
2.x/3.x for (potential) further discussion:

https://docs.google.com/document/d/1Ag0cej2X0qNBb2HMVOJGiQdDRntqDW5n/edit?usp=sharing&ouid=107898024699489289958&rtpof=true&sd=true

Best regards,

Vladimir Rodionov

On Mon, Mar 16, 2026 at 11:37 AM Vladimir Rodionov
<[email protected]> wrote:
>
> Hi Xiao,
>
> Thank you very much for the detailed response and for sharing your
> experience. It is very interesting to hear that you implemented
> a RowCache-like mechanism internally and have been running it in
> production at large scale. CPU reduction under the same traffic is
> exactly the type of benefit
> I was hoping to see from logical row caching. From an architectural
> perspective, RowCache design takes a somewhat different approach from
> HBASE-29585.
> The implementation is a coprocessor and therefore requires no changes
> to the HBase core code path, which allows it to be deployed and
> evaluated independently.
> It also uses a cache engine (can be made easily pluggable) designed to
> store a very large number of small objects with minimal metadata
> overhead.
>
> One aspect that I found important while experimenting with row-level
> caching is the scalability of the cache storage layer when the cache
> contains a very large
> number of small entries. In many HBase workloads rows are relatively
> small, which means the cache may contain millions or even hundreds of
> millions
> of objects. In such scenarios the per-entry metadata overhead becomes
> an important factor. For this reason RowCache implementation uses a
> cache engine
>  optimized for compact storage of large numbers of small objects (it
> even creates common compression dictionaries and applies them to
> objects during
> compression reducing  memory footprint even more). This was one of the
> motivations for not reusing BucketCache directly, since its metadata
> overhead was originally designed around block-sized entries rather
> than row-sized objects. It will be interesting to see how different
> approaches behave
>  as cache entry counts grow to very large scales.
>
> My understanding is that HBASE-29585 reuses BucketCache as the storage
> layer for row objects. While that approach has the advantage of
> integrating
> with existing infrastructure, it may face scalability challenges when
> the cache stores extremely large numbers of small entries, since
> BucketCache metadata
> overhead becomes more significant at that granularity. It will be
> interesting to see how this behaves in large production deployments.
>
> In any case, it is very encouraging to see that similar ideas are
> being explored and successfully used in production systems such as
> yours,
> OceanBase, and XEngine. Logical row caching appears to be a useful
> complement to the existing HBase caching layers.
>
> Thank you again for sharing your experience, and I look forward to
> continuing the discussion.
>
> Best regards,
> Vladimir
>
> On Sun, Mar 15, 2026 at 8:37 PM Xiao Liu <[email protected]> wrote:
> >
> > Thanks, Vladimir!
> >
> > In fact, by 2025, in some of our production use cases—such as features and 
> > tags, they use bulkload for offline data loading and multi-get for 
> > real-time point queries to provide services to external users. In this 
> > scenario, we’ve found that cache utilization is not very efficient.
> >
> > We have also researched similar systems. As you mentioned, many of them 
> > have implemented RowCache. Other examples include Oceanbase[1] and 
> > XEngine[2] used by PolarDB.
> >
> > We have implemented a RowCache based on these references and deployed it in 
> > our production environment. In our benchmark tests, Get throughput improved 
> > under the same resource conditions. In actual production use, CPU 
> > utilization dropped significantly under the same traffic load. After 
> > running in production for nearly six months and handling tens of millions 
> > of requests per second, it has proven to be a viable solution and a 
> > valuable complement to HBase use cases.
> >
> > During practical implementation, we encountered several challenges, 
> > including:
> > 1. Ensuring data consistency in bulk load scenarios
> > 2. Maintaining cache consistency during automatic region balancing
> > 3. Performance under high filterRead loads
> > 4. Business scenarios where Get.addColumn is used to read data from a 
> > specific cell rather than an entire row
> > 5. Handling massive table data volumes with a 7-day expiration policy
> > ...
> >
> > In summary, thank you very much for proposing and open-sourcing a solution; 
> > we will study it in depth. At the same time, as Duo mentioned, we are very 
> > pleased to see that HBASE-29585 in the community is also being actively 
> > advanced, and we can work together to drive the implementation of this 
> > feature in HBase.
> >
> > Best,
> > Xiao Liu
> >
> > [1] XEngine: 
> > https://www.alibabacloud.com/help/en/polardb/polardb-for-mysql/user-guide/x-engine-principle-analysis
> > [2] Oceanbase: https://www.oceanbase.com/docs
> > /common-oceanbase-database-cn-10000000001576547
> >
> >
> >
> > On 2026/01/04 21:02:28 Vladimir Rodionov wrote:
> > > Hello HBase community,
> > >
> > > I’d like to start a discussion around a feature that exists in related
> > > systems but is still missing in Apache HBase: row-level caching.
> > >
> > > Both *Cassandra* and *Google Bigtable* provide a row cache for hot rows.
> > > Bigtable recently revisited this area and reported measurable gains for
> > > single-row reads. HBase today relies almost entirely on *block cache*,
> > > which is excellent for scans and predictable access patterns, but can be
> > > inefficient for *small random reads*, *hot rows spanning multiple blocks*,
> > > and *cloud / object-store–backed deployments*.
> > >
> > > To explore this gap, I’ve been working on an *HBase Row Cache for HBase 
> > > 2.x*,
> > > implemented as a *RegionObserver coprocessor*, and I’d appreciate feedback
> > > from HBase developers and operators.
> > >
> > > *Project*:
> > >
> > > https://github.com/VladRodionov/hbase-row-cache
> > >
> > >
> > > *Background / motivation (cloud focus):*
> > >
> > > https://github.com/VladRodionov/hbase-row-cache/wiki/HBase:-Why-Block-Cache-Alone-Is-No-Longer-Enough-in-the-Cloud
> > >
> > > What This Is
> > >
> > >
> > >    -
> > >
> > >    Row-level cache for HBase 2.x (coprocessor-based)
> > >    -
> > >
> > >    Powered by *Carrot Cache* (mostly off-heap, GC-friendly)
> > >    -
> > >
> > >    Multi-level cache (L1/L2/L3)
> > >    -
> > >
> > >    Read-through caching of table : rowkey : column-family
> > >    -
> > >
> > >    Cache invalidation on any mutation of the corresponding row+CF
> > >    -
> > >
> > >    Designed for *read-mostly, random-access* workloads
> > >    -
> > >
> > >    Can be enabled per table or per column family
> > >    -
> > >
> > >    Typically used *instead of*, not alongside, block cache
> > >
> > > *Block Cache vs Row Cache (Conceptual)*
> > >
> > > *Aspect*
> > >
> > > *Block Cache*
> > >
> > > *Row Cache*
> > >
> > > Cached unit
> > >
> > > HFile block (e.g. 64KB)
> > >
> > > Row / column family
> > >
> > > Optimized for
> > >
> > > Scans, sequential access
> > >
> > > Random small reads, hot rows
> > >
> > > Memory efficiency for small reads
> > >
> > > Low (unused data in blocks)
> > >
> > > High (cache only requested data)
> > >
> > > Rows spanning multiple blocks
> > >
> > > Multiple blocks cached
> > >
> > > Single cache entry
> > >
> > > Read-path CPU cost
> > >
> > > Decode & merge every read
> > >
> > > Amortized across hits
> > >
> > > Cloud / object store fit
> > >
> > > Necessary but expensive
> > >
> > > Reduces memory & I/O amplification
> > >
> > > Block cache remains essential; row cache targets a *different optimization
> > > point*.
> > >
> > > *Non-Goals (Important)*
> > >
> > >
> > >    -
> > >
> > >    Not proposing removal or replacement of block cache
> > >    -
> > >
> > >    Not suggesting this be merged into HBase core
> > >    -
> > >
> > >    Not targeting scan-heavy or sequential workloads
> > >    -
> > >
> > >    Not eliminating row reconstruction entirely
> > >    -
> > >
> > >    Not optimized for write-heavy or highly mutable tables
> > >    -
> > >
> > >    Not changing HBase storage or replication semantics
> > >
> > > This is an *optional optimization* for a specific class of workloads.
> > >
> > > *Why I’m Posting*
> > >
> > > This is *not a merge proposal*, but a request for discussion:
> > >
> > >
> > >    1.
> > >
> > >    Do you see *row-level caching* as relevant for modern HBase 
> > > deployments?
> > >    2.
> > >
> > >    Are there workloads where block cache alone is insufficient today?
> > >    3.
> > >
> > >    Is a coprocessor-based approach reasonable for experimentation?
> > >    4.
> > >
> > >    Are there historical or architectural reasons why row cache never 
> > > landed
> > >    in HBase?
> > >
> > > Any feedback—positive or critical—is very welcome.
> > >
> > > Best regards,
> > >
> > > Vladimir Rodionov
> > >

Reply via email to