Hi all,

I’d like to start a discussion about refactoring the HBase block cache
subsystem into a more modular and pluggable architecture.

*Motivation*

The current block cache design (BlockCache, CombinedBlockCache,
BucketCache) mixes several concerns:

1. storage implementation
2. L1/L2 topology and orchestration
3. placement logic and admission behavior


This makes it difficult to:

- introduce alternative cache implementations
- evolve cache policies independently
- experiment with different topologies

In addition, at large scale the current implementation can incur noticeable
metadata overhead. For example, with 64KB blocks and ~1.6TB cache,
BucketCache may consume on the order of ~9GB of metadata, reducing
effective cache capacity.


*Proposal (high level)*
Introduce a layered internal architecture:

1. BlockCacheEngine – storage abstraction (Lru, Bucket, etc.)
2. CacheTopology – L1/L2 coordination (exclusive/inclusive)
3. CachePlacementPolicy – admission, placement, promotion decisions
4. CacheAccessService – unified entry point for read/write paths

One important addition is explicit admission control on the cache insertion
path (put), allowing better handling of:

- scan-once workloads
- compaction-generated blocks
- prefetch behavior
- block-type-aware caching (data vs index vs bloom)

The goal is to keep behavior unchanged initially, and introduce this
structure incrementally.


*Implementation plan*
The work is organized under an umbrella JIRA:

HBASE-30018 <https://issues.apache.org/jira/browse/HBASE-30018> (Pluggable
Block Cache Architecture)

Planned phases:

1. Introduce internal APIs (no behavior change)
2. Refactor CombinedBlockCache into explicit topology layer
3. Adapt BucketCache to new interfaces
4. Enable alternative cache engines (e.g., CarrotCache, EHCache)


*Questions / feedback*
I’d appreciate feedback on:

- overall direction and layering
- separation between topology and policy
- admission control on the put path
- compatibility concerns with existing implementations
- any known pitfalls in HFileReaderImpl / write path integration

If there is general agreement, I’ll start with a small initial patch
introducing the internal interfaces with no behavior change.


Thanks,
Vladimir

Reply via email to