Hari Krishna Dara created HBASE-30222:
-----------------------------------------
Summary: DefaultCipherProvider re-parses HBase configuration
(hbase-default.xml/hbase-site.xml) on every instantiation
Key: HBASE-30222
URL: https://issues.apache.org/jira/browse/HBASE-30222
Project: HBase
Issue Type: Bug
Components: encryption
Affects Versions: 2.6.6, 2.5.15, 3.0.0-beta-1, 4.0.0-alpha-1
Reporter: Hari Krishna Dara
Assignee: Hari Krishna Dara
h3. Summary
{{DefaultCipherProvider}} initializes its {{conf}} field inline:
// DefaultCipherProvider.java
private Configuration conf = HBaseConfiguration.create();
Every time the provider is constructed, this parses hbase-default.xml and
hbase-site.xml off the classpath to build a fresh Configuration. That
Configuration is then immediately discarded: providers are constructed via
{{Encryption.getCipherProvider(conf)}} ->
{{ReflectionUtils.newInstance(class, conf)}},
which invokes the no-arg constructor (running the field initializer) and
*then*
calls {{setConf(conf)}} -- CipherProvider extends Configurable -- overwriting
the
field with the caller's conf. So the default-config parse performed in the
constructor is pure waste on every call.
{{Encryption.getCipherProvider(conf)}} constructs a new provider instance on
every
invocation (it does not reuse the {{getInstance()}} singleton), so the
wasteful
parse happens on every cipher resolution.
h3. Impact
Each call to {{Encryption.getCipher(conf, name)}} /
{{Encryption.getCipherProvider(conf)}} parses the full HBase configuration. A
JMH
microbenchmark of a single provider construction measured ~2.3 ms latency and
~1.1 MB of allocation, essentially all of it attributable to
{{HBaseConfiguration.create()}} -- confirmed by control benchmarks isolating
the
config parse from cipher construction and the AES operations. (Microbenchmark
on
Apple M4 Max / JDK 17; the absolute numbers are environment-dependent, but the
attribution -- ~100% of the cost is the config parse -- is the point.)
For encryption at rest as it exists today the practical impact is small,
because
cipher resolution happens only at infrequent, amortized points:
- per WAL file: AbstractProtobufLogWriter, SecureProtobufLogReader
- per HFile / encryption-context creation:
SecurityUtil.createEncryptionContext
and related paths
These are once-per-file events amortized over many cells, so the cost is real
but
negligible in normal operation. However, the inefficiency is latent: any code
path
that resolves ciphers at higher frequency (e.g. per-key or per-row) would pay
a
full config parse each time, turning a harmless cost into a significant
latency/allocation problem. Fixing it removes a foot-gun for current and
future
callers and eliminates needless GC churn.
h3. Root cause
{{DefaultCipherProvider}} has no Configuration-accepting constructor, so
construction *always* calls {{HBaseConfiguration.create()}}, even though every
production caller overwrites that value via {{setConf()}} before use. There
is no
way to suppress the parse.
The {{getInstance()}} singleton is used only by tests (TestAES,
TestCommonsAES),
both of which already call {{setConf(conf)}} immediately afterward; it is
effectively a test-only accessor but is not annotated as such.
h3. Proposed fix
Remove the inline initializer so construction does not parse config:
private Configuration conf; // set via setConf() before use
This is safe for production: {{ReflectionUtils.newInstance()}} always calls
{{setConf()}} before the provider is used, and {{getConf()}} is first
dereferenced
later (in the AES/AES256GCM constructor), by which point conf is set.
Additionally:
- Annotate {{getInstance()}} as @InterfaceAudience.LimitedPrivate(UNITTEST)
(or
otherwise document it as test-only), since tests are its only callers and
they
already set conf explicitly.
- Optionally add a fail-fast guard in {{getConf()}}
(Preconditions.checkState(conf != null, ...)) so a caller that forgets
setConf() fails loudly instead of NPE-ing inside cipher construction --
preserving the "always non-null" property the field initializer
accidentally
provided.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)