Bumping up this thread as I have addressed all the TBD items since the PR was 
raised. Any input would be appreciated!

Thank you,
Hari

PS: I reformatted the original message quoted below for better readability.

On 2025/11/10 14:27:11 Hari Krishna Dara wrote:
> Dear HBase Developers,
> 
> I am pleased to announce that the key management feature for encryption at
> rest is now ready for community review. This is a significant enhancement
> to HBase's security capabilities, and I would greatly appreciate your
> feedback and insights.
> 
> Pull Request: https://github.com/apache/hbase/pull/7421
> Branch: HBASE-29368-key-management-feature
> Primary JIRA: HBASE-29368
> Design Document:
> https://docs.google.com/document/d/1ToW_rveXHXUc1F6eFNQfu5LOeMAjzgq6FcYUDbdZrSM/edit?tab=t.0
> 
> Overview
> This feature introduces a comprehensive key management system that
> extends HBase's existing encryption-at-rest capabilities. The
> implementation provides enterprise-grade key lifecycle management with
> support for key rotation, hierarchical namespace resolution for key lookup,
> key caching and improved integration with key management systems to handle
> key life cycles and external key changes.
> 
> Key Features
>
> 1. Managed Keys Infrastructure
> 
>    - Introduction of ManagedKeyProvider interface for pluggable key
>      provider implementations on the lines of the existing KeyProvider
>      interface.
>    - The new interface can also return Data Encryption Keys (DEKs) and a
>      lot more details on the keys.
>    - Comes with the default ManagedKeyStoreKeyProvider implementation using
>      Java KeyStore, similar to the existing KeyStoreKeyProvider.
>    - Enables logical key isolation for multi-tenant scenarios through
>      custodian identifiers (future use cases) and the special default global
>      custodian.
>    - Hierarchical namespace resolution for DEKs with automatic fallback:
>      explicit CF namespace attribute → constructed table/family namespace →
>      table name → global namespace
> 
> 2. System Key (STK) Management
> 
>    - Cluster-wide system key for wrapping data encryption keys (DEKs). This
>      is equivalent to the existing master key, but better managed and 
> operation
>      friendly.
>    - Secure storage in HDFS with support for automatic key rotation during
>      boot up.
>    - Admin API to trigger key rotation and propagation to all RegionServers
>      without needing to do a rolling restart.
>    - Preserves the current double-wrapping architecture: DEKs wrapped by
>      STK, STK sourced from external KMS
> 
> 3. KeymetaAdmin API
> 
>    - enableKeyManagement(keyCust, keyNamespace) - Enable key management for
>      a custodian/namespace pair
>    - getManagedKeys(keyCust, keyNamespace) - Query key status and metadata
>    - rotateSTK() - Check for and propagate new system keys
>    - disableKeyManagement(keyCust, keyNamespace) - Disable all the keys for
>      a custodian/namespace (TBD)
>    - disableManagedKey(keyCust, keyNamespace, keyMetadataHash) - Disable a
>      specific key (TBD)
>    - rotateManagedKey(keyCust, keyNamespace) - Rotate the active key (TBD)
>    - refreshManagedKeys(keyCust, keyNamespace) - Refresh from external KMS
>      to validate all the keys. (TBD)
>    - Internal cache management operations for convenience and meeting SLAs.
>      (TBD)
> 
> 4. Persistent Key Metadata Storage
> 
>    - New system table hbase:keymeta for storing key metadata and state
>      which acts as an L2 cache.
>    - Tracks key lifecycle: ACTIVE, INACTIVE, DISABLED, FAILED states
>    - Stores wrapped DEKs and metadata for key lookup without depending on
>      external KMS.
>    - Optimized for high-priority access with in-memory column families
>    - Key metadata tracking with cryptographic hashes for integrity
>      verification
> 
> 5. Multi-Layer Caching
> 
>    - L1: In-memory Caffeine cache on RegionServers for hot key data
>    - L2: Keymeta table for persistent key metadata that is shared across
>      all RegionServers.
>    - L3: Dynamic lookup from external KMS as fallback when not found in L2.
>    - Cache invalidation mechanism for key rotation scenarios
> 
> 6. HBase Shell Integration
> 
>    - enable_key_management - Enable key management for a custodian and
>      namespace
>    - show_key_status - Display key status and metadata
>    - rotate_stk - Trigger system key rotation
>    - disable_key_management - Disable key management for a custodian and
>      namespace (TBD)
>    - disable_managed_key - Disable a specific key (TBD)
>    - rotate_managed_key - Rotate the active key (TBD)
>    - refresh_managed_keys - Refresh all keys for a custodian and namespace
>      (TBD)
> 
> Implementation Highlights
> 
>    - Backward Compatibility: Changes are fully compatible with existing
>      encryption-at-rest configuration
>    - Gradual step-by-step migration: Well defined migration path from
>      existing configuration to new configuration
>    - Performance: Minimal overhead through efficient caching and lazy key
>      loading
>    - Security: Cryptographic verification of key metadata, secure key
>      wrapping
>    - Operability: Administrative tools for key life cycle and cache
>      management
>    - Extensibility: Plugin architecture for custom key provider
>      implementations
>    - Testing: Comprehensive unit and integration tests coverage
> 
> ArchitectureThe implementation follows a layered architecture:
> 
> 
>    1. Provider Layer: Pluggable ManagedKeyProvider for KMS integration
>    2. Management Layer: KeyMetaAdmin API for administrative operations
>    3. Persistence Layer: KeymetaTableAccessor for metadata storage
>    4. Cache Layer: ManagedKeyDataCache and SystemKeyCache for performance
>    5. Service Layer: Coprocessor endpoints for client-server communication
> 
> Areas for ReviewI would particularly appreciate feedback on:
> 
> 
>    1. API Design: Is the KeymetaAdmin API intuitive and complete for
>       common key management scenarios?
>    2. Security Model: Does the double-wrapping architecture (DEK wrapped
>       by STK, STK from KMS) provide appropriate security guarantees?
>    3. Performance: Are there potential bottlenecks in the caching
>       strategy or table access patterns?
>    4. Operational Aspects: Are the administrative commands sufficient for
>       the needs of operations and monitoring?
>    5. Testing Coverage: Are there additional test scenarios we should
>       cover?
>    6. Documentation: Is the design document clear? What additional
>       documentation would be helpful?
>    7. Compatibility: Any concerns about interaction with existing HBase
>       features?
> 
> Next StepsAfter incorporating community feedback, I plan to:
> 
>    1. Address any issues identified during review
>    2. Implement the work identified for future phases
>    3. Add additional documentation to the reference guide
> 
> How to Review
>
> This PR introduces changes across multiple modules. Rather
> than reviewing all 143 files, I recommend focusing on these core
> components first:
> 
> Core Architecture:
> 
>    1. Design document (linked above) - architectural overview
>    2. ManagedKeyProvider, KeymetaAdmin, ManagedKeyData interfaces
>       (   hbase-common)
>    3. ManagedKeys.proto - protocol definitions
>    4. HMaster and misc. procedure changes - initialization of keymeta in a
>       predictable order
>    5. FixedFileTrailer + reader/writer changes - encode/decode additional
>       encryption key in store files
> 
> Key Implementation:
> 
>    1. KeymetaAdminImpl, KeymetaTableAccessor, ManagedKeyUtils,
>       SystemKeyManager, SystemKeyAccessor - admin operations and persistence
>    2. ManagedKeyDataCache, SystemKeyCache - caching layer
>    3. SecurityUtil - encryption context creation
> 
> Client & Shell:
> 
>    1. KeymetaAdminClient - client API
>    2. Shell commands and Ruby wrappers
> 
> Tests & Examples:
> 
>    1. TestKeymetaAdminImpl, TestManagedKeymeta - for usage patterns
>    2. key_provider_keymeta_migration_test.rb - E2E migration steps
> 
> Note: The remaining files contain secondary changes (API updates,
> test helpers, configuration constants, etc.) that can be reviewed later or
> skipped for initial feedback.
> 
> Please feel free to comment directly on the PR, or reply to this thread
> with questions, concerns, or suggestions.
> 
> Thank you for your time and expertise. Your feedback is invaluable in
> ensuring this feature meets the security and operational needs of HBase.
> 
> Best regards,
> Hari Krishna Dara
> 

Reply via email to