[
https://issues.apache.org/jira/browse/KUDU-3413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725647#comment-17725647
]
ASF subversion and git services commented on KUDU-3413:
-------------------------------------------------------
Commit 195e1b51b141a78c20e5846273624b825c4d40fb in kudu's branch
refs/heads/master from kedeng
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=195e1b51b ]
KUDU-3413 [multi-tenancy] add some renames for multi-tenancy
For the convenience of introducing multi-tenancy feature, I have
performed the following renaming operations in this patch:
* rename server_key to encryption_key;
* rename SetServerKey() to SetEncryptionKey();
* rename GenerateEncryptedServerKey(...) to GenerateEncryptionKey(...);
* rename DecryptServerKey(...) to DecryptEncryptionKey(...);
* rename DecryptKey(...) to DecryptEncryptionKey(...).
Because no logical changes were made, no unit tests were added in this patch.
Change-Id: Ic5bf51df9e7b9ebdaeb353049274cb18fa73a43a
Reviewed-on: http://gerrit.cloudera.org:8080/19899
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <[email protected]>
Reviewed-by: Yingchun Lai <[email protected]>
> Kudu multi-tenancy
> ------------------
>
> Key: KUDU-3413
> URL: https://issues.apache.org/jira/browse/KUDU-3413
> Project: Kudu
> Issue Type: New Feature
> Reporter: dengke
> Assignee: dengke
> Priority: Major
> Attachments: data_and_metadata.png, kudu table topology.png,
> metadata_record.png, new_fs_manager.png, tablet_rowsets.png,
> zonekey_update.png
>
>
> h1. 1、Definition
> * Tenant: A cluster user can be called a tenant. Tenants may be divided by
> project or actual application. Each tenant is equivalent to a resource pool,
> and all users under a tenant share all resources of the resource pool.
> Multiple tenants share a cluster resource.
> * User: The user of cluster resources.
> * Multi tenant: The database level controls that tenants cannot access each
> other, and resources are private and independent(Note: Kudu does not have the
> concept of database, which is simply understood as multiple tables).
> h1. 2、Current situation
> The latest version of kudu has realized ‘data at rest encryption',
> mainly cluster level authentication and encryption, data storage encryption
> of a single server level, which can meet the needs of basic encryption
> scenarios, but there is still a little gap from the tenant level encryption
> we are pursuing.
> h1. 3、Outline design
> In general, there are the following differences between tenant level
> encryption and cluster level encryption:
> * Tenant level encryption requires data storage isolation, which means data
> between tenants needs to be separated (a new layer of namespace namespace may
> be added to the storage topology, and data of the same tenant is stored in
> the same namespace path, with minimal mutual impact);
> * The generation and use of tenants'keys. In a multi tenant scenario, we
> need to replace the cluster key with the tenant key.
> h1. 4、Design
> h2. 4.1 Namespace
> The namespace in the storage field of the industry is mainly used to
> maintain the file attributes, directory tree structure and other metadata
> information of the file system, and is compatible with POSIX directory trees
> and file operations. It is a core concept in file storage. Taking the common
> HDFS as an example, its namespace is mainly implemented based on "the disk
> allows logical partitioning, while attaching partition files to different
> directories, and finally modifying the directory owner's permissions" to
> achieve resource isolation.
> Corresponding to the Kudu system, the current storage topology is
> relatively mature, and the kudu client's read/write requests need to be
> processed by tserver before the corresponding data can be obtained. The
> request does not involve direct manipulation of raw data, that is, the client
> does not perceive the data distribution in the storage engine at all, there
> is a natural degree of data isolation.
> However, the data in the storage engine are intertwined. In some
> extreme cases, there is still the possibility of interaction. The best
> solution is to completely distinguish the read/write, compact and other
> processing processes of different tenants. However, it requires a lot of
> changes and may lead to system instability. We can make minimal changes by
> tenant to achieve physical isolation of data.
> First, we need to analyze the current storage topology: a table in
> kudu will be divided into multiple tablet partitions. Each tablet includes
> metadata meta information and several RowSets. The RowSet contains a
> 'MemRowSet'(corresponding to the data in memory) and multiple
> 'DiskRowSets'(corresponding to the data on the disk). The 'DiskRowSet'
> contains 'BloomFile’、'Ad_hoc Index’、'BaseData'、'DeltaMem' and several
> 'RedoFiles' and 'UndoFile' (generally, there is only one 'UndoFile'). For
> more specific distribution information, please refer to the following figure.
> !kudu table topology.png|width=1282,height=721!
> *The simplest way to achieve physical isolation is to set different
> storage paths for the data of different tenants.* Currently, we only need to
> consider the physical isolation of 'DiskRowSet'.
> Kudu system writes disks through containers. Each container can write
> a large continuous disk space for writing data to a CFile (the actual storage
> form of ‘DiskRowSet'). When one CFile is written, the container will be
> returned to the ‘BlockManager', and then the container can be used to write
> data to the next CFile. When no container is available in the BlockManager, a
> new container will be created for the new CFile. Each container consists of a
> *. metadata and a * Data. Each DiskRowSet has several blocks, and all the
> blocks corresponding to a DiskRowset are distributed to multiple containers.
> A container may also contain data from multiple DiskRowSets.
> It can be simply understood that one DiskRowSet corresponds to one
> CFile file (it refers to the single column case. If it is multi column, it
> corresponds to multiple CFile files). The difference is that DiskRowSet is
> our logical organization, while CFile is our physical storage. For the six
> parts of a DiskRowSet (BloomFile, BaseData, UndoFile, RedoFile, DeltaMem,
> AdhocIndex as shown in the figure above), neither one CFile corresponds to a
> DiskRowSet nor one CFile contains all six parts of a DiskRowSet. These six
> parts will be independent in multiple CFiles, and each part will be a
> separate CFile. As shown in the figure below, we can only find the following
> files (*. data and *. metadata) in the actual production environment, and no
> CFile file exists.
> !data_and_metadata.png|width=1298,height=395!
> This is because a large number of CFiles will be merged and written
> to a *.data file by the container, and the *.data is actually a collection of
> CFiles. The CFile corresponding to each part of the DiskRowSet and its
> mapping relationship are recorded in the tablet-meta/<tablet_id>. In the
> file, each mapping relationship is based on the tablet_id which saved
> separately.
> In current storage topology, the *.metadata file corresponds to the
> metadata of the block (the final representation of CFile in fs) of the lowest
> level fs layer. It is not in the same dimension as the above concepts such as
> CFile and BlockManager. Instead, it records the relevant information of the
> block. As shown in the figure below, it is a record in *. metadata.
> !metadata_record.png!
> According to the above description, we can draw the approximate
> corresponding relationship as shown in the figure below:
> !tablet_rowsets.png|width=1315,height=695!
> Base on the above logic, we can know that the *.data file is the
> actual storage location of tenant data. To achieve data isolation, the
> isolation of *.data is needed. In order to achieve this goal, we can choose
> to create different BlockManagers for each tenant, maintain their own *.data
> files. *_In the default scenario (no tenant name is specified), the data will
> have a default block_manager. If multi tenant encryption is enabled,
> fs_manager will create a new tenant_block_manager based on the tenant name,
> the data of the specified tenant name will be stored in the
> tenant_block_manager corresponding to the tenant name to achieve the purpose
> of data physical isolation._* The modified schematic diagram is as follows:
> {{!new_fs_manager.png|width=1306,height=552!}} Add the correspondence
> between the tenant and the block_manager in fs_manager, and maintain it in
> memory. The tenant's information needs to be persistent. We can consider
> appending metadata, or adding new metadata files for real-time update.
> {code:java}
> message TenantMetadataPB {
> message TenantMeta {
> // The name of tenant.
> optional string tenant_name = 1;
> // Encrypted tenant key used to encrypt/decrypt file keys for tenant.
> optional string tenant_key = 2;
> // Initialization vector for the tenant key.
> optional string server_key_iv = 3;
> }
> repeated TenantMeta tenant_meta = 1;
> // Tenant key version.
> optional string tenant_key_version = 2;
> } {code}
> h2. 4.2 Tenant Key
> There are two current implementations of the key:
> * When static encryption is enabled, server_key is randomly generated by
> default;
> * When the address and cluster name of the kms are specified, try to get the
> server_key from kms.
> The server_key is mainly used for encryption and decryption of
> sensitive files. We should change the work mode like 'no encryption’,
> 'default cluster static encryption’, 'KMS cluster static encryption' and 'KMS
> multi tenant encryption’. In the 'KMS multi tenant encryption' mode, the new
> tenant name need to add. The tenant name is used to distinguish different
> tenants and obtain the corresponding key. If the tenant name is not set, it
> corresponds to the "default cluster static encryption” mode, which means
> sharing the randomly generated server_key by default.
> In the previous cluster encryption scenario, kms_client gets the
> zonekey information of the cluster. But there is only zonekey information and
> no tenant information in the ranger system, so we need to maintain the
> correspondence between the tenant name and zonekey. To do this, we need to
> add a configuration file(maybe JSON format) to mark the corresponding
> relationship between the tenant name and zonekey. Every time the tenant name
> changes, we need to add a zoneKey in Ranger first, then update the
> configuration item in the configuration file, and finally use the new tenant
> name when creating the table by the end.
> {code:c++}
> class RangerKMSClient {
> public:
> RangerKMSClient(std::string kms_url)
> : kms_url_(std::move(kms_url)) {}
>
> Status DecryptKey(const std::string tenant_name,
> const std::string& encrypted_key,
> const std::string& iv,
> const std::string& key_version,
> std::string* decrypted_key);
>
> Status GenerateEncryptedServerKey(const std::string tenant_name,
> std::string* encrypted_key,
> std::string* iv,
> std::string* key_version);
>
> private:
> std::string kms_url_;
> };
> class DefaultKeyProvider : public KeyProvider {
> public:
> ~DefaultKeyProvider() override {}
> Status DecryptServerKey(const std::string& encrypted_server_key,
> const std::string& /*iv*/,
> const std::string& /*key_version*/,
> std::string* server_key);
>
> Status GenerateEncryptedServerKey(std::string* server_key,
> std::string* iv,
> std::string* key_version);
> };
> {code}
> The encryption and decryption api of the kms client needs to pass in
> the tenant name, and maintain the correspondence between the tenant name and
> the zonekey in the memory. Each time we use it, search it in the memory at
> first. If the search fails, we will search in the configuration file, and
> update the memory data at the same time. If it fails again, we will return.
> Otherwise, we will use the queried zonekey to obtain the key.
> !zonekey_update.png|width=1273,height=754!
> h1. 5、Follow-up work
> * Add the parameter of tenant name;
> * Add multi tenant encryption mode parameter control;
> * Modify the use of block_manager to adapt to multi tenant scenarios;
> * Modify the key acquisition;
> * Add new multi tenant key acquisition and sensitive data encryption;
> * Modify the key acquisition and sensitive data encryption behavior of the
> default scenario (no tenant is specified);
--
This message was sent by Atlassian Jira
(v8.20.10#820010)