This is an automated email from the ASF dual-hosted git repository.
luzhijing pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new d8d08f36095 [docs](doris-cloud) Add storage vault and update
deployment (#596)
d8d08f36095 is described below
commit d8d08f36095d2fd279df4e8571172bcc872a4549
Author: Gavin Chou <[email protected]>
AuthorDate: Sat May 4 08:55:19 2024 +0800
[docs](doris-cloud) Add storage vault and update deployment (#596)
---
.../install-fdb.md | 27 ++
.../storage-vault.md | 27 ++
.../deployment.md | 270 +++++++++---------
.../install-fdb.md | 304 +++++++++++++++++++++
.../meta-service-resource-http-api.md | 115 ++++----
.../storage-vault.md | 174 ++++++++++++
sidebars.json | 8 +-
7 files changed, 724 insertions(+), 201 deletions(-)
diff --git a/docs/separation-of-storage-and-compute/install-fdb.md
b/docs/separation-of-storage-and-compute/install-fdb.md
new file mode 100644
index 00000000000..e15a31b3d12
--- /dev/null
+++ b/docs/separation-of-storage-and-compute/install-fdb.md
@@ -0,0 +1,27 @@
+---
+{
+ "title": "Install FDB",
+ "language": "en-US"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Coming soon
diff --git a/docs/separation-of-storage-and-compute/storage-vault.md
b/docs/separation-of-storage-and-compute/storage-vault.md
new file mode 100644
index 00000000000..7d86117f12a
--- /dev/null
+++ b/docs/separation-of-storage-and-compute/storage-vault.md
@@ -0,0 +1,27 @@
+---
+{
+ "title": "Storage Backend - Storage Vault",
+ "language": "en-US"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Coming soon
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/deployment.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/deployment.md
index d5a315fa5f7..80bbeec87be 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/deployment.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/deployment.md
@@ -69,7 +69,7 @@ version:{doris_cloud-0.0.0-debug}
code_version:{commit=b9c1d057f07dd874ad32501ff
Recycler 和 Meta-service 是同个程序的不同进程,通过启动参数来确定运行的 Recycler 或者是 Meta-service.
-这两个进程依赖 FDB, FDB 的部署请参考[FDB 安装章节](#FDB安装)
+这两个进程依赖 FDB, FDB 的部署请参考[FDB
安装文档](../separation-of-storage-and-compute/install-fdb.md)
### 配置
@@ -118,37 +118,113 @@ bin/stop.sh
存算分离架构下,整个数仓的节点构成信息是通过 Meta-service 进行维护的 (注册 + 变更). FE BE 和 Meta-service
交互来进行服务发现和身份验证。
-创建存算分离集群主要是和 Meta-service 交互,通过 HTTP 接口,[meta-service 提供了标准的 http
接口进行资源管理操作](../separation-of-storage-and-compute/meta_service_resource_http_api.md).
+创建存算分离集群主要是和 Meta-service 交互,通过 HTTP 接口,[meta-service 提供了标准的 http
接口进行资源管理操作](../separation-of-storage-and-compute/meta-service-resource-http-api.md).
创建存算分离集群 (以及 Cluster) 的其实就是描述这个存算分离集群里的机器组成,以下步骤只涉创建一个最基础的存算分离集群所需要进行的交互。
-### 创建存算分离集群
+主要分为两步: 1. 注册一个仓库(FE) 2. 注册一个或者多个计集群(BE)
-一个存算分离集群需要有计算和存储的基础资源,所以创建这个集群需要在 Meta-service 里注册这些资源。
+### 创建存算分离集群FE
-**1. 对象信息,obj_info 的 Bucket 信息 按照机器所在 Region 实际填写,Prefix
使用自己比较有针对性的前缀,比如加个业务的名字**
+#### 存算分离集群及其存储后端
-**2. 再添加 FE 机器信息,一般来说只需要建一个 FE 即可,信息主要包括**
+这个步骤主要目的是在meta-service注册一个存算分离模式的Doris数仓(一套meta-service可以支持多个不同的Doris数仓(多套FE-BE)).
+主要需要描述一个仓库使用什么样的存储后端([Storage
Vault](../separation-of-storage-and-compute/storage-vault.md)), 可以选择S3 或者 HDFS.
-- 节点的 cloud_unique_id 是一个唯一字符串,是每个节点的唯一 ID 以及身份标识,根据自己喜好选一个,这个值需要和 fe.conf 的
cloud_unique_id 配置值相同。
+调用meta-servicde的create_instance接口. 主要参数
+1. instance_id: 存算分离模式Doris数仓的id, 要求历史唯一uuid,
例如6ADDF03D-4C71-4F43-9D84-5FC89B3514F8. **本文档中为了简化使用普通字符串**.
+2. name: 数仓名字, 按需填写
+3. user_id: 用户id, 是一个字符串, 按需填写
+4. vault: HDFS或者S3的存储后端的信息, 比如HDFS的属性, s3 bucket信息等.
-- ip edit_log_port 按照 fe.conf 里实际填写,FE 集群的 cluster_name cluster_id 是固定的
(RESERVED_CLUSTER_NAME_FOR_SQL_SERVER, RESERVED_CLUSTER_ID_FOR_SQL_SERVER) 不能改动
+更多信息请参考[meta-service API create
instance章节](../separation-of-storage-and-compute/meta-service-resource-http-api.md).
+
+##### 创建基于HDFS的存算分离Doris
+
+示例
+
+```Shell
+curl -s
"${META_SERVICE_ENDPOINT}/MetaService/http/create_instance?token=greedisgood9999"
\
+ -d '{
+ "instance_id": "doris_master_asan_hdfs_multi_cluster_autoscale",
+ "name": "doris_master_asan_hdfs_multi_cluster_autoscale",
+ "user_id": "sample-user-id",
+ "vault": {
+ "hdfs_info" : {
+ "build_conf": {
+ "fs_name": "hdfs://172.21.0.44:4007",
+ "user": "hadoop",
+ "hdfs_kerberos_keytab": "/etc/emr.keytab",
+ "hdfs_kerberos_principal": "hadoop/172.30.0.178@EMR-D46OBYMH",
+ "hdfs_confs" : [
+ {
+ "key": "hadoop.security.authentication",
+ "value": "kerberos"
+ }
+ ]
+ },
+ "prefix": "doris_master_asan_hdfs_multi_cluster_autoscale-0404"
+ }
+ }
+}'
+```
+
+##### 创建基于Se的存算分离Doris
+
+示例(腾讯云的cos)
```Shell
-# create 存算分离集群
-# 注意配置S3信息
-curl '127.0.0.1:5000/MetaService/http/create_instance?token=greedisgood9999'
-d
'{"instance_id":"cloud_instance0","name":"cloud_instance0","user_id":"user-id",
-"obj_info": {
- "ak": "${ak}",
- "sk": "${sk}",
- "bucket": "sample-bucket",
- "prefix": "${your_prefix}",
- "endpoint": "cos.ap-beijing.myqcloud.com",
- "external_endpoint": "cos.ap-beijing.myqcloud.com",
- "region": "ap-beijing",
- "provider": "COS"
-}}'
+curl -s
"${META_SERVICE_ENDPOINT}/MetaService/http/create_instance?token=greedisgood9999"
\
+ -d '{
+ "instance_id": "doris_master_asan_hdfs_multi_cluster_autoscale",
+ "name": "doris_master_asan_hdfs_multi_cluster_autoscale",
+ "user_id": "sample-user-id",
+ "vault": {
+ "obj_info": {
+ "ak": "${ak}",
+ "sk": "${sk}",
+ "bucket": "doris-build-1308700295",
+ "prefix": "${your_prefix}",
+ "endpoint": "cos.ap-beijing.myqcloud.com",
+ "external_endpoint": "cos.ap-beijing.myqcloud.com",
+ "region": "ap-beijing",
+ "provider": "COS"
+ }
+ }
+}'
+```
+
+启动后在FE输入show storage vault会看到built_in_storage_vault,并且这个vault的属性就和刚刚传递的属性相同.
+
+```Shell
+mysql> show storage vault;
++------------------------+----------------+-------------------------------------------------------------------------------------------------+-----------+
+| StorageVaultName | StorageVaultId | Propeties
| IsDefault |
++------------------------+----------------+-------------------------------------------------------------------------------------------------+-----------+
+| built_in_storage_vault | 1 | build_conf { fs_name:
"hdfs://127.0.0.1:8020" } prefix: "_1CF80628-16CF-0A46-54EE-2C4A54AB1519" |
false |
++------------------------+----------------+-------------------------------------------------------------------------------------------------+-----------+
+2 rows in set (0.00 sec)
+```
+
+**注意:**
+
+Storage
Vault模式和非Vault模式是不能同时创建的,如果用户同时指定了obj_info和vault,那么只会创建非vault模式的集群。Vault模式必须在创建instance的时候就传递vault信息,否则会默认为非vault模式.
+只有Vault模式才支持对应的vault stmt.
+
+#### 添加FE
+
+存算分离模式FE的管理方式和BE 是类似的都是分了组, 所以也是通过add_cluster等接口来进行操作.
+
+一般来说只需要建一个FE即可, 如果需要多加几个FE,
+
+cloud_unique_id是一个唯一字符串, 格式为 `1:<instance_id>:<string>`, 根据自己喜好选一个.
+ip edit_log_port 按照fe.conf里实际填写.
+注意, FE集群的cluster_name cluster_id是固定, 恒定为
+"cluster_name":"RESERVED_CLUSTER_NAME_FOR_SQL_SERVER"
+"cluster_id":"RESERVED_CLUSTER_ID_FOR_SQL_SERVER"
+
+```Shell
# 添加FE
curl '127.0.0.1:5000/MetaService/http/add_cluster?token=greedisgood9999' -d '{
"instance_id":"cloud_instance0",
@@ -158,7 +234,7 @@ curl
'127.0.0.1:5000/MetaService/http/add_cluster?token=greedisgood9999' -d '{
"cluster_id":"RESERVED_CLUSTER_ID_FOR_SQL_SERVER",
"nodes":[
{
- "cloud_unique_id":"cloud_unique_id_sql_server00",
+
"cloud_unique_id":"1:cloud_instance0:cloud_unique_id_sql_server00",
"ip":"172.21.16.21",
"edit_log_port":12103,
"node_type":"FE_MASTER"
@@ -170,25 +246,23 @@ curl
'127.0.0.1:5000/MetaService/http/add_cluster?token=greedisgood9999' -d '{
# 创建成功 get 出来确认一下
curl '127.0.0.1:5000/MetaService/http/get_cluster?token=greedisgood9999' -d '{
"instance_id":"cloud_instance0",
- "cloud_unique_id":"regression-cloud-unique-id-fe-1",
+ "cloud_unique_id":"1:cloud_instance0:regression-cloud-unique-id-fe-1",
"cluster_name":"RESERVED_CLUSTER_NAME_FOR_SQL_SERVER",
"cluster_id":"RESERVED_CLUSTER_ID_FOR_SQL_SERVER"
}'
```
-### 创建 Compute Cluster (BE Cluster)
-
-一个计算集群由多个计算节点组成,主要包含以下关键信息:
+### 创建compute cluster (BE)
-1. 计算集群的 cluster_name cluster_id 按照自己的实际情况偏好填写,需要确保唯一。
+用户可以创建一个或者多个计算集群, 一个计算机群由任意多个计算阶段组成.
-2. 节点的 cloud_unique_id 是一个唯一字符串,是每个节点的唯一 ID 以及身份标识,根据自己喜好选一个,这个值需要和 be.conf 的
cloud_unique_id 配置值相同。
+一个计算集群组成有多个几个关键信息:
-3. IP 根据实际情况填写,heartbeat_port 是 BE 的心跳端口。
+1. cloud_unique_id是一个唯一字符串, 格式为 `1:<instance_id>:<string>`, 根据自己喜好选一个.
这个值需要和be.conf的cloud_unique_id配置值相同.
+2. cluster_name cluster_id 按照自己的实际情况偏好填写
+3. ip根据实际情况填写, heartbeat_port 是BE的心跳端口.
-BE cluster 的数量以及 节点数量 根据自己需求调整,不固定,不同 cluster 需要使用不同的 cluster_name 和
cluster_id.
-
-通过调用[Meta-service 的资源管控 API
进行操作](../separation-of-storage-and-compute/meta_service_resource_http_api.md)
+BE cluster的数量以及 节点数量 根据自己需求调整, 不固定, 不同cluster需要使用不同的 cluster_name 和 cluster_id.
```Shell
# 172.19.0.11
@@ -201,7 +275,7 @@ curl
'127.0.0.1:5000/MetaService/http/add_cluster?token=greedisgood9999' -d '{
"cluster_id":"cluster_id0",
"nodes":[
{
- "cloud_unique_id":"cloud_unique_id_compute_node0",
+
"cloud_unique_id":"1:cloud_instance0:cloud_unique_id_compute_node0",
"ip":"172.21.16.21",
"heartbeat_port":9455
}
@@ -212,15 +286,23 @@ curl
'127.0.0.1:5000/MetaService/http/add_cluster?token=greedisgood9999' -d '{
# 创建成功 get 出来确认一下
curl '127.0.0.1:5000/MetaService/http/get_cluster?token=greedisgood9999' -d '{
"instance_id":"cloud_instance0",
- "cloud_unique_id":"regression-cloud-unique-id0",
+ "cloud_unique_id":"1:cloud_instance0:regression-cloud-unique-id0",
"cluster_name":"regression_test_cluster_name0",
"cluster_id":"regression_test_cluster_id0"
}'
```
-### FE/BE 配置
+### 计算集群操作
+
+TBD
-FE/BE 配置相比 Doris 多了一些配置,一个是 Meta-service 的地址另外一个是 cloud_unique_id
(根据之前创建存算分离集群 的时候实际值填写)
+加减节点: FE BE
+
+Drop cluster
+
+### FE/BE配置
+
+FE BE 配置相比doris多了一些配置, 一个是meta service 的地址另外一个是 cloud_unique_id (根据之前创建存算分离集群
的时候实际值填写)
fe.conf
@@ -228,23 +310,23 @@ fe.conf
# cloud HTTP data api port
cloud_http_port = 8904
meta_service_endpoint = 127.0.0.1:5000
-cloud_unique_id = cloud_unique_id_sql_server00
+cloud_unique_id = 1:cloud_instance0:cloud_unique_id_sql_server00
```
be.conf
```Shell
meta_service_endpoint = 127.0.0.1:5000
-cloud_unique_id = cloud_unique_id_compute_node0
+cloud_unique_id = 1:cloud_instance0:cloud_unique_id_compute_node0
meta_service_use_load_balancer = false
enable_file_cache = true
file_cache_path =
[{"path":"/mnt/disk3/doris_cloud/file_cache","total_size":104857600,"query_limit":104857600}]
tmp_file_dirs =
[{"path":"/mnt/disk3/doris_cloud/tmp","max_cache_bytes":104857600,"max_upload_bytes":104857600}]
```
-### 启停 FE/BE
+### 启停FE/BE
-FE/BE 启停和 Doris 保持一致,
+FE BE启停和doris存算一体启停方式保持一致,
```Shell
bin/start_be.sh --daemon
@@ -254,112 +336,8 @@ bin/stop_be.sh
bin/start_fe.sh --daemon
bin/stop_fe.sh
```
-:::caution 注意
-**Doris Cloud 模式 FE 会自动发现对应的 BE, 千万不要用 ALTER SYSTEM ADD 或者 DROP BACKEND**
-:::
-
-## FDB 安装
-
-请使用 7.1.x 的版本
-
-### ubuntu 安装
-
-```Plain
-apt-get install foundationdb
-```
-
-默认安装的相关路径信息
-
-配置文件
-
-/etc/foundationdb/fdb.cluster
-
-/etc/foundationdb/foundationdb.conf
-
-日志路径 (会自动滚动,但是要关注/的使用率)
-
-/var/log/foundationdb/
-
-### 使用 rpm 包安装
-
-安装&使用参考
-
-https://apple.github.io/foundationdb/getting-started-linux.html
-
-https://github.com/apple/foundationdb/tags
-
-### FDB 注意事项
-
-如果默认 FDB 使用 Memory 作为存储引擎,该引擎适合小数据量存储,要是做压力测试或者存大量数据,需切换 FDB 的存储引擎为 SSD(一般使用
SSD 盘),步骤如下:
-
-1. 新建存放目录 Data 和 Log,并使其有 `foundationdb` 用户的访问权限:
-
- ```Shell
- $ chown -R foundationdb:foundationdb /mnt/disk1/foundationdb/data/
/mnt/disk1/foundationdb/log
- ```
-
-2. 修改 `/etc/foundationdb/foundationdb.conf` 中 datadir 和 logdir 路径:
-
- ```Shell
- ## Default parameters for individual fdbserver processes
- [fdbserver]
- logdir = /mnt/disk1/foundationdb/log
- datadir = /mnt/disk1/foundationdb/data/$ID
-
-
- [backup_agent]
- command = /usr/lib/foundationdb/backup_agent/backup_agent
- logdir = /mnt/disk1/foundationdb/log
-
- [backup_agent.1]
- ```
-
-3. 调用 `fdbcli` 生成一个以 SSD 为存储引擎的数据库:
-
- ```Shell
- user@host$ fdbcli
- Using cluster file `/etc/foundationdb/fdb.cluster'.
-
- The database is unavailable; type `status' for more information.
-
- Welcome to the fdbcli. For help, type `help'.
- fdb> configure new single ssd
- Database created
- ```
-
-## 测试数据清理 (清理所有数据,仅适用于调试环境)
-
-### 清理集群
-
-正常删掉 doris-meta 和 Storage 信息
-
-清空 FDB 信息 `${instance_id}` 需要用实际的值替代
-
-1. 清理 instance 的信息 (包括 instance 和 cluster 的信息)
-
-2. 清理 meta 信息
-
-3. 清理 Version 信息
-
-4. 清理 txn 信息
-
-5. 清理 stats
-
-6. 清理 job 信息
-
-```shell
-fdbcli --exec "writemode on;clearrange
\x01\x10instance\x00\x01\x10${instance_id}\x00\x01
\x01\x10instance\x00\x01\x10${instance_id}\x00\xff\x00\x01"
-fdbcli --exec "writemode on;clearrange
\x01\x10meta\x00\x01\x10${instance_id}\x00\x01
\x01\x10meta\x00\x01\x10${instance_id}\x00\xff\x00\x01"
-fdbcli --exec "writemode on;clearrange
\x01\x10txn\x00\x01\x10${instance_id}\x00\x01
\x01\x10txn\x00\x01\x10${instance_id}\x00\xff\x00\x01"
-fdbcli --exec "writemode on;clearrange
\x01\x10version\x00\x01\x10${instance_id}\x00\x01
\x01\x10version\x00\x01\x10${instance_id}\x00\xff\x00\x01"
-fdbcli --exec "writemode on;clearrange
\x01\x10stats\x00\x01\x10${instance_id}\x00\x01
\x01\x10stats\x00\x01\x10${instance_id}\x00\xff\x00\x01"
-fdbcli --exec "writemode on;clearrange
\x01\x10recycle\x00\x01\x10${instance_id}\x00\x01
\x01\x10recycle\x00\x01\x10${instance_id}\x00\xff\x00\x01"
-fdbcli --exec "writemode on;clearrange
\x01\x10job\x00\x01\x10${instance_id}\x00\x01
\x01\x10job\x00\x01\x10${instance_id}\x00\xff\x00\x01"
-fdbcli --exec "writemode on;clearrange
\x01\x10copy\x00\x01\x10${instance_id}\x00\x01
\x01\x10copy\x00\x01\x10${instance_id}\x00\xff\x00\x01"
-```
-
-### 清理集群 (清理除 KV 外的数据)
-请按照实际配置的对象存储或者 HDFS 存储的前缀或者目录,直接调用对应存储系统的接口进行前缀或者目录删除。
+Doris **cloud模式****FE****会自动发现对应的BE, 不需通过 alter system add 或者drop backend
操作节点.**
+启动后观察日志.
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/install-fdb.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/install-fdb.md
new file mode 100644
index 00000000000..3b280d08775
--- /dev/null
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/install-fdb.md
@@ -0,0 +1,304 @@
+---
+{
+ "title": "安装fdb",
+ "language": "zh-CN"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+
+## 0. 机器要求
+
+- 一般要求至少3台机器组成一个双副本、允许单机故障的fdb集群。
+
+> 如果只是需要开发、测试,只要一台机器就行。
+
+## 1. 安装
+
+每台机器都需先安装fdb服务。安装包下载地址
https://github.com/apple/foundationdb/releases,选择一个版本,目前一般用的版本
[7.1.38](https://github.com/apple/foundationdb/releases/tag/7.1.38)。
+
+一般关注centos(redhat) 和 ubuntu 的即可
+
+https://github.com/apple/foundationdb/releases/download/7.1.38/foundationdb-clients_7.1.38-1_amd64.deb
+
+https://github.com/apple/foundationdb/releases/download/7.1.38/foundationdb-server_7.1.38-1_amd64.deb
+
+https://github.com/apple/foundationdb/releases/download/7.1.38/foundationdb-clients-7.1.38-1.el7.x86_64.rpm
+
+https://github.com/apple/foundationdb/releases/download/7.1.38/foundationdb-server-7.1.38-1.el7.x86_64.rpm
+
+安装fdb
+
+```Shell
+// ubuntu系统
+user@host$ sudo dpkg -i foundationdb-clients_7.1.23-1_amd64.deb \
+foundationdb-server_7.1.23-1_amd64.deb
+
+// Centos系统
+user@host$ sudo rpm -Uvh foundationdb-clients-7.1.23-1.el7.x86_64.rpm \
+foundationdb-server-7.1.23-1.el7.x86_64.rpm
+```
+
+安装完毕后,在命令行输入fdbcli查看是否安装成功,成功会显示available字样:
+
+```SQL
+user@host$ fdbcli
+Using cluster file `/etc/foundationdb/fdb.cluster'.
+
+The database is available.
+
+Welcome to the fdbcli. For help, type `help'.
+```
+
+安装完毕后有几点需要注意:
+
+- 默认会启动一个fdb服务。
+-
默认集群信息文件fdb.cluster存放在/etc/foundationdb/fdb.cluster、默认集群配置文件foundationdb.conf存放在/etc/foundationdb/foundationdb.conf。
+- 默认data和log保存在/var/lib/foundationdb/data/和/var/log/foundationdb。
+- 默认会新建一个foundationdb的user和group,data和log的路径默认已有foundationdb的访问权限。
+
+## 1. 主机配置
+
+从三台机器中选择一台作为主机,先将主机配置好,再配置其他机器。
+
+### 1. 更改fdb配置
+
+按照不同机型使用不同的fdb配置
+
+目前使用的8核 32G内存 一块数据盘500G机型的foundationdb.conf如下,注意log和data的存放路径,目前的数据盘一般挂载在mnt上:
+
+> 目前只出了 8C32G 和 4C16G 机型的配置模板,部署时请更改对应机型的配置。
+
+```Shell
+# foundationdb.conf
+##
+## Configuration file for FoundationDB server processes
+## Full documentation is available at
+##
https://apple.github.io/foundationdb/configuration.html#the-configuration-file
+
+[fdbmonitor]
+user = foundationdb
+group = foundationdb
+
+[general]
+restart-delay = 60
+## by default, restart-backoff = restart-delay-reset-interval = restart-delay
+# initial-restart-delay = 0
+# restart-backoff = 60
+# restart-delay-reset-interval = 60
+cluster-file = /etc/foundationdb/fdb.cluster
+# delete-envvars =
+# kill-on-configuration-change = true
+
+## Default parameters for individual fdbserver processes
+[fdbserver]
+command = /usr/sbin/fdbserver
+public-address = auto:$ID
+listen-address = public
+logdir = /mnt/foundationdb/log
+datadir = /mnt/foundationdb/data/$ID
+# logsize = 10MiB
+# maxlogssize = 100MiB
+# machine-id =
+# datacenter-id =
+# class =
+# memory = 8GiB
+# storage-memory = 1GiB
+# cache-memory = 2GiB
+# metrics-cluster =
+# metrics-prefix =
+
+## An individual fdbserver process with id 4500
+## Parameters set here override defaults from the [fdbserver] section
+[fdbserver.4500]
+class = stateless
+[fdbserver.4501]
+class = stateless
+
+[fdbserver.4502]
+class = storage
+
+[fdbserver.4503]
+class = storage
+
+[fdbserver.4504]
+class = log
+
+[backup_agent]
+command = /usr/lib/foundationdb/backup_agent/backup_agent
+logdir = /mnt/foundationdb/log
+
+[backup_agent.1]
+```
+
+先按照上述配置的datadir和logdir路径在主机上创建相应的目录,并使其有foundationdb的访问权限:
+
+```Shell
+chown -R foundationdb:foundationdb /mnt/foundationdb/data/
/mnt/foundationdb/log
+```
+
+然后将/etc/foundationdb/foundationdb.conf的内容替换为上述配置内容。
+
+### 1. 配置访问权限
+
+先设置/etc/foundationdb目录的访问权限:
+
+```Shell
+chmod -R 777 /etc/foundationdb
+```
+
+在主机中修改/etc/foundationdb/fdb.cluster中的ip地址,默认是本机地址,修改为内网地址,如
+
+```Shell
+3OrXp9ei:[email protected]:4500 -> 3OrXp9ei:[email protected]:4500
+```
+
+然后重启fdb服务
+
+```Shell
+# for service
+user@host$ sudo service foundationdb restart
+
+# for systemd
+user@host$ sudo systemctl restart foundationdb.service
+```
+
+### 1. 配置新数据库
+
+主机由于更改了data和dir的存放路径,需新建database,在fdbcli中新建一个ssd存储引擎的database。
+
+```Shell
+user@host$ fdbcli
+fdb> configure new single ssd
+Database created
+```
+
+最后通过fdbcli检测是否启动正常
+
+```Shell
+user@host$ fdbcli
+Using cluster file `/etc/foundationdb/fdb.cluster'.
+
+The database is available.
+
+Welcome to the fdbcli. For help, type `help'.
+```
+
+此时主机的配置完成。
+
+## 1. 构建集群
+
+> 如果只部署一台机器做测试,可以直接跳过这个步骤。
+
+对于其余机器,每台先按照2.1步骤,创建data和log目录。
+
+然后设置/etc/foundationdb目录的访问权限:
+
+```Shell
+chmod -R 777 /etc/foundationdb
+```
+
+接着将主机的/etc/foundationdb/foundationdb.conf和/etc/foundationdb/fdb.cluster替换成本机的/etc/foundationdb/foundationdb.conf和/etc/foundationdb/fdb.cluster。
+
+然后在本机重启fdb服务
+
+```Shell
+# for service
+user@host$ sudo service foundationdb restart
+
+# for systemd
+user@host$ sudo systemctl restart foundationdb.service
+```
+
+待所有机器操作完毕后,所有机器都已连接在同一集群上(同一fdb.cluster)。此时登录主机,配置双副本模式:
+
+```Shell
+user@host$ fdbcli
+Using cluster file `/etc/foundationdb/fdb.cluster'.
+
+The database is available.
+
+Welcome to the fdbcli. For help, type `help'.
+fdb> configure double
+Configuration changed.
+```
+
+然后在主机配置fdb.cluster可被访问的机器和端口,用于容灾:
+
+```Shell
+user@host$ fdbcli
+Using cluster file `/etc/foundationdb/fdb.cluster'.
+
+The database is available.
+
+Welcome to the fdbcli. For help, type `help'.
+fdb> coordinators 主机ip:4500 从机1ip:4500 从机2ip:4500(需要填写所有机器)
+Coordinators changed
+```
+
+最后通过fdbcli中的status检测模式是否配置成功:
+
+```Shell
+[root@ip-10-100-3-91 recycler]# fdbcli
+Using cluster file `/etc/foundationdb/fdb.cluster'.
+
+The database is available.
+
+Welcome to the fdbcli. For help, type `help'.
+fdb> status
+
+Using cluster file `/etc/foundationdb/fdb.cluster'.
+
+Configuration:
+ Redundancy mode - double
+ Storage engine - ssd-2
+ Coordinators - 3
+ Usable Regions - 1
+
+Cluster:
+ FoundationDB processes - 15
+ Zones - 3
+ Machines - 3
+ Memory availability - 6.1 GB per process on machine with least available
+ Fault Tolerance - 1 machines
+ Server time - 11/11/22 04:47:30
+
+Data:
+ Replication health - Healthy
+ Moving data - 0.000 GB
+ Sum of key-value sizes - 0 MB
+ Disk space used - 944 MB
+
+Operating space:
+ Storage server - 473.9 GB free on most full server
+ Log server - 473.9 GB free on most full server
+
+Workload:
+ Read rate - 19 Hz
+ Write rate - 0 Hz
+ Transactions started - 5 Hz
+ Transactions committed - 0 Hz
+ Conflict rate - 0 Hz
+
+Backup and DR:
+ Running backups - 0
+ Running DRs - 0
+```
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/meta-service-resource-http-api.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/meta-service-resource-http-api.md
index 4e2c2e67bba..25d7f757404 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/meta-service-resource-http-api.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/meta-service-resource-http-api.md
@@ -47,6 +47,17 @@ PUT /MetaService/http/v1/create_instance?token=<token>
HTTP/1.1
为了保证兼容性,之前的接口(即不带 `v1/`)仍然能访问。
+## 字段值要求
+
+本文档中会出现一些字段值需要特别关注其值以及格式要求.
+
+字段 | 描述 | 备注
+------| ------| ------
+instance_id | 存算分离架构下数仓的id, 一般使用uuid字符串 | 要求历史上唯一
+cloud_unique_id | 存算分离架构下be.conf fe.conf的一个配置, 创建计算集群请求时也需要提供, 格式为
`1:<instance_id>:<string>` | 示例
"1:regression_instance0:regression-cloud-unique-id-1"
+cluster_name | 存算分离架构下描述一个计算集群时需要传入的字段, 格式要求是一个 identifier,
需要匹配模式`[a-zA-Z][0-9a-zA-Z_]+` | 实例 write_cluster 或者 read_cluster0
+
+
## 创建 Instance
### 接口描述
@@ -86,25 +97,25 @@ Content-Type: text/plain
```
* 请求参数
-| 参数名 | 描述 | 是否必须 | 备注
|
-|----------------------------|------------------------|------|--------------------------------|
-| instance_id | instance_id | 是 | 全局唯一 (包括历史上)
|
-| name | instance 别名 | 否 |
|
-| user_id | 用户 id | 是 |
|
-| obj_info | S3 链接配置信息 | 是 |
|
-| obj_info.ak | S3 的 access key | 是 |
|
-| obj_info.sk | S3 的 secret key | 是 |
|
-| obj_info.bucket | S3 的 bucket 名 | 是 |
|
-| obj_info.prefix | S3 上数据存放位置前缀 | 否 | 不填的话,在 bucket
的根目录 |
-| obj_info.endpoint | S3 的 endpoint 信息 | 是 |
|
-| obj_info.region | S3 的 region 信息 | 是 |
|
-| obj_info.external_endpoint | S3 的 external endpoint 信息 | 否 | 兼容 oss,oss 有
external、internal 区别 |
-| obj_info.provider | S3 的 provider 信息 | 是 | |
-| obj_info.user_id | bucket 的 user_id | 否 | 轮转 ak sk 使用,用于标识哪些
obj 需更改 ak sk |
-| ram_user | ram_user 信息,用于外部 bucket 授权 | 否 | |
-| ram_user.user_id | | 是 | |
-| ram_user.ak | | 是 | |
-| ram_user.sk | | 是 | |
+| 参数名 | 描述 | 是否必须 | 备注
|
+|----------------------------|------------------------ |------
|-------------------------------- |
+| instance_id | instance_id | 是 |
全局唯一 (包括历史上), 一般是使用一个uuid 字符串|
+| name | instance 别名 | 否 |
|
+| user_id | 用户 id | 是 |
|
+| obj_info | S3 链接配置信息 | 是 |
|
+| obj_info.ak | S3 的 access key | 是 |
|
+| obj_info.sk | S3 的 secret key | 是 |
|
+| obj_info.bucket | S3 的 bucket 名 | 是 |
|
+| obj_info.prefix | S3 上数据存放位置前缀 | 否 | 不填的话,在
bucket 的根目录 |
+| obj_info.endpoint | S3 的 endpoint 信息 | 是 |
|
+| obj_info.region | S3 的 region 信息 | 是 |
|
+| obj_info.external_endpoint | S3 的 external endpoint 信息 | 否 | 兼容
oss,oss 有 external、internal 区别 |
+| obj_info.provider | S3 的 provider 信息 | 是 |
|
+| obj_info.user_id | bucket 的 user_id | 否 |
轮转 ak sk 使用,用于标识哪些 obj 需更改 ak sk |
+| ram_user | ram_user 信息,用于外部 bucket 授权 | 否 |
|
+| ram_user.user_id | | 是 |
|
+| ram_user.ak | | 是 |
|
+| ram_user.sk | | 是 |
|
* 请求示例
@@ -279,7 +290,7 @@ Content-Type: text/plain
"type": "SQL",
"nodes": [
{
- "cloud_unique_id": "regression-cloud-unique-id-fe-1",
+ "cloud_unique_id":
"1:regression_instance0:regression-cloud-unique-id-fe-1",
"ip": "127.0.0.1",
"ctime": "1669260437",
"mtime": "1669260437",
@@ -294,7 +305,7 @@ Content-Type: text/plain
"type": "COMPUTE",
"nodes": [
{
- "cloud_unique_id": "regression-cloud-unique-id0",
+ "cloud_unique_id":
"1:regression_instance0:regression-cloud-unique-id0",
"ip": "127.0.0.1",
"ctime": "1669260437",
"mtime": "1669260437",
@@ -311,7 +322,7 @@ Content-Type: text/plain
"type": "COMPUTE",
"nodes": [
{
- "cloud_unique_id": "regression-cloud-unique-id0",
+ "cloud_unique_id":
"1:regression_instance0:regression-cloud-unique-id0",
"ip": "127.0.0.1",
"ctime": "1669260437",
"mtime": "1669260437",
@@ -329,7 +340,7 @@ Content-Type: text/plain
"type": "COMPUTE",
"nodes": [
{
- "cloud_unique_id": "regression-cloud-unique-id0",
+ "cloud_unique_id":
"1:regression_instance0:regression-cloud-unique-id0",
"ip": "127.0.0.1",
"ctime": "1669260437",
"mtime": "1669260437",
@@ -422,19 +433,19 @@ Content-Type: text/plain
```
* 请求参数
-| 参数名 | 描述 | 是否必须 | 备注
|
-|-------------------------------|--------------------|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| instance_id | instance_id | 是 | 全局唯一 (包括历史上)
|
-| cluster | cluster 对象信息 | 是 |
|
-| cluster.cluster_name | cluster 的名字 | 是 | 其中 fe 的 cluster
名字特殊,默认 RESERVED_CLUSTER_NAME_FOR_SQL_SERVER,可在 fe.conf 中配置
cloud_observer_cluster_name 修改
|
-| cluster.cluster_id | cluster 的 id | 是 | 其中 fe 的
cluster id 特殊,默认 RESERVED_CLUSTER_ID_FOR_SQL_SERVER,可在 fe.conf 中配置
cloud_observer_cluster_id 修改
|
-| cluster.type | cluster 中节点的类型 | 是 |
支持:"SQL","COMPUTE"两种 type,"SQL"表示 sql service 对应 fe, "COMPUTE"表示计算机节点对应 be
|
-| cluster.nodes | cluster 中的节点数组 | 是 |
|
-| cluster.nodes.cloud_unique_id | 节点的 cloud_unique_id | 是 | 是
fe.conf、be.conf 中的 cloud_unique_id 配置项
|
-| cluster.nodes.ip | 节点的 ip | 是 |
|
-| cluster.nodes.heartbeat_port | be 的 heartbeat port | 是 | 是 be.conf 中的
heartbeat_service_port 配置项
|
-| cluster.nodes.edit_log_port | fe 节点的 edit log port | 是 | 是 fe.conf 中的
edit_log_port 配置项
|
-| cluster.nodes.node_type | fe 节点的类型 | 是 | 当 cluster 的
type 为 SQL 时,需要填写,分为"FE_MASTER" 和 "FE_OBSERVER", 其中"FE_MASTER" 表示此节点为 master,
"FE_OBSERVER"表示此节点为 observer,注意:一个 type 为"SQL"的 cluster 的 nodes
数组中只能有一个"FE_MASTER"节点,和若干"FE_OBSERVER"节点 |
+| 参数名 | 描述 | 是否必须 | 备注
|
+|-------------------------------|-------------------- |------
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
+| instance_id | instance_id | 是 | 全局唯一
(包括历史上)
|
+| cluster | cluster 对象信息 | 是 |
|
+| cluster.cluster_name | cluster 的名字 | 是 | 其中 fe 的
cluster 名字特殊,默认 RESERVED_CLUSTER_NAME_FOR_SQL_SERVER,可在 fe.conf 中配置
cloud_observer_cluster_name 修改
|
+| cluster.cluster_id | cluster 的 id | 是 | 其中 fe 的
cluster id 特殊,默认 RESERVED_CLUSTER_ID_FOR_SQL_SERVER,可在 fe.conf 中配置
cloud_observer_cluster_id 修改
|
+| cluster.type | cluster 中节点的类型 | 是 |
支持:"SQL","COMPUTE"两种 type,"SQL"表示 sql service 对应 fe, "COMPUTE"表示计算机节点对应 be
|
+| cluster.nodes | cluster 中的节点数组 | 是 |
|
+| cluster.nodes.cloud_unique_id | 节点的 cloud_unique_id | 是 | 是
fe.conf、be.conf 中的 cloud_unique_id 配置项
|
+| cluster.nodes.ip | 节点的 ip | 是 |
|
+| cluster.nodes.heartbeat_port | be 的 heartbeat port | 是 | 是 be.conf
中的 heartbeat_service_port 配置项
|
+| cluster.nodes.edit_log_port | fe 节点的 edit log port | 是 | 是 fe.conf
中的 edit_log_port 配置项
|
+| cluster.nodes.node_type | fe 节点的类型 | 是 | 当 cluster 的
type 为 SQL 时,需要填写,分为"FE_MASTER" 和 "FE_OBSERVER", 其中"FE_MASTER" 表示此节点为 master,
"FE_OBSERVER"表示此节点为 observer,注意:一个 type 为"SQL"的 cluster 的 nodes
数组中只能有一个"FE_MASTER"节点,和若干"FE_OBSERVER"节点 |
* 请求示例
@@ -451,7 +462,7 @@ Content-Type: text/plain
"type": "COMPUTE",
"nodes": [
{
- "cloud_unique_id": "cloud_unique_id_compute_node1",
+ "cloud_unique_id":
"1:regression_instance0:cloud_unique_id_compute_node1",
"ip": "172.21.0.5",
"heartbeat_port": 9050
}
@@ -524,7 +535,7 @@ Content-Type: text/plain
{
"instance_id":"regression_instance0",
- "cloud_unique_id":"regression-cloud-unique-id-fe-1",
+ "cloud_unique_id":"1:regression_instance0:regression-cloud-unique-id-fe-1",
"cluster_name":"RESERVED_CLUSTER_NAME_FOR_SQL_SERVER",
"cluster_id":"RESERVED_CLUSTER_ID_FOR_SQL_SERVER"
}
@@ -550,7 +561,7 @@ Content-Type: text/plain
"type": "COMPUTE",
"nodes": [
{
- "cloud_unique_id": "cloud_unique_id_compute_node0",
+ "cloud_unique_id":
"1:regression_instance0:cloud_unique_id_compute_node0",
"ip": "172.21.16.42",
"ctime": "1662695469",
"mtime": "1662695469",
@@ -565,7 +576,7 @@ Content-Type: text/plain
```
{
"code": "NOT_FOUND",
- "msg": "fail to get cluster with instance_id: \"instance_id_deadbeef\"
cloud_unique_id: \"dengxin_cloud_unique_id_compute_node0\" cluster_name:
\"cluster_name\" "
+ "msg": "fail to get cluster with instance_id: \"instance_id_deadbeef\"
cloud_unique_id: \"1:regression_instance0:xxx_cloud_unique_id_compute_node0\"
cluster_name: \"cluster_name\" "
}
```
@@ -778,12 +789,12 @@ Content-Type: text/plain
"type": "COMPUTE",
"nodes": [
{
- "cloud_unique_id": "cloud_unique_id_compute_node2",
+ "cloud_unique_id":
"1:regression_instance0:cloud_unique_id_compute_node2",
"ip": "172.21.0.50",
"heartbeat_port": 9051
},
{
- "cloud_unique_id": "cloud_unique_id_compute_node3",
+ "cloud_unique_id":
"1:regression_instance0:cloud_unique_id_compute_node3",
"ip": "172.21.0.52",
"heartbeat_port": 9052
}
@@ -879,12 +890,12 @@ Content-Type: text/plain
"type": "COMPUTE",
"nodes": [
{
- "cloud_unique_id": "cloud_unique_id_compute_node2",
+ "cloud_unique_id":
"1:instance_id_deadbeef_1:cloud_unique_id_compute_node2",
"ip": "172.21.0.50",
"heartbeat_port": 9051
},
{
- "cloud_unique_id": "cloud_unique_id_compute_node3",
+ "cloud_unique_id":
"1:instance_id_deadbeef_1:cloud_unique_id_compute_node3",
"ip": "172.21.0.52",
"heartbeat_port": 9052
}
@@ -1014,7 +1025,7 @@ PUT /MetaService/http/get_obj_store_info?token=<token>
HTTP/1.1
Content-Length: <ContentLength>
Content-Type: text/plain
-{"cloud_unique_id": "cloud_unique_id_compute_node1"}
+{"cloud_unique_id": "<cloud_unique_id>"}
```
* 请求参数
@@ -1029,7 +1040,7 @@ PUT /MetaService/http/get_obj_store_info?token=<token>
HTTP/1.1
Content-Length: <ContentLength>
Content-Type: text/plain
-{"cloud_unique_id": "cloud_unique_id_compute_node1"}
+{"cloud_unique_id": "1:regression_instance0:cloud_unique_id_compute_node1"}
```
* 返回参数
@@ -1218,7 +1229,7 @@ Content-Length: <ContentLength>
Content-Type: text/plain
{
- "cloud_unique_id": "cloud_unique_id_compute_node1",
+ "cloud_unique_id": "1:regression_instance0:cloud_unique_id_compute_node1",
"obj": {
"id": "1",
"ak": "test-ak",
@@ -1300,7 +1311,7 @@ Content-Length: <ContentLength>
Content-Type: text/plain
{
- "cloud_unique_id": "cloud_unique_id_compute_node1",
+ "cloud_unique_id": "1:regression_instance0:cloud_unique_id_compute_node1",
"obj": {
"ak": "test-ak91",
"sk": "test-sk1",
@@ -1421,7 +1432,7 @@ Content-Length: <ContentLength>
Content-Type: text/plain
{
- "cloud_unique_id":"regression-cloud-unique-id0",
+ "cloud_unique_id":"1:regression_instance0:regression-cloud-unique-id0",
"tablet_idx": [{
"table_id":113973,
"index_id":113974,
@@ -1498,7 +1509,7 @@ Content-Length: <ContentLength>
Content-Type: text/plain
{
- "cloud_unique_id": "regression-cloud-unique-id0",
+ "cloud_unique_id": "1:regression_instance0:regression-cloud-unique-id0",
"txn_id": 869414052004864
}
@@ -1555,7 +1566,7 @@ Content-Length: <ContentLength>
Content-Type: text/plain
{
- "cloud_unique_id": "regression-cloud-unique-id0",
+ "cloud_unique_id": "1:regression_instance0:regression-cloud-unique-id0",
"job" : {
"idx": {"tablet_id": 113973},
"compaction": [{"id": 113974}]
@@ -1821,7 +1832,7 @@ Content-Type: text/plain
```
curl
'127.0.0.1:5008/MetaService/http/set_cluster_status?token=greedisgood9999' -d '{
- "cloud_unique_id": "regression-cloud-unique-id-fe-0128",
+ "cloud_unique_id":
"1:regression_instance0:regression-cloud-unique-id-fe-0128",
"cluster": {
"cluster_id": "test_cluster_1_id1",
"cluster_status":"STOPPED"
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/storage-vault.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/storage-vault.md
new file mode 100644
index 00000000000..aeffcb13a47
--- /dev/null
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/separation-of-storage-and-compute/storage-vault.md
@@ -0,0 +1,174 @@
+---
+{
+ "title": "storage-vault(存储后端)",
+ "language": "zh-CN"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## 概览
+
+Storage Vault 是doris存算分离模式使用的远程共享存储。用户可以配置一个或者多个 Storage Vault ,不同表允许存储在不同
Storage Vault 上。
+
+## 名词解释
+
+vault name:每个 Storage Vault 有一个数仓实例内全局唯一的名称,除 built-in vault 外,vault name
由用户创建 Storage Vault 时指定。
+
+built-in vault:存算分离模式用于存储 Doris 系统表的远程共享存储,必须在创建数仓实例时就配置好。built-in vault 有固定
vault name "built_in_storage_vault"。必须配置 built-in vault,数仓(FE) 才能启动。
+
+default vault:数仓实例级别的默认 Storage Vault,用户可以指定某个 Storage Vault 为 default
vault,包括 built-in vault 也可以作为 default
vault。由于存算分离模式中,数据必须要存储在某个远程共享存储上,如果用户建表时没有在 table properties 中指定
vault_name,则该表数据会存储在 default vault 上。default vault 可以被重新设置,但是已经创建的表所使用的 Storage
Vault 不会因此改变。
+
+## 使用
+
+### Create Storage Vault
+
+创建 Storage Vault。
+
+语法:
+
+```SQL
+CREATE STORAGE VAULT [IF NOT EXISTS] <vault_name>
+PROPERTIES
+("key" = "value",...)
+```
+
+<vault_name> 是用户定义的 storage vault 的名称,是用户接口用于访问 storage vault 的标识
+
+e.g.
+
+ 创建 HDFS storage vault
+
+```SQL
+CREATE STORAGE VAULT IF NOT EXISTS ssb_hdfs_vault
+ PROPERTIES (
+ "type"="hdfs", -- required
+ "fs.defaultFS"="hdfs://127.0.0.1:8020", -- required
+ "path_prefix"="prefix", -- optional -> Gavin希望是required
+ "hadoop.username"="user" -- optional
+ "hadoop.security.authentication"="kerberos" -- optional
+ "hadoop.kerberos.principal"="hadoop/127.0.0.1@XXX" -- optional
+ "hadoop.kerberos.keytab"="/etc/emr.keytab" -- optional
+ );
+```
+
+创建 s3 storage vault
+
+```SQL
+CREATE STORAGE VAULT IF NOT EXISTS ssb_hdfs_vault
+ PROPERTIES (
+ "type"="S3", -- required
+ "s3.endpoint" = "bj", -- required
+ "s3.region" = "bj", -- required
+ "s3.root.path" = "/path/to/root", -- required
+ "s3.access_key" = "ak", -- required
+ "s3.secret_key" = "sk", -- required
+ "provider" = "cos", -- required
+ );
+```
+
+注意:新创建的 Storage Vault 对 BE 集群不一定能实时可见,短时间(< 1min)内向使用新 Storage Vault
的表导入数据报错是正常现象。
+
+Properties 参数
+
+| 参数 | 说明 | 示例
|
+| ------------------------------ | ------------------- |
------------------------------- |
+| type | 目前支持 s3 和 hdfs | s3 \| hdfs
|
+| fs.defaultFS | HDFS Vault 参数 | hdfs://127.0.0.1:8020
|
+| hadoop.username | HDFS Vault 参数 | hadoop
|
+| hadoop.security.authentication | HDFS Vault 参数 | kerberos
|
+| hadoop.kerberos.principal | HDFS Vault 参数 | hadoop/127.0.0.1@XXX
|
+| hadoop.kerberos.keytab | HDFS Vault 参数 | /etc/emr.keytab
|
+| dfs.client.socket-timeout | HDFS Vault 参数 |
dfs.client.socket-timeout=60000 |
+
+### Show Storage Vault
+
+语法:
+
+```Plain
+SHOW STORAGE VAULT
+```
+
+show出来4列,一列是name 一列是id 一列是属性 一列是是否是default
+
+```SQL
+mysql> show storage vault;
++------------------------+----------------+-------------------------------------------------------------------------------------------------+-----------+
+| StorageVaultName | StorageVaultId | Propeties
| IsDefault |
++------------------------+----------------+-------------------------------------------------------------------------------------------------+-----------+
+| built_in_storage_vault | 1 | build_conf { fs_name:
"hdfs://127.0.0.1:8020" } prefix: "_1CF80628-16CF-0A46-54EE-2C4A54AB1519" |
false |
+| hdfs_vault | 2 | build_conf { fs_name:
"hdfs://127.0.0.1:8020" } prefix: "_0717D76E-FF5E-27C8-D9E3-6162BC913D97" |
false |
++------------------------+----------------+-------------------------------------------------------------------------------------------------+-----------+
+```
+
+### Set Default Storage Vault
+
+语法:
+
+```SQL
+SET <vault_name> AS DEFAULT STORAGE VAULT
+```
+
+### 指定 vault name 建表
+
+建表时在 properties 中指定 "storage_vault",则数据会存储在指定的 vault name 对应的 Storage Vault
上。建表成功后,该表不允许再修改 storage_vault,即不支持更换 Storage Vault。
+
+e.g.
+
+```SQL
+CREATE TABLE IF NOT EXISTS `supplier` (
+ `s_suppkey` int(11) NOT NULL COMMENT "",
+ `s_name` varchar(26) NOT NULL COMMENT "",
+ `s_address` varchar(26) NOT NULL COMMENT "",
+ `s_city` varchar(11) NOT NULL COMMENT "",
+ `s_nation` varchar(16) NOT NULL COMMENT "",
+ `s_region` varchar(13) NOT NULL COMMENT "",
+ `s_phone` varchar(16) NOT NULL COMMENT ""
+)
+UNIQUE KEY (`s_suppkey`)
+DISTRIBUTED BY HASH(`s_suppkey`) BUCKETS 1
+PROPERTIES (
+"replication_num" = "1",
+"storage_vault_name" = "ssb_hdfs_vault"
+);
+```
+
+### Built-in storage vault
+
+用户在创建create instance的时候可以选择vault mode或者非vault mode,如果选择的是vault
mode,传递进去的vault则会被设置为built-in storage vault. Built-in storage
vault是用来保存内部表的信息的(比如统计信息表),在vault模式下如果没有创建built-in storage vault则FE是无法正常启动的。
+
+用户也可以选择将自己的新的表的数据存储在built-in storage vault之上,可以通过将built-in storage
vault设置为default storage vault或者在建表的时候将表的storage_vault_name属性设置为builtin storage
vault实现.
+
+### Alter Storage Vault
+
+TBD
+
+用于更新 Storage Vault 配置的可修改属性。
+
+### Drop Storage Vault
+
+TBD
+
+只有不是 default vault 且没有被任何表引用的 Storage Vault 可以被 drop
+
+
+### 权限
+
+TBD
diff --git a/sidebars.json b/sidebars.json
index b11444ecb82..4c1a3c6efb9 100644
--- a/sidebars.json
+++ b/sidebars.json
@@ -1456,11 +1456,13 @@
"label": "Separation of Storage and Compute",
"items": [
"separation-of-storage-and-compute/overview",
- "separation-of-storage-and-compute/use-case",
+ "separation-of-storage-and-compute/deployment",
"separation-of-storage-and-compute/compute-cluster",
"separation-of-storage-and-compute/file-cache",
- "separation-of-storage-and-compute/deployment",
-
"separation-of-storage-and-compute/meta-service-resource-http-api"
+ "separation-of-storage-and-compute/storage-vault",
+
"separation-of-storage-and-compute/meta-service-resource-http-api",
+ "separation-of-storage-and-compute/install-fdb",
+ "separation-of-storage-and-compute/use-case"
]
},
{
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]