This is an automated email from the ASF dual-hosted git repository.
ethanfeng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/celeborn.git
The following commit(s) were added to refs/heads/main by this push:
new 91d8f955c [CELEBORN-1622][CIP-11] Adding documentation for Worker Tags
feature
91d8f955c is described below
commit 91d8f955cadff49ce5db9d97dd4158864a5def6c
Author: Sanskar Modi <[email protected]>
AuthorDate: Tue Dec 10 15:56:58 2024 +0800
[CELEBORN-1622][CIP-11] Adding documentation for Worker Tags feature
### What changes were proposed in this pull request?
Adding documentation for Worker Tags feature
### Why are the changes needed?
https://cwiki.apache.org/confluence/display/CELEBORN/CIP-11+Supporting+Tags+in+Celeborn
### Does this PR introduce _any_ user-facing change?
NA
### How was this patch tested?
NA
Closes #2981 from s0nskar/tags_docu.
Authored-by: Sanskar Modi <[email protected]>
Signed-off-by: mingji <[email protected]>
---
docs/worker_tags.md | 171 ++++++++++++++++++++++++++++++++++++++++++++++++++++
mkdocs.yml | 1 +
2 files changed, 172 insertions(+)
diff --git a/docs/worker_tags.md b/docs/worker_tags.md
new file mode 100644
index 000000000..b2408c90d
--- /dev/null
+++ b/docs/worker_tags.md
@@ -0,0 +1,171 @@
+---
+license: |
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ https://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+---
+
+# Worker Tags
+
+Worker tags in Celeborn allow users to assign specific tags (labels) to workers
+within a cluster. These tags enable grouping workers with similar
characteristics,
+allowing applications with different priorities or users to access distinct
+groups of workers, thereby creating isolated sub-clusters.
+
+Worker tags can be applied for various purposes, including but not limited to:
+
+ - **Configuration-Based Tagging**: Workers tagged by hardware configurations
(e.g., "hdd-14t", "ssd-245g", "high-nw").
+ - **Environment Segmentation**: Workers grouped by environment names, such
as "production" or "staging".
+ - **Tenant Isolation**: Tags for different tenants to ensure resource
isolation.
+ - **Rolling Upgrades**: Tags like "v0-6-0" to manage controlled rolling
upgrades effectively.
+
+## Configuration
+
+Worker tags can be enabled by setting `celeborn.tags.enabled` to `true` in the
+`Master`. When enabled, `Master` will start selecting the workers using
`TagsManager`
+based on the tag expression provided for the application. If the worker tagging
+is disabled or the application tag expression is empty, then all the available
+workers will be selected.
+
+Worker tags are part of `SystemConfig` and can be assigned and updated
dynamically
+using [dynamic config service backends](#store-backends).
+
+### Tags Expression
+
+Tags expression for an application is specified using `celeborn.tags.tagsExpr`.
+This is a dynamic configuration that can be applied at the system, tenant, or
+tenant-user level by administrators via the dynamic configuration service.
+
+Clients can also specify custom tag expressions for applications using the
+`celeborn.tags.tagsExpr` property, if the administrator sets
+`celeborn.tags.preferClientTagsExpr` to true in the dynamic configuration.
+
+Tags expressions are defined as a comma-separated list of tags, where each tag
+is evaluated as an "AND" condition. This means only workers that match all
+specified tags will be selected.
+
+Example tag expression: `env=production,region=us-east,high-io,v0-0-6`
+This tags expression will select workers that have all the following tags:
+ - `env=production`
+ - `region=us-east`
+ - `high-io`
+ - `v0-0-6`
+
+### TagsQL
+
+TagsQL extends the default tags expression format, offering enhanced
flexibility
+for worker tag selection. TagsQL can be enabled by setting
`celeborn.tags.useTagsQL`
+to `true` in `Master`.
+
+TagsQL allows users to select workers based on tag key-value pairs and supports
+the following syntax:
+ - Match single value: `key:value`
+ - Negate single value: `key:!value`
+ - Match list of values: `key:{value1,value2}`
+ - Negate list of values: `key:!{value1,value2}`
+
+Example TagsQL expression: `env:production region:{us-east,us-west}
env:!sandbox`
+This tags expression will select the workers that have the following tags:
+ - `env=production`
+ - `region=us-east` OR `region=us-west`
+
+and will ignore the workers that have the following tags:
+ - `env=sandbox`
+
+**NOTE: TagsQL only supports tags key-value pairs separate by a equal sign
(`=`).**
+
+### Store Backends
+
+#### FileSystem Store Backend
+
+This backend reads [worker tags and configurations](#configuration) settings
from a
+user-specified dynamic config file. For more information on using the
FileSystem config store
+backend, refer to [filesystem config
service](../developers/configuration#filesystem-config-service).
+
+Here is an example of worker tags assignment and configuration via YAML file:
+
+```yaml
+- level: SYSTEM
+ config:
+ celeborn.tags.preferClientTagsExpr: false
+ celeborn.tags.tagsExpr: 'env=production'
+ tags:
+ env=production:
+ - 'host1:1111'
+ - 'host2:2222'
+ env=staging:
+ - 'host3:3333'
+ region=us-east:
+ - 'host1:1111'
+ - 'host3:3333'
+ region=us-west:
+ - 'host2:2222'
+- tenantId: tenant_01
+ level: TENANT
+ config:
+ celeborn.tags.preferClientTagsExpr: false
+ celeborn.tags.tagsExpr: 'env=production,region=us-east'
+ users:
+ - name: Jerry
+ config:
+ celeborn.tags.preferClientTagsExpr: true
+```
+
+#### Database Store Backend
+
+This backend reads [worker tags and configurations](#configuration) settings
from a
+user-specified database. For more information on using the database store
backend,
+refer to [database config
service](../developers/configuration#database-config-service).
+
+Here is an example SQL of worker tags assignment and configuration:
+
+```sql
+# SYSTEM level configuration
+INSERT INTO `celeborn_cluster_system_config` ( `id`, `cluster_id`,
`config_key`, `config_value`, `type`, `gmt_create`, `gmt_modify` )
+VALUES
+ ( 1, 1, 'celeborn.tags.preferClientTagsExpr', 'true', 'master',
'2024-02-27 22:08:30', '2024-02-27 22:08:30' ),
+ ( 2, 1, 'celeborn.tags.tagsExpr', 'env=production', 'master', '2024-02-27
22:08:30', '2024-02-27 22:08:30' ),
+
+# TENANT/TENANT_USER level configuration
+INSERT INTO `celeborn_cluster_tenant_config` ( `id`, `cluster_id`,
`tenant_id`, `level`, `name`, `config_key`, `config_value`, `type`,
`gmt_create`, `gmt_modify` )
+VALUES
+ ( 1, 1, 'tenant_01', 'TENANT', '', 'celeborn.tags.preferClientTagsExpr',
'true', 'master', '2024-02-27 22:08:30', '2024-02-27 22:08:30' ),
+ ( 2, 1, 'tenant_01', 'TENANT', '', 'celeborn.tags.tagsExpr',
'env=production,region=us-east', 'master', '2024-02-27 22:08:30', '2024-02-27
22:08:30' ),
+ ( 3, 1, 'tenant_01', 'TENANT_USER', 'Jerry',
'celeborn.tags.preferClientTagsExpr', 'true', 'master', '2024-02-27 22:08:30',
'2024-02-27 22:08:30' ),
+
+# Worker Tags Assignment
+INSERT INTO `celeborn_cluster_tags` ( `id`, `cluster_id`, `tag`, `worker_id`,
`gmt_create`, `gmt_modify` )
+VALUES
+ ( 1, 1, 'env=production', 'host1:1111', '2023-08-26 22:08:30', '2023-08-26
22:08:30' ),
+ ( 2, 1, 'env=production', 'host2:2222', '2023-08-26 22:08:30', '2023-08-26
22:08:30' ),
+ ( 3, 1, 'env=staging', 'host3:3333', '2023-08-26 22:08:30', '2023-08-26
22:08:30' ),
+ ( 4, 1, 'region=us-east', 'host1:1111', '2023-08-26 22:08:30', '2023-08-26
22:08:30' ),
+ ( 5, 1, 'region=us-east', 'host3:3333', '2023-08-26 22:08:30', '2023-08-26
22:08:30' ),
+ ( 6, 1, 'region=us-west', 'host2:2222', '2023-08-26 22:08:30', '2023-08-26
22:08:30' ),
+```
+
+## FAQ
+
+#### - What happens if no worker matches the specified tagsExpr?
+If no worker matches the specified tags expression, no workers will be selected
+for the shuffle. Depending on application configurations, it can fall back to
+Spark Shuffle.
+
+#### - Can a worker have multiple tags, and can tags be updated for a running
worker?
+Yes, a worker can have multiple tags, and they can be dynamically updated for a
+running/non-running worker via dynamic config service.
+
+#### - Are there restrictions on the tag naming format?
+Tags should be alphanumeric and can include dashes, underscores, and equal
signs.
+Avoid any special characters to ensure compatibility.
diff --git a/mkdocs.yml b/mkdocs.yml
index 9b2b7a180..c3b34c48c 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -78,6 +78,7 @@ nav:
- Upgrading: upgrading.md
- Decommissioning: decommissioning.md
- Cluster Planning: cluster_planning.md
+ - Worker Tags: worker_tags.md
- Configuration: configuration/index.md
- Migration Guide: migration.md
- Developers Doc: