This is an automated email from the ASF dual-hosted git repository.
yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new 9528f0e0a47 refactor workload group en (#1651)
9528f0e0a47 is described below
commit 9528f0e0a4715e19a94a37d046a7336b1c671894
Author: wangbo <[email protected]>
AuthorDate: Fri Dec 27 21:12:32 2024 +0800
refactor workload group en (#1651)
## Versions
- [x] dev
- [ ] 3.0
- [ ] 2.1
- [ ] 2.0
## Languages
- [ ] Chinese
- [x] English
## Docs Checklist
- [ ] Checked by AI
- [ ] Test Cases Built
---
.../workload-management/workload-group.md | 737 +++++++++++++++++----
.../workload-management/workload-group.md | 6 +-
.../workload-management/workload-group.md | 6 +-
.../workload-management/workload-group.md | 6 +-
4 files changed, 622 insertions(+), 133 deletions(-)
diff --git a/docs/admin-manual/workload-management/workload-group.md
b/docs/admin-manual/workload-management/workload-group.md
index dc491a96259..50e3cb65ca3 100644
--- a/docs/admin-manual/workload-management/workload-group.md
+++ b/docs/admin-manual/workload-management/workload-group.md
@@ -24,137 +24,171 @@ specific language governing permissions and limitations
under the License.
-->
-Use Workload Groups in Doris to manage and limit resources. By employing
resource control, you can effectively limit the CPU, memory, and IO resources
used by queries and imports, and create query queues to manage the maximum
concurrency of queries in the cluster. Since Doris version 2.1, CPU resource
limitations are enforced using CGroup. Before using the Workload resource
control feature, you need to configure the CGroup environment. When setting up
Workload resource control, you must [...]
-- Soft Limit: Allows borrowing resources from other Workload Groups when there
is no resource contention, potentially exceeding the soft limit.
-- Hard Limit: Ensures that the resource allocation cannot exceed the specified
quota, regardless of resource contention.
-To use Workload resource control, you need to perform the following steps:
-1. Create a Workload Group.
-2. Add resource limitation rules to the Workload Group.
-3. Bind tenants to the Workload Group.
+Workload Group is an in-process mechanism for isolating workloads.
+It achieves resource isolation by finely partitioning or limiting resources
(CPU, IO, Memory) within the BE process.
+Its principle is illustrated in the diagram below:
-## Version Upgrade Notes
-Workload resource control has been available since Doris version 2.0. In Doris
2.0, Workload resource control did not depend on CGroup, but Doris 2.1 requires
CGroup.
+
-Upgrading from Doris 2.0 to 2.1: Since Workload resource control in version
2.1 depends on CGroup, you must first configure the CGroup environment before
upgrading to Doris 2.1.
+The currently supported isolation capabilities include:
-## Configuring the CGroup Environment
-In Doris version 2.0, CPU resource limitation was implemented based on Doris's
scheduling, which provided great flexibility but lacked precise CPU isolation.
From version 2.1, Doris uses CGroup for CPU resource limitation. Users
requiring strong resource isolation are recommended to upgrade to version 2.1
and ensure CGroup is installed on all BE nodes.
+* Managing CPU resources, with support for both cpu hard limit and cpu soft
limit;
+* Managing memory resources, with support for both memory hard limit and
memory soft limit;
+* Managing IO resources, including IO generated by reading local and remote
files.
-If you used soft limits in Workload Groups in version 2.0 and upgraded to 2.1,
you also need to configure CGroup to avoid losing soft limit functionality.
Without CGroup configured, users can use all Workload Group features except CPU
limitation.
-:::tip
-1. The Doris BE node can effectively utilize the CPU and memory resources of
the machine. It is recommended to deploy only one BE instance per machine.
Currently, the workload resource management does not support deploying multiple
BE instances on a single machine.
-2. After a machine restart, the following CGroup configurations will be
cleared. If you want the configurations to persist after a reboot, you can use
systemd to set the operation as a custom system service. This way, the creation
and authorization operations will be automatically performed each time the
machine restarts.
-3. If CGroup is used within a container, the container needs to have
permissions to operate on the host machine.
- :::
+## Version Notes
-### Verifying CGroup Installation on BE Nodes
-Check /proc/filesystems to determine if CGroup is installed:
-cat /proc/filesystems | grep cgroup
-nodev cgroup
-nodev cgroup2
-nodev cgroupfs
-Look for cgroup, cgroup2, or cgroupfs in the output, indicating CGroup
support. Further verify the CGroup version.
+- The Workload Group feature has been available since Doris 2.0. In Doris 2.0,
the Workload Group feature does not rely on CGroup, but starting with Doris
2.1, it requires CGroup.
-#### Determining CGroup Version
-For CGroup V1, multiple subsystems are mounted under /sys/fs/cgroup. The
presence of /sys/fs/cgroup/cpu indicates CGroup V1 is in use:
-```
-## CGroup V1 is in use
-ls /sys/fs/cgroup/cpu
-```
+- Upgrading from Doris 1.2 to 2.0: It is recommended to enable the Workload
Group feature only after the entire cluster has been upgraded. If only some
follower FE nodes are upgraded, queries on the upgraded follower FE nodes may
fail due to the absence of Workload Group metadata on the non-upgraded FE nodes.
-For CGroup V2, all controllers are managed in a unified hierarchy. The
presence of /sys/fs/cgroup/cgroup.controllers indicates CGroup V2 is in use:
-```
-## CGroup V2 is in use
-ls /sys/fs/cgroup/cgroup.controllers
+- Upgrading from Doris 2.0 to 2.1: Since the Workload Group feature in Doris
2.1 relies on CGroup, you need to configure the CGroup environment before
upgrading to Doris 2.1.
+
+## Configuring Workload Group
+
+### Setting Up the CGroup Environment
+Workload Group supports managing CPU, memory, and IO. CPU management relies on
the CGroup component.
+To use Workload Group for CPU resource management, you must first configure
the CGroup environment.
+
+The following are the steps for configuring the CGroup environment:
+
+1. First, verify whether the node where the BE is located has CGroup
installed.
+If the output includes cgroup, it indicates that CGroup V1 is installed in the
current environment.
+If it includes cgroup2, it indicates that CGroup V2 is installed. You can
determine which version is active in the next step.
+```shell
+cat /proc/filesystems | grep cgroup
+nodev cgroup
+nodev cgroup2
+nodev cgroupfs
```
-Configure CGroup based on its version when using Workload resource control in
Doris.
-### Using CGroup V1
-If using CGroup V1, you need to create a CPU management directory for Doris
under the /sys/fs/cgroup/cpu directory. You can customize the directory name.
In the following example, /sys/fs/cgroup/cpu/doris is used:
+2. The active CGroup version can be confirmed based on the path name.
+```shell
+If this path exists, it indicates that CGroup V1 is currently active.
+/sys/fs/cgroup/cpu/
+
+
+If this path exists, it indicates that CGroup V2 is currently active.
+/sys/fs/cgroup/cgroup.controllers
```
-## Create cgroup dir for Doris
+
+3. Create a directory named doris under the CGroup path. The directory name
can be customized by the user.
+
+```shell
+If using CGroup V1, create the directory under the cpu directory.
mkdir /sys/fs/cgroup/cpu/doris
-## Modify the Doris cgroup directory permissions
-chmod 770 /sys/fs/cgroup/cpu/doris
-## Grant user permissions for Doris usage
-chown -R doris:doris /sys/fs/cgroup/cpu/doris
+If using CGroup V2, create the directory directly under the cgroup directory.
+mkdir /sys/fs/cgroup/doris
```
-### Using CGroup V2
-Due to stricter permission control in CGroup V2, write access to the
cgroup.procs file in the root directory is required to move processes between
groups:
-Grant permission to the cgroup.procs directory using the following command:
+4. Ensure that the Doris BE process has read, write, and execute permissions
for this directory.
+```shell
+// If using CGroup V1, the command is as follows:
+// 1. Modify the directory's permissions to be readable, writable, and
executable.
+chmod 770 /sys/fs/cgroup/cpu/doris
+// 2. Change the ownership of this directory to the doris account.
+chown -R doris:doris /sys/fs/cgroup/cpu/doris
+
+// If using CGroup V2, the command is as follows:
+// 1.Modify the directory's permissions to be readable, writable, and
executable.
+chmod 770 /sys/fs/cgroup/doris
+// 2. Change the ownership of this directory to the doris account.
+chown -R doris:doris /sys/fs/cgroup/doris
```
+
+5. If using CGroup V2, the following steps are required.
+This is because CGroup V2 has stricter permission control, and write
permissions to the cgroup.procs file in the root directory are necessary to
move processes between groups.
+ If it is CGroup V1, this step is not required.
+```shell
chmod a+w /sys/fs/cgroup/cgroup.procs
```
-### Configuring CGroup for BE Nodes
-Before using Workload resource control, configure the CGroup path in the BE
configuration file be/conf/be.conf:
-```
+6. Modify the BE configuration to specify the path of the cgroup.
+```shell
+If using CGroup V1, the configuration path is as follows:
doris_cgroup_cpu_path = /sys/fs/cgroup/cpu/doris
+
+If using CGroup V2, the configuration path is as follows:
+doris_cgroup_cpu_path = /sys/fs/cgroup/doris
```
-Restart the BE node after configuring be.conf. Check the BE.INFO log for the
"add thread {pid} to group" message to confirm successful configuration.
-## Managing Resources with Workload Groups
-After creating a Workload Group, you can add resource limitation rules. Doris
currently supports the following rules:
-- Hard or soft limits on CPU
-- Hard or soft limits on memory
-- Limits on remote or local IO
-- Query queues for managing query jobs
+7. Restart the BE, and in the log (be.INFO), the phrase 'add thread xxx to
group' indicates that the configuration was successful.
-### Creating Custom Workload Groups
-Use an ADMIN user to create Workload Groups and add resource rules using the
CREATE WORKLOAD GROUP statement. Since Doris 2.1, a default Workload Group
named normal is automatically created, and users are bound to it by default.
The following example creates a Workload Group g1 with CPU and memory resource
limits:
+:::tip
+1. It is recommended to deploy only one BE per machine, as the current
Workload Group feature does not support deploying multiple BE instances on a
single machine.
+2. After a machine is restarted, all configurations under the CGroup path will
be cleared.
+To persist the CGroup configuration, you can use systemd to set the operation
as a custom system service,
+so that the creation and authorization operations can be automatically
performed each time the machine restarts.
+3. If using CGroup within a container, the container must have permission to
operate on the host machine.
+ :::
-```
-CREATE WORKLOAD GROUP IF NOT EXISTS g1
-PROPERTIES(
- "cpu_share"="1024",
- "memory_limit"="30%"
-);
-```
+#### Considerations for Using Workload Group in Containers
+Workload's CPU management is based on CGroup. If you want to use Workload
Group inside a container,
+the container needs to be started in privileged mode so that the BE process
inside the container has permission to read and write CGroup files on the host
machine.
-### Modifying Workload Group Resource Rules
-You can view the created Workload Group information by accessing the Doris
system table information_schema.workload_groups.
-To delete a Workload Group, refer to
[DROP-WORKLOAD-GROUP](../../sql-manual/sql-statements/Data-Definition-Statements/Drop/DROP-WORKLOAD-GROUP);
-The ALTER-WORKLOAD-GROUP command can be used to adjust and modify the Workload
Group configuration,
refer[ALTER-WORKLOAD-GROUP](../../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-WORKLOAD-GROUP).
+When BE runs inside a container, the CPU resource usage for Workload Group is
partitioned based on the available resources of the container.
+For example, if the host machine has 64 cores and the container is allocated 8
cores,
+and the Workload Group is configured with a 50% CPU hard limit, the actual
available CPU cores for the Workload Group will be 4 (8 cores * 50%).
-#### Adding or Modifying Resource Items
-Modify the memory limit for the g1 Workload Group:
-```
-ALTER WORKLOAD GROUP g1 PROPERTIES('memory_limit'='10%');
-```
+The memory and IO management functions of Workload Group are implemented
internally by Doris and do not rely on external components,
+so there is no difference in deployment between containers and physical
machines.
+
+If you want to use Doris on K8S, it is recommended to deploy it using the
Doris Operator, which can shield underlying permission issues.
-You can view the modified memory limits through the
information_schema.workload_groups system table:
+### Create Workload Group
```
-SELECT name, memory_limit FROM information_schema.workload_groups;
-+--------+--------------+
-| name | memory_limit |
-+--------+--------------+
-| normal | 30% |
-| g1 | 10% |
-+--------+--------------+
+mysql [information_schema]>create workload group if not exists g1
+ -> properties (
+ -> "cpu_share"="1024"
+ -> );
+Query OK, 0 rows affected (0.03 sec)
+
```
+You can refer to
[CREATE-WORKLOAD-GROUP](../../sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-WORKLOAD-GROUP)。
-#### Configuring Soft and Hard Limits
-Using the Workload Group feature, you can set soft and hard limits for CPU and
memory resources, while for remote and local I/O, only hard limits are
available:
-- Soft Limit: The soft limit acts as a warning threshold for the resource.
Under normal operation, users will not exceed this limit. When other Workload
Groups have lower loads, resources from those groups can be borrowed, exceeding
the soft limit.
-- Hard Limit: The hard limit is the absolute upper bound for resource usage.
Regardless of whether other Workload Groups are underloaded, the hard limit
cannot be exceeded. Hard limits are typically used to prevent resource misuse
in the system.
+The CPU limit configured at this point is a soft limit. Since version 2.1,
Doris will automatically create a group named normal, which cannot be deleted.
-| | soft limit switch and params
| soft limit switch and params
| Description |
-|-----------|----------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|-----|
-| CPU | Switch:FE Config - enable_cpu_hard_limit = false params:Property
- cpu_share | switch:FE Config - enable_cpu_hard_limit = true params:
property - cpu_hard_limit |Only soft or hard limits can be set for different
Workload Groups simultaneously |
-| Memory | Switch:property - enable_memory_overcommit = true
params:property - memory_limit | switch:property - enable_memory_overcommit =
false params: property - memory_limit | Soft and hard limits can be set
independently for different Workload Groups |
-| local IO | None
| params: read_bytes_per_second
| Only hard limits are currently available for
local IO |
-| remote IO | None
| params: remote_read_bytes_per_second
| Only hard limits are currently available for
remote IO |
+### Workload Group Properties
-### Binding Tenants to Workload Groups
-Non-ADMIN users must first check their permissions for a Workload Group. Use
the information_schema.workload_groups system table to verify permissions. Bind
tenants to Workload Groups using user properties or session variables. Session
variables take precedence over user properties.
-```
+| Property | Data type | Default value | Value range
| Description
[...]
+|------------------------------|----------------|---------------|--------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[...]
+| cpu_share | Integer | -1 | [1, 10000]
| Optional, effective under CPU soft limit mode. The valid range of
values depends on the version of CGroup being used, which is described in
detail later. cpu_share represents the weight of CPU time that the Workload
Group can acquire; the larger the value, the more CPU time it can obtain. For
example, if the user creates three Workload Groups, g-a, g-b, and g-c, with
cpu_share values of 10, 30, a [...]
+| memory_limit | Floating-point | -1 | (0%, 100%]
| Optional. Enabling memory hard limit represents the maximum
available memory percentage for the current Workload Group. The default value
means no memory limit is applied. The cumulative value of memory_limit for all
Workload Groups cannot exceed 100%, and it is typically used in conjunction
with the enable_memory_overcommit attribute. For example, if a machine has 64GB
of memory and the memory_l [...]
+| enable_memory_overcommit | Boolean | true | true, false
| Optional. Used to control whether the memory limit for the
current Workload Group is a hard limit or a soft limit, with the default set to
true. If set to false, the Workload Group will have hard memory limit, and when
the system detects that the memory usage exceeds the limit, it will immediately
cancel the tasks with the highest memory usage within the group to release the
excess memory. If set [...]
+| cpu_hard_limit | Integer | -1 | [1%, 100%]
| Optional. Effective under CPU hard limit mode, it represents the
maximum CPU percentage a Workload Group can use. Regardless of whether the
machine's CPU resources are fully utilized, the Workload Group's CPU usage
cannot exceed the cpu_hard_limit. The cumulative value of cpu_hard_limit for
all Workload Groups cannot exceed 100%. This attribute was introduced in
version 2.1 and is not supported i [...]
+| max_concurrency | Integer | 2147483647 | [0,
2147483647] | Optional. Specifies the maximum query concurrency. The
default value is the maximum value of an integer, meaning no concurrency limit.
When the number of running queries reaches the maximum concurrency, new queries
will enter a queue.
[...]
+| max_queue_size | Integer | 0 | [0,
2147483647] | Optional. Specifies the length of the query waiting
queue. When the queue is full, new queries will be rejected. The default value
is 0, which means no queuing. If the queue is full, new queries will fail
directly.
[...]
+| queue_timeout | Integer | 0 | [0,
2147483647] | Optional. Specifies the maximum waiting time for a query
in the waiting queue, in milliseconds. If the query's waiting time in the queue
exceeds this value, an exception will be thrown directly to the client. The
default value is 0, meaning no queuing; queries will immediately fail upon
entering the queue.
[...]
+| scan_thread_num | Integer | -1 | [1,
2147483647] | Optional. Specifies the number of threads used for
scanning in the current Workload Group. When this property is set to -1, it
means it is not active, and the actual scan thread num on the BE will default
to the doris_scanner_thread_pool_thread_num configuration in the BE.
[...]
+| max_remote_scan_thread_num | Integer | -1 | [1,
2147483647] | Optional. Specifies the maximum number of threads in the
scan thread pool for reading external data sources. When this property is set
to -1, the actual number of threads is determined by the BE, typically based on
the number of CPU cores.
[...]
+| min_remote_scan_thread_num | Integer | -1 | [1,
2147483647] | Optional. Specifies the minimum number of threads in the
scan thread pool for reading external data sources. When this property is set
to -1, the actual number of threads is determined by the BE, typically based on
the number of CPU cores.
[...]
+| tag | String | empty | -
| Specifies tags for the Workload Group. The cumulative resource
values of Workload Groups with the same tag cannot exceed 100%. To specify
multiple values, use commas to separate them.
[...]
+| read_bytes_per_second | Integer | -1 | [1,
9223372036854775807] | Optional. Specifies the maximum I/O throughput when
reading internal tables in Doris. The default value is -1, meaning no I/O
bandwidth limit is applied. It is important to note that this value is not tied
to individual disks but to directories. For example, if Doris is configured
with two directories to store internal table data, the maximum read I/O for
each directory will not exceed this value [...]
+| remote_read_bytes_per_second | Integer | -1 | [1,
9223372036854775807] | Optional. Specifies the maximum I/O throughput when
reading external tables in Doris. The default value is -1, meaning no I/O
bandwidth limit is applied.
[...]
+
+:::tip
+
+1. Currently, the simultaneous use of both cpu hard limit and cpu soft limit
is not supported.
+At any given time, a cluster can only have either a soft limit or a hard
limit. The method for switching between them will be described later.
+
+2. All properties are optional, but at least one property must be specified
when creating a Workload Group.
+
+3. It is important to note that the default values for CPU soft limits differ
between CGroup v1 and CGroup v2. The default CPU soft limit for CGroup v1 is
1024, with a valid range from 2 to 262144, while the default for CGroup v2 is
100, with a valid range from 1 to 10000.
+ If a value outside the range is set for the soft limit, it may cause the
CPU soft limit modification to fail in BE. If the default value of 100 from
CGroup v2 is applied in a CGroup v1 environment, it could result in this
Workload Group having the lowest priority on the machine.
+ :::
+
+## Set Workload Group for user
+Before binding a user to a specific Workload Group, it is necessary to ensure
that the user has the necessary permissions for the Workload Group.
+You can use the user to query the information_schema.workload_groups system
table, and the result will show the Workload Groups that the current user has
permission to access.
+The following query result indicates that the current user has access to the
g1 and normal Workload Groups:
+
+```sql
SELECT name FROM information_schema.workload_groups;
+--------+
| name |
@@ -164,49 +198,504 @@ SELECT name FROM information_schema.workload_groups;
+--------+
```
-If you cannot see the g1 Workload Group, you can use the GRANT statement to
grant permissions to the user.
-When binding a Workload Group to a tenant, you can do so either by setting a
user property or specifying a session variable. When both methods are used, the
session variable takes priority over the user property:
+If the g1 Workload Group is not visible, you can use the ADMIN account to
execute the GRANT statement to authorize the user. For example:
+```
+"GRANT USAGE_PRIV ON WORKLOAD GROUP 'g1' TO 'user_1'@'%';"
+```
+This statement means granting the user_1 the permission to use the Workload
Group named g1.
+More details can be found in
[grant](../../sql-manual/sql-statements/Account-Management-Statements/GRANT)。
+
+**Two ways to bind Workload Group to user**
+1. By setting the user property, you can bind the user to a default Workload
Group. The default is normal. It's important to note that the value here cannot
be left empty; otherwise, the statement will fail.
+```
+set property 'default_workload_group' = 'g1';
+```
+After executing this statement, the current user's queries will default to
using the 'g1' Workload Group.
+
-- Binding Workload Group using user property: Typically, administrators use
the SET-PROPERTY command to bind the default Workload Group for a tenant. In
the following example, the default Workload Group g1 is bound to the test_wlg
tenant。
+2. By specifying the Workload Group through a session variable, the default is
empty:
```
-set property for 'test_wlg' 'default_workload_group' = 'g1';
+set workload_group = 'g1';
```
+When both methods are used to specify a Workload Group for the user, the
session variable takes priority over the user property.
-- Using Session Variables: During development, even if an administrator has
set a default Workload Group, it can be overridden in the session using the
workload_group variable. In the following example, the Workload Group for the
current session is set to g1:
+## Show Workload Group
+1. You can use the SHOW statement to view the Workload Group:
```
-SET workload_group = 'g1';
+show workload groups;
```
+More details can be found in
[SHOW-WORKLOAD-GROUPS](../../sql-manual/sql-statements/Show-Statements/SHOW-WORKLOAD-GROUPS)
-## Grouping Workload Groups
-In a multi-workload or multi-tenant environment, a Doris cluster may be split
into multiple sub-clusters, such as some nodes used for federated queries from
external storage and some nodes used for real-time queries on internal tables.
Workload Groups can tag BE nodes, and BE nodes with the same tag form a
sub-cluster. The resources of each sub-cluster are calculated independently,
and the total resource usage within each sub-cluster cannot exceed 100%. In the
following example, seven ma [...]
-In a multi-workload or multi-tenant environment, a Doris cluster may be split
into multiple sub-clusters, such as some nodes used for federated queries from
external storage and some nodes used for fact queries on internal tables. The
two sub-clusters are completely isolated in terms of data distribution and
resource usage. Within the same sub-cluster, multiple tenants need to be
created along with isolation rules for resource usage between tenants. For
complex resource isolation require [...]
+2. You can view the Workload Group through the system table:
+```
+mysql [information_schema]>select * from information_schema.workload_groups
where name='g1';
++-------+------+-----------+--------------+--------------------------+-----------------+----------------+---------------+----------------+-----------------+----------------------------+----------------------------+----------------------+-----------------------+------+-----------------------+------------------------------+
+| ID | NAME | CPU_SHARE | MEMORY_LIMIT | ENABLE_MEMORY_OVERCOMMIT |
MAX_CONCURRENCY | MAX_QUEUE_SIZE | QUEUE_TIMEOUT | CPU_HARD_LIMIT |
SCAN_THREAD_NUM | MAX_REMOTE_SCAN_THREAD_NUM | MIN_REMOTE_SCAN_THREAD_NUM |
MEMORY_LOW_WATERMARK | MEMORY_HIGH_WATERMARK | TAG | READ_BYTES_PER_SECOND |
REMOTE_READ_BYTES_PER_SECOND |
++-------+------+-----------+--------------+--------------------------+-----------------+----------------+---------------+----------------+-----------------+----------------------------+----------------------------+----------------------+-----------------------+------+-----------------------+------------------------------+
+| 14009 | g1 | 1024 | -1 | true |
2147483647 | 0 | 0 | -1 | -1
| -1 | -1 | 50%
| 80% | | -1 |
-1 |
++-------+------+-----------+--------------+--------------------------+-----------------+----------------+---------------+----------------+-----------------+----------------------------+----------------------------+----------------------+-----------------------+------+-----------------------+------------------------------+
+1 row in set (0.05 sec)
+```
+
+## Alter Workload Group
+```
+mysql [information_schema]>alter workload group g1
properties('cpu_share'='2048');
+Query OK, 0 rows affected (0.00 sec
-
+mysql [information_schema]>select cpu_share from
information_schema.workload_groups where name='g1';
++-----------+
+| cpu_share |
++-----------+
+| 2048 |
++-----------+
+1 row in set (0.02 sec)
-1. Create sub_cluster_a and sub_cluster_b Resource Groups, dividing seven
machines into two sub-clusters:
```
--- create resource group sub_cluster_a
-ALTER SYSTEM MODIFY BACKEND "192.168.88.31:9050" SET("tag.location" =
"sub_cluster_a");
-ALTER SYSTEM MODIFY BACKEND "192.168.88.32:9050" SET("tag.location" =
"sub_cluster_a");
-ALTER SYSTEM MODIFY BACKEND "192.168.88.33:9050" SET("tag.location" =
"sub_cluster_a");
--- create resource group sub_cluster_b
-ALTER SYSTEM MODIFY BACKEND "192.168.88.34:9050" SET("tag.location" =
"sub_cluster_b");
-ALTER SYSTEM MODIFY BACKEND "192.168.88.35:9050" SET("tag.location" =
"sub_cluster_b");
+More details can be found in
[ALTER-WORKLOAD-GROUP](../../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-WORKLOAD-GROUP)
+
+## Drop Workload Group
+```
+mysql [information_schema]>drop workload group g1;
+Query OK, 0 rows affected (0.01 sec)
```
-2. Create Workload Groups for memory resource isolation within sub-clusters:
+More details can be found
in[DROP-WORKLOAD-GROUP](../../sql-manual/sql-statements/Data-Definition-Statements/Drop/DROP-WORKLOAD-GROUP)
+
+## Explanation of Switching Between CPU Soft and Hard Limit Modes
+Currently, Doris does not support running both CPU soft and hard limits
simultaneously. At any given time, a Doris cluster can only operate in either
CPU soft limit or CPU hard limit mode.
+Users can switch between these two modes, and the switching method is as
follows:
+
+1 If the current cluster configuration is set to the default CPU soft limit
and you wish to change it to CPU hard limit, you need to modify the
cpu_hard_limit parameter of the Workload Group to a valid value.
```
--- create workload groups for sub cluster A
-CREATE WORKLOAD GROUP a_wlg_1 PROPERTIES('tag' = "sub_cluster_a",
"memory_limit" = "30");
-CREATE WORKLOAD GROUP a_wlg_2 PROPERTIES('tag' = "sub_cluster_a",
"memory_limit" = "30");
-CREATE WORKLOAD GROUP a_wlg_3 PROPERTIES('tag' = "sub_cluster_a",
"memory_limit" = "30");
+alter workload group test_group properties ( 'cpu_hard_limit'='20%' );
+```
+All Workload Groups in the cluster need to be modified, and the cumulative
value of cpu_hard_limit for all Workload Groups cannot exceed 100%.
+
+Since CPU hard limits cannot automatically have a valid value, simply enabling
the switch without modifying the property will prevent the CPU hard limit from
taking effect.
--- create workload groups for sub cluster B
-CREATE WORKLOAD GROUP b_wlg_1 PROPERTIES('tag' = "sub_cluster_b",
"memory_limit" = "30");
-CREATE WORKLOAD GROUP b_wlg_2 PROPERTIES('tag' = "sub_cluster_b",
"memory_limit" = "30");
+2 Enable the CPU hard limit on all FE nodes
```
+1 Modify the configuration in the fe.conf file on the disk.
+experimental_enable_cpu_hard_limit = true
+
+
+2 Modify the configuration in memory.
+ADMIN SET FRONTEND CONFIG ("enable_cpu_hard_limit" = "true");
+```
+
+If the user wishes to switch from CPU hard limit back to CPU soft limit, they
need to set the value of enable_cpu_hard_limit to false on all FE nodes.
+The CPU soft limit property cpu_share will default to a valid value of 1024
(if it was not previously specified). Users can adjust the cpu_share value
based on the priority of the group.
+
+## Testing
+### Memory hard limit
+Adhoc-type queries typically have unpredictable SQL inputs and uncertain
memory usage, which poses the risk of a few queries consuming a large amount of
memory.
+These types of workloads can be allocated to a separate group, and by using
the Workload Group's memory hard limit feature, it helps prevent sudden large
queries from consuming all memory, which could cause other queries to run out
of available memory or result in OOM (Out of Memory) errors.
+When the memory usage of this Workload Group exceeds the configured hard
limit, the system will kill queries to release memory, preventing the process
from running out of memory.
+
+**Testing environment**
+
+1 FE, 1 BE, with BE configured to 96 cores and 375GB of memory.
+
+The test dataset is clickbench, and the testing method involves using JMeter
to run query Q29 with three concurrent executions.
+
+**Test without enabling memory hard limit for Workload Group**
+
+1. Check the memory usage of the process. The fourth column in the ps command
output represents the physical memory usage of the process, in kilobytes (KB).
It shows that under the current test load, the process uses approximately 7.7GB
of memory.
+
+ ```sql
+ [ ~]$ ps -eo pid,comm,%mem,rss | grep 1407481
+ 1407481 doris_be 2.0 7896792
+ [ ~]$ ps -eo pid,comm,%mem,rss | grep 1407481
+ 1407481 doris_be 2.0 7929692
+ [ ~]$ ps -eo pid,comm,%mem,rss | grep 1407481
+ 1407481 doris_be 2.0 8101232
+ ```
+
+2. Use Doris system tables to check the current memory usage of the Workload
Group. The memory usage of the Workload Group is approximately 5.8GB.
+
+ ```sql
+ mysql [information_schema]>select MEMORY_USAGE_BYTES / 1024/ 1024 as
wg_mem_used_mb from workload_group_resource_usage where workload_group_id=11201;
+ +-------------------+
+ | wg_mem_used_mb |
+ +-------------------+
+ | 5797.524360656738 |
+ +-------------------+
+ 1 row in set (0.01 sec)
+
+ mysql [information_schema]>select MEMORY_USAGE_BYTES / 1024/ 1024 as
wg_mem_used_mb from workload_group_resource_usage where workload_group_id=11201;
+ +-------------------+
+ | wg_mem_used_mb |
+ +-------------------+
+ | 5840.246627807617 |
+ +-------------------+
+ 1 row in set (0.02 sec)
+
+ mysql [information_schema]>select MEMORY_USAGE_BYTES / 1024/ 1024 as
wg_mem_used_mb from workload_group_resource_usage where workload_group_id=11201;
+ +-------------------+
+ | wg_mem_used_mb |
+ +-------------------+
+ | 5878.394917488098 |
+ +-------------------+
+ 1 row in set (0.02 sec)
+ ```
+
+Here, we can see that the process memory usage is typically much larger than
the memory usage of a Workload Group, even if only one Workload Group is
running. This is because the Workload Group only tracks the memory used by
queries and loads The memory used by other components within the process, such
as metadata and various caches, is not counted as part of the Workload Group's
memory usage, nor is it managed by the Workload Group.
+
+**Test with the memory hard limit for Workload Group enabled**
+1. Execute the SQL command to modify the memory configuration.
+
+ ```sql
+ alter workload group g2 properties('memory_limit'='0.5%');
+ alter workload group g2 properties('enable_memory_overcommit'='false');
+ ```
+
+2. Run the same test and check the memory usage in the system table; the
memory usage is around 1.5G.
+
+ ```sql
+ mysql [information_schema]>select MEMORY_USAGE_BYTES / 1024/ 1024 as
wg_mem_used_mb from workload_group_resource_usage where workload_group_id=11201;
+ +--------------------+
+ | wg_mem_used_mb |
+ +--------------------+
+ | 1575.3877239227295 |
+ +--------------------+
+ 1 row in set (0.02 sec)
+
+ mysql [information_schema]>select MEMORY_USAGE_BYTES / 1024/ 1024 as
wg_mem_used_mb from workload_group_resource_usage where workload_group_id=11201;
+ +------------------+
+ | wg_mem_used_mb |
+ +------------------+
+ | 1668.77405834198 |
+ +------------------+
+ 1 row in set (0.01 sec)
+
+ mysql [information_schema]>select MEMORY_USAGE_BYTES / 1024/ 1024 as
wg_mem_used_mb from workload_group_resource_usage where workload_group_id=11201;
+ +--------------------+
+ | wg_mem_used_mb |
+ +--------------------+
+ | 499.96760272979736 |
+ +--------------------+
+ 1 row in set (0.01 sec)
+ ```
+
+3. Use the ps command to check the memory usage of the process; the memory
usage is around 3.8G.
+
+ ```sql
+ [ ~]$ ps -eo pid,comm,%mem,rss | grep 1407481
+ 1407481 doris_be 1.0 4071364
+ [ ~]$ ps -eo pid,comm,%mem,rss | grep 1407481
+ 1407481 doris_be 1.0 4059012
+ [ ~]$ ps -eo pid,comm,%mem,rss | grep 1407481
+ 1407481 doris_be 1.0 4057068
+ ```
+
+4. At the same time, the client will observe a significant number of query
failures caused by insufficient memory.
+
+ ```sql
+ 1724074250162,14126,1c_sql,HY000 1105,"java.sql.SQLException: errCode = 2,
detailMessage = (127.0.0.1)[MEM_LIMIT_EXCEEDED]GC wg for hard limit, wg
id:11201, name:g2, used:1.71 GB, limit:1.69 GB, backend:10.16.10.8. cancel top
memory used tracker <Query#Id=4a0689936c444ac8-a0d01a50b944f6e7> consumption
1.71 GB. details:process memory used 3.01 GB exceed soft limit 304.41 GB or sys
available memory 101.16 GB less than warning water mark 12.80 GB., Execute
again after enough memory, det [...]
+ ```
+
+From the error message, it can be observed that the Workload Group used 1.7G
of memory, but the Workload Group's limit is 1.69G. The calculation is as
follows:1.69G = Physical machine memory (375G) * mem_limit (value from be.conf,
default is 0.9) * 0.5% (Workload Group's configuration).
+This means the memory percentage configured in the Workload Group is
calculated based on the memory available to the BE process.
+
+**Recommendations**
+
+As demonstrated in the tests above, memory hard limits can control the memory
usage of a Workload Group but do so by terminating queries to release memory.
This approach can lead to a poor user experience and, in extreme cases, may
cause all queries to fail.
+
+Therefore, in production environments, it is recommended to use memory hard
limits in conjunction with query queuing functionality. This ensures controlled
memory usage while maintaining query success rates.
+
+
+
+### CPU hard limit
+Doris workloads can generally be categorized into three types:
+1. Core Report Queries: These are typically used by company executives to view
reports. While the load may not be very high, the availability requirements are
strict. These queries can be assigned to a group with a higher-priority soft
limit, ensuring they receive more CPU resources when resources are insufficient.
+2. Adhoc queries are typically exploratory and analytical in nature, with
random SQL and unpredictable resource consumption. Their priority is usually
low. Therefore, CPU hard limits can be used to manage these queries,
configuring lower values to prevent excessive CPU resource usage that could
reduce cluster availability.
+3. ETL queries typically have fixed SQL and stable resource consumption,
although there may occasionally be spikes in resource usage due to increased
upstream data. Therefore, CPU hard limits can be configured to manage these
queries.
+
+Different workloads have varying CPU consumption, and users have different
latency requirements. When the BE CPU is fully utilized, availability
decreases, and response times increase. For example, an Adhoc analysis query
may fully utilize the CPU of the entire cluster, causing core report queries to
experience higher latency, which impacts SLA. Therefore, a CPU isolation
mechanism is needed to separate different workloads and ensure cluster
availability and SLA.
+
+Workload Group supports both CPU soft limits and hard limits. It is currently
recommended to configure Workload Groups with hard limits in production
environments. This is because CPU soft limits typically only show priority
effects when the CPU is fully utilized. However, when the CPU is fully used,
internal Doris components (such as the RPC component) and the operating
system’s available CPU are reduced, leading to a significant drop in overall
cluster availability. Therefore, in produ [...]
+
+**Test environment**
+
+1 FE, 1 BE, 96-core machine.
+The dataset is clickbench, and the test SQL is q29.
+
+**Tesing**
+1. Using JMeter to initiate 3 concurrent queries, the CPU usage of the BE
process is pushed to a relatively high usage rate. The test machine has 96
cores, and using the top command, we can see that the BE process's CPU usage is
7600%, which means the process is currently using 76 cores.
+
+ 
+
+2. Modify the CPU hard limit of the currently used Workload Group to 10%.
+
+ ```sql
+ alter workload group g2 properties('cpu_hard_limit'='10%');
+ ```
+
+3. Switch to CPU hard limit mode.
+
+ ```sql
+ ADMIN SET FRONTEND CONFIG ("enable_cpu_hard_limit" = "true");
+ ```
+
+4. Re-run the load test for queries, and you can see that the current process
can only use 9 to 10 cores, which is about 10% of the total cores.
+
+ 
+
+It is important to note that this test is best conducted using query
workloads, as they are more likely to reflect the effect. If testing load, it
may trigger Compaction, causing the actual observed values to be higher than
the values configured in the Workload Group. Currently, Compaction workloads
are not managed under the Workload Group.
+
+5. In addition to using Linux system commands, you can also observe the
current CPU usage of the group through Doris's system tables, where the CPU
usage is around 10%.
+
+ ```sql
+ mysql [information_schema]>select CPU_USAGE_PERCENT from
workload_group_resource_usage where WORKLOAD_GROUP_ID=11201;
+ +-------------------+
+ | CPU_USAGE_PERCENT |
+ +-------------------+
+ | 9.57 |
+ +-------------------+
+ 1 row in set (0.02 sec)
+ ```
+
+**note**
+
+1. When configuring, it's better not to set the total CPU allocation of all
groups to exactly 100%. This is mainly to ensure the availability of
low-latency scenarios, as some resources need to be reserved for other
components. However, for scenarios that are not very sensitive to latency and
aim for maximum resource utilization, setting the total CPU allocation of all
groups to 100% can be considered.
+2. Currently, the interval for synchronizing Workload Group metadata from FE
to BE is 30 seconds. Therefore, changes to Workload Group settings may take up
to 30 seconds to take effect.
+
+
+### Limit local IO
+In OLAP systems, during ETL operations or large Adhoc queries, a significant
amount of data needs to be read. To speed up the data analysis process, Doris
uses multi-threaded parallel scanning across multiple disk files, which
generates substantial disk IO that can impact other queries (such as report
analysis).
+By using Workload Groups, Doris can group offline ETL data processing and
online report queries separately, limiting the offline data processing IO
bandwidth. This helps reduce the impact of offline data processing on online
report analysis.
+
+**Test environment**
+
+1 FE, 1 BE, 96-core machine. Dataset: clickbench. Test query: q29.
+
+**Testing without enabling IO hard limits**
+1. Clear Cache.
+
+ ```sql
+ // clear OS cache
+ sync; echo 3 > /proc/sys/vm/drop_caches
+
+ // disable BE page cache
+ disable_storage_page_cache = true
+ ```
+
+2. Perform a full table scan on the clickbench table, and execute a single
concurrent query.
+
+ ```sql
+ set dry_run_query = true;
+ select * from hits.hits;
+ ```
+
+3. Check the maximum throughput of the current Group as 3GB per second through
Doris's system table.
+
+ ```sql
+ mysql [information_schema]>select LOCAL_SCAN_BYTES_PER_SECOND / 1024 /
1024 as mb_per_sec from workload_group_resource_usage where
WORKLOAD_GROUP_ID=11201;
+ +--------------------+
+ | mb_per_sec |
+ +--------------------+
+ | 1146.6208400726318 |
+ +--------------------+
+ 1 row in set (0.03 sec)
+
+ mysql [information_schema]>select LOCAL_SCAN_BYTES_PER_SECOND / 1024 /
1024 as mb_per_sec from workload_group_resource_usage where
WORKLOAD_GROUP_ID=11201;
+ +--------------------+
+ | mb_per_sec |
+ +--------------------+
+ | 3496.2762966156006 |
+ +--------------------+
+ 1 row in set (0.04 sec)
+
+ mysql [information_schema]>select LOCAL_SCAN_BYTES_PER_SECOND / 1024 /
1024 as mb_per_sec from workload_group_resource_usage where
WORKLOAD_GROUP_ID=11201;
+ +--------------------+
+ | mb_per_sec |
+ +--------------------+
+ | 2192.7690029144287 |
+ +--------------------+
+ 1 row in set (0.02 sec)
+ ```
+
+4. Use the pidstat command to check the process IO. The first column is the
process ID, and the second column is the read IO throughput (in kb/s). It can
be seen that when IO is not restricted, the maximum throughput is 2GB per
second.
+
+ 
+
+
+**Test after enabling IO hard limit**
+1. Clear cache.
+
+ ```sql
+ // Clear OS cache.
+ sync; echo 3 > /proc/sys/vm/drop_caches
+
+ // disable BE page cache
+ disable_storage_page_cache = true
+ ```
+
+2. Modify the Workload Group configuration to limit the maximum throughput to
100M per second.
+
+ ```sql
+ alter workload group g2 properties('read_bytes_per_second'='104857600');
+ ```
+
+3. Use Doris system tables to check that the maximum IO throughput of the
Workload Group is 98M per second.
+
+ ```sql
+ mysql [information_schema]>select LOCAL_SCAN_BYTES_PER_SECOND / 1024 /
1024 as mb_per_sec from workload_group_resource_usage where
WORKLOAD_GROUP_ID=11201;
+ +--------------------+
+ | mb_per_sec |
+ +--------------------+
+ | 97.94296646118164 |
+ +--------------------+
+ 1 row in set (0.03 sec)
+
+ mysql [information_schema]>select LOCAL_SCAN_BYTES_PER_SECOND / 1024 /
1024 as mb_per_sec from workload_group_resource_usage where
WORKLOAD_GROUP_ID=11201;
+ +--------------------+
+ | mb_per_sec |
+ +--------------------+
+ | 98.37584781646729 |
+ +--------------------+
+ 1 row in set (0.04 sec)
-## NOTE
-1. Using Workload Resource Control in Kubernetes: Workload's CPU management
relies on CGroup. If using Workload Groups in containers, start the container
in privileged mode to allow the Doris process to read and write the host's
CGroup files. When Doris runs in a container, the CPU resources allocated to
the Workload Group are based on the container's available resources.
-2. Memory and IO Management: Workload Group's memory and IO management are
implemented internally by Doris and do not depend on external components, so
there is no difference in deployment on containers or physical machines. For
Doris deployment on K8S, using the Doris Operator is recommended to abstract
away underlying permission details.
\ No newline at end of file
+ mysql [information_schema]>select LOCAL_SCAN_BYTES_PER_SECOND / 1024 /
1024 as mb_per_sec from workload_group_resource_usage where
WORKLOAD_GROUP_ID=11201;
+ +--------------------+
+ | mb_per_sec |
+ +--------------------+
+ | 98.06641292572021 |
+ +--------------------+
+ 1 row in set (0.02 sec)
+ ```
+
+4. Use the pid tool to check that the maximum IO throughput of the process is
131M per second.
+
+ 
+
+**Note**
+1. The LOCAL_SCAN_BYTES_PER_SECOND field in the system table represents the
summary value of the current Workload Group's statistics at the process level.
For example, if 12 file paths are configured, LOCAL_SCAN_BYTES_PER_SECOND is
the maximum IO value of these 12 file paths. If you wish to view the IO
throughput for each file path separately, you can check the detailed values in
Grafana.
+
+2. Due to the presence of the operating system and Doris's Page Cache, the IO
observed through Linux's IO monitoring scripts is typically smaller than the IO
seen in the system table.
+
+
+### Limit remote IO
+BrokerLoad and S3Load are commonly used methods for large-scale data load.
Users can first upload data to HDFS or S3, and then use BrokerLoad and S3Load
to load data in parallel. To speed up the load process, Doris uses
multi-threading to pull data from HDFS/S3, which can generate significant
pressure on HDFS/S3, potentially making other jobs running on HDFS/S3 unstable.
+
+To mitigate the impact on other workloads, the Workload Group's remote IO
limit feature can be used to restrict the bandwidth used during the load
process from HDFS/S3. This helps reduce the impact on other business operations.
+
+
+**Test environment**
+
+1 FE and 1 BE are deployed on the same machine, configured with 16 cores and
64GB of memory. The test data is the clickbench dataset, and before testing, we
need to upload the dataset to S3. Considering the upload time, we will only
upload 10 million rows of data, and then use the TVF function to query the data
from S3.
+
+After the upload is successful, you can use the command to view the schema
information.
+
+ ```sql
+ DESC FUNCTION s3 (
+ "URI" = "https://bucketname/1kw.tsv",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "format" = "csv",
+ "use_path_style"="true"
+ );
+ ```
+
+**Test without restricting remote read IO**
+1. Initiate a single-threaded test to perform a full table scan on the
clickbench table.
+
+ ```sql
+ // Set the operation to only scan the data without returning results.
+ set dry_run_query = true;
+
+ SELECT * FROM s3(
+ "URI" = "https://bucketname/1kw.tsv",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "format" = "csv",
+ "use_path_style"="true"
+ );
+ ```
+
+2. Use the system table to check the current remote IO throughput. It shows
that the remote IO throughput for this query is 837 MB per second. Note that
the actual IO throughput here is highly dependent on the environment. If the
machine hosting the BE has limited bandwidth to the external storage, the
actual throughput may be lower.
+
+ ```sql
+ MySQL [(none)]> select cast(REMOTE_SCAN_BYTES_PER_SECOND/1024/1024 as int)
as read_mb from information_schema.workload_group_resource_usage;
+ +---------+
+ | read_mb |
+ +---------+
+ | 837 |
+ +---------+
+ 1 row in set (0.104 sec)
+
+ MySQL [(none)]> select cast(REMOTE_SCAN_BYTES_PER_SECOND/1024/1024 as int)
as read_mb from information_schema.workload_group_resource_usage;
+ +---------+
+ | read_mb |
+ +---------+
+ | 867 |
+ +---------+
+ 1 row in set (0.070 sec)
+
+ MySQL [(none)]> select cast(REMOTE_SCAN_BYTES_PER_SECOND/1024/1024 as int)
as read_mb from information_schema.workload_group_resource_usage;
+ +---------+
+ | read_mb |
+ +---------+
+ | 867 |
+ +---------+
+ 1 row in set (0.186 sec)
+ ```
+
+3. Use the sar command (sar -n DEV 1 3600) to monitor the machine's network
bandwidth. It shows that the maximum network bandwidth at the machine level is
1033 MB per second.
+ The first column of the output represents the number of bytes received per
second by a specific network interface on the machine, in KB per second.
+
+ 
+
+**Test limiting remote read IO**
+1. Modify the Workload Group configuration to limit remote read IO throughput
to 100M per second.
+
+ ```sql
+ alter workload group normal
properties('remote_read_bytes_per_second'='104857600');
+ ```
+
+2. Initiate a single concurrent full table scan query.
+
+ ```sql
+ set dry_run_query = true;
+
+ SELECT * FROM s3(
+ "URI" = "https://bucketname/1kw.tsv",
+ "s3.access_key"= "ak",
+ "s3.secret_key" = "sk",
+ "format" = "csv",
+ "use_path_style"="true"
+ );
+ ```
+
+3. Use the system table to check the current remote read IO throughput. At
this time, the IO throughput is around 100M, with some fluctuations. These
fluctuations are influenced by the current algorithm design, typically peaking
briefly without persisting for long periods, which is considered normal.
+
+ ```sql
+ MySQL [(none)]> select cast(REMOTE_SCAN_BYTES_PER_SECOND/1024/1024 as int)
as read_mb from information_schema.workload_group_resource_usage;
+ +---------+
+ | read_mb |
+ +---------+
+ | 56 |
+ +---------+
+ 1 row in set (0.010 sec)
+
+ MySQL [(none)]> select cast(REMOTE_SCAN_BYTES_PER_SECOND/1024/1024 as int)
as read_mb from information_schema.workload_group_resource_usage;
+ +---------+
+ | read_mb |
+ +---------+
+ | 131 |
+ +---------+
+ 1 row in set (0.009 sec)
+
+ MySQL [(none)]> select cast(REMOTE_SCAN_BYTES_PER_SECOND/1024/1024 as int)
as read_mb from information_schema.workload_group_resource_usage;
+ +---------+
+ | read_mb |
+ +---------+
+ | 111 |
+ +---------+
+ 1 row in set (0.009 sec)
+ ```
+
+4. Use the sar command (sar -n DEV 1 3600) to monitor the current network
card's received traffic. The first column represents the amount of data
received per second. The maximum value observed is now 207M per second,
indicating that the read IO limit is effective. However, since the sar command
reflects machine-level traffic, the observed value is slightly higher than what
Doris reports.
+
+ 
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/workload-management/workload-group.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/workload-management/workload-group.md
index 8164181c886..a73792a0257 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/workload-management/workload-group.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/workload-management/workload-group.md
@@ -146,7 +146,7 @@ Query OK, 0 rows affected (0.03 sec)
| 属性名称 | 数据类型 | 默认值 | 取值范围 | 说明
|
|------------------------------|---------|-----|-----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| cpu_share | 整型 | -1 | [1, 10000] |
可选,CPU软限模式下生效,取值范围和使用的CGroup版本有关,下文有详细描述。cpu_share 代表了 Workload Group
可获得CPU时间的权重,值越大,可获得的CPU时间越多。例如,用户创建了 3 个 Workload Group g-a、g-b 和 g-c,cpu_share
分别为 10、30、40,某一时刻 g-a 和 g-b 正在跑任务,而 g-c 没有任务,此时 g-a 可获得 25% (10 / (10 + 30)) 的
CPU 资源,而 g-b 可获得 75% 的 CPU 资源。如果系统只有一个 Workload Group 正在运行,则不管其 cpu_share
的值为多少,它都可获取全部的 CPU 资源 。 |
-| memory_limit | 浮点 | -1 | (0%, 100%] |
可选,开启内存硬限时代表当前 Workload Group 最大可用内存百分比,默认值代表不限制内存。所有 Workload Group 的
memory_limit 累加值不可以超过 100%,通常与 enable_memory_overcommit 属性配合使用。如果一个机器的内存为
64G,Workload Group 的 memory_limit配置为50%,那么该 group 的实际物理内存=64G * 90% * 50%=
28.8G,这里的90%是 BE 进程可用内存配置的默认值。一个集群中所有 Workload Group 的 memory_limit 的累加值不能超过
100%。 |
+| memory_limit | 浮点 | -1 | (0%, 100%] |
可选,开启内存硬限时代表当前 Workload Group 最大可用内存百分比,默认值代表不限制内存。所有 Workload Group 的
memory_limit 累加值不可以超过 100%,通常与 enable_memory_overcommit 属性配合使用。如果一个机器的内存为
64G,Workload Group 的 memory_limit配置为50%,那么该 group 的实际物理内存=64G * 90% * 50%=
28.8G,这里的90%是 BE 进程可用内存配置的默认值。
|
| enable_memory_overcommit | 布尔 | true | true, false |
可选,用于控制当前 Workload Group 的内存限制是硬限还是软限,默认为 true。如果设置为 false,则该 workload group
为内存硬隔离,系统检测到 workload group 内存使用超出限制后将立即 cancel 组内内存占用最大的若干个任务,以释放超出的内存;如果设置为
true,则该 Workload Group 为内存软隔离,如果系统有空闲内存资源则该 Workload Group 在超出 memory_limit
的限制后可继续使用系统内存,在系统总内存紧张时会 cancel 组内内存占用最大的若干个任务,释放部分超出的内存以缓解系统内存压力。建议所有 workload
group 的 memory_limit 总和低于 100%,为BE进程中的其他组件保留一些内存。 |
| cpu_hard_limit | 整型 | -1 | [1%, 100%] |
可选,CPU 硬限制模式下生效,Workload Group 最大可用 CPU 百分比,不管当前机器的 CPU 资源是否被用满,Workload Group
的最大 CPU 用量都不能超过 cpu_hard_limit,所有 Workload Group 的 cpu_hard_limit 累加值不能超过
100%。2.1 版本新增属性,2.0版本不支持该功能。
|
| max_concurrency | 整型 | 2147483647 | [0, 2147483647] |
可选,最大查询并发数,默认值为整型最大值,也就是不做并发的限制。运行中的查询数量达到最大并发时,新来的查询会进入排队的逻辑。
|
@@ -192,7 +192,7 @@ SELECT name FROM information_schema.workload_groups;
更多授权操作可以参考[grant
语句](../../sql-manual/sql-statements/Account-Management-Statements/GRANT)。
**两种绑定方式**
-1. 通过设置 user property 将 user 默认绑定到 workload
group,默认为`normal`,需要注意的这里的value不能填空,否则语句会执行失败,如果不知道要设置哪些group,可以设置为`normal`,`normal`为全局默认的group。
+1. 通过设置 user property 将 user 默认绑定到 workload
group,默认为`normal`,需要注意的这里的value不能填空,否则语句会执行失败。
```
set property 'default_workload_group' = 'g1';
```
@@ -558,7 +558,7 @@ OLAP 系统在做 ETL 或者大的 Adhoc 查询时,需要读取大量的数据

**注意事项**
-1. 系统表中的 LOCAL_SCAN_BYTES_PER_SECOND 字段代表的是当前 Workload Group 在进程粒度的统计汇总值,比如配置了
12 个文件路径,那么 LOCAL_SCAN_BYTES_PER_SECOND 就是这 12 个文件路径 IO 的最大值,如果期望查看每个文件路径分别的 IO
吞吐,可以在 grafana 上或者 BE 的 bvar 监控查看明细的值。
+1. 系统表中的 LOCAL_SCAN_BYTES_PER_SECOND 字段代表的是当前 Workload Group 在进程粒度的统计汇总值,比如配置了
12 个文件路径,那么 LOCAL_SCAN_BYTES_PER_SECOND 就是这 12 个文件路径 IO 的最大值,如果期望查看每个文件路径分别的 IO
吞吐,可以在 grafana 监控查看明细的值。
2. 由于操作系统和 Doris 的 Page Cache 的存在,通过 linux 的 IO 监控脚本看到的 IO 通常要比系统表看到的要小。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/admin-manual/workload-management/workload-group.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/admin-manual/workload-management/workload-group.md
index 419ea650b49..ca5c02b911c 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/admin-manual/workload-management/workload-group.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/admin-manual/workload-management/workload-group.md
@@ -146,7 +146,7 @@ Query OK, 0 rows affected (0.03 sec)
| 属性名称 | 数据类型 | 默认值 | 取值范围 | 说明
|
|------------------------------|---------|-----|-----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| cpu_share | 整型 | -1 | [1, 10000] |
可选,CPU软限模式下生效,取值范围和使用的CGroup版本有关,下文有详细描述。cpu_share 代表了 Workload Group
可获得CPU时间的权重,值越大,可获得的CPU时间越多。例如,用户创建了 3 个 Workload Group g-a、g-b 和 g-c,cpu_share
分别为 10、30、40,某一时刻 g-a 和 g-b 正在跑任务,而 g-c 没有任务,此时 g-a 可获得 25% (10 / (10 + 30)) 的
CPU 资源,而 g-b 可获得 75% 的 CPU 资源。如果系统只有一个 Workload Group 正在运行,则不管其 cpu_share
的值为多少,它都可获取全部的 CPU 资源 。 |
-| memory_limit | 浮点 | -1 | (0%, 100%] |
可选,开启内存硬限时代表当前 Workload Group 最大可用内存百分比,默认值代表不限制内存。所有 Workload Group 的
memory_limit 累加值不可以超过 100%,通常与 enable_memory_overcommit 属性配合使用。如果一个机器的内存为
64G,Workload Group 的 memory_limit配置为50%,那么该 group 的实际物理内存=64G * 90% * 50%=
28.8G,这里的90%是 BE 进程可用内存配置的默认值。一个集群中所有 Workload Group 的 memory_limit 的累加值不能超过
100%。 |
+| memory_limit | 浮点 | -1 | (0%, 100%] |
可选,开启内存硬限时代表当前 Workload Group 最大可用内存百分比,默认值代表不限制内存。所有 Workload Group 的
memory_limit 累加值不可以超过 100%,通常与 enable_memory_overcommit 属性配合使用。如果一个机器的内存为
64G,Workload Group 的 memory_limit配置为50%,那么该 group 的实际物理内存=64G * 90% * 50%=
28.8G,这里的90%是 BE 进程可用内存配置的默认值。
|
| enable_memory_overcommit | 布尔 | true | true, false |
可选,用于控制当前 Workload Group 的内存限制是硬限还是软限,默认为 true。如果设置为 false,则该 workload group
为内存硬隔离,系统检测到 workload group 内存使用超出限制后将立即 cancel 组内内存占用最大的若干个任务,以释放超出的内存;如果设置为
true,则该 Workload Group 为内存软隔离,如果系统有空闲内存资源则该 Workload Group 在超出 memory_limit
的限制后可继续使用系统内存,在系统总内存紧张时会 cancel 组内内存占用最大的若干个任务,释放部分超出的内存以缓解系统内存压力。建议所有 workload
group 的 memory_limit 总和低于 100%,为BE进程中的其他组件保留一些内存。 |
| cpu_hard_limit | 整型 | -1 | [1%, 100%] |
可选,CPU 硬限制模式下生效,Workload Group 最大可用 CPU 百分比,不管当前机器的 CPU 资源是否被用满,Workload Group
的最大 CPU 用量都不能超过 cpu_hard_limit,所有 Workload Group 的 cpu_hard_limit 累加值不能超过
100%。2.1 版本新增属性,2.0版本不支持该功能。
|
| max_concurrency | 整型 | 2147483647 | [0, 2147483647] |
可选,最大查询并发数,默认值为整型最大值,也就是不做并发的限制。运行中的查询数量达到最大并发时,新来的查询会进入排队的逻辑。
|
@@ -192,7 +192,7 @@ SELECT name FROM information_schema.workload_groups;
更多授权操作可以参考[grant
语句](../../sql-manual/sql-statements/Account-Management-Statements/GRANT)。
**两种绑定方式**
-1. 通过设置 user property 将 user 默认绑定到 workload
group,默认为`normal`,需要注意的这里的value不能填空,否则语句会执行失败,如果不知道要设置哪些group,可以设置为`normal`,`normal`为全局默认的group。
+1. 通过设置 user property 将 user 默认绑定到 workload
group,默认为`normal`,需要注意的这里的value不能填空,否则语句会执行失败。
```
set property 'default_workload_group' = 'g1';
```
@@ -558,7 +558,7 @@ OLAP 系统在做 ETL 或者大的 Adhoc 查询时,需要读取大量的数据

**注意事项**
-1. 系统表中的 LOCAL_SCAN_BYTES_PER_SECOND 字段代表的是当前 Workload Group 在进程粒度的统计汇总值,比如配置了
12 个文件路径,那么 LOCAL_SCAN_BYTES_PER_SECOND 就是这 12 个文件路径 IO 的最大值,如果期望查看每个文件路径分别的 IO
吞吐,可以在 grafana 上或者 BE 的 bvar 监控查看明细的值。
+1. 系统表中的 LOCAL_SCAN_BYTES_PER_SECOND 字段代表的是当前 Workload Group 在进程粒度的统计汇总值,比如配置了
12 个文件路径,那么 LOCAL_SCAN_BYTES_PER_SECOND 就是这 12 个文件路径 IO 的最大值,如果期望查看每个文件路径分别的 IO
吞吐,可以在 grafana 监控查看明细的值。
2. 由于操作系统和 Doris 的 Page Cache 的存在,通过 linux 的 IO 监控脚本看到的 IO 通常要比系统表看到的要小。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/admin-manual/workload-management/workload-group.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/admin-manual/workload-management/workload-group.md
index 419ea650b49..ca5c02b911c 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/admin-manual/workload-management/workload-group.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/admin-manual/workload-management/workload-group.md
@@ -146,7 +146,7 @@ Query OK, 0 rows affected (0.03 sec)
| 属性名称 | 数据类型 | 默认值 | 取值范围 | 说明
|
|------------------------------|---------|-----|-----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| cpu_share | 整型 | -1 | [1, 10000] |
可选,CPU软限模式下生效,取值范围和使用的CGroup版本有关,下文有详细描述。cpu_share 代表了 Workload Group
可获得CPU时间的权重,值越大,可获得的CPU时间越多。例如,用户创建了 3 个 Workload Group g-a、g-b 和 g-c,cpu_share
分别为 10、30、40,某一时刻 g-a 和 g-b 正在跑任务,而 g-c 没有任务,此时 g-a 可获得 25% (10 / (10 + 30)) 的
CPU 资源,而 g-b 可获得 75% 的 CPU 资源。如果系统只有一个 Workload Group 正在运行,则不管其 cpu_share
的值为多少,它都可获取全部的 CPU 资源 。 |
-| memory_limit | 浮点 | -1 | (0%, 100%] |
可选,开启内存硬限时代表当前 Workload Group 最大可用内存百分比,默认值代表不限制内存。所有 Workload Group 的
memory_limit 累加值不可以超过 100%,通常与 enable_memory_overcommit 属性配合使用。如果一个机器的内存为
64G,Workload Group 的 memory_limit配置为50%,那么该 group 的实际物理内存=64G * 90% * 50%=
28.8G,这里的90%是 BE 进程可用内存配置的默认值。一个集群中所有 Workload Group 的 memory_limit 的累加值不能超过
100%。 |
+| memory_limit | 浮点 | -1 | (0%, 100%] |
可选,开启内存硬限时代表当前 Workload Group 最大可用内存百分比,默认值代表不限制内存。所有 Workload Group 的
memory_limit 累加值不可以超过 100%,通常与 enable_memory_overcommit 属性配合使用。如果一个机器的内存为
64G,Workload Group 的 memory_limit配置为50%,那么该 group 的实际物理内存=64G * 90% * 50%=
28.8G,这里的90%是 BE 进程可用内存配置的默认值。
|
| enable_memory_overcommit | 布尔 | true | true, false |
可选,用于控制当前 Workload Group 的内存限制是硬限还是软限,默认为 true。如果设置为 false,则该 workload group
为内存硬隔离,系统检测到 workload group 内存使用超出限制后将立即 cancel 组内内存占用最大的若干个任务,以释放超出的内存;如果设置为
true,则该 Workload Group 为内存软隔离,如果系统有空闲内存资源则该 Workload Group 在超出 memory_limit
的限制后可继续使用系统内存,在系统总内存紧张时会 cancel 组内内存占用最大的若干个任务,释放部分超出的内存以缓解系统内存压力。建议所有 workload
group 的 memory_limit 总和低于 100%,为BE进程中的其他组件保留一些内存。 |
| cpu_hard_limit | 整型 | -1 | [1%, 100%] |
可选,CPU 硬限制模式下生效,Workload Group 最大可用 CPU 百分比,不管当前机器的 CPU 资源是否被用满,Workload Group
的最大 CPU 用量都不能超过 cpu_hard_limit,所有 Workload Group 的 cpu_hard_limit 累加值不能超过
100%。2.1 版本新增属性,2.0版本不支持该功能。
|
| max_concurrency | 整型 | 2147483647 | [0, 2147483647] |
可选,最大查询并发数,默认值为整型最大值,也就是不做并发的限制。运行中的查询数量达到最大并发时,新来的查询会进入排队的逻辑。
|
@@ -192,7 +192,7 @@ SELECT name FROM information_schema.workload_groups;
更多授权操作可以参考[grant
语句](../../sql-manual/sql-statements/Account-Management-Statements/GRANT)。
**两种绑定方式**
-1. 通过设置 user property 将 user 默认绑定到 workload
group,默认为`normal`,需要注意的这里的value不能填空,否则语句会执行失败,如果不知道要设置哪些group,可以设置为`normal`,`normal`为全局默认的group。
+1. 通过设置 user property 将 user 默认绑定到 workload
group,默认为`normal`,需要注意的这里的value不能填空,否则语句会执行失败。
```
set property 'default_workload_group' = 'g1';
```
@@ -558,7 +558,7 @@ OLAP 系统在做 ETL 或者大的 Adhoc 查询时,需要读取大量的数据

**注意事项**
-1. 系统表中的 LOCAL_SCAN_BYTES_PER_SECOND 字段代表的是当前 Workload Group 在进程粒度的统计汇总值,比如配置了
12 个文件路径,那么 LOCAL_SCAN_BYTES_PER_SECOND 就是这 12 个文件路径 IO 的最大值,如果期望查看每个文件路径分别的 IO
吞吐,可以在 grafana 上或者 BE 的 bvar 监控查看明细的值。
+1. 系统表中的 LOCAL_SCAN_BYTES_PER_SECOND 字段代表的是当前 Workload Group 在进程粒度的统计汇总值,比如配置了
12 个文件路径,那么 LOCAL_SCAN_BYTES_PER_SECOND 就是这 12 个文件路径 IO 的最大值,如果期望查看每个文件路径分别的 IO
吞吐,可以在 grafana 监控查看明细的值。
2. 由于操作系统和 Doris 的 Page Cache 的存在,通过 linux 的 IO 监控脚本看到的 IO 通常要比系统表看到的要小。
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]