jerqi commented on code in PR #2277: URL: https://github.com/apache/incubator-uniffle/pull/2277#discussion_r1912884018
##########
docs/coordinator_guide.md:
##########
@@ -82,35 +82,36 @@ This document will introduce how to deploy Uniffle
coordinators.
## Configuration
### Common settings
-| Property Name | Default
| Description
|
-|--------------------------------------------------------|------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| rss.coordinator.server.heartbeat.timeout | 30000
| Timeout if can't get
heartbeat from shuffle server
|
-| rss.coordinator.server.periodic.output.interval.times | 30
| The periodic interval
times of output alive nodes. The interval sec can be calculated by
(rss.coordinator.server.heartbeat.timeout/3 *
rss.coordinator.server.periodic.output.interval.times). Default output interval
is 5min. |
-| rss.coordinator.assignment.strategy | PARTITION_BALANCE
| Strategy for assigning
shuffle server, PARTITION_BALANCE should be used for workload balance
|
-| rss.coordinator.app.expired | 60000
| Application expired time
(ms), the heartbeat interval should be less than it
|
-| rss.coordinator.shuffle.nodes.max | 9
| The max number of shuffle
server when do the assignment
|
-| rss.coordinator.dynamicClientConf.path | -
| The path of configuration
file which have default conf for rss client
|
-| rss.coordinator.exclude.nodes.file.path | -
| The path of configuration
file which have exclude nodes
|
-| rss.coordinator.exclude.nodes.check.interval.ms | 60000
| Update interval (ms) for
exclude nodes
|
-| rss.coordinator.access.checkers |
org.apache.uniffle.coordinator.access.checker.AccessClusterLoadChecker | The
access checkers will be used when the spark client use the
DelegationShuffleManager, which will decide whether to use rss according to the
result of the specified access checkers
|
-| rss.coordinator.access.loadChecker.memory.percentage | 15.0
| The minimal percentage of
available memory percentage of a server
|
-| rss.coordinator.dynamicClientConf.enabled | false
| whether to enable dynamic
client conf, which will be fetched by spark client
|
-| rss.coordinator.dynamicClientConf.path | -
| The dynamic client conf of
this cluster and can be stored in HADOOP FS or local
|
-| rss.coordinator.dynamicClientConf.updateIntervalSec | 120
| The dynamic client conf
update interval in seconds
|
-| rss.coordinator.remote.storage.cluster.conf | -
| Remote Storage Cluster
related conf with format $clusterId,$key=$value, separated by ';'
|
-| rss.rpc.server.port | -
| RPC port for coordinator
|
-| rss.jetty.http.port | -
| Http port for coordinator
|
-| rss.coordinator.remote.storage.select.strategy | APP_BALANCE
| Strategy for selecting the
remote path
|
-| rss.coordinator.remote.storage.io.sample.schedule.time | 60000
| The time of scheduling the
read and write time of the paths to obtain different HADOOP FS
|
-| rss.coordinator.remote.storage.io.sample.file.size | 204800000
| The size of the file that
the scheduled thread reads and writes
|
-| rss.coordinator.remote.storage.io.sample.access.times | 3
| The number of times to
read and write HADOOP FS files
|
-| rss.coordinator.startup-silent-period.enabled | false
| Enable the
startup-silent-period to reject the assignment requests for avoiding partial
assignments. To avoid service interruption, this mechanism is disabled by
default. Especially it's recommended to use in coordinator HA mode when
restarting single coordinator. |
-| rss.coordinator.startup-silent-period.duration | 20000
| The waiting duration(ms)
when conf of rss.coordinator.startup-silent-period.enabled is enabled.
|
-| rss.coordinator.select.partition.strategy | CONTINUOUS
| There are two strategies
for selecting partitions: ROUND and CONTINUOUS. ROUND will poll to allocate
partitions to ShuffleServer, and CONTINUOUS will try to allocate consecutive
partitions to ShuffleServer, this feature can improve performance in AQE
scenarios. |
-| rss.metrics.reporter.class | -
| The class of metrics
reporter.
|
-| rss.reconfigure.interval.sec | 5
| Reconfigure check
interval.
|
-| rss.coordinator.rpc.audit.log.enabled | true
| When set to true, for
auditing purposes, the coordinator will log audit records for every rpc request
operation.
|
-| rss.coordinator.rpc.audit.log.excludeList |
appHeartbeat,heartbeat |
Exclude record rpc audit operation list, separated by ','.
|
+| Property Name | Default
| Description
|
+|--------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| rss.coordinator.server.heartbeat.timeout | 30000
| Timeout if can't get heartbeat from shuffle server
|
+| rss.coordinator.server.periodic.output.interval.times | 30
| The periodic interval times of output alive nodes.
The interval sec can be calculated by
(rss.coordinator.server.heartbeat.timeout/3 *
rss.coordinator.server.periodic.output.interval.times). Default output interval
is 5min. |
+| rss.coordinator.assignment.strategy | PARTITION_BALANCE
| Strategy for assigning shuffle server,
PARTITION_BALANCE should be used for workload balance
|
+| rss.coordinator.app.expired | 60000
| Application expired time (ms), the heartbeat
interval should be less than it
|
+| rss.coordinator.shuffle.nodes.max | 9
| The max number of shuffle server when do the
assignment
|
+| rss.coordinator.dynamicClientConf.path | -
| The path of configuration file which have default
conf for rss client
|
+| rss.coordinator.exclude.nodes.file.path | -
| The path of configuration file which have exclude
nodes
|
+| rss.coordinator.exclude.nodes.check.interval.ms | 60000
| Update interval (ms) for exclude nodes
|
+| rss.coordinator.access.checkers |
org.apache.uniffle.coordinator.access.checker.AccessClusterLoadChecker,org.apache.uniffle.coordinator.access.checker.AccessQuotaChecker,org.apache.uniffle.coordinator.access.checker.AccessSupportRssChecker
| The access checkers will be used when the spark client use the
DelegationShuffleManager, which will decide whether to use rss according to the
result of the specified access checkers
|
+| rss.coordinator.access.loadChecker.memory.percentage | 15.0
| The minimal percentage of available memory
percentage of a server
|
+| rss.coordinator.dynamicClientConf.enabled | false
| whether to enable dynamic client conf, which will
be fetched by spark client
|
+| rss.coordinator.dynamicClientConf.path | -
| The dynamic client conf of this cluster and can be
stored in HADOOP FS or local
|
+| rss.coordinator.dynamicClientConf.updateIntervalSec | 120
| The dynamic client conf update interval in seconds
|
+| rss.coordinator.remote.storage.cluster.conf | -
| Remote Storage Cluster related conf with format
$clusterId,$key=$value, separated by ';'
|
+| rss.rpc.server.port | -
| RPC port for coordinator
|
+| rss.jetty.http.port | -
| Http port for coordinator
|
+| rss.coordinator.remote.storage.select.strategy | APP_BALANCE
| Strategy for selecting the remote path
|
+| rss.coordinator.remote.storage.io.sample.schedule.time | 60000
| The time of scheduling the read and write time of
the paths to obtain different HADOOP FS
|
+| rss.coordinator.remote.storage.io.sample.file.size | 204800000
| The size of the file that the scheduled thread
reads and writes
|
+| rss.coordinator.remote.storage.io.sample.access.times | 3
| The number of times to read and write HADOOP FS
files
|
+| rss.coordinator.startup-silent-period.enabled | false
| Enable the startup-silent-period to reject the
assignment requests for avoiding partial assignments. To avoid service
interruption, this mechanism is disabled by default. Especially it's
recommended to use in coordinator HA mode when restarting single coordinator. |
+| rss.coordinator.startup-silent-period.duration | 20000
| The waiting duration(ms) when conf of
rss.coordinator.startup-silent-period.enabled is enabled.
|
+| rss.coordinator.select.partition.strategy | CONTINUOUS
| There are two strategies for selecting partitions:
ROUND and CONTINUOUS. ROUND will poll to allocate partitions to ShuffleServer,
and CONTINUOUS will try to allocate consecutive partitions to ShuffleServer,
this feature can improve performance in AQE scenarios. |
+| rss.metrics.reporter.class | -
| The class of metrics reporter.
|
+| rss.reconfigure.interval.sec | 5
| Reconfigure check interval.
|
+| rss.coordinator.rpc.audit.log.enabled | true
| When set to true, for auditing purposes, the
coordinator will log audit records for every rpc request operation.
|
+| rss.coordinator.rpc.audit.log.excludeList |
appHeartbeat,heartbeat
| Exclude record rpc audit
operation list, separated by ','.
|
+| rss.coordinator.unsupportedConfigs |
serializer:org.apache.hadoop.io.serializer.JavaSerialization
| The unsupported config list
separated by ',', the key value separated by ':'. If the client configures
these properties and they are set to be denied access, the client's access will
be rejected.
|
Review Comment:
Empty may be better the default value.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
