This is an automated email from the ASF dual-hosted git repository.
dockerzhang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-inlong-website.git
The following commit(s) were added to refs/heads/master by this push:
new 8887d11 [INLONG-1808] Optimize document of DataProxy about monitor
metric (#187)
8887d11 is described below
commit 8887d11df497e688d36a3f80839afa819bedd44c
Author: 卢春亮 <[email protected]>
AuthorDate: Thu Nov 18 18:25:07 2021 +0800
[INLONG-1808] Optimize document of DataProxy about monitor metric (#187)
---
docs/modules/dataproxy/architecture.md | 253 +++++++++++++--------
.../current/modules/dataproxy/architecture.md | 229 ++++++++++++-------
2 files changed, 313 insertions(+), 169 deletions(-)
diff --git a/docs/modules/dataproxy/architecture.md
b/docs/modules/dataproxy/architecture.md
index a7d72f5..de2a89a 100644
--- a/docs/modules/dataproxy/architecture.md
+++ b/docs/modules/dataproxy/architecture.md
@@ -22,131 +22,204 @@ DataProxy supports configurable source-channel-sink, and
the configuration metho
Source configuration example and corresponding notes:
- agent1.sources.tcp-source.channels = ch-msg1 ch-msg2 ch-msg3 ch-more1
ch-more2 ch-more3 ch-msg5 ch-msg6 ch-msg7 ch-msg8 ch-msg9 ch-msg10 ch-transfer
ch -Back
- Define the channel used in the source. Note that if the configuration
below this source uses the channel, it needs to be annotated here
+```shell
+agent1.sources.tcp-source.channels = ch-msg1 ch-msg2 ch-msg3 ch-more1 ch-more2
ch-more3 ch-msg5 ch-msg6 ch-msg7 ch-msg8 ch-msg9 ch-msg10 ch-transfer ch -Back
+Define the channel used in the source. Note that if the configuration below
this source uses the channel, it needs to be annotated here
- agent1.sources.tcp-source.type = org.apache.flume.source.SimpleTcpSource
- tcp resolution type definition, here provide the class name for
instantiation, SimpleTcpSource is mainly to initialize the configuration and
start port monitoring
+agent1.sources.tcp-source.type = org.apache.flume.source.SimpleTcpSource
+tcp resolution type definition, here provide the class name for instantiation,
SimpleTcpSource is mainly to initialize the configuration and start port
monitoring
- agent1.sources.tcp-source.msg-factory-name =
org.apache.flume.source.ServerMessageFactory
- Handler used for message structure analysis, and set read stream handler
and write stream handler
+agent1.sources.tcp-source.msg-factory-name =
org.apache.flume.source.ServerMessageFactory
+Handler used for message structure analysis, and set read stream handler and
write stream handler
- agent1.sources.tcp-source.host = 0.0.0.0
- tcp ip binding monitoring, binding all network cards by default
+agent1.sources.tcp-source.host = 0.0.0.0
+tcp ip binding monitoring, binding all network cards by default
- agent1.sources.tcp-source.port = 46801
- tcp port binding, port 46801 is bound by default
+agent1.sources.tcp-source.port = 46801
+tcp port binding, port 46801 is bound by default
- agent1.sources.tcp-source.highWaterMark=2621440
- The concept of netty, set the netty high water level value
+agent1.sources.tcp-source.highWaterMark=2621440
+The concept of netty, set the netty high water level value
- agent1.sources.tcp-source.enableExceptionReturn=true
- The new function of v1.7 version, optional, the default is false, used to
open the exception channel, when an exception occurs, the data is written to
the exception channel to prevent other normal data transmission (the open
source version does not add this function), Details: Increase the local disk of
abnormal data landing
+agent1.sources.tcp-source.enableExceptionReturn=true
+The new function of v1.7 version, optional, the default is false, used to open
the exception channel, when an exception occurs, the data is written to the
exception channel to prevent other normal data transmission (the open source
version does not add this function), Details: Increase the local disk of
abnormal data landing
- agent1.sources.tcp-source.max-msg-length = 524288
- Limit the size of a single package, here if the compressed package is
transmitted, it is the compressed package size, the limit is 512KB
+agent1.sources.tcp-source.max-msg-length = 524288
+Limit the size of a single package, here if the compressed package is
transmitted, it is the compressed package size, the limit is 512KB
- agent1.sources.tcp-source.topic = test_token
- The default topic value, if the mapping relationship between groupId and
topic cannot be found, it will be sent to this topic
+agent1.sources.tcp-source.topic = test_token
+The default topic value, if the mapping relationship between groupId and topic
cannot be found, it will be sent to this topic
- agent1.sources.tcp-source.attr = m=9
- The default value of m is set, where the value of m is the version of
inlong's internal TdMsg protocol
+agent1.sources.tcp-source.attr = m=9
+The default value of m is set, where the value of m is the version of inlong's
internal TdMsg protocol
- agent1.sources.tcp-source.connections = 5000
- Concurrent connections go online, new connections will be broken when the
upper limit is exceeded
+agent1.sources.tcp-source.connections = 5000
+Concurrent connections go online, new connections will be broken when the
upper limit is exceeded
- agent1.sources.tcp-source.max-threads = 64
- Netty thread pool work thread upper limit, generally recommended to choose
twice the cpu
+agent1.sources.tcp-source.max-threads = 64
+Netty thread pool work thread upper limit, generally recommended to choose
twice the cpu
- agent1.sources.tcp-source.receiveBufferSize = 524288
- Netty server tcp tuning parameters
+agent1.sources.tcp-source.receiveBufferSize = 524288
+Netty server tcp tuning parameters
- agent1.sources.tcp-source.sendBufferSize = 524288
- Netty server tcp tuning parameters
+agent1.sources.tcp-source.sendBufferSize = 524288
+Netty server tcp tuning parameters
- agent1.sources.tcp-source.custom-cp = true
- Whether to use the self-developed channel process, the self-developed
channel process can select the alternate channel to send when the main channel
is blocked
+agent1.sources.tcp-source.custom-cp = true
+Whether to use the self-developed channel process, the self-developed channel
process can select the alternate channel to send when the main channel is
blocked
- agent1.sources.tcp-source.selector.type =
org.apache.flume.channel.FailoverChannelSelector
- This channel selector is a self-developed channel selector, which is not
much different from the official website, mainly because of the channel
master-slave selection logic
+agent1.sources.tcp-source.selector.type =
org.apache.flume.channel.FailoverChannelSelector
+This channel selector is a self-developed channel selector, which is not much
different from the official website, mainly because of the channel master-slave
selection logic
- agent1.sources.tcp-source.selector.master = ch-msg5 ch-msg6 ch-msg7
ch-msg8 ch-msg9
- Specify the master channel, these channels will be preferentially selected
for data push. Those channels that are not in the master, transfer, fileMetric,
and slaMetric configuration items, but are in
- There are defined channels in channels, which are all classified as slave
channels. When the master channel is full, the slave channel will be selected.
Generally, the file channel type is recommended for the slave channel.
+agent1.sources.tcp-source.selector.master = ch-msg5 ch-msg6 ch-msg7 ch-msg8
ch-msg9
+Specify the master channel, these channels will be preferentially selected for
data push. Those channels that are not in the master, transfer, fileMetric, and
slaMetric configuration items, but are in
+There are defined channels in channels, which are all classified as slave
channels. When the master channel is full, the slave channel will be selected.
Generally, the file channel type is recommended for the slave channel.
- agent1.sources.tcp-source.selector.transfer = ch-msg5 ch-msg6 ch-msg7
ch-msg8 ch-msg9
- Specify the transfer channel to accept the transfer type data. The
transfer here generally refers to the data pushed to the non-tube cluster,
which is only for forwarding, and it is reserved for subsequent functions.
-
- agent1.sources.tcp-source.selector.fileMetric = ch-back
- Specify the fileMetric channel to receive the metric data reported by the
agent
+agent1.sources.tcp-source.selector.transfer = ch-msg5 ch-msg6 ch-msg7 ch-msg8
ch-msg9
+Specify the transfer channel to accept the transfer type data. The transfer
here generally refers to the data pushed to the non-tube cluster, which is only
for forwarding, and it is reserved for subsequent functions.
+agent1.sources.tcp-source.selector.fileMetric = ch-back
+Specify the fileMetric channel to receive the metric data reported by the agent
+```
Channel configuration examples and corresponding annotations
memory channel
- agent1.channels.ch-more1.type = memory
- memory channel type
+```shell
+agent1.channels.ch-more1.type = memory
+memory channel type
+
+agent1.channels.ch-more1.capacity = 10000000
+Memory channel queue size, the maximum number of messages that can be cached
- agent1.channels.ch-more1.capacity = 10000000
- Memory channel queue size, the maximum number of messages that can be
cached
+agent1.channels.ch-more1.keep-alive = 0
- agent1.channels.ch-more1.keep-alive = 0
-
- agent1.channels.ch-more1.transactionCapacity = 20
- The maximum number of batches are processed in atomic operations, and the
memory channel needs to be locked when used, so there will be a batch process
to increase efficiency
+agent1.channels.ch-more1.transactionCapacity = 20
+The maximum number of batches are processed in atomic operations, and the
memory channel needs to be locked when used, so there will be a batch process
to increase efficiency
+```
file channel
- agent1.channels.ch-msg5.type = file
- file channel type
+```shell
+agent1.channels.ch-msg5.type = file
+file channel type
- agent1.channels.ch-msg5.capacity = 100000000
- The maximum number of messages that can be cached in a file channel
+agent1.channels.ch-msg5.capacity = 100000000
+The maximum number of messages that can be cached in a file channel
- agent1.channels.ch-msg5.maxFileSize = 1073741824
- file channel file maximum limit, the number of bytes
+agent1.channels.ch-msg5.maxFileSize = 1073741824
+file channel file maximum limit, the number of bytes
- agent1.channels.ch-msg5.minimumRequiredSpace = 1073741824
- The minimum free space of the disk where the file channel is located.
Setting this value can prevent the disk from being full
+agent1.channels.ch-msg5.minimumRequiredSpace = 1073741824
+The minimum free space of the disk where the file channel is located. Setting
this value can prevent the disk from being full
- agent1.channels.ch-msg5.checkpointDir = /data/work/file/ch-msg5/check
- file channel checkpoint path
+agent1.channels.ch-msg5.checkpointDir = /data/work/file/ch-msg5/check
+file channel checkpoint path
- agent1.channels.ch-msg5.dataDirs = /data/work/file/ch-msg5/data
- file channel data path
+agent1.channels.ch-msg5.dataDirs = /data/work/file/ch-msg5/data
+file channel data path
- agent1.channels.ch-msg5.fsyncPerTransaction = false
- Whether to synchronize the disk for each atomic operation, it is
recommended to change it to false, otherwise it will affect the performance
+agent1.channels.ch-msg5.fsyncPerTransaction = false
+Whether to synchronize the disk for each atomic operation, it is recommended
to change it to false, otherwise it will affect the performance
- agent1.channels.ch-msg5.fsyncInterval = 5
- The time interval between data flush from memory to disk, in seconds
+agent1.channels.ch-msg5.fsyncInterval = 5
+The time interval between data flush from memory to disk, in seconds
+```
Sink configuration example and corresponding notes
- agent1.sinks.meta-sink-more1.channel = ch-msg1
- The upstream channel name of the sink
-
- agent1.sinks.meta-sink-more1.type = org.apache.flume.sink.MetaSink
- The sink class is implemented, where the message is implemented to push
data to the tube cluster
-
- agent1.sinks.meta-sink-more1.master-host-port-list =
- Tube cluster master node list
-
- agent1.sinks.meta-sink-more1.send_timeout = 30000
- Timeout limit when sending to tube
-
- agent1.sinks.meta-sink-more1.stat-interval-sec = 60
- Sink indicator statistics interval time, in seconds
-
- agent1.sinks.meta-sink-more1.thread-num = 8
- Sink class sends messages to the worker thread, 8 means to start 8
concurrent threads
-
- agent1.sinks.meta-sink-more1.client-id-cache = true
- agent id cache, used to check the data reported by the agent to remove
duplicates
-
- agent1.sinks.meta-sink-more1.max-survived-time = 300000
- Maximum cache time
-
- agent1.sinks.meta-sink-more1.max-survived-size = 3000000
- Maximum number of caches
+```shell
+agent1.sinks.meta-sink-more1.channel = ch-msg1
+The upstream channel name of the sink
+
+agent1.sinks.meta-sink-more1.type = org.apache.flume.sink.MetaSink
+The sink class is implemented, where the message is implemented to push data
to the tube cluster
+
+agent1.sinks.meta-sink-more1.master-host-port-list =
+Tube cluster master node list
+
+agent1.sinks.meta-sink-more1.send_timeout = 30000
+Timeout limit when sending to tube
+
+agent1.sinks.meta-sink-more1.stat-interval-sec = 60
+Sink indicator statistics interval time, in seconds
+
+agent1.sinks.meta-sink-more1.thread-num = 8
+Sink class sends messages to the worker thread, 8 means to start 8 concurrent
threads
+
+agent1.sinks.meta-sink-more1.client-id-cache = true
+agent id cache, used to check the data reported by the agent to remove
duplicates
+
+agent1.sinks.meta-sink-more1.max-survived-time = 300000
+Maximum cache time
+
+agent1.sinks.meta-sink-more1.max-survived-size = 3000000
+Maximum number of caches
+```
+
+# 4、Monitor metrics configuration instructions
+
+ DataProxy provide monitor indicator based on JMX, user can implement the
code that read the metrics and report to user-defined monitor system.
+Source-module and Sink-module can add monitor metric class that is the
subclass of org.apache.inlong.commons.config.metrics.MetricItemSet, and
register it to MBeanServer. User-defined plugin can get module metric with JMX,
and report metric data to different monitor system.
+
+ User can describe the configuration in the file "common.properties ". For
example:
+
+```shell
+metricDomains=DataProxy
+metricDomains.DataProxy.domainListeners=org.apache.inlong.dataproxy.metrics.prometheus.PrometheusMetricListener
+metricDomains.DataProxy.snapshotInterval=60000
+```
+
+ * The JMX domain name of DataProxy is "DataProxy".
+ * It is defined by the parameter "metricDomains".
+ * The listeners of JMX domain is defined by the parameter
"metricDomains.$domainName.domainListeners".
+ * The class names of the listeners is separated by the space char.
+ * The listener class need to implement the interface
"org.apache.inlong.dataproxy.metrics.MetricListener".
+ * The snapshot interval of the listeners is defined by the parameter
"metricDomains.$domainName.snapshotInterval", the parameter unit is
"millisecond".
+
+ The method proto of org.apache.inlong.dataproxy.metrics.MetricListener is:
+
+```java
+public void snapshot(String domain, List itemValues);
+```
+
+ The field of MetricItemValue.dimensions has these dimensions(The fields of
DataProxyMetricItem defined by the Annotation "@Dimension"):
+
+```shell
+clusterId: DataProxy cluster ID.
+sourceId: DataProxy source component name.
+sourceDataId: DataProxy source component data id, when source is a TCP source,
it will be port number.
+inlongGroupId: Inlong data group ID.
+inlongStreamId: Inlong data stream ID.
+sinkId: DataProxy sink component name.
+sinkDataId: DataProxy sink component data id, when sink is a pulsar sink, it
will be topic name.
+```
+
+ The field of MetricItemValue.metrics has these metrics(The fields of
DataProxyMetricItem defined by the Annotation "@CountMetric"):
+
+```shell
+readSuccessCount: Successful event count reading from source component.
+readSuccessSize: Successful event body size reading from source component.
+readFailCount: Failure event count reading from source component.
+readFailSize: Failure event body size reading from source component.
+sendCount: Event count sending to sink destination.
+sendSize: Event body size sending to sink destination.
+sendSuccessCount: Successful event count sending to sink destination.
+sendSuccessSize: Successful event body size sending to sink destination.
+sendFailCount: Failure event count sending to sink destination.
+sendFailSize: Failure event body size sending to sink destination.
+sinkDuration: The unit is millisecond, the duration is between current
timepoint and the timepoint in sending to sink destination.
+nodeDuration: The unit is millisecond, the duration is between current
timepoint and the timepoint in getting event from source.
+wholeDuration: The unit is millisecond, the duration is between current
timepoint and the timepoint in generating event.
+```
+
+ Monitor indicators have registered to MBeanServer, user can append JMX
parameters when running DataProxy, remote server can get monitor metrics with
RMI.
+
+```shell
+-Dcom.sun.management.jmxremote
+-Djava.rmi.server.hostname=127.0.0.1
+-Dcom.sun.management.jmxremote.port=9999
+-Dcom.sun.management.jmxremote.authenticate=false
+-Dcom.sun.management.jmxremote.ssl=false
+```
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/dataproxy/architecture.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/dataproxy/architecture.md
index 9c63de0..23fe487 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/dataproxy/architecture.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/dataproxy/architecture.md
@@ -24,127 +24,198 @@ DataProxy支持配置化的source-channel-sink,配置方式与flume的配置
Source配置示例以及对应的注解:
- agent1.sources.tcp-source.channels = ch-msg1 ch-msg2 ch-msg3 ch-more1
ch-more2 ch-more3 ch-msg5 ch-msg6 ch-msg7 ch-msg8 ch-msg9 ch-msg10 ch-transfer
ch-back
- 定义source中使用到的channel,注意此source下面的配置如果有使用到channel,均需要在此注释
+```shell
+agent1.sources.tcp-source.channels = ch-msg1 ch-msg2 ch-msg3 ch-more1 ch-more2
ch-more3 ch-msg5 ch-msg6 ch-msg7 ch-msg8 ch-msg9 ch-msg10 ch-transfer ch-back
+定义source中使用到的channel,注意此source下面的配置如果有使用到channel,均需要在此注释
- agent1.sources.tcp-source.type = org.apache.flume.source.SimpleTcpSource
- tcp解析类型定义,这里提供类名用于实例化,SimpleTcpSource主要是初始化配置并启动端口监听
+agent1.sources.tcp-source.type = org.apache.flume.source.SimpleTcpSource
+tcp解析类型定义,这里提供类名用于实例化,SimpleTcpSource主要是初始化配置并启动端口监听
- agent1.sources.tcp-source.msg-factory-name =
org.apache.flume.source.ServerMessageFactory
- 用于构造消息解析的handler,并设置read stream handler和write stream handler
+agent1.sources.tcp-source.msg-factory-name =
org.apache.flume.source.ServerMessageFactory
+用于构造消息解析的handler,并设置read stream handler和write stream handler
- agent1.sources.tcp-source.host = 0.0.0.0
- tcp ip绑定监听,默认绑定所有网卡
+agent1.sources.tcp-source.host = 0.0.0.0
+tcp ip绑定监听,默认绑定所有网卡
- agent1.sources.tcp-source.port = 46801
- tcp 端口绑定,默认绑定46801端口
+agent1.sources.tcp-source.port = 46801
+tcp 端口绑定,默认绑定46801端口
- agent1.sources.tcp-source.highWaterMark=2621440
- netty概念,设置netty高水位值
+agent1.sources.tcp-source.highWaterMark=2621440
+netty概念,设置netty高水位值
- agent1.sources.tcp-source.max-msg-length = 524288
- 限制单个包大小,这里如果传输的是压缩包,则是压缩包大小,限制512KB
+agent1.sources.tcp-source.max-msg-length = 524288
+限制单个包大小,这里如果传输的是压缩包,则是压缩包大小,限制512KB
- agent1.sources.tcp-source.topic = test_token
- 默认topic值,如果groupId和topic的映射关系找不到,则发送到此topic中
+agent1.sources.tcp-source.topic = test_token
+默认topic值,如果groupId和topic的映射关系找不到,则发送到此topic中
- agent1.sources.tcp-source.attr = m=9
- 默认m值设置,这里的m值是inlong内部TdMsg协议的版本
+agent1.sources.tcp-source.attr = m=9
+默认m值设置,这里的m值是inlong内部TdMsg协议的版本
- agent1.sources.tcp-source.connections = 5000
- 并发连接上线,超过上限值时会对新连接做断链处理
+agent1.sources.tcp-source.connections = 5000
+并发连接上线,超过上限值时会对新连接做断链处理
- agent1.sources.tcp-source.max-threads = 64
- netty线程池工作线程上限,一般推荐选择cpu的两倍
+agent1.sources.tcp-source.max-threads = 64
+netty线程池工作线程上限,一般推荐选择cpu的两倍
- agent1.sources.tcp-source.receiveBufferSize = 524288
- netty server tcp调优参数
+agent1.sources.tcp-source.receiveBufferSize = 524288
+netty server tcp调优参数
- agent1.sources.tcp-source.sendBufferSize = 524288
- netty server tcp调优参数
+agent1.sources.tcp-source.sendBufferSize = 524288
+netty server tcp调优参数
- agent1.sources.tcp-source.custom-cp = true
- 是否使用自研的channel process,自研channel process可在主channel阻塞时,选择备用channel发送
+agent1.sources.tcp-source.custom-cp = true
+是否使用自研的channel process,自研channel process可在主channel阻塞时,选择备用channel发送
- agent1.sources.tcp-source.selector.type =
org.apache.flume.channel.FailoverChannelSelector
- 这个channel selector就是自研的channel selector,和官网的差别不大,主要是有channel主从选择逻辑
+agent1.sources.tcp-source.selector.type =
org.apache.flume.channel.FailoverChannelSelector
+这个channel selector就是自研的channel selector,和官网的差别不大,主要是有channel主从选择逻辑
- agent1.sources.tcp-source.selector.master = ch-msg5 ch-msg6 ch-msg7
ch-msg8 ch-msg9
- 指定master
channel,这些channel会被优先选择用于数据推送。那些不在master、transfer、fileMetric、slaMetric配置项里的channel,但在
- channels里面有定义的channel,统归为slave channel,当master channel都被占满时,就会选择使用slave
channel,slave channel一般建议使用file channel类型
+agent1.sources.tcp-source.selector.master = ch-msg5 ch-msg6 ch-msg7 ch-msg8
ch-msg9
+指定master
channel,这些channel会被优先选择用于数据推送。那些不在master、transfer、fileMetric、slaMetric配置项里的channel,但在
+channels里面有定义的channel,统归为slave channel,当master channel都被占满时,就会选择使用slave
channel,slave channel一般建议使用file channel类型
- agent1.sources.tcp-source.selector.transfer = ch-msg5 ch-msg6 ch-msg7
ch-msg8 ch-msg9
- 指定transfer
channel,承接transfer类型的数据,这里的transfer一般是指推送到非tube集群的数据,仅做转发,这里预留出来供后续功能使用
+agent1.sources.tcp-source.selector.transfer = ch-msg5 ch-msg6 ch-msg7 ch-msg8
ch-msg9
+指定transfer
channel,承接transfer类型的数据,这里的transfer一般是指推送到非tube集群的数据,仅做转发,这里预留出来供后续功能使用
- agent1.sources.tcp-source.selector.fileMetric = ch-back
- 指定fileMetric channel,用于接收agent上报的指标数据
+agent1.sources.tcp-source.selector.fileMetric = ch-back
+指定fileMetric channel,用于接收agent上报的指标数据
+```
Channel配置示例以及对应的注解
memory channel
- agent1.channels.ch-more1.type = memory
- memory channel类型
+```shell
+agent1.channels.ch-more1.type = memory
+memory channel类型
- agent1.channels.ch-more1.capacity = 10000000
- memory channel 队列大小,可缓存最大消息条数
+agent1.channels.ch-more1.capacity = 10000000
+memory channel 队列大小,可缓存最大消息条数
- agent1.channels.ch-more1.keep-alive = 0
-
- agent1.channels.ch-more1.transactionCapacity = 20
- 原子操作时批量处理最大条数,memory channel使用时需要用到加锁,因此会有批处理流程增加效率
+agent1.channels.ch-more1.keep-alive = 0
+
+agent1.channels.ch-more1.transactionCapacity = 20
+原子操作时批量处理最大条数,memory channel使用时需要用到加锁,因此会有批处理流程增加效率
+```
file channel
- agent1.channels.ch-msg5.type = file
- file channel类型
+```shell
+agent1.channels.ch-msg5.type = file
+file channel类型
- agent1.channels.ch-msg5.capacity = 100000000
- file channel最大可缓存消息条数
+agent1.channels.ch-msg5.capacity = 100000000
+file channel最大可缓存消息条数
- agent1.channels.ch-msg5.maxFileSize = 1073741824
- file channel文件最大上限,字节数
+agent1.channels.ch-msg5.maxFileSize = 1073741824
+file channel文件最大上限,字节数
- agent1.channels.ch-msg5.minimumRequiredSpace = 1073741824
- file channel所在磁盘最小可用空间,设置此值可以防止磁盘写满
+agent1.channels.ch-msg5.minimumRequiredSpace = 1073741824
+file channel所在磁盘最小可用空间,设置此值可以防止磁盘写满
- agent1.channels.ch-msg5.checkpointDir = /data/work/file/ch-msg5/check
- file channel checkpoint路径
+agent1.channels.ch-msg5.checkpointDir = /data/work/file/ch-msg5/check
+file channel checkpoint路径
- agent1.channels.ch-msg5.dataDirs = /data/work/file/ch-msg5/data
- file channel数据路径
+agent1.channels.ch-msg5.dataDirs = /data/work/file/ch-msg5/data
+file channel数据路径
- agent1.channels.ch-msg5.fsyncPerTransaction = false
- 是否对每个原子操作做同步磁盘,建议改false,否则会对性能有影响
+agent1.channels.ch-msg5.fsyncPerTransaction = false
+是否对每个原子操作做同步磁盘,建议改false,否则会对性能有影响
- agent1.channels.ch-msg5.fsyncInterval = 5
- 数据从内存flush到磁盘的时间间隔,单位秒
+agent1.channels.ch-msg5.fsyncInterval = 5
+数据从内存flush到磁盘的时间间隔,单位秒
+```
Sink配置示例以及对应的注解
- agent1.sinks.meta-sink-more1.channel = ch-msg1
- sink的上游channel名称
+```shell
+agent1.sinks.meta-sink-more1.channel = ch-msg1
+sink的上游channel名称
+
+agent1.sinks.meta-sink-more1.type = org.apache.flume.sink.MetaSink
+sink类实现,此处实现消息向tube集群推送数据
- agent1.sinks.meta-sink-more1.type = org.apache.flume.sink.MetaSink
- sink类实现,此处实现消息向tube集群推送数据
+agent1.sinks.meta-sink-more1.master-host-port-list =
+tube集群master节点列表
- agent1.sinks.meta-sink-more1.master-host-port-list =
- tube集群master节点列表
+agent1.sinks.meta-sink-more1.send_timeout = 30000
+发送到tube时超时时间限制
- agent1.sinks.meta-sink-more1.send_timeout = 30000
- 发送到tube时超时时间限制
+agent1.sinks.meta-sink-more1.stat-interval-sec = 60
+sink指标统计间隔时间,单位秒
- agent1.sinks.meta-sink-more1.stat-interval-sec = 60
- sink指标统计间隔时间,单位秒
+agent1.sinks.meta-sink-more1.thread-num = 8
+Sink类发送消息的工作线程,8表示启动8个并发线程
- agent1.sinks.meta-sink-more1.thread-num = 8
- Sink类发送消息的工作线程,8表示启动8个并发线程
+agent1.sinks.meta-sink-more1.client-id-cache = true
+agent id缓存,用于检查agent上报数据去重
- agent1.sinks.meta-sink-more1.client-id-cache = true
- agent id缓存,用于检查agent上报数据去重
+agent1.sinks.meta-sink-more1.max-survived-time = 300000
+缓存最大时间
- agent1.sinks.meta-sink-more1.max-survived-time = 300000
- 缓存最大时间
+agent1.sinks.meta-sink-more1.max-survived-size = 3000000
+缓存最大个数
+```
- agent1.sinks.meta-sink-more1.max-survived-size = 3000000
- 缓存最大个数
+# 4、监控指标配置说明
+
+
DataProxy提供了JMX方式的监控指标Listener能力,用户可以实现MetricListener接口,注册后可以定期接收监控指标,用户选择将指标上报自定义的监控系统。Source和Sink模块可以通过将指标数据统计到org.apache.inlong.commons.config.metrics.MetricItemSet的子类中,并注册到MBeanServer。用户自定义的MetricListener通过JMX方式收集指标数据并上报到外部监控系统
+
+ 用户能在配置文件common.propetiese增加如下配置,例如:
+
+```shell
+metricDomains=DataProxy
+metricDomains.DataProxy.domainListeners=org.apache.inlong.dataproxy.metrics.prometheus.PrometheusMetricListener
+metricDomains.DataProxy.snapshotInterval=60000
+```
+
+ * 统一的JMX域名:DataProxy,并定义在参数metricDomains下;自定义的Source、Sink等组件也可以上报到不同的JMX域名。
+ *
对一个JMX域名的监控指标MetricListener可以配置在metricDomains.$domainName.domainListeners参数里,可以配置多个,用空格分隔类名。
+ *
这些监控指标MetricListener需要实现接口:org.apache.inlong.dataproxy.metrics.MetricListener。
+ * 快照参数:metricDomains.$domainName.snapshotInterval,定义拉取一次监控指标数据的间隔时间,参数单位是毫秒。
+
+ org.apache.inlong.dataproxy.metrics.MetricListener接口的方法原型
+
+```java
+public void snapshot(String domain, List<MetricItemValue> itemValues);
+```
+
+ 监控指标项的MetricItemValue.dimensions有这些维度(DataProxyMetricItem的这些字段通过注解Annotation
"@Dimension"定义):
+
+```shell
+clusterId: DataProxy集群ID
+sourceId: DataProxy的Source组件名
+sourceDataId: DataProxy的Source组件数据流ID,如果Source是一个TCPSource,那么这个ID会是一个端口号
+inlongGroupId: Inlong数据ID
+inlongStreamId: Inlong数据流ID
+sinkId: DataProxy的Sink组件名
+sinkDataId: DataProxy的Sink组件数据流ID,如果Sink是一个Pulsar发送组件,这个ID会是一个Topic名。
+```
+
+ 监控指标项的MetricItemValue.metrics有这些指标(DataProxyMetricItem的这些字段通过注解Annotation
"@CountMetric"定义):
+
+```shell
+readSuccessCount: 接收成功条数
+readSuccessSize: 接收成功大小,单位:byte
+readFailCount: 接收失败条数
+readFailSize: 接收失败大小,单位:byte
+sendCount: 发送条数
+sendSize: 发送大小,单位:byte
+sendSuccessCount: 发送成功条数
+sendSuccessSize: 发送成功大小,单位:byte
+sendFailCount: 发送失败条数
+sendFailSize: 发送失败大小,单位:byte
+sinkDuration: 发送成功回调时间和发送开始时间的时间差,用于评估目标集群的处理时延和健康状况,单位:毫秒
+nodeDuration: 发送成功回调时间和接收成功时间的时间差,用于评估DataProxy内部处理耗时和健康状况,单位:毫秒
+wholeDuration: 发送成功回调时间和事件生成时间的时间差,单位:毫秒
+```
+
+
监控指标已经注册到MBeanServer,用户可以在DataProxy的启动参数中增加如下类似JMX定义(端口和鉴权根据情况进行调整),实现监控指标从远端采集。
+
+```shell
+ -Dcom.sun.management.jmxremote
+ -Djava.rmi.server.hostname=127.0.0.1
+ -Dcom.sun.management.jmxremote.port=9999
+ -Dcom.sun.management.jmxremote.authenticate=false
+ -Dcom.sun.management.jmxremote.ssl=false
+```