qiaozhanwei opened a new issue #3370:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/3370
**Describe the feature**
JVM parameter optimization
**Is your feature request related to a problem? Please describe.**
4G heap memory and default young generation size
within 30s of starting, there will be one Minor GC and two Full GCs
S0C S1C S0U S1U EC EU OC OU MC
MU CCSC CCSU YGC YGCT FGC FGCT GCT
110720.0 110720.0 0.0 48623.7 886080.0 607260.9 3086784.0 0.0
42276.0 41076.2 5676.0 5418.5 1 0.066 2 0.305 0.370
**Describe the solution you'd like**
A clear and concise description of what you want to happen.
reasoning:
If the old age is full, causing Full GC, then it is unlikely that the old
age will be zero after Full GC, which is relatively small.
Then there is the metadata area. Visually, MU uses a lot, so adjust the size
of the metadata area.
The metadata area mainly stores the description information and static
variables of some classes. If there are more reflections, this area can be
increased appropriately
It is recommended to 256MB or 512M. Considering the user's machine memory,
set 128M temporarily, which is smaller
-XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=128m
After the optimization is started, there is no Full GC
S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT
209664.0 209664.0 0.0 51195.2 1677824.0 1584070.1 2097152.0 0.0 43136.0
41916.0 5760.0 5425.1 1 0.071 0 0.000 0.071
Worker starts 20 processes, Linux memory takes up 1.530g, an average of
about 78MB each, one process
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7782 root 20 0 13.638g 1.530g 23908 S 0.0 1.2 0:56.66 java
Check the JVM memory at this time, and there is no change
Example:
--------------------------------------------------
------------------------------------------
Start a process every 5s, 20 tasks, each task sleep 10s
t_ds_task_instance 40 tasks are running
--------------------------------------------------
------------------------------------------
4G heap memory
export DOLPHINSCHEDULER_OPTS="-server -Xms4g -Xmx4g -Xmn2g
-XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=128m -Xss512k -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:Bytes=128Enabled
-XX:LargePageSize +UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails -Xloggc:gc.log
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/oom"
The new generation is about 2M per second
Perform minor gc once every 14 minutes, survivor 40~50M, enter the old age
without dynamic age judgment, without full gc
--------------------------------------------------
------------------------------------------
2G heap memory
export DOLPHINSCHEDULER_OPTS="-server -Xms2g -Xmx2g -Xmn1g
-XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=128m -Xss512k -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:Bytes=128Enabled
-XX:LargePageSize +UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails -Xloggc:gc.log
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/oom"
The new generation is about 2M per second
Perform minor gc once every 7 minutes, survivor 40~50M, enter the old age
without dynamic age judgment, without full gc
For the JDBC result set, there are 1000. According to the test of the task
instance table, there are an average of 4 task instance result sets, each time
in 30s.
Each time young gc old age increases by 1M
The new generation is about 30M per second
Master JVM optimization
--------------------------------------------------
------------------------------------------
Start a process every 5s, 20 tasks, each task sleep 10s
t_ds_task_instance 40 tasks are running
3~4 process examples
--------------------------------------------------
------------------------------------------
The new generation is about 40M per second
Perform a minor gc once in 40s, and the old age will increase by about 1M
after each minor gc
Start a process every 1s, 20 tasks, each task sleep 10s
t_ds_process_instance 80 process instances
The new generation has been around 500M per second
Perform a minor gc once in 3s, and the old generation will increase by about
1M after each minor gc
Api Server
--------------------------------------------------
------------------------------------------
export DOLPHINSCHEDULER_OPTS="-server -Xms1g -Xmx1g -Xmn500m
-XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=128m -Xss512k -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:LargePageSize
-XX:LargePageSize +UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails -Xloggc:gc.log
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/oom"
Minor gc occurs once in about 30 minutes
worker JVM优化
export DOLPHINSCHEDULER_OPTS="-server -Xms4g -Xmx4g -Xss512k
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
-XX:LargePageSizeInBytes=128m -XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails -Xloggc:gc.log
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=dump.hprof"
4g
s0 108MB
s1 108MB
eden 865MB
old 3014MB
metaspace 41MB
启动完30s内,就会发生一次Minor GC和两次Full GC
S0C S1C S0U S1U EC EU OC OU MC
MU CCSC CCSU YGC YGCT FGC FGCT GCT
110720.0 110720.0 0.0 48623.7 886080.0 607260.9 3086784.0 0.0
42276.0 41076.2 5676.0 5418.5 1 0.066 2 0.305 0.370
推理:
如果要是老年代满了,引起Full GC,那老年代也不太可能Full GC之后,OU为零,可能性比较小
然后就是元数据区了,目测MU使用很多,所以调整一下元数据区域大小,
元数据区域主要存放的是一些类的描述信息和静态变量,如果反射比较多,这个区域可以适当的增大
建议256MB或者512M,考虑到用户机器内存,姑且设置128M,小一点
-XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=128m
优化后启动之后,无Full GC
S0C S1C S0U S1U EC EU OC OU
MC MU CCSC CCSU YGC YGCT FGC FGCT GCT
209664.0 209664.0 0.0 51195.2 1677824.0 1584070.1 2097152.0 0.0
43136.0 41916.0 5760.0 5425.1 1 0.071 0 0.000 0.071
worker 启动20个进程,linux内存占用1.530g,平均每个78MB左右,一个进程
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7782 root 20 0 13.638g 1.530g 23908 S 0.0 1.2 0:56.66 java
此时查看JVM内存,并没有变化
示例:
--------------------------------------------------------------------------------------------
每5s启动一个流程,20个任务,每个任务sleep 10s
t_ds_task_instance 40个任务在运行
--------------------------------------------------------------------------------------------
4G堆内存
export DOLPHINSCHEDULER_OPTS="-server -Xms4g -Xmx4g -Xmn2g
-XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=128m -Xss512k -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
-XX:LargePageSizeInBytes=128m -XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails -Xloggc:gc.log
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/oom"
每秒新生代曾2M左右
14分钟进行一次minor gc,survivor 40~50M,无动态年龄判断进入老年代,没有full gc
--------------------------------------------------------------------------------------------
2G堆内存
export DOLPHINSCHEDULER_OPTS="-server -Xms2g -Xmx2g -Xmn1g
-XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=128m -Xss512k -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
-XX:LargePageSizeInBytes=128m -XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails -Xloggc:gc.log
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/oom"
每秒新生代曾2M左右
7分钟进行一次minor gc,survivor 40~50M,无动态年龄判断进入老年代,没有full gc
对于JDBC结果集合,有1000条,根据任务实例表测试的,平均有4个任务实例结果集,30s一次yong gc
每次young gc老年代增加1M
新生代30M左右每秒
Master JVM优化
--------------------------------------------------------------------------------------------
每5s启动一个流程,20个任务,每个任务sleep 10s
t_ds_task_instance 40个任务在运行
3~4个流程实例
--------------------------------------------------------------------------------------------
每秒新生代曾40M左右
40s进行一次minor gc,每次minor gc之后老年代增加1M左右
每1s启动一个流程,20个任务,每个任务sleep 10s
t_ds_process_instance 80个流程实例
每秒新生代曾500M左右
3s进行一次minor gc,每次minor gc之后老年代增加1M左右
Api Server
--------------------------------------------------------------------------------------------
export DOLPHINSCHEDULER_OPTS="-server -Xms1g -Xmx1g -Xmn500m
-XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=128m -Xss512k -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
-XX:LargePageSizeInBytes=128m -XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails -Xloggc:gc.log
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/oom"
30分钟左右发生一次minor gc
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]