bukeliyu opened a new issue, #10449: URL: https://github.com/apache/dolphinscheduler/issues/10449
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues. ### What happened 使用dolphinscheduler每天调度10万级工作流,上百万任务量时,使用mysql作为元数据库,mysql压力过大 如下图,开始mysql配置为2核4g,cpu占用非常多峰刺。 然后升级mysql为8核16g,很快cpu使用率就达到100%  ### What you expected to happen 经过问题排查发现t_ds_task_instance,t_ds_process_instance,t_ds_relation_process_instance这三张表数据量过大,并且没有配置入口去删除数据。 而mysql cpu使用过高的的原因主要是大量相同语句查询t_ds_relation_process_instance表,查询语句如下: select id, parent_process_instance_id, parent_task_instance_id, process_instance_id from t_ds_relation_process_instance where parent_process_instance_id = 667735 and parent_task_instance_id = 2454593; 这个表没有创建合适的索引,导致每次查询会扫描全表。 所以可以对此表增加索引,可以很大程度缓解mysql压力。  ### How to reproduce 工作流、任务并行度为默认值,元数据库为mysql 每小时调度数百个工作流,每个工作流有数十个任务 执行一段时间会把mysql cpu占满 ### Anything else _No response_ ### Version 2.0.5 ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
