github-actions[bot] commented on issue #10449:
URL: 
https://github.com/apache/dolphinscheduler/issues/10449#issuecomment-1155958748

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### What happened
   
   Using dolphinscheduler to schedule 100,000-level workflows per day, when 
using mysql as the metadata database for millions of tasks, the pressure on 
mysql is too high
   As shown in the figure below, MySQL is initially configured with 2 cores and 
4G, and the CPU usage is very high.
   Then upgrade mysql to 8 cores and 16g, and soon the cpu usage will reach 100%
   
![image](https://user-images.githubusercontent.com/23436224/173733315-ddd21919-afb9-4d9e-b36c-41447a8e9d8a.png)
   
   
   ### What you expected to happen
   
   After troubleshooting, it was found that the data volume of the three tables 
t_ds_task_instance, t_ds_process_instance, and t_ds_relation_process_instance 
was too large, and no entry was configured to delete the data.
   The reason for the high usage of mysql cpu is that a large number of the 
same statements query the t_ds_relation_process_instance table. The query 
statements are as follows:
   select id, parent_process_instance_id, parent_task_instance_id, 
process_instance_id from t_ds_relation_process_instance where 
parent_process_instance_id = 667735 and parent_task_instance_id = 2454593;
   
   This table does not have an appropriate index created, causing each query to 
scan the full table.
   Therefore, you can add an index to this table, which can greatly relieve the 
pressure on MySQL.
   
![image](https://user-images.githubusercontent.com/23436224/173734401-adedb8c0-bbdc-4bc2-8b9a-0ee246e32060.png)
   
   
   
   ### How to reproduce
   
   Workflow and task parallelism are the default values, and the metadata 
database is mysql
   Schedule hundreds of workflows per hour, each with dozens of tasks
   Executing for a period of time will fill up the mysql cpu
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   2.0.5
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to