p1ne commented on issue #8448: URL: https://github.com/apache/incubator-devlake/issues/8448#issuecomment-2908796317
I would correct bottleneck statement. We have large scale enterprise DevLake installation with GitLab/Jira/Sonar connections, 80+ projects with 10-100 repositories each. Full collection cycle is approx 2 weeks, where "full collection" != "full data gathering", it's just sequental run of all blueprints - with incremental data collection where applicable. It still may take 2-12 hours for large projects. RPS tweaks, database resource increase and other optimizations helped a little bit, but still. To my point of view, main bottleneck is single runner, which just walks all projects one-by-one. Parallel runners may be more effective -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
