GitHub user dosubot[bot] added a comment to the discussion: Question regarding 
scaling options.

<!-- Greeting -->
Hi @DucTuLeAnh! I'm [Dosu](https://go.dosu.dev/dosubot) and I’m helping the 
incubator-devlake team.

<!-- Answer -->
You're correct — DevLake is designed as a [single-instance 
application](https://github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/devops/deployment/k8s/k8s-deploy.yaml)
 and doesn't support horizontal scaling by increasing replicas. All blueprints 
are [processed sequentially by a single 
runner](https://github.com/apache/incubator-devlake/issues/8448#issuecomment-2908796317),
 which becomes the primary bottleneck at enterprise scale.

For 1000+ GitLab repositories, there have been community reports of significant 
challenges:
- [Full collection cycles taking ~2 
weeks](https://github.com/apache/incubator-devlake/issues/8448#issuecomment-2908796317)
 for 80+ projects
- [MySQL hitting 100% CPU at ~1,000 
projects](https://github.com/apache/incubator-devlake/issues/8260) even with 16 
cores / 64 GB RAM
- [Raw tables growing to 21+ 
GB](https://github.com/apache/incubator-devlake/issues/8523) with 2000 repos 
due to user data duplication

**What helps at scale:**
1. **Database indexes** — Community-tested indexes [reduced DORA calculations 
from hours to minutes](https://github.com/apache/incubator-devlake/issues/8260)
2. **GitLab-specific optimization** — Set 
`GITLAB_SERVER_COLLECT_ALL_USERS=false` to collect members per-project instead 
of globally ([reduced collection time by 
~70%](https://github.com/apache/incubator-devlake/issues/8283))
3. **Memory allocation** — [Minimum 6 GB container 
memory](https://github.com/apache/incubator-devlake/discussions/8576#discussioncomment-14401772)
 recommended
4. **Collection strategy** — Use incremental collection, narrow date ranges, 
and consider `SKIP_COMMIT_STAT=true` / `SKIP_COMMIT_FILES=true`

The honest answer is that at 1000+ repositories, you'll likely face performance 
challenges even with optimizations. Parallel runners are the [most requested 
architectural 
enhancement](https://github.com/apache/incubator-devlake/issues/8448#issuecomment-2908796317)
 for this reason.


<!-- Dosu Comment Footer -->
*To reply, just mention 
[@dosu](https://go.dosu.dev/dosubot?utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=brand-link).*

---
<sup>How did I do? 
[Good](https://app.dosu.dev/response-feedback/88d0ad01-730f-4d10-a428-a3fe23565acf?feedback_type=great_response&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-great_response)
 | 
[Irrelevant](https://app.dosu.dev/response-feedback/88d0ad01-730f-4d10-a428-a3fe23565acf?feedback_type=irrelevant_answer&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-irrelevant_answer)
 | 
[Incorrect](https://app.dosu.dev/response-feedback/88d0ad01-730f-4d10-a428-a3fe23565acf?feedback_type=incorrect_sources&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-incorrect_sources)
 | 
[Verbose](https://app.dosu.dev/response-feedback/88d0ad01-730f-4d10-a428-a3fe23565acf?feedback_type=too_verbose&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-too_verbose)
 | [Hallucination](https://app.dosu.dev/response-feedback/88d0ad01-730f-4d
 
10-a428-a3fe23565acf?feedback_type=hallucination&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-hallucination)
 | [Report 
🐛](https://app.dosu.dev/response-feedback/88d0ad01-730f-4d10-a428-a3fe23565acf?feedback_type=bug_report&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-bug_report)
 | 
[Other](https://app.dosu.dev/response-feedback/88d0ad01-730f-4d10-a428-a3fe23565acf?feedback_type=other&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-other)</sup>

[![Chat with 
Dosu](https://dosu.dev/dosu-chat-badge.svg)](https://app.dosu.dev/b4e8e847-d479-4541-83a8-d88d83fea5c9/ask?utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=chat-badge)&nbsp;[![Open
 in 
Cursor](https://dosu.dev/dosu-in-cursor.svg)](https://cursor.com/link/prompt?text=You%27re%20correct%20%E2%80%94%20DevLake%20is%20designed%20as%20a%20%5Bsingle-instance%20application%5D%28https%3A//github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/devops/deployment/k8s/k8s-deploy.yaml%29%20and%20doesn%27t%20support%20horizontal%20scaling%20by%20increasing%20replicas.%20All%20blueprints%20are%20%5Bprocessed%20sequentially%20by%20a%20single%20runner%5D%28https%3A//github.com/apache/incubator-devlake/issues/8448%23issuecomment-2908796317%29%2C%20which%20becomes%20the%20primary%20bottleneck%20at%20enterprise%20scale.%0A%0AFor%201000%2B%20GitLab%20repositories%2C%20there%20have%20been%20community%20reports%20of%20significant%20cha
 
llenges%3A%0A-%20%5BFull%20collection%20cycles%20taking%20~2%20weeks%5D%28https%3A//github.com/apache/incubator-devlake/issues/8448%23issuecomment-2908796317%29%20for%2080%2B%20projects%0A-%20%5BMySQL%20hitting%20100%25%20CPU%20at%20~1%2C000%20projects%5D%28https%3A//github.com/apache/incubator-devlake/issues/8260%29%20even%20with%2016%20cores%20/%2064%20GB%20RAM%0A-%20%5BRaw%20tables%20growing%20to%2021%2B%20GB%5D%28https%3A//github.com/apache/incubator-devlake/issues/8523%29%20with%202000%20repos%20due%20to%20user%20data%20duplication%0A%0A%2A%2AWhat%20helps%20at%20scale%3A%2A%2A%0A1.%20%2A%2ADatabase%20indexes%2A%2A%20%E2%80%94%20Community-tested%20indexes%20%5Breduced%20DORA%20calculations%20from%20hours%20to%20minutes%5D%28https%3A//github.com/apache/incubator-devlake/issues/8260%29%0A2.%20%2A%2AGitLab-specific%20optimization%2A%2A%20%E2%80%94%20Set%20%60GITLAB_SERVER_COLLECT_ALL_USERS%3Dfalse%60%20to%20collect%20members%20per-project%20instead%20of%20globally%20%28%5Breduced%2
 
0collection%20time%20by%20~70%25%5D%28https%3A//github.com/apache/incubator-devlake/issues/8283%29%29%0A3.%20%2A%2AMemory%20allocation%2A%2A%20%E2%80%94%20%5BMinimum%206%20GB%20container%20memory%5D%28https%3A//github.com/apache/incubator-devlake/discussions/8576%23discussioncomment-14401772%29%20recommended%0A4.%20%2A%2ACollection%20strategy%2A%2A%20%E2%80%94%20Use%20incremental%20collection%2C%20narrow%20date%20ranges%2C%20and%20consider%20%60SKIP_COMMIT_STAT%3Dtrue%60%20/%20%60SKIP_COMMIT_FILES%3Dtrue%60%0A%0AThe%20honest%20answer%20is%20that%20at%201000%2B%20repositories%2C%20you%27ll%20likely%20face%20performance%20challenges%20even%20with%20optimizations.%20Parallel%20runners%20are%20the%20%5Bmost%20requested%20architectural%20enhancement%5D%28https%3A//github.com/apache/incubator-devlake/issues/8448%23issuecomment-2908796317%29%20for%20this%20reason.)&nbsp;[![Join
 
Discord](https://img.shields.io/badge/join-5865F2?logo=discord&logoColor=white&label=)](https://go.dosu.dev/disco
 
rd-bot?utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=join-discord)&nbsp;[![Share
 on 
X](https://img.shields.io/badge/X-share-black)](https://twitter.com/intent/tweet?text=%40dosu_ai%20helped%20me%20solve%20this%20issue!&url=https%3A//github.com/apache/incubator-devlake/discussions/8802)

GitHub link: 
https://github.com/apache/incubator-devlake/discussions/8802#discussioncomment-16276817

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to