dosubot[bot] commented on issue #8523: URL: https://github.com/apache/incubator-devlake/issues/8523#issuecomment-3148856824
<!-- Answer --> This is a known issue: the GitLab plugin collects user data on a per-project basis, so when you integrate many repositories, the `_raw_gitlab_api_users` table accumulates a large number of duplicate records and can grow very large. This behavior is expected given the current design—raw tables are meant to store all occurrences for traceability, and deduplication only happens later in the tool layer, not in the raw layer. There is an open discussion and a proposed solution to collect user accounts only once per pipeline, which would reduce duplication and table size, but it hasn't been merged yet. You can follow the details and progress in [issue #8283](https://github.com/apache/incubator-devlake/issues/8283). For now, periodic cleanup or manual deduplication may be needed if storage is a concern. <!-- Dosu Comment Footer --> *To reply, just mention [@dosu](https://go.dosu.dev/dosubot).* --- <sup>How did I do? [Good](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=great_response) | [Irrelevant](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=irrelevant_answer) | [Incorrect](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=incorrect_sources) | [Verbose](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=too_verbose) | [Hallucination](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=hallucination) | [Report 🐛](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=bug_report) | [Other](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=other)</sup> [](https://app.dosu.dev/b4e8e847-d479-4541-83a8-d88d83fea5c9/ask?utm_source=githu b) [](https://go.dosu.dev/discord-bot) [](https://twitter.com/intent/tweet?text=%40dosu_ai%20helped%20me%20solve%20this%20issue!&url=https%3A//github.com/apache/incubator-devlake/issues/8523) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org