dosubot[bot] commented on issue #8523:
URL: 
https://github.com/apache/incubator-devlake/issues/8523#issuecomment-3148856824

   <!-- Answer -->
   This is a known issue: the GitLab plugin collects user data on a per-project 
basis, so when you integrate many repositories, the `_raw_gitlab_api_users` 
table accumulates a large number of duplicate records and can grow very large. 
This behavior is expected given the current design—raw tables are meant to 
store all occurrences for traceability, and deduplication only happens later in 
the tool layer, not in the raw layer. There is an open discussion and a 
proposed solution to collect user accounts only once per pipeline, which would 
reduce duplication and table size, but it hasn't been merged yet. You can 
follow the details and progress in [issue 
#8283](https://github.com/apache/incubator-devlake/issues/8283). For now, 
periodic cleanup or manual deduplication may be needed if storage is a concern.
   
   
   <!-- Dosu Comment Footer -->
   *To reply, just mention [@dosu](https://go.dosu.dev/dosubot).*
   
   ---
   <sup>How did I do? 
[Good](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=great_response)
 | 
[Irrelevant](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=irrelevant_answer)
 | 
[Incorrect](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=incorrect_sources)
 | 
[Verbose](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=too_verbose)
 | 
[Hallucination](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=hallucination)
 | [Report 
🐛](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=bug_report)
 | 
[Other](https://app.dosu.dev/response-feedback/9f462dbc-36a1-4ea7-938b-28a48a2361e4?feedback_type=other)</sup>&nbsp;&nbsp;[![Chat
 with 
Dosu](https://dosu.dev/dosu-chat-badge.svg)](https://app.dosu.dev/b4e8e847-d479-4541-83a8-d88d83fea5c9/ask?utm_source=githu
 b)&nbsp;[![Join 
Discord](https://img.shields.io/badge/join-5865F2?logo=discord&logoColor=white&label=)](https://go.dosu.dev/discord-bot)&nbsp;[![Share
 on 
X](https://img.shields.io/badge/X-share-black)](https://twitter.com/intent/tweet?text=%40dosu_ai%20helped%20me%20solve%20this%20issue!&url=https%3A//github.com/apache/incubator-devlake/issues/8523)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to