dosubot[bot] commented on issue #8450: URL: https://github.com/apache/incubator-devlake/issues/8450#issuecomment-2909019004
<!-- Answer --> To filter out the collection of certain GitLab "product warehouses" (repositories or projects) that take a long time to collect in Apache DevLake, you can use the configuration options provided in the GitLab plugin. The key settings to look at are: - **dataScope**: This section allows you to search for and select specific GitLab projects or subgroups to include or exclude from data collection. By carefully selecting only the projects you want, you can avoid collecting from those "pure product warehouses" that are slow to process. The UI provides a miller column for browsing and filtering projects and subgroups, making it easier to select or deselect them as needed. [See config reference](https://github.com/apache/incubator-devlake/blob/main/config-ui/src/plugins/register/gitlab/config.tsx) - **scopeConfig**: This section lets you define which entities to collect (such as CODE, TICKET, CODEREVIEW, etc.) and apply transformation patterns. You can use these patterns to further refine what data is collected from each selected project. [See config reference](https://github.com/apache/incubator-devlake/blob/main/config-ui/src/plugins/register/gitlab/config.tsx) - **Incremental Collection**: If you still need to collect from some large projects but want to reduce the time, consider enabling incremental collection. This will only fetch new or updated data, rather than collecting everything from scratch each time. [See more about collectors and incremental collection](https://github.com/apache/incubator-devlake/blob/main/backend/DevelopmentManual.md) **Note:** The term "pure product warehouses" is not standard in DevLake or GitLab documentation. If you mean specific types of repositories or projects, you will need to identify them by name or pattern and exclude them during the project selection step in the dataScope configuration. If you need more granular filtering (for example, by project naming patterns), you may need to use transformation patterns in the scopeConfig or manually select/deselect projects during the onboarding/configuration process. If you have a specific example of what you mean by "pure product warehouses," I can provide more targeted configuration guidance. <!-- Dosu Comment Footer --> *To continue talking to [Dosu](https://go.dosu.dev/dosubot), mention @dosu-bot.* --- <sup>Help Dosu learn! Give it feedback: [Great Response](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=great_response) | [Irrelevant Answer](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=irrelevant_answer) | [Incorrect Sources](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=incorrect_sources) | [Too Verbose](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=too_verbose) | [Hallucination](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=hallucination) | [Bug Report](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=bug_report) | [Other](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=other)</sup> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org