dosubot[bot] commented on issue #8450:
URL: 
https://github.com/apache/incubator-devlake/issues/8450#issuecomment-2909019004

   <!-- Answer -->
   To filter out the collection of certain GitLab "product warehouses" 
(repositories or projects) that take a long time to collect in Apache DevLake, 
you can use the configuration options provided in the GitLab plugin. 
   
   The key settings to look at are:
   
   - **dataScope**: This section allows you to search for and select specific 
GitLab projects or subgroups to include or exclude from data collection. By 
carefully selecting only the projects you want, you can avoid collecting from 
those "pure product warehouses" that are slow to process. The UI provides a 
miller column for browsing and filtering projects and subgroups, making it 
easier to select or deselect them as needed.  
     [See config 
reference](https://github.com/apache/incubator-devlake/blob/main/config-ui/src/plugins/register/gitlab/config.tsx)
   
   - **scopeConfig**: This section lets you define which entities to collect 
(such as CODE, TICKET, CODEREVIEW, etc.) and apply transformation patterns. You 
can use these patterns to further refine what data is collected from each 
selected project.  
     [See config 
reference](https://github.com/apache/incubator-devlake/blob/main/config-ui/src/plugins/register/gitlab/config.tsx)
   
   - **Incremental Collection**: If you still need to collect from some large 
projects but want to reduce the time, consider enabling incremental collection. 
This will only fetch new or updated data, rather than collecting everything 
from scratch each time.  
     [See more about collectors and incremental 
collection](https://github.com/apache/incubator-devlake/blob/main/backend/DevelopmentManual.md)
   
   **Note:** The term "pure product warehouses" is not standard in DevLake or 
GitLab documentation. If you mean specific types of repositories or projects, 
you will need to identify them by name or pattern and exclude them during the 
project selection step in the dataScope configuration.
   
   If you need more granular filtering (for example, by project naming 
patterns), you may need to use transformation patterns in the scopeConfig or 
manually select/deselect projects during the onboarding/configuration process.
   
   If you have a specific example of what you mean by "pure product 
warehouses," I can provide more targeted configuration guidance.
   
   
   <!-- Dosu Comment Footer -->
   *To continue talking to [Dosu](https://go.dosu.dev/dosubot), mention 
@dosu-bot.*
   
   ---
   <sup>Help Dosu learn! Give it feedback: [Great 
Response](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=great_response)
 | [Irrelevant 
Answer](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=irrelevant_answer)
 | [Incorrect 
Sources](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=incorrect_sources)
 | [Too 
Verbose](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=too_verbose)
 | 
[Hallucination](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=hallucination)
 | [Bug 
Report](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=bug_report)
 | 
[Other](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=other)</sup>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to