hippojay opened a new issue, #8283:
URL: https://github.com/apache/incubator-devlake/issues/8283

   ## What and why to refactor
   The gitlab plugin will collect account information relating to users 
(account_collector.go).  For Gitlab.com this is done on a per project basis, 
and run for each gitlab repository collection.
   
   For on-premise instance, there is a test which allows the plugin to use the 
global /users API endpoint.  However this results in duplicated operations for 
each repository (data scope) in the project.  I.e. for a DevLake project with 
20 data scopes, this will result in the account information being gathered, 
extracted and converted 20 times. 19 of them will be repeats of the same data
   
   User collection on a large user base (7000 users) takes 3min 30 seconds for 
collection, extraction and conversation per stage.
   
   ## Describe the solution you'd like
   Ideally account Collection for on-premise needs to be a single Gitlab stage 
that is added to the pipeline.  However it could also be a added to the first 
collection stage as a subtask, and then not added as a subtask to other stages 
- however this makes that collection less visible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to