sstojak1 opened a new issue, #8066:
URL: https://github.com/apache/incubator-devlake/issues/8066

   ## What and why to refactor
   What are you trying to refactor? Why should it be refactored now?
   The 
[accounts_collector](https://github.com/apache/incubator-devlake/blob/main/backend/plugins/sonarqube/tasks/accounts_collector.go)
 is adding duplicate data to the raw table after each run because it's using 
StatefulApiCollector. Since this accounts_collector is straightforward (it 
imports all accounts), we should switch to ApiCollector, which will delete the 
old data before importing new records for each project. This will prevent 
duplicates in the raw table. 
   Why is this an issue for us:
   1. The `ExtractAccounts` job, which runs right after, takes longer because 
the raw table grows larger with each pipeline run.
   2. The raw table size is becoming excessive, currently exceeding 8GB.
   ## Describe the solution you'd like
   How to refactor?
   Switch from StatefulApiCollector to the simple ApiCollector.
   
   ## Related issues
   Please link any other
   
   ## Additional context
   Add any other context or screenshots about the feature request here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to