sstojak1 opened a new issue, #8066: URL: https://github.com/apache/incubator-devlake/issues/8066
## What and why to refactor What are you trying to refactor? Why should it be refactored now? The [accounts_collector](https://github.com/apache/incubator-devlake/blob/main/backend/plugins/sonarqube/tasks/accounts_collector.go) is adding duplicate data to the raw table after each run because it's using StatefulApiCollector. Since this accounts_collector is straightforward (it imports all accounts), we should switch to ApiCollector, which will delete the old data before importing new records for each project. This will prevent duplicates in the raw table. Why is this an issue for us: 1. The `ExtractAccounts` job, which runs right after, takes longer because the raw table grows larger with each pipeline run. 2. The raw table size is becoming excessive, currently exceeding 8GB. ## Describe the solution you'd like How to refactor? Switch from StatefulApiCollector to the simple ApiCollector. ## Related issues Please link any other ## Additional context Add any other context or screenshots about the feature request here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
