dmora opened a new issue, #8646:
URL: https://github.com/apache/incubator-devlake/issues/8646
## Search before asking
- [x] I searched the issues and found no similar feature request
## Use case
**Problem**: Bot accounts (dependabot, github-actions, renovate, etc.)
significantly skew DevLake metrics, particularly DORA calculations.
In our organization's deployment:
- **28% of commits** are from bots (1,763 of 6,222)
- **17% of deployment commits** are bot-generated (70 of 560)
- **Lead Time for Changes** is artificially affected by automated dependency
updates
The current bot handling (PR #7845) only ignores PRs where `author_id=0`,
which doesn't cover:
- Commits authored by `*[bot]` accounts
- PRs with valid author IDs but bot-generated content (dependabot has a real
GitHub account)
- Title-pattern bot PRs ("Bump X from Y to Z")
**Current workaround**: Users must create MySQL views or modify every
Grafana dashboard query manually, which is error-prone and doesn't persist
across DevLake upgrades.
## Describe the solution you'd like
Add a **Bot Exclusion** section to Scope Config with the following options:
### 1. Author Pattern Exclusion
```
Exclude authors matching patterns:
[ ] *[bot]
[ ] dependabot*
[ ] renovate*
[ ] github-actions*
[ ] Custom: ___________
```
### 2. Title Pattern Exclusion (for PRs)
```
Exclude PRs with titles matching:
[ ] Bump * from * to *
[ ] Update * from * to *
[ ] Custom regex: ___________
```
### 3. Exclusion Scope
```
Apply exclusion to:
[x] Commits
[x] Pull Requests
[x] DORA Metrics (deployment commits)
[ ] Issues
```
### Implementation Suggestion
The filtering could be applied during the **transformation phase** (similar
to how issue type mapping works) rather than at collection, allowing users to:
1. Still collect all data for audit purposes
2. Filter at query time via domain layer tables
3. Toggle filtering without re-collecting data
### API Example
```json
{
"scopeConfig": {
"botExclusion": {
"enabled": true,
"authorPatterns": ["*[bot]", "dependabot*"],
"titlePatterns": ["Bump * from * to *"],
"applyTo": ["commits", "pull_requests", "cicd_deployment_commits"]
}
}
}
```
## Related issues
- #7845 - fix(github): ignore bot account (partial solution for author_id=0)
- #7786 - fix(github): process bot account in pull_requests table
## Are you willing to submit a PR?
- [ ] Yes, I am willing to submit a PR
## Code of Conduct
- [x] I agree to follow this project's Code of Conduct
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]