jcg-juan opened a new issue, #8571:
URL: https://github.com/apache/incubator-devlake/issues/8571

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/incubator-devlake/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### What happened
   
   We’ve identified a data inconsistency with how DevLake ingests GitHub Pull 
Requests. Since around March 2025, all PRs that are merged in GitHub are being 
stored in DevLake as state = closed and merged = false. This results in:
   
   - Merged PRs not being reflected correctly in queries (state = 'MERGED' 
returns nothing after February).
   - Metrics such as additions/deletions for merged PRs show up as 0.
   - Dashboards and reports undercounting merged PR activity.
   
   Impact:
   - Any downstream metrics that rely on merged PRs (e.g., code change size, 
lead time, DORA metrics) are currently inaccurate.
   - Dashboards show zero additions/deletions after Feb 2025.
   
   ### What do you expect to happen
   
   - DevLake should correctly recognise merged PRs (state = 'MERGED') or, at a 
minimum, set the merged column to true when GitHub API reports it.
   - Additions/deletions should be populated for merged PRs the same way as 
open ones.
   
   ### How to reproduce
   
   Steps to Reproduce:
   - Ingest PR data from GitHub into DevLake.
   - Run the following query to check the latest PRs:
   ```
   SELECT *
   FROM public._tool_github_pull_requests
   ORDER BY github_updated_at DESC
   LIMIT 1000;
   ```
   
   - Observe that since March 2025, all PRs are either open or closed, never 
MERGED.
   - Additionally, **merged** column is always **false**, even when GitHub API 
reports merged = true.
   
   ### Anything else
   
   1. Example GitHub API response for a merged PR (PR #1 before March working 
as expected):
   ```
   curl -s -H "Accept: application/vnd.github+json" \
      -H "Authorization: Bearer <token>" \
      https://api.github.com/repos/<org>/<repo>/pulls/1 \
      | jq '{number, state, merged, additions, deletions}'
   ```
   Response:
   ```
   {
     "number": 1,
     "state": "closed",
     "merged": true,
     "additions": 203,
     "deletions": 0
   }
   
   From Devlake DB (table _tool_github_pull_requests): 
   ```
   <html><head></head><body>
   connection_id | github_id | repo_id | number | state | title | created_at | 
updated_at | closed_at | additions | deletions | changed_files | 
review_comments | comments | merged | merge_commit_sha | head_ref | base_ref | 
head_sha | base_sha | url | author | author_id | _created_at | _updated_at | 
raw_data_origin | raw_table | raw_data_id | org_id | installation_id | commits 
| is_draft
   -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | 
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | 
--
   1 | 2019299654 | 842380603 | 1 | MERGED | initial setup | 2024-08-14 
13:22:49 | 2024-08-14 13:26:32 | 2024-08-14 13:26:31 | 203 | 0 | 0 | 0 | 0 | 
false | (redacted) | setup-action | main | (redacted) | (redacted) | (redacted) 
| dev-user | 132356641 | 2025-03-30 00:09:11.131044 | 2025-03-30 
00:09:11.131044 | {"ConnectionId":1,"Name":"…"} | _raw_github_graphql_prs | 171 
| (null) | 0 | 0 | false
   
   </body></html>
   
   2. Example GitHub API response for a merged PR (PR #73 after March NOT 
working as expected):
   ```
   curl -s -H "Accept: application/vnd.github+json" \
      -H "Authorization: Bearer <token>" \
      https://api.github.com/repos/<org>/<repo>/pulls/73\
      | jq '{number, state, merged, additions, deletions}'
   ```
   Response:
   ```
   {
     "number": 73,
     "state": "closed",
     "merged": true,
     "additions": 4,
     "deletions": 1
   }
   ```
   
   From Devlake DB (table _tool_github_pull_requests): 
   <html><head></head><body>
   connection_id | github_id | repo_id | number | state | title | created_at | 
updated_at | closed_at | additions | deletions | changed_files | 
review_comments | comments | merged | merge_commit_sha | head_ref | base_ref | 
head_sha | base_sha | url | author | author_id | _created_at | _updated_at | 
raw_data_origin | raw_table | raw_data_id | org_id | installation_id | commits 
| is_draft
   -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | 
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | 
--
   2 | 2815676617 | 842380603 | 73 | closed | Cicd usage script fix: check if 
action enabled | 2025-09-10 14:09:57 | 2025-09-10 16:05:53 | 2025-09-10 
16:05:52 | 0 | 0 | 0 | 0 | 0 | false | (redacted) | feature-branch | main | 
(redacted) | (redacted) | (redacted) | dev-user | 131661649 | 2025-09-12 
00:00:49.359316 | 2025-09-12 00:00:49.359316 | {"ConnectionId":2,"Name":"…"} | 
_raw_github_api_pull_requests | 30757 | (null) | 842380603 | 0 | false
   
   </body></html>
   
   ### Version
   
   v1.0.2
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to