dosubot[bot] commented on issue #8817: URL: https://github.com/apache/incubator-devlake/issues/8817#issuecomment-4152535741
<!-- Answer --> This is a confirmed bug. The PR collector has a two-stage collection process: Stage 1 lists PRs via pagination, and Stage 2 re-fetches individual details for all previously-collected PRs that aren't in `state = 'closed'` [[1]](https://github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/github/tasks/pr_collector.go#L109-L116). Your deleted/transferred PR still has a record in DevLake's DB from a prior sync, so Stage 2 tries to fetch it by number via `repos/{{ .Params.Name }}/pulls/{{ .Input.Number }}` [[2]](https://github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/github/tasks/pr_collector.go#L123) — that's where the 404 comes from. The issue is that the `CollectUnfinishedDetails` stage is **missing an `AfterResponse` handler** to gracefully skip 404s [[3]](https://github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/github/tasks/pr_collector.go#L122-L132). Without it, the default behavior retries 404s 3 times and then crashes the subtask [[4]](https://github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/helpers/pluginhelper/api/api_async_client.go#L188-L192). The fix is a one-liner. An [`ignoreHTTPStatus404` helper](https://github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/github/tasks/shared.go#L66-L74) already exists in `shared.go` and is used by other GitHub collectors like `cicd_job_collector.go` [[5]](https://github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/github/tasks/cicd_job_collector.go#L116) and `account_collector.go` [[6]](https://github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/github/tasks/account_collector.go#L91-L96). It just needs to be added to the `CollectUnfinishedDetails` block in `pr_collector.go`: ```go FinalizableApiCollectorCommonArgs: helper.FinalizableApiCollectorCommonArgs{ UrlTemplate: "repos/{{ .Params.Name }}/pulls/{{ .Input.Number }}", AfterResponse: ignoreHTTPStatus404, // ← add this line ResponseParser: func(res *http.Response) ([]json.RawMessage, errors.Error) { ``` **Workaround until this is fixed:** you can manually delete the offending PR record from the `github_pull_requests` table in DevLake's database (where `number = 47788` and the matching `repo_id`/`connection_id`), so Stage 2 no longer tries to re-fetch it. <!-- Dosu Comment Footer --> *To reply, just mention [@dosu](https://go.dosu.dev/dosubot?utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=brand-link).* --- <sup>How did I do? [Good](https://app.dosu.dev/response-feedback/db42eb24-0c54-4208-8fa0-2e6ccdf5143f?feedback_type=great_response&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-great_response) | [Irrelevant](https://app.dosu.dev/response-feedback/db42eb24-0c54-4208-8fa0-2e6ccdf5143f?feedback_type=irrelevant_answer&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-irrelevant_answer) | [Incorrect](https://app.dosu.dev/response-feedback/db42eb24-0c54-4208-8fa0-2e6ccdf5143f?feedback_type=incorrect_sources&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-incorrect_sources) | [Verbose](https://app.dosu.dev/response-feedback/db42eb24-0c54-4208-8fa0-2e6ccdf5143f?feedback_type=too_verbose&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-too_verbose) | [Hallucination](https://app.dosu.dev/response-feedback/db42eb24-0c54 -4208-8fa0-2e6ccdf5143f?feedback_type=hallucination&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-hallucination) | [Report 🐛](https://app.dosu.dev/response-feedback/db42eb24-0c54-4208-8fa0-2e6ccdf5143f?feedback_type=bug_report&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-bug_report) | [Other](https://app.dosu.dev/response-feedback/db42eb24-0c54-4208-8fa0-2e6ccdf5143f?feedback_type=other&utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=feedback-other)</sup> [](https://app.dosu.dev/b4e8e847-d479-4541-83a8-d88d83fea5c9/ask?utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=chat-badge) [](https://cursor.com/link/prompt?text=This%20is%20a%20confirmed%20bug.%20The%20PR%20collector%20has%20a%20two-stage%20collection%20process%3A%20Stage%201%20lists%20PRs%20via%20pagination%2C%20and%20Stage%202%20re-fetches%20individual%20details%20for%20all%20previously-collected%20PRs%20that%20aren%27t%20in%20%60state%20%3D%20%27closed%27%60%20%5B%5B1%5D%5D%28https%3A//github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/github/tasks/pr_collector.go%23L109-L116%29.%20Your%20deleted/transferred%20PR%20still%20has%20a%20record%20in%20DevLake%27s%20DB%20from%20a%20prior%20sync%2C%20so%20Stage%202%20tries%20to%20fetch%20it%20by%20number%20via%20%60repos/%7B%7B%20.Params.N ame%20%7D%7D/pulls/%7B%7B%20.Input.Number%20%7D%7D%60%20%5B%5B2%5D%5D%28https%3A//github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/github/tasks/pr_collector.go%23L123%29%20%E2%80%94%20that%27s%20where%20the%20404%20comes%20from.%0A%0AThe%20issue%20is%20that%20the%20%60CollectUnfinishedDetails%60%20stage%20is%20%2A%2Amissing%20an%20%60AfterResponse%60%20handler%2A%2A%20to%20gracefully%20skip%20404s%20%5B%5B3%5D%5D%28https%3A//github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/github/tasks/pr_collector.go%23L122-L132%29.%20Without%20it%2C%20the%20default%20behavior%20retries%20404s%203%20times%20and%20then%20crashes%20the%20subtask%20%5B%5B4%5D%5D%28https%3A//github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/helpers/pluginhelper/api/api_async_client.go%23L188-L192%29.%0A%0AThe%20fix%20is%20a%20one-liner.%20An%20%5B%60ignoreHTTPStatus404%60%20helper%5D %28https%3A//github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/github/tasks/shared.go%23L66-L74%29%20already%20exists%20in%20%60shared.go%60%20and%20is%20used%20by%20other%20GitHub%20collectors%20like%20%60cicd_job_collector.go%60%20%5B%5B5%5D%5D%28https%3A//github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/github/tasks/cicd_job_collector.go%23L116%29%20and%20%60account_collector.go%60%20%5B%5B6%5D%5D%28https%3A//github.com/apache/incubator-devlake/blob/2c15ad880dce830c6512fa58039527cbf4745b8b/backend/plugins/github/tasks/account_collector.go%23L91-L96%29.%20It%20just%20needs%20to%20be%20added%20to%20the%20%60CollectUnfinishedDetails%60%20block%20in%20%60pr_collector.go%60%3A%0A%0A%60%60%60go%0AFinalizableApiCollectorCommonArgs%3A%20helper.FinalizableApiCollectorCommonArgs%7B%0A%20%20%20%20UrlTemplate%3A%20%20%20%22repos/%7B%7B%20.Params.Name%20%7D%7D/pulls/%7B%7B%20.Input.Number%20%7D%7D %22%2C%0A%20%20%20%20AfterResponse%3A%20ignoreHTTPStatus404%2C%20%20//%20%E2%86%90%20add%20this%20line%0A%20%20%20%20ResponseParser%3A%20func%28res%20%2Ahttp.Response%29%20%28%5B%5Djson.RawMessage%2C%20errors.Error%29%20%7B%0A%60%60%60%0A%0A%2A%2AWorkaround%20until%20this%20is%20fixed%3A%2A%2A%20you%20can%20manually%20delete%20the%20offending%20PR%20record%20from%20the%20%60github_pull_requests%60%20table%20in%20DevLake%27s%20database%20%28where%20%60number%20%3D%2047788%60%20and%20the%20matching%20%60repo_id%60/%60connection_id%60%29%2C%20so%20Stage%202%20no%20longer%20tries%20to%20re-fetch%20it.) [](https://go.dosu.dev/discord-bot?utm_source=github&utm_medium=bot-comment&utm_campaign=github-assistant&utm_content=join-discord) [](https://twitter.com/intent/tweet?text=%40dosu_ai%20helped%20me%20solve%20this%20issue!&url=https %3A//github.com/apache/incubator-devlake/issues/8817) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
