moshe opened a new issue, #5864:
URL: https://github.com/apache/incubator-devlake/issues/5864

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-devlake/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### What happened
   
   Hi,
   I have a pipeline that scrapes private GitHub repository, the pipeline fails 
during the invocation when trying to call `repos/ORG/REPO/pulls/comments`. The 
exact log message:
   ```
   time="2023-08-13 20:04:59" level=info msg=" [pipeline service] [pipeline 
#119] [task #318] executing subtask collectApiPrReviewCommentsMeta"
   time="2023-08-13 20:04:59" level=info msg=" [pipeline service] [pipeline 
#119] [task #318] [collectApiPrReviewCommentsMeta] start api collection"
   time="2023-08-13 20:05:16" level=warning msg=" [pipeline service] [pipeline 
#119] [task #318] [api async client] retry #0 calling 
repos/ArmisSecurity/armis/pulls/comments
        caused by: Http DoAsync error calling [GET 
repos/ArmisSecurity/armis/pulls/comments]. Response: {
          "message": "Server Error"
        }
         (502)"
   time="2023-08-13 20:05:26" level=warning msg=" [pipeline service] [pipeline 
#119] [task #318] [api async client] retry #1 calling 
repos/ArmisSecurity/armis/pulls/comments
        caused by: Http DoAsync error calling [GET 
repos/ArmisSecurity/armis/pulls/comments]. Response: {
          "message": "Server Error"
        }
         (502)"
   time="2023-08-13 20:05:36" level=warning msg=" [pipeline service] [pipeline 
#119] [task #318] [api async client] retry #2 calling 
repos/ArmisSecurity/armis/pulls/comments
        caused by: Http DoAsync error calling [GET 
repos/ArmisSecurity/armis/pulls/comments]. Response: {
          "message": "Server Error"
        }
         (502)"
   time="2023-08-13 20:05:40" level=info msg=" [pipeline service] [pipeline 
#119] [task #318] [collectApiPrReviewCommentsMeta] finished records: 1"
   time="2023-08-13 20:05:45" level=info msg=" [pipeline service] [pipeline 
#119] [task #318] [collectApiPrReviewCommentsMeta] finished records: 2"
   time="2023-08-13 20:05:49" level=info msg=" [pipeline service] [pipeline 
#119] [task #318] [collectApiPrReviewCommentsMeta] finished records: 26"
   time="2023-08-13 20:05:50" level=warning msg=" [pipeline service] [pipeline 
#119] [task #318] [api async client] retry #0 calling 
repos/ArmisSecurity/armis/pulls/comments
        caused by: Http DoAsync error calling [GET 
repos/ArmisSecurity/armis/pulls/comments]. Response: {
          "message": "Server Error"
        }
         (502)"
   
   //// returns many many time
   time="2023-08-13 20:05:52" level=info msg=" [pipeline service] [pipeline 
#119] [task #318] [collectApiPrReviewCommentsMeta] finished records: 31"
   
   //// errors starting again
         (502)"
   time="2023-08-13 20:05:52" level=warning msg=" [pipeline service] [pipeline 
#119] [task #318] [api async client] retry #0 calling 
repos/ArmisSecurity/armis/pulls/comments
        caused by: Http DoAsync error calling [GET 
repos/ArmisSecurity/armis/pulls/comments]. Response: {
          "message": "Server Error"
        }
   
   //// starting to get 403
   time="2023-08-13 20:05:53" level=warning msg=" [pipeline service] [pipeline 
#119] [task #318] [api async client] retry #2 calling 
repos/ArmisSecurity/armis/pulls/comments
        caused by: Http DoAsync error calling [GET 
repos/ArmisSecurity/armis/pulls/comments]. Response: {
          "documentation_url": 
"https://docs.github.com/en/free-pro-team@latest/rest/overview/resources-in-the-rest-api#secondary-rate-limits";,
          "message": "You have exceeded a secondary rate limit. Please wait a 
few minutes before you try again."
        }
         (403)"
   ```
   
   From the logs it seems like devlake trying to retry when getting those 500, 
and suddenly gets throttled
   
   When running this API locally with `gh` CLI, I can make gh to return 500 
when not specifying the `Accept` header:
   ```
   ~/dev $ gh api repos/ORG/REPO/pulls/comments
   gh: Server Error (HTTP 502)
   {
     "message": "Server Error"
   }
   ```
   
   Adding `-H "Accept: application/vnd.github+json"` solve the issue:
   `gh api -H "Accept: application/vnd.github+json" 
repos/ArmisSecurity/armis/pulls/comments`
   
   wdyt? do we set the Accept header when calling the comments API?
   
   ### What do you expect to happen
   
   pipeline to actually work
   
   ### How to reproduce
   
   Create a pipeline that scrapes GitHub private repository
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   v0.16.0
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to