dosubot[bot] commented on issue #8028:
URL: 
https://github.com/apache/incubator-devlake/issues/8028#issuecomment-2340873912

   <!-- Greeting -->
   Hello, @ClaudioMascaro! I'm here to help you with any bugs, questions, or 
contributions. Let's work together to tackle this issue.
   
   <!-- Answer -->
   To address the issue of GraphQL API requests failing for large repositories 
with many workflow and job runs, despite setting `API_TIMEOUT` to "120s" and 
`API_RETRY` to "20", you can try the following approaches:
   
   1. **Increase the `InputStep` Value**: The `InputStep` parameter in the 
`InitGraphQLCollector` function controls the number of items processed in each 
batch. Increasing this value can help reduce the number of API calls by 
processing more items per request.
   
   2. **Optimize the `BuildQuery` Function**: Ensure that the `BuildQuery` 
function is efficiently constructing the GraphQL queries to minimize the 
payload size and complexity.
   
   3. **Adjust the `PAGE_SIZE`**: The `PAGE_SIZE` constant in the `CollectRuns` 
function controls the number of entries per page. Although the API accepts a 
maximum of 100 entries per page, sometimes the response body is too large, 
leading to request failures. Reducing the `PAGE_SIZE` can help mitigate this 
issue.
   
   Here is a relevant code snippet with the `InputStep` value increased and 
`PAGE_SIZE` adjusted:
   
   ```go
   const PAGE_SIZE = 30 // Adjusted to handle large response bodies
   
   err = apiCollector.InitGraphQLCollector(helper.GraphqlCollectorArgs{
       Input:         iterator,
       InputStep:     50, // Increased from 20 to 50
       GraphqlClient: data.GraphqlClient,
       BuildQuery: func(reqData *helper.GraphqlRequestData) (interface{}, 
map[string]interface{}, error) {
           query := &GraphqlQueryCheckRunWrapper{}
           if reqData == nil {
               return query, map[string]interface{}{}, nil
           }
           workflowRuns := reqData.Input.([]interface{})
           checkSuiteIds := []map[string]interface{}{}
           for _, iWorkflowRuns := range workflowRuns {
               workflowRun := iWorkflowRuns.(*SimpleWorkflowRun)
               checkSuiteIds = append(checkSuiteIds, map[string]interface{}{
                   `id`: graphql.ID(workflowRun.CheckSuiteNodeID),
               })
           }
           variables := map[string]interface{}{
               "node": checkSuiteIds,
           }
           return query, variables, nil
       },
       ResponseParser: func(queryWrapper any) (messages []json.RawMessage, err 
errors.Error) {
           query := queryWrapper.(*GraphqlQueryCheckRunWrapper)
           for _, node := range query.Node {
               for _, checkRun := range node.CheckSuite.CheckRuns.Nodes {
                   updatedAt := checkRun.StartedAt
                   if checkRun.CompletedAt != nil {
                       updatedAt = checkRun.CompletedAt
                   }
                   if apiCollector.GetSince() != nil && 
!apiCollector.GetSince().Before(*updatedAt) {
                       return messages, helper.ErrFinishCollect
                   }
                   messages = append(messages, errors.Must1(json.Marshal(node)))
               }
           }
           return
       },
       IgnoreQueryErrors: true,
   })
   if err != nil {
       return err
   }
   
   return collector.Execute()
   ```
   
   By increasing the `InputStep` value and adjusting the `PAGE_SIZE`, you can 
reduce the number of API calls and potentially avoid hitting rate limits or 
timeouts 
<sup>[[1]](https://github.com/apache/incubator-devlake/blob/main/backend/plugins/github_graphql/tasks/job_collector.go)</sup><sup>[[2]](https://github.com/apache/incubator-devlake/blob/main/backend/plugins/github/tasks/cicd_run_collector.go)</sup>.
   
   
   <!-- Dosu Comment Footer -->
   *To continue talking to [Dosu](https://dosu.dev), mention @dosu.*
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to