dosubot[bot] commented on issue #8028:
URL:
https://github.com/apache/incubator-devlake/issues/8028#issuecomment-2340873912
<!-- Greeting -->
Hello, @ClaudioMascaro! I'm here to help you with any bugs, questions, or
contributions. Let's work together to tackle this issue.
<!-- Answer -->
To address the issue of GraphQL API requests failing for large repositories
with many workflow and job runs, despite setting `API_TIMEOUT` to "120s" and
`API_RETRY` to "20", you can try the following approaches:
1. **Increase the `InputStep` Value**: The `InputStep` parameter in the
`InitGraphQLCollector` function controls the number of items processed in each
batch. Increasing this value can help reduce the number of API calls by
processing more items per request.
2. **Optimize the `BuildQuery` Function**: Ensure that the `BuildQuery`
function is efficiently constructing the GraphQL queries to minimize the
payload size and complexity.
3. **Adjust the `PAGE_SIZE`**: The `PAGE_SIZE` constant in the `CollectRuns`
function controls the number of entries per page. Although the API accepts a
maximum of 100 entries per page, sometimes the response body is too large,
leading to request failures. Reducing the `PAGE_SIZE` can help mitigate this
issue.
Here is a relevant code snippet with the `InputStep` value increased and
`PAGE_SIZE` adjusted:
```go
const PAGE_SIZE = 30 // Adjusted to handle large response bodies
err = apiCollector.InitGraphQLCollector(helper.GraphqlCollectorArgs{
Input: iterator,
InputStep: 50, // Increased from 20 to 50
GraphqlClient: data.GraphqlClient,
BuildQuery: func(reqData *helper.GraphqlRequestData) (interface{},
map[string]interface{}, error) {
query := &GraphqlQueryCheckRunWrapper{}
if reqData == nil {
return query, map[string]interface{}{}, nil
}
workflowRuns := reqData.Input.([]interface{})
checkSuiteIds := []map[string]interface{}{}
for _, iWorkflowRuns := range workflowRuns {
workflowRun := iWorkflowRuns.(*SimpleWorkflowRun)
checkSuiteIds = append(checkSuiteIds, map[string]interface{}{
`id`: graphql.ID(workflowRun.CheckSuiteNodeID),
})
}
variables := map[string]interface{}{
"node": checkSuiteIds,
}
return query, variables, nil
},
ResponseParser: func(queryWrapper any) (messages []json.RawMessage, err
errors.Error) {
query := queryWrapper.(*GraphqlQueryCheckRunWrapper)
for _, node := range query.Node {
for _, checkRun := range node.CheckSuite.CheckRuns.Nodes {
updatedAt := checkRun.StartedAt
if checkRun.CompletedAt != nil {
updatedAt = checkRun.CompletedAt
}
if apiCollector.GetSince() != nil &&
!apiCollector.GetSince().Before(*updatedAt) {
return messages, helper.ErrFinishCollect
}
messages = append(messages, errors.Must1(json.Marshal(node)))
}
}
return
},
IgnoreQueryErrors: true,
})
if err != nil {
return err
}
return collector.Execute()
```
By increasing the `InputStep` value and adjusting the `PAGE_SIZE`, you can
reduce the number of API calls and potentially avoid hitting rate limits or
timeouts
<sup>[[1]](https://github.com/apache/incubator-devlake/blob/main/backend/plugins/github_graphql/tasks/job_collector.go)</sup><sup>[[2]](https://github.com/apache/incubator-devlake/blob/main/backend/plugins/github/tasks/cicd_run_collector.go)</sup>.
<!-- Dosu Comment Footer -->
*To continue talking to [Dosu](https://dosu.dev), mention @dosu.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]