Nickcw6 commented on code in PR #7820:
URL: 
https://github.com/apache/incubator-devlake/pull/7820#discussion_r1701554584


##########
backend/plugins/circleci/tasks/shared.go:
##########
@@ -121,3 +121,12 @@ func ParseCircleciPageTokenResp(res *http.Response) 
([]json.RawMessage, errors.E
        err := api.UnmarshalResponse(res, &data)
        return data.Items, err
 }
+
+func ignoreDeletedBuilds(res *http.Response) errors.Error {
+       // CircleCI API will return a 404 response for a workflow/job that has 
been deleted
+       // due to their data retention policy. We should ignore these errors.
+       if res.StatusCode == http.StatusNotFound {
+               return api.ErrIgnoreAndContinue

Review Comment:
   With `api.ErrFinishCollect` it potentially misses workflow/jobs that are 
still available if the ci data isn't processed exactly in order from newest -> 
oldest. With `api.ErrIgnoreAndContinue` if that does occur those rows would 
still be collected.
   
   E.g. say there are 600 pipelines to be collected, 1 has been deleted. These 
end up out of order and the deleted record gets processed as 300/600. The 
devlake pipeline would end with 300 rows uncollected.
   
   There shouldn't be a significant number of deleted workflows & jobs for 
attempted collection (as the associated pipelines will also have been deleted), 
so the increase in devlake pipeline duration should be minimal & worth the 
reduced risk of missing rows.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to