klesh opened a new issue, #3822: URL: https://github.com/apache/incubator-devlake/issues/3822
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-devlake/issues?q=is%3Aissue) and found no similar feature requirement. ### Use case 1. Shorten overall collection time when collection failed 2. Better user experience in terms of robustness ### Description Many things could go wrong during data collection, network problems, server crashes, etc. When it happens, the collected data would be recollected once again, which is wasteful and bad for the user experience. Some may say we have `diff-sync` mechanism for most of our important plugins, yes, we do, however, it relies on Extracted Data which is not really helpful when Collection failed. To be more specific, let's take collecting 100 pages of jira issues as an example, if the server went down on the 50th page, users may wait for the server to come back, and then start another pipeline to collect data, one would certainly wish the Apache DevLake would pick up from 51 pages. It sounds all good and smooth, however, there are some catches we need to take care: 1. We collect data in parallel, so failing on the 50th page doesn't mean we can pick it up there. 2. `diff-sync` should be replaced if we opt for the `updated_time-based` strategy 3. The records order from API response must be consistent, which depends on the data source API specification 4. How do we know it is legitimate to pick up previous data? based on what? ### Related issues _No response_ ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
