DanielLeens commented on issue #10929: URL: https://github.com/apache/seatunnel/issues/10929#issuecomment-4523791942
Thanks for pointing to the exact code path. I checked the current cursor-pagination logic in `HttpSourceReader`, and there is a real robustness gap here: in cursor mode, the reader currently only stops when the response cursor becomes empty. If an API keeps returning the same non-empty cursor, the reader can keep polling without making progress. So the no-progress loop itself looks like a genuine issue. That said, I would be careful about making `readSize < batchSize` the only stop condition for cursor mode. Some cursor-based APIs do not guarantee that every non-terminal page is "full", so a short page is not always equivalent to end-of-data. A safer first fix would be: 1. detect when the cursor does not advance 2. stop or fail clearly in that no-progress case 3. then evaluate whether an additional end-of-page heuristic is needed for specific APIs If you can share one concrete response example for the terminal page (especially what the final cursor and payload look like), that will help confirm whether repeated-cursor detection is enough or whether an additional heuristic is also needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
