DanielLeens commented on issue #10929:
URL: https://github.com/apache/seatunnel/issues/10929#issuecomment-4523791942

   Thanks for pointing to the exact code path.
   
   I checked the current cursor-pagination logic in `HttpSourceReader`, and 
there is a real robustness gap here: in cursor mode, the reader currently only 
stops when the response cursor becomes empty. If an API keeps returning the 
same non-empty cursor, the reader can keep polling without making progress.
   
   So the no-progress loop itself looks like a genuine issue.
   
   That said, I would be careful about making `readSize < batchSize` the only 
stop condition for cursor mode. Some cursor-based APIs do not guarantee that 
every non-terminal page is "full", so a short page is not always equivalent to 
end-of-data.
   
   A safer first fix would be:
   
   1. detect when the cursor does not advance
   2. stop or fail clearly in that no-progress case
   3. then evaluate whether an additional end-of-page heuristic is needed for 
specific APIs
   
   If you can share one concrete response example for the terminal page 
(especially what the final cursor and payload look like), that will help 
confirm whether repeated-cursor detection is enough or whether an additional 
heuristic is also needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to