[GitHub] [drill] jnturton commented on pull request #2424: DRILL-8092: Add Auto Pagination to HTTP Storage Plugin

GitBox Wed, 19 Jan 2022 01:02:52 -0800


jnturton commented on pull request #2424:
URL: https://github.com/apache/drill/pull/2424#issuecomment-1016222748



   > @paul-rogers yes, of course. I'll see if I can expand the scope of this 
refactoring PR to add that in.
   
   In the paginated case, a separate batch reader is created for each page of 
data returned by the HTTP API.
   
   - This plugin's JSON batch reader uses JsonLoader from EVF which will try to 
fill up a Drill batch for each page of API data.
   - Its CSV batch reader makes direct use of the Univocity CSV parse rather 
than going via CompliantTextBatchReader (why?) but includes its own 
batch-filling loop.
   - Its XML batch reader uses XmlReader from the format-xml which is also 
EVF-based and tries to fill up a Drill batch.
   
   So I think the "A natural structure is to create one Drill batch per HTTP 
page" comment is already addressed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [drill] jnturton commented on pull request #2424: DRILL-8092: Add Auto Pagination to HTTP Storage Plugin

Reply via email to