[
https://issues.apache.org/jira/browse/ARROW-14665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bryan Cutler resolved ARROW-14665.
----------------------------------
Fix Version/s: 8.0.0
Resolution: Fixed
Issue resolved by pull request 11667
[https://github.com/apache/arrow/pull/11667]
> [Java] JdbcToArrowUtils ResultSet iteration bug
> -----------------------------------------------
>
> Key: ARROW-14665
> URL: https://issues.apache.org/jira/browse/ARROW-14665
> Project: Apache Arrow
> Issue Type: Bug
> Components: Java
> Affects Versions: 6.0.0
> Reporter: Zac
> Priority: Major
> Labels: pull-request-available
> Fix For: 8.0.0
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> When specifying a target batch size, the [iteration
> logic|https://github.com/apache/arrow/blob/ea42b9e0aa000238fff22fd48f06f3aa516b9f3f/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L266]
> is currently broken:
> {code:java}
> while (rs.next() && readRowCount < config.getTargetBatchSize()) {
> compositeConsumer.consume(rs);
> readRowCount++;
> }
> {code}
> calling next() on the result set will move the cursor forward to the next
> row, even when we've reached the target batch size.
> For example, consider setting target batch size to 1, and query a table that
> has three rows.
> On the first iteration, we'll successfully consume the first row. On the next
> iteration, we'll move the cursor to row 2, but detect the read row count is
> no longer < target batch size and return.
> Upon calling into the method again with the same result set, rs.next will be
> called again which will result in successfully consuming row 3.
> *Problem:* row 2 is skipped!
--
This message was sent by Atlassian Jira
(v8.20.1#820001)