[
https://issues.apache.org/jira/browse/OFBIZ-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151383#comment-17151383
]
ASF subversion and git services commented on OFBIZ-11789:
---------------------------------------------------------
Commit 71ac32c89b29a56cb76952e641e006eb120b5e8b in ofbiz-framework's branch
refs/heads/trunk from Pawan Verma
[ https://gitbox.apache.org/repos/asf?p=ofbiz-framework.git;h=71ac32c ]
Implemented: EntityBatchIterator which pages through large resultsets (500 at a
time by default) and added support for EntityBatchIterator to EntityQuery via
'queryBatchIterator'(OFBIZ-11789) (#191)
Modified GenericDao#makeOffsetString to have OFFSET/FETCH style by default
Modified entityengine's mysql and postgres datasources to have LIMIT/OFFSET
style, rest can work with default OFFSET/FETCH style
Thanks: Scott for the help.
> Implement EntityBatchIterator for large data set queries
> --------------------------------------------------------
>
> Key: OFBIZ-11789
> URL: https://issues.apache.org/jira/browse/OFBIZ-11789
> Project: OFBiz
> Issue Type: Improvement
> Components: framework/entity
> Affects Versions: Trunk
> Reporter: Pawan Verma
> Assignee: Pawan Verma
> Priority: Minor
>
> Dev Mail Thread:
> [https://markmail.org/search/EntityBatchIterator+for+large+data+set+queries]
> While working on the large database we have figured out that very large
> queries consume all memory and crash ofbiz(because queryIterator() doesn't
> really work, it's no different from queryList())
> The EntityListIterator attempts to use a cursor to iterate over large result
> sets but in reality most databases do not give us a cursor unless we ask for
> it in a very specific way, and instead you get back the full result set and
> potentially consume a large amount of memory. For example, the MySql details
> are here (ResultSet section):
> [https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html]
> To work around this we've built the EntityBatchIterator which instead
> basically just pages through results like you would do on a webpage that uses
> pagination. The iterator takes an EntityQuery and when next() is called it
> grabs the first 500 records and returns the first record, call next again and
> you get the second record, once you've called next 500 times it runs the
> query again with an offset and grabs the next 500 and so on until there are
> no more records.
> The main downsides to this approach are:
> 1. Same as when using limit/offset you want to be sure your results will come
> back in a consistent order to avoid accidentally skipping some rows and
> seeing other rows twice.
> 2. Because the results are a combination of many queries rather than a single
> query, some of the data may change while you are paging through it. i.e. if
> you were to sort by lastUpdatedTxStamp you may see some rows twice as they
> are updated by other transactions (this might be a good thing in some cases).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)