Hello Pawan,

I just had a look into the EntityQuery.queryIterator() method and looks
like we can achieve that by using fetchSize(), fowardOnly(),
cursorScrollInsensitive(), cursorScrollSensitive() and offset() methods in
EntityQuery class. Let me know if I am missing anything.

It will be good if you can post a pseudo code or something here so that we
could get an understanding of the exact design which you have in your mind.

Kind Regards,
--
Pritam Kute


On Thu, May 21, 2020 at 7:41 PM Pawan Verma <pawan.ve...@hotwaxsystems.com>
wrote:

> Hello Devs,
>
> While working on the large database we have figured out that very large
> queries consume all memory and crash ofbiz(because queryIterator() doesn't
> really work, it's no different from queryList())
>
> The EntityListIterator attempts to use a cursor to iterate over large
> result sets but in reality most databases do not give us a cursor unless we
> ask for it in a very specific way, and instead you get back the full result
> set and potentially consume a large amount of memory.  For example, the
> MySql details are here (ResultSet section):
>
> https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html
>
> To work around this we've built the EntityBatchIterator which instead
> basically just pages through results like you would do on a webpage that
> uses pagination.  The iterator takes an EntityQuery and when next() is
> called it grabs the first 500 records and returns the first record, call
> next again and you get the second record, once you've called next 500 times
> it runs the query again with an offset and grabs the next 500 and so on
> until there are no more records.
>
> The main downsides to this approach are:
> 1. Same as when using limit/offset you want to be sure your results will
> come back in a consistent order to avoid accidentally skipping some rows
> and seeing other rows twice.
> 2. Because the results are a combination of many queries rather than a
> single query, some of the data may change while you are paging through it.
> i.e. if you were to sort by lastUpdatedTxStamp you may see some rows twice
> as they are updated by other transactions (this might be a good thing in
> some cases).
>
> So, the main proposal is to introduce a new EntityBatchIterator for large
> queries.  If we agree on the proposal I'll create a Jira and provide a
> patch for the community review. Thanks!
>
> --
> Thanks & Regards
> Pawan Verma
> Technical Consultant
> *HotWax Systems*
> *Enterprise open source experts*
> http://www.hotwaxsystems.com
>

Reply via email to