[
https://issues.apache.org/jira/browse/IGNITE-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263970#comment-15263970
]
Semen Boikov commented on IGNITE-2921:
--------------------------------------
Hi Artem,
Reviewed your fix, looks good, have some minor comments:
- please change CacheWeakQueryIteratorsHolder to avoid extra AutoCloseable
creation (CacheIteratorConverter is already AutoCloseable, CacheQueryFuture
also can implement it).
- scan iterator is wrapped twice in GrdiCacheAdaper.igniteIterator: first in
CacheWeakQueryIteratorsHolder and then in GrdiCacheAdaper.igniteIterator, let's
keep only one wrapper.
- do we really always need copy nodes collection inside 'nodes(final
GridCacheContext<?, ?> cctx, @Nullable final ClusterGroup prj, @Nullable final
Integer part)'?
- please take a look at this place in PeekValueExpiryAwareIterator: we call
'unwrapBinaryIfNeeded' for key and value, then it possible that
'unwrapBinaryIfNeeded' is called one more time inside 'checkPredicate'. Can we
get rid of one call?
> ScanQueries over local partitions are not optimal
> -------------------------------------------------
>
> Key: IGNITE-2921
> URL: https://issues.apache.org/jira/browse/IGNITE-2921
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Affects Versions: 1.5.0.final
> Reporter: Denis Magda
> Assignee: Artem Shutak
> Priority: Blocker
> Labels: community, important, performance
> Fix For: 1.6
>
> Attachments: LocalIteratorStuff.java, ScanQueryStuff.java
>
>
> Presently scan queries over local partitions are not executed optimally.
> If to run a scan query over a specific partition (by setting
> {{query.setPartition(...)}} parameter and or {{query.setLocal(true)}}) and
> start iterating over entries we will see that the Thread, that iterates over
> the data, waits for some event to happen.
> In fact the Thread waits while a system pool's thread prepares an iterator
> with entries for it and only after that iterates over the returned result
> set. The flow looks this way:
> - {{GridCacheLocalQueryFuture}} is created;
> - when {{QueryCursor.iterator().next}} is called from the app thread (the
> Thread above), {{GridCacheLocalQueryFuture.execute()}} methods puts closure
> that will prepare content for the iterator in the system pool.
> - a system Thread execute {{GridCacheQueryManager.runQuery()}} reading all
> the entries from partition and passing them back to the Thread at line 1553
> by calling {{onPageReady(...)}} method.
> The other bottleneck is that a system thread gets all the entries and passes
> them to the Thread which will lead to more garbaged Java heap especially if
> cache is {{OFFHEAP_TIRED}}.
> Run attached test ({{ScanQueryStuff}}) and you will see with Visual VM that
> most of the time the test spends executing the code from system threads.
> Finally, what have to be done:
> - if ScanQuery is supposed to be executed locally (setPartition() refers to
> local partition or setLocal is set to true) then the calling application
> thread has to iterate over the data avoiding usage of the system pool;
> - internal code mustn't read all entries from a partition initially. The
> iterator has to get one entry next after another. This will be a memory
> backpressure mechanism especially for {{OFFHEAP_TIRED}}.
> My assumption is that the fixed version has to work in a similar way to
> iteration over local entries -
> {{cache.localEntries(CachePeekMode.PRIMARY);}}. Run attached
> {{LocalIteratorStuff}} to see with Visual VM that the application thread is
> fully utilized and system threads are idle.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)