[jira] [Commented] (IGNITE-2921) ScanQueries over local partitions are not optimal

Yakov Zhdanov (JIRA) Thu, 31 Mar 2016 01:42:43 -0700

    [ 
https://issues.apache.org/jira/browse/IGNITE-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219597#comment-15219597
 ]


Yakov Zhdanov commented on IGNITE-2921:
---------------------------------------

If it is true that we have several thread switches for local query, than It is 
really weird. We need to re-approach it.

Note: there is a possible race condition - entry gets moved between 
swap/offheap/onheap but we still need not to miss it. I am pretty sure now 
there is a special listener for that - migrated entries are put to special 
buffer and included into iteration.

While working on this ticket we need to add a setting 
Query.swapEventsBufferSize() which should set this buffer per query. 
QueryCursor.iterator().next() should throw exception on buffer overflow. Why I 
think we need to expose this property? After these changes raw data iteration 
process can take more time since user logic will be executed for each entry. 
So, user need to have a way to somehow control this.

I would do these minimal changes now, since after disc store will be ready we 
will have to return to scan queries again.  

> ScanQueries over local partitions are not optimal
> -------------------------------------------------
>
>                 Key: IGNITE-2921
>                 URL: https://issues.apache.org/jira/browse/IGNITE-2921
>             Project: Ignite
>          Issue Type: Bug
>          Components: cache
>    Affects Versions: 1.5.0.final
>            Reporter: Denis Magda
>            Priority: Blocker
>              Labels: community, important
>             Fix For: 1.6
>
>         Attachments: LocalIteratorStuff.java, ScanQueryStuff.java
>
>
> Presently scan queries over local partitions are not executed optimally. 
> If to run a scan query over a specific partition (by setting 
> {{query.setPartition(...)}} parameter and or {{query.setLocal(true)}}) and 
> start iterating over entries we will see that the Thread, that iterates over 
> the data, waits for some event to happen.
> In fact the Thread waits while a system pool's thread prepares an iterator 
> with entries for it and only after that iterates over the returned result 
> set. The flow looks this way:
> - {{GridCacheLocalQueryFuture}} is created;
> - when {{QueryCursor.iterator().next}} is called from the app thread (the 
> Thread above), {{GridCacheLocalQueryFuture.execute()}} methods puts closure 
> that will prepare content for the iterator in the system pool.
> -  a system Thread execute {{GridCacheQueryManager.runQuery()}} reading all 
> the entries from partition and passing them back to the Thread at line 1553 
> by calling {{onPageReady(...)}} method.
> The other bottleneck is that a system thread gets all the entries and passes 
> them to the Thread which will lead to more garbaged Java heap especially if 
> cache is {{OFFHEAP_TIRED}}.
> Run attached test ({{ScanQueryStuff}}) and you will see with Visual VM that 
> most of the time the test spends executing the code from system threads.
> Finally, what have to be done:
> - if ScanQuery is supposed to be executed locally (setPartition() refers to 
> local partition or setLocal is set to true) then the calling application 
> thread has to iterate over the data avoiding usage of the system pool;
> - internal code mustn't read all entries from a partition initially. The 
> iterator has to get one entry next after another. This will be a memory 
> backpressure mechanism especially for {{OFFHEAP_TIRED}}.
> My assumption is that the fixed version has to work in a similar way to 
> iteration over local entries - 
> {{cache.localEntries(CachePeekMode.PRIMARY);}}. Run attached 
> {{LocalIteratorStuff}} to see with Visual VM that the application thread is 
> fully utilized and system threads are idle. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (IGNITE-2921) ScanQueries over local partitions are not optimal

Reply via email to