[ 
https://issues.apache.org/jira/browse/AURORA-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15727059#comment-15727059
 ] 

Mehrdad Nurolahzade commented on AURORA-1847:
---------------------------------------------

Obviously this would no longer be a problem if/when our move to {{DBTaskStore}} 
is finalized. We are going to revisit the impediments of such move again 
(soon). 

In the meantime, this can be a band-aid to improve the performance of loading 
the scheduler landing page almost three orders of magnitude for us (results 
from my quick & dirty fix):
{code}
Benchmark                                       (numTasks)   Mode  Cnt       
Score       Error  Units
TaskStoreBenchmarks.DBFetchTasksBenchmark.run        10000  thrpt    5  
239816.089 ± 21423.880  ops/s
TaskStoreBenchmarks.DBFetchTasksBenchmark.run        50000  thrpt    5  
317320.217 ± 27734.522  ops/s
TaskStoreBenchmarks.DBFetchTasksBenchmark.run       100000  thrpt    5  
316582.626 ± 66012.270  ops/s
TaskStoreBenchmarks.MemFetchTasksBenchmark.run       10000  thrpt    5  
544172.191 ± 46109.756  ops/s
TaskStoreBenchmarks.MemFetchTasksBenchmark.run       50000  thrpt    5  
344869.887 ± 35948.155  ops/s
TaskStoreBenchmarks.MemFetchTasksBenchmark.run      100000  thrpt    5  
345617.654 ± 51053.176  ops/s
{code}

> Eliminate sequential scan in MemTaskStore.getJobKeys()
> ------------------------------------------------------
>
>                 Key: AURORA-1847
>                 URL: https://issues.apache.org/jira/browse/AURORA-1847
>             Project: Aurora
>          Issue Type: Story
>          Components: Efficiency, UI
>            Reporter: Mehrdad Nurolahzade
>            Priority: Minor
>              Labels: newbie
>
> The existing {{TaskStoreBenchmarks}} shows {{DBTaskStore}} is almost two 
> orders of magnitude faster than {{MemTaskStore}} when it comes to 
> {{getJobKeys()}}:
> {code}
> Benchmark                                       (numTasks)   Mode  Cnt       
> Score       Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run        10000  thrpt    5  
> 320271.082 ± 30842.727  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run        50000  thrpt    5  
> 334805.551 ± 20435.139  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       100000  thrpt    5  
> 317395.890 ± 45302.180  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run       10000  thrpt    5     
> 624.944 ±    54.038  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run       50000  thrpt    5      
> 91.335 ±     9.241  ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run      100000  thrpt    5      
> 27.712 ±     8.128  ops/s
> {code}
> If scheduler is configured to run with the {{MemTaskStore}} every hit on 
> scheduler page ({{/scheduler}}) causes a call to 
> {{MemTaskStore.getJobKeys()}}. 
> The implementation of this method is currently very inefficient as it results 
> in a sequential scan of the task store and then mapping to their respective 
> job keys. The sequential scan and mapping to job key can be eliminated by 
> simply returning the key set of the existing secondary index  {{job}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to