[ https://issues.apache.org/jira/browse/AURORA-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15726960#comment-15726960 ]
Zameer Manji commented on AURORA-1847: -------------------------------------- Could this be resolved by moving to {{DBTaskStore}} or does that have too many drawbacks? > Eliminate sequential scan in MemTaskStore.getJobKeys() > ------------------------------------------------------ > > Key: AURORA-1847 > URL: https://issues.apache.org/jira/browse/AURORA-1847 > Project: Aurora > Issue Type: Story > Components: Efficiency, UI > Reporter: Mehrdad Nurolahzade > Priority: Minor > Labels: newbie > > The existing {{TaskStoreBenchmarks}} shows {{DBTaskStore}} is almost two > orders of magnitude faster than {{MemTaskStore}} when it comes to > {{getJobKeys()}}: > {code} > Benchmark (numTasks) Mode Cnt > Score Error Units > TaskStoreBenchmarks.DBFetchTasksBenchmark.run 10000 thrpt 5 > 78430.531 ± 3255.027 ops/s > TaskStoreBenchmarks.DBFetchTasksBenchmark.run 50000 thrpt 5 > 50774.988 ± 8986.951 ops/s > TaskStoreBenchmarks.DBFetchTasksBenchmark.run 100000 thrpt 5 > 2480.074 ± 9833.122 ops/s > TaskStoreBenchmarks.MemFetchTasksBenchmark.run 10000 thrpt 5 > 1189.568 ± 108.146 ops/s > TaskStoreBenchmarks.MemFetchTasksBenchmark.run 50000 thrpt 5 > 124.990 ± 27.605 ops/s > TaskStoreBenchmarks.MemFetchTasksBenchmark.run 100000 thrpt 5 > 35.724 ± 15.101 ops/s > {code} > If scheduler is configured to run with the {{MemTaskStore}} every hit on > scheduler page ({{/scheduler}}) causes a call to > {{MemTaskStore.getJobKeys()}}. > The implementation of this method is currently very inefficient as it results > in a sequential scan of the task store and then mapping to their respective > job keys. The sequential scan and mapping to job key can be eliminated by > simply returning the key set of the existing secondary index {{job}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)