[
https://issues.apache.org/jira/browse/ASTERIXDB-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chen Luo updated ASTERIXDB-2453:
--------------------------------
Description:
The current constant merge policy has a very high merge cost (O(n*n), where n
is the number of records), and is thus seldom used in practice. However, it
still has a desirable property that read cost is always bounded. From the
user's perspective, this policy is also easy to tune - only a single parameter
of the number of components.
To improve the write cost of the constant merge policy, we will adopt the idea
of Binomial policy proposed by https://arxiv.org/abs/1407.3008. This policy
significantly improves the merge cost to O(K*n^(1+1/K)), where K is the maximum
number of components, and n is the total number of records (or flushes).
Another desirable property is that this policy only has write cost O(n log n)
(similar to the current prefix policy) when n is relatively small (the number
of flushes < 4^K).
was:
The current constant merge policy has a very high merge cost (O(n*n), where n
is the number of records), and is thus seldom used in practice. However, it
still has a desirable property that read cost is always bounded. From the
user's perspective, this policy is also easy to tune - only a single parameter
of the number of components.
To improve the write cost of the constant merge policy, we will adopt the idea
of Binomial policy proposed by https://arxiv.org/abs/1407.3008. This policy
significantly improves the merge cost to O(K*n^{1+1/K}), where K is the bound
of number of components, and n is the total number of records. Another
desirable property is that this policy only has write cost O(n log n) (similar
to the current prefix policy) when n is relatively small (the number of flushes
< 4^K).
> Improve the Constant Merge Policy
> ---------------------------------
>
> Key: ASTERIXDB-2453
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2453
> Project: Apache AsterixDB
> Issue Type: Bug
> Components: STO - Storage
> Reporter: Chen Luo
> Assignee: Chen Luo
> Priority: Major
>
> The current constant merge policy has a very high merge cost (O(n*n), where n
> is the number of records), and is thus seldom used in practice. However, it
> still has a desirable property that read cost is always bounded. From the
> user's perspective, this policy is also easy to tune - only a single
> parameter of the number of components.
> To improve the write cost of the constant merge policy, we will adopt the
> idea of Binomial policy proposed by https://arxiv.org/abs/1407.3008. This
> policy significantly improves the merge cost to O(K*n^(1+1/K)), where K is
> the maximum number of components, and n is the total number of records (or
> flushes). Another desirable property is that this policy only has write cost
> O(n log n) (similar to the current prefix policy) when n is relatively small
> (the number of flushes < 4^K).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)