[
https://issues.apache.org/jira/browse/HDFS-15409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17141523#comment-17141523
]
Danil Lipovoy commented on HDFS-15409:
--------------------------------------
Actually this test out of clientShortCircuitNum, because it is distribution of
all blocks by the last digit. We can organize it how to want.
For example in current approach if clientShortCircuitNum would 3 it:
clientShortCircuitNum[0] = 0 + 3 + 6 + 9 = 140883 + 152024 + 182202 + 152268 =
627 377 = 38% of blocks
clientShortCircuitNum[1] = 1 + 4 + 7 = 212124 + 141270 + 152417 = 505 811 = 31%
of blocks
clientShortCircuitNum[2] = 2 + 5 + 8 = 152218 + 157903 + 209427 = 519 548 = 31%
of blocks
Here we have got a little bit disproportion (we can't 10 divided on 3 without
remainder).
So we could do it more tricky for case when clientShortCircuitNum = 3.
# Divide by modulo 100 (instead of 10).
# If two last digits between 0 and 32 then put it into
clientShortCircuitNum[0] -> 32% of blocks
# If two last digits between 33 and 65 then put it into
clientShortCircuitNum[1] -> 32% of blocks
# If two last digits between 66 and 99 then put it into
clientShortCircuitNum[2] -> 33% of blocks
Similar logic we can use when clientShortCircuitNum = 4:
# Divide by modulo 100.
# If two last digits between 0 and 24 then put it into
clientShortCircuitNum[0] -> 25% of blocks
# If two last digits between 25 and 49 then put it into
clientShortCircuitNum[1] -> 25% of blocks
# If two last digits between 50 and 74 then put it into
clientShortCircuitNum[2] -> 25% of blocks
# If two last digits between 75 and 99 then put it into
clientShortCircuitNum[2] -> 25% of blocks
When clientShortCircuitNum = 2 or 5 we can use current way because it splits
blocks quite good (10 divided on 2 and 5 without remainder).
What do you think?
> Optimization Strategy for choosing ShortCircuitCache
> -----------------------------------------------------
>
> Key: HDFS-15409
> URL: https://issues.apache.org/jira/browse/HDFS-15409
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Lisheng Sun
> Assignee: Lisheng Sun
> Priority: Major
>
> When clientShortCircuitNum is 10, the probability of falling into each
> ShortCircuitCache is the same, while the probability of other
> clientShortCircuitNum is different.
> For example if clientShortCircuitNum is 3, when a lot of blockids of SSR are
> ***1, ***4, ***7, this situation will fall into a ShortCircuitCache.
> Since the real environment blockid is completely unpredictable, i think it is
> need to design a strategy which is allocated to a specific ShortCircuitCache.
> This should improve performance even more.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]