[ 
https://issues.apache.org/jira/browse/CASSANDRA-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381053#comment-14381053
 ] 

Benedict edited comment on CASSANDRA-8993 at 3/25/15 11:51 PM:
---------------------------------------------------------------

bq.  that it should hopefully clarify the reason.

Unfortunately not, at least for me. I am afraid I don't really follow the 
Downsampling logic, and the comment actually confused me further. I tried to 
print out the various static method states in Downsampling to visualise them 
better, and it still doesn't make much sense to me, so I'll outline my 
confusion and we can decide either to help me understand, or to perhaps get 
[~jbellis] (the original reviewer iirc) to step in. I should note that, either 
way, I expect the test you've introduced to be sound for solving this problem, 
so if this takes too long we can roll a version without further ceremony, but I 
think it may be an unnecessarily lengthy test on startup for old format files, 
which could be onerous. Of course the fact that I don't understand means my 
expectation could also be broken, negating the point of review.

bq. Unfortunately, the first entry to be dropped is the entry at index 
(BASE_SAMPLING_LEVEL - 1), so we need to check a full set of 
BASE_SAMPLING_LEVEL entries.

If I print out the "original indices" and "effective intervals", it seems that 
at the first downsampling level (64) alternating summary entries are dropped 
(and again for each extra level), up to dropping all but each 128th (beginning 
with zero), so the first half of the comment doesn't seem to match with the 
behaviour I see in a way I understand. If the behaviour matches what I see, and 
not the comment, then it seems we could safely just check the first two 
"expected" intervals and if they mismatch we've downsampled and need to 
rebuild. This would translate into always just checking 2 * minIndexInterval 
records in the index, or 1/64th the data.

Further confusion to understanding Downsampling as a whole stems from the 
permission of a -1 index into getEffectiveIndexIntervalAfterIndex without 
explanation, and the fact that every effective interval is the same despite 
there being multiple avenues for calculating it (it would be much clearer if it 
were just minIndexInterval * (BASE_SAMPLING_LEVEL / samplingLevel).

I apologise if I'm just being a bit dimwitted.


was (Author: benedict):
bq.  that it should hopefully clarify the reason.

Unfortunately not, at least for me. I am afraid I don't really follow the 
Downsampling logic, and the comment actually confused me further. I tried to 
print out the various static method states in Downsampling to visualise them 
better, and it still doesn't make much sense to me, so I'll outline my 
confusion and we can decide either to help me understand, or to perhaps get 
[~jbellis] (the original reviewer iirc) to step in. I should note that, either 
way, I expect the test you've introduced to be sound for solving this problem, 
so if this takes too long we can roll a version without further ceremony, but I 
think it may be an unnecessarily lengthy test on startup for old format files, 
which could be onerous.

bq. Unfortunately, the first entry to be dropped is the entry at index 
(BASE_SAMPLING_LEVEL - 1), so we need to check a full set of 
BASE_SAMPLING_LEVEL entries.

If I print out the "original indices" and "effective intervals", it seems that 
at the first downsampling level (64) alternating summary entries are dropped 
(and again for each extra level), up to dropping all but each 128th (beginning 
with zero), so the first half of the comment doesn't seem to match with the 
behaviour I see in a way I understand. If the behaviour matches what I see, and 
not the comment, then it seems we could safely just check the first two 
"expected" intervals and if they mismatch we've downsampled and need to 
rebuild. This would translate into always just checking 2 * minIndexInterval 
records in the index, or 1/64th the data.

Further confusion to understanding Downsampling as a whole stems from the 
permission of a -1 index into getEffectiveIndexIntervalAfterIndex without 
explanation, and the fact that every effective interval is the same despite 
there being multiple avenues for calculating it (it would be much clearer if it 
were just minIndexInterval * (BASE_SAMPLING_LEVEL / samplingLevel).

I apologise if I'm just being a bit dimwitted.

> EffectiveIndexInterval calculation is incorrect
> -----------------------------------------------
>
>                 Key: CASSANDRA-8993
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8993
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Blocker
>             Fix For: 2.1.4
>
>         Attachments: 8993-2.1-v2.txt, 8993-2.1.txt, 8993.txt
>
>
> I'm not familiar enough with the calculation itself to understand why this is 
> happening, but see discussion on CASSANDRA-8851 for the background. I've 
> introduced a test case to look for this during downsampling, but it seems to 
> pass just fine, so it may be an artefact of upgrading.
> The problem was, unfortunately, not manifesting directly because it would 
> simply result in a failed lookup. This was only exposed when early opening 
> used firstKeyBeyond, which does not use the effective interval, and provided 
> the result to getPosition().
> I propose a simple fix that ensures a bug here cannot break correctness. 
> Perhaps [~thobbs] can follow up with an investigation as to how it actually 
> went wrong?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to