[ 
https://issues.apache.org/jira/browse/CASSANDRA-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364440#comment-16364440
 ] 

Jordan West edited comment on CASSANDRA-14055 at 2/14/18 5:22 PM:
------------------------------------------------------------------

[~jasobrown], sorry for the trunk issues. The way {{TableMetadata}} is 
accessed/stored was changed and the test will need to be modified as a result. 
Will post a separate patch for trunk. 

[~lboutros], In my testing, the primary issue I saw was that files were removed 
from the SASI {{View}} that shouldn't be. The test writes 5 sstables (with 
sequence numbers 1-4 & 6) and during the test a compaction typically happens 
(that generates a sstable with generation 5 from sstables 1-4). The final SASI 
{{View}} when the queries are performed should contain either (1-4, 6) or (5, 
6)*. The test fails by returning 8 keys instead of 10 when the SASI {{View}} 
ends up containing only sstable 5 or by returning 0 keys instead of 10 when the 
SASI {{View}} ends up empty. 

The issue occurs when index redistribution completes. Depending on the 
interleaving* of events (the memtable flush, compaction, and redistribution), 
redistribution re-opens sstable 6, and sometimes re-opens sstable 5. This 
results in an {{SSTableListChangedNotification}}, which in turn results in the 
creation of a new {{View}},  where {{added=[6]}} (or {{added=[5,6]}}) and 
{{removed=[6]}} (or {{removed=[5,6]}}). The SASI {{View}} was written assuming 
these two sets were disjoint, which is why any reader in {{oldSSTables}} caused 
the index to be closed. This is incorrect in both cases because sstables 5 and 
6 are indeed the active data files (5 contains keys 0-8, and 6 contains keys 9 
& 10). 

Regarding the ref counting, we want to maintain one reference to sstables 5 & 6 
via their SSTableIndex instance but we’ve created a second reference and one 
needs to be closed. This is ensured by the 
{{newView.containsKey(sstable.descriptor)}} part of the conditional (so we are 
still indeed calling {{#release()}} on one instance). As I am writing this, 
however, I am realizing we want to keep a reference to the newer index, which 
references the newer SSTable instance and my patch does the opposite — keeping 
the old instance. I will post an updated patch along with my trunk patch for 
internal review, but the gist is to change the order we iterate over the old 
view and new indexes to favor new index instances.

NOTE: I've ignored https://issues.apache.org/jira/browse/CASSANDRA-14207 above

*I've found a few other interleavings by using another machine, but the general 
issue is the same.


was (Author: jrwest):
[~jasobrown], sorry for the trunk issues. The way {{TableMetadata}} is 
accessed/stored was changed and the test will need to be modified as a result. 
Will post a separate patch for trunk. 

[~lboutros], In my testing, the primary issue I saw was that files were removed 
from the SASI {{View}} that shouldn't be. The test writes 5 sstables (with 
sequence numbers 1-4 & 6) and during the test a compaction typically happens 
(that generates a sstable with generation 5 from sstables 1-4). The final SASI 
{{View}} when the queries are performed should contain either (1-4, 6) or (5, 
6)*. The test fails by returning 8 keys instead of 10 when the SASI {{View}} 
ends up containing only sstable 5 or by returning 0 keys instead of 10 when the 
SASI {{View}} ends up empty. 

The issue occurs when index redistribution completes. Depending on the 
interleaving* of events (the memtable flush, compaction, and redistribution), 
redistribution re-opens sstable 6, and sometimes re-opens sstable 5. This 
results in an {{SSTableListChangedNotification}}, which in turn results in the 
creation of a new {{View}},  where {{added=6}} (or {{added=[5,6]}}) and 
{{removed=6}} (or {{removed=[5,6]}}). The SASI {{View}} was written assuming 
these two sets were disjoint, which is why any reader in {{oldSSTables}} caused 
the index to be closed. This is incorrect in both cases because sstables 5 and 
6 are indeed the active data files (5 contains keys 0-8, and 6 contains keys 9 
& 10). 

Regarding the ref counting, we want to maintain one reference to sstables 5 & 6 
via their SSTableIndex instance but we’ve created a second reference and one 
needs to be closed. This is ensured by the 
{{newView.containsKey(sstable.descriptor)}} part of the conditional (so we are 
still indeed calling {{#release()}} on one instance). As I am writing this, 
however, I am realizing we want to keep a reference to the newer index, which 
references the newer SSTable instance and my patch does the opposite — keeping 
the old instance. I will post an updated patch along with my trunk patch for 
internal review, but the gist is to change the order we iterate over the old 
view and new indexes to favor new index instances.

NOTE: I've ignored https://issues.apache.org/jira/browse/CASSANDRA-14207 above

*I've found a few other interleavings by using another machine, but the general 
issue is the same.

> Index redistribution breaks SASI index
> --------------------------------------
>
>                 Key: CASSANDRA-14055
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14055
>             Project: Cassandra
>          Issue Type: Bug
>          Components: sasi
>            Reporter: Ludovic Boutros
>            Assignee: Ludovic Boutros
>            Priority: Major
>              Labels: patch
>             Fix For: 3.11.x, 4.x
>
>         Attachments: CASSANDRA-14055-jrwest.patch, CASSANDRA-14055.patch, 
> CASSANDRA-14055.patch, CASSANDRA-14055.patch
>
>
> During index redistribution process, a new view is created.
> During this creation, old indexes should be released.
> But, new indexes are "attached" to the same SSTable as the old indexes.
> This leads to the deletion of the last SASI index file and breaks the index.
> The issue is in this function : 
> [https://github.com/apache/cassandra/blob/9ee44db49b13d4b4c91c9d6332ce06a6e2abf944/src/java/org/apache/cassandra/index/sasi/conf/view/View.java#L62]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to