maedhroz commented on code in PR #2460:
URL: https://github.com/apache/cassandra/pull/2460#discussion_r1253762041
##########
src/java/org/apache/cassandra/index/sai/StorageAttachedIndexGroup.java:
##########
@@ -250,15 +250,15 @@ public void handleNotification(INotification
notification, Object sender)
SSTableAddedNotification notice = (SSTableAddedNotification)
notification;
// Avoid validation for index files just written following
Memtable flush.
- boolean validate = !notice.memtable().isPresent();
+ IndexValidation validate = notice.memtable().isPresent() ?
IndexValidation.NONE : IndexValidation.CHECKSUM;
Review Comment:
One of the general things I looked at while reviewing here was whether the
notification order of the different `INotificationConsumer` things matters.
`StorageAttachedIndexGroup` and `SecondaryIndexManager` in particular are the
interesting ones. Basically...do we checksum the things we should and not the
things we shouldn't, given we happen to have SIM registered before the index
group? Here's what I've come up with:
1.) When we stream entire SSTables w/ their attached index components (or
locally import new SSTables w/ index components already built), SIM handles the
notification, but the index build it triggers should be a no-op. Even if it
built something it wouldn't register as a full rebuild and no checksumming
would be done. Then when `StorageAttachedIndexGroup` handles the notification
here, it does the checksum, effectively protecting us from corruption
introduced by a streaming source. This all seems reasonable.
2.) When we don't stream an entire SSTable during repair or partial SSTable
bootstrapping (or locally import new SSTables w/ index components not already
built), SIM handles the notification and blocks to build SSTable-attached index
bits. Then (again, this is an accident of our ordering)
`StorageAttachedIndexGroup` handles the notification. It sees that there isn't
a Memtable and validates the checksum. _This isn't the end of the world, but it
would seem unnecessary, given we just build the indexes from the streamed
SSTables._ If we knew that all of the added SSTables didn't come with SAI
components, we'd bypass checksumming, right?
I'm not sure how exactly we want to address this, or whether we would want
to address it in this patch (although that might be nice, since we have it
loaded into our working memories), but perhaps we could agree that it's an
issue (or not, and that I'm missing something) and go from there?
The two things that come to mind to "fix" this are a.) having more
information in the notification or b.) having the index group handle
notifications *before* SIM does. (SIM only handles `SSTableAddedNotification`
FWIW).
CC @mike-tr-adamson @adelapena
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]