mike-tr-adamson commented on code in PR #2460:
URL: https://github.com/apache/cassandra/pull/2460#discussion_r1254536224
##########
src/java/org/apache/cassandra/index/sai/StorageAttachedIndexGroup.java:
##########
@@ -250,15 +250,15 @@ public void handleNotification(INotification
notification, Object sender)
SSTableAddedNotification notice = (SSTableAddedNotification)
notification;
// Avoid validation for index files just written following
Memtable flush.
- boolean validate = !notice.memtable().isPresent();
+ IndexValidation validate = notice.memtable().isPresent() ?
IndexValidation.NONE : IndexValidation.CHECKSUM;
Review Comment:
We have been looking at this in some detail today to try and establish the
best way to correct it.
From our perspective, the `SSTableAddedNotification` is handled in 2 places.
It is handled by the `SecondaryIndexManager` that with invoke a `fullRebuild =
false` index build on the added SSTables, and it is handled by the
`StorageAttachedIndexGroup` that will validate and register the SSTables.
The problem with the current implementation is:
- The invoked index build will rebuild the per-column index components for
`fullRebuild = false` even if the components are present and valid. We do,
though, only build the per-SSTable components if they are invalid, in this case.
- The SAIG handler will validate the components even though they have been
already validated by the build.
What should be noted here is that the SIM will only invoke an index build if
a memtable is not present in the notification, and the SAIG currently only
validates the index components if the memtable isn't present but still
registers the components in either case.
We feel that the following changes could be made to improve this:
1. The index build should only build the per-column indexes if
`fullRebuild=true`, the index doesn't exist, or the index is invalid. In either
case, the SSTables must be registered (on build completion or immediately).
2. The SAIG should only register the SSTables if the memtable is present and
shouldn't validate the components because this is the result of a memtable
flush.
This will require some rework of the index builder, to make sure that the
SSTables are always registered correctly and, as such, we would prefer that
this was done on a separate ticket from this one. Which phase that ticket goes
into is up for discussion. The downsides of the current implementation are that
we can rebuild indexes that already have valid components and we can end up
validating components twice.
WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]