blambov commented on code in PR #2689:
URL: https://github.com/apache/cassandra/pull/2689#discussion_r1329957899


##########
src/java/org/apache/cassandra/io/sstable/format/SSTableWriter.java:
##########
@@ -102,9 +108,38 @@ protected SSTableWriter(Builder<?, ?> builder, 
LifecycleNewTracker lifecycleNewT
         this.mmappedRegionsCache = builder.getMmappedRegionsCache();
         this.lifecycleNewTracker = lifecycleNewTracker;
 
+        // We need to ensure that no sstable components exist before the 
lifecycle transaction starts tracking it.
+        // Otherwise, it means that we either want to overwrite some existing 
sstable, which is not allowed, or some
+        // sstable files were created before the sstable is registered in the 
lifecycle transaction, which may lead
+        // to a race such that the sstable is listed as completed due to the 
lack of the transaction file before
+        // anything is actually written to it.
+        Supplier<List<Component>> existingComponents = Suppliers.memoize(() -> 
components.stream()

Review Comment:
   Nit:
   I think I gave you the wrong impression what I was looking for here. We are 
normally working with assertions enabled, so optimizing for the other case is 
not something I consider important; I was wondering about the amount of work we 
do in the case where the assertion is checked but does not fire, and this 
change is making that case worse.
   
   To me it is preferable to not materialize the resulting set, e.g. by doing:
   ```
           Set<Component> existingComponents = Sets.filter(components, c -> 
descriptor.fileFor(c).exists());
   
           assert existingComponents.isEmpty() : String.format("Cannot create a 
new SSTable in directory %s as a files for components %s already exist there", 
                                                               
descriptor.directory, 
                                                               
existingComponents);
   ```
   where we create only the filter function and set wrapper at the expense of 
having to walk and filter the set twice when the assertion fires.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to