ctubbsii commented on a change in pull request #2213:
URL: https://github.com/apache/accumulo/pull/2213#discussion_r676918934



##########
File path: 
server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/CompactableImpl.java
##########
@@ -214,6 +202,9 @@ public CompactableImpl(Tablet tablet, CompactionManager 
manager,
       }
       return Set.copyOf(servicesIds);
     }, 2, TimeUnit.SECONDS);
+
+    this.fileMgr = new 
TabletCompactionFileManager(Collections.unmodifiableSet(runningJobs),
+        tablet.getExtent(), extCompactingFiles, extSelInfo);

Review comment:
       I guess it's technically correct if all operations on 
`TabletCompactionFileManager` instances are guarded with a synchronized block 
as well. However, my concern is that's very fragile and hard to enforce. It 
also implies a high degree of coordination between the two separate classes in 
one sense (for synchronization), whereas in another sense (mutable data 
access), that degree of coordination is implied by the protective use of 
`Collections.unmodifiableSet` to not exist. If they were coordinating to that 
degree (such as if `TabletCompactionFileManager` were an inner-class of 
`CompactableImpl`), I wouldn't think it necessary to do that protective 
wrapping.
   
   Additionally, it feels weird that methods on the 
`TabletCompactionFileManager` object would require synchronizing on 
`CompactableImpl.this`. This feels like `CompactableImpl`'s internal 
implementation details are leaking into the second object.
   
   I just think it might be more clear if, as long as we're separating these 
into two distinct classes, we further ensure they are decoupled, so you don't 
have to make assumptions about how the set of running jobs is shared between 
them. As is, the concurrency model seems very fragile. Some ideas:
   
   1. Synchronize on the `runningJobs` object instead of 
`CompactableImpl.this`, or via some other very explicit lock object,
   2. Don't share `runningJobs` with `TabletCompactionFileManager` at all, 
keeping all synchronization and access to `runningJobs` internal to 
`CompactableImpl`; expose what you need through `CompactableImpl` API methods
   
   With lambdas, it seems we should be able to get very explicit, and less 
error prone synchronization, all contained in `CompactableImpl`, without 
leaking anything out:
   
   ```java
   public interface CompactableImpl {
     // ...
     public synchronized <T> T synchronizedDo(Function<Set<CompactionJob>, T> 
operation) {
       return operation.apply(runningJobs);
     }
     // ...
   }
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to