cshannon commented on PR #3417: URL: https://github.com/apache/accumulo/pull/3417#issuecomment-1556214256
I just pushed a new commit to fix the failures and also started a new full IT run. The test failures turned out to be due to a class cast exception in FileManager trying to convert StoredTabletFile to TabletFile when using a Scan server because SnapshotTablet was [returning](https://github.com/apache/accumulo/blob/4d933f1a285fb1a5c323bf58062fb05d3110dd91/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/SnapshotTablet.java#L72) StoredTabletFiles when reserving files instead of TabletFile as getDataFiles() is used by [reserveFilesForScan()](https://github.com/apache/accumulo/blob/4d933f1a285fb1a5c323bf58062fb05d3110dd91/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/SnapshotTablet.java#L103). This was missed due to the suppressed warning [annotation](https://github.com/apache/accumulo/blob/4d933f1a285fb1a5c323bf58062fb05d3110dd91/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/SnapshotTablet.java#L100) for unchecked and raw types so the compiler did not pick this up. I came up with 3 ways to fix this which I explain below. I'm not entirely sure it was the best one to use so any feedback is welcome. 1. **Simplest option:** Each time [reserveFilesForScan() ](https://github.com/apache/accumulo/blob/4d933f1a285fb1a5c323bf58062fb05d3110dd91/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/SnapshotTablet.java#L102)is called in `SnapshotTablet` we could just convert the map to a new map of TabletFiles instead of StoredTabletFiles and return that. We [currently](https://github.com/apache/accumulo/blob/4d933f1a285fb1a5c323bf58062fb05d3110dd91/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/DatafileManager.java#L98) iterate already and create a new map in DataFileManager so this would not be new. It's the simplest option with the least amount of changes but I didn't do that for now as the **downside** is it requires creating a new map and having to iterate every time we call that method so I wanted to avoid that but I could easily just do that instead. Doing this would simply require changing [reserveFilesForScan() ](https://github.com/apache/accumulo/bl ob/4d933f1a285fb1a5c323bf58062fb05d3110dd91/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/SnapshotTablet.java#L102) in SnapshotTablet to something like the following with no other changes: ```java @Override public Pair<Long,Map<TabletFile,DataFileValue>> reserveFilesForScan() { return new Pair<>(0L, getDatafiles().entrySet().stream().collect(Collectors.toMap(e -> e.getKey().getTabletFile(), Map.Entry::getValue, (tf1, tf2) -> tf1, TreeMap::new))); } ``` 2. **This is the option I chose for now:** In my latest commit I updated `FileManager` to just use `StoredTabletFile` instead of TabletFile and also updated the reserveFilesForScan() in `TabletBase` to just return StoredTabletFile. This has the benefit of not having to create a new map each time reserveFilesForScan() was called for `SnapshotTablet`. The downside is more API changes and now TabletFile can't be passed into FileManager anymore, only StoredTabletFile. I think this should be fine as FileManager is only used from the ScanDataSource but not sure if it could be a problem in the future. 3. The last option was a modification of option 2 which was to update FileManager to take AbstractTabletFile instead of StoredTabletFile so either TabletFile or StoredTabletFile could be used. This is certainly more flexible and keeps compatibility with TabletFile and I almost did this but I didn't like that by using the Abstract type it could lead to issues with reserved reader caching because in the future an update could be made such that both that both StoredTabletFile and TabletFile are passed as the API would allow that between different calls for opening files. It could cause problems if they pointed to the same file as they are no longer considered equal so reserved cached readers wouldn't be re-used. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
