cshannon commented on code in PR #4568:
URL: https://github.com/apache/accumulo/pull/4568#discussion_r1604939498
##########
core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletMetadata.java:
##########
@@ -156,8 +156,13 @@ private TabletMetadata(Builder tmBuilder) {
this.compacted = tmBuilder.compacted.build();
this.userCompactionsRequested = tmBuilder.userCompactionsRequested.build();
this.unSplittableMetadata = tmBuilder.unSplittableMetadata;
- this.fileSize =
- Suppliers.memoize(() ->
files.values().stream().mapToLong(DataFileValue::getSize).sum());
+ this.fileSize = Suppliers.memoize(() -> {
Review Comment:
Is the performance difference for the sum calculation because the stream is
iterating twice over the values (once to map the size and then again to sum) or
just simply the overhead of the stream itself? I think it would be good to add
a comment here to say why this change was done just so that no one changes it
back.
##########
server/base/src/main/java/org/apache/accumulo/server/manager/state/TabletManagementIterator.java:
##########
@@ -66,35 +68,53 @@
* TabletManagement objects for each Tablet that needs some type of action
performed on it by the
* Manager.
*/
-public class TabletManagementIterator extends SkippingIterator {
+public class TabletManagementIterator extends WholeRowIterator {
private static final Logger LOG =
LoggerFactory.getLogger(TabletManagementIterator.class);
public static final String TABLET_GOAL_STATE_PARAMS_OPTION = "tgsParams";
private CompactionJobGenerator compactionGenerator;
private TabletBalancer balancer;
+ private final SplitConfig splitConfig = new SplitConfig();
+
+ private static class SplitConfig {
+ TableId tableId;
+ long splitThreshold;
+ long maxEndRowSize;
+ int maxFilesToOpen;
+
+ void update(TableId tableId, Configuration tableConfig) {
+ if (!tableId.equals(this.tableId)) {
+ this.tableId = tableId;
+ splitThreshold = ConfigurationTypeHelper
+
.getFixedMemoryAsBytes(tableConfig.get(Property.TABLE_SPLIT_THRESHOLD.getKey()));
+ maxEndRowSize = ConfigurationTypeHelper
+
.getFixedMemoryAsBytes(tableConfig.get(Property.TABLE_MAX_END_ROW_SIZE.getKey()));
+ maxFilesToOpen = (int) ConfigurationTypeHelper
+
.getFixedMemoryAsBytes(tableConfig.get(Property.SPLIT_MAXOPEN.getKey()));
+ }
+ }
+ }
private static boolean shouldReturnDueToSplit(final TabletMetadata tm,
- final Configuration tableConfig) {
+ final Configuration tableConfig, SplitConfig splitConfig) {
- final long splitThreshold = ConfigurationTypeHelper
-
.getFixedMemoryAsBytes(tableConfig.get(Property.TABLE_SPLIT_THRESHOLD.getKey()));
- final long maxEndRowSize = ConfigurationTypeHelper
-
.getFixedMemoryAsBytes(tableConfig.get(Property.TABLE_MAX_END_ROW_SIZE.getKey()));
- final int maxFilesToOpen = (int) ConfigurationTypeHelper
-
.getFixedMemoryAsBytes(tableConfig.get(Property.SPLIT_MAXOPEN.getKey()));
+ // should see the same table many times in a row, so this should only read
config the first time
+ // seen
+ splitConfig.update(tm.getTableId(), tableConfig);
Review Comment:
This is a nice way to save time having to convert properties over and over.
Another nice benefit (and I'm not sure if this was a problem before or
mattered), but this also essentially grabs as a snapshot of the properties per
iteration for the table. So this means you would be guaranteed to not have the
properties change mid-way through iterating over the same table if there were
concurrent updates.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]