ctubbsii commented on a change in pull request #2152:
URL: https://github.com/apache/accumulo/pull/2152#discussion_r647929478
##########
File path:
core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletMetadata.java
##########
@@ -429,6 +434,27 @@ private void setLocationOnce(String val, String qual,
LocationType lt) {
location = new Location(val, qual, lt);
}
+ static Iterable<TabletMetadata> convert(BatchScanner input,
EnumSet<ColumnType> fetchedColumns,
+ boolean buildKeyValueMap) {
+
+ IteratorSetting iterSetting = new IteratorSetting(100,
WholeRowIterator.class);
+ input.addScanIterator(iterSetting);
+
+ Supplier<Iterator<TabletMetadata>> iterFactory = () -> {
+ return Iterators.transform(input.iterator(), entry -> {
+ try {
+ return convertRow(
+ WholeRowIterator.decodeRow(entry.getKey(),
entry.getValue()).entrySet().iterator(),
+ fetchedColumns, buildKeyValueMap);
+ } catch (IOException e) {
+ throw new RuntimeException(e);
+ }
+ });
+ };
+
+ return () -> iterFactory.get();
Review comment:
You don't need the Supplier as an intermediate. You can create an
Iterable directly. You also don't need the variable, because you can just
return it right away.
Also, there is an UncheckedIOException to wrap IOException as a RTE.
```suggestion
return () -> Iterators.transform(input.iterator(), entry -> {
try {
return convertRow(
WholeRowIterator.decodeRow(entry.getKey(),
entry.getValue()).entrySet().iterator(),
fetchedColumns, buildKeyValueMap);
} catch (IOException e) {
throw new UncheckedIOException(e);
}
});
```
##########
File path:
core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -144,8 +164,24 @@ private TabletsMetadata buildNonRoot(AccumuloClient
client) {
}
}
+ private void configureColumns(ScannerBase scanner) {
+ for (Text fam : families) {
+ scanner.fetchColumnFamily(fam);
+ }
+
+ for (ColumnFQ col : qualifiers) {
+ col.fetch(scanner);
+ }
+
Review comment:
```suggestion
families.forEach(scanner::fetchColumnFamily);
qualifiers.forEach(col -> col.fetch(scanner));
```
##########
File path:
core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -306,6 +355,13 @@ public Options readConsistency(ReadConsistency
readConsistency) {
*/
Options forTablet(KeyExtent extent);
+ /**
+ * Get the tablet metadata for the given extents. This will find tablets
based on end row, so
+ * its possible the prev rows could differ for the tablets returned. If
this matters then it
Review comment:
"its" -> "it's" ("it is")
```suggestion
* it's possible the prev rows could differ for the tablets returned. If
this matters, then it
```
##########
File path:
core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -104,6 +115,25 @@ public TabletsMetadata build() {
}
}
+ private TabletsMetadata buildExtents(AccumuloClient client) {
+
+ try {
+ BatchScanner scanner = client.createBatchScanner(level.metaTable(),
Authorizations.EMPTY);
+
+ List<Range> ranges =
+ extents.stream().map(e ->
e.toMetaRange()).collect(Collectors.toList());
Review comment:
3 possible changes here. The first is `KeyExtent::toMetaRange` for the
map function. The other two changes are attempts to make this line shorter for
readability. Could use `var` here, or static import `toList()`. I did both
below, but it's up to you what you're comfortable with. I think it's worth
trying to get it to not wrap, though.
```suggestion
var ranges =
extents.stream().map(KeyExtent::toMetaRange).collect(toList());
```
##########
File path:
core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -233,6 +269,19 @@ public Options forTablet(KeyExtent extent) {
return this;
}
+ @Override
+ public Options forTablets(Collection<KeyExtent> extents) {
+ if (!extents.stream().map(e -> DataLevel.of(e.tableId()))
+ .allMatch(dl -> dl == DataLevel.USER)) {
Review comment:
Instead of `! allMatch`, you can just use `anyMatch`:
```suggestion
if (extents.stream().map(e -> DataLevel.of(e.tableId()))
.anyMatch(dl -> dl == DataLevel.USER)) {
```
You can also split up the two steps in the map function to use method
references. Sometimes this is easier to understand:
```suggestion
if (extents.stream().map(KeyExtent::tableId).map(DataLevel::of)
.anyMatch(dl -> dl == DataLevel.USER)) {
```
##########
File path:
core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -87,13 +91,20 @@
private TableId tableId;
private ReadConsistency readConsistency = ReadConsistency.IMMEDIATE;
private AccumuloClient _client;
+ private Collection<KeyExtent> extents;
Builder(AccumuloClient client) {
this._client = client;
}
@Override
public TabletsMetadata build() {
+ if (extents != null) {
+ Preconditions.checkState(
+ range == null && table == null && level == DataLevel.USER &&
!checkConsistency);
+ return buildExtents(_client);
+ }
Review comment:
This is doing a lot of state checking, but it's not clear what it's
checking for. A comment could go a long way here.
##########
File path:
core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletsMetadata.java
##########
@@ -144,8 +164,24 @@ private TabletsMetadata buildNonRoot(AccumuloClient
client) {
}
}
+ private void configureColumns(ScannerBase scanner) {
+ for (Text fam : families) {
+ scanner.fetchColumnFamily(fam);
+ }
+
+ for (ColumnFQ col : qualifiers) {
+ col.fetch(scanner);
+ }
+
+ if (families.isEmpty() && qualifiers.isEmpty()) {
+ fetchedCols = EnumSet.allOf(ColumnType.class);
+ }
+ }
+
@Override
public Options checkConsistency() {
+ Preconditions.checkState(extents == null,
Review comment:
I like to statically import Preconditions.checkState (or
Preconditions.checkArgument). It can help with readability.
##########
File path:
server/compaction-coordinator/src/main/java/org/apache/accumulo/coordinator/CompactionFinalizer.java
##########
@@ -148,16 +150,22 @@ private void processPending() {
List<ExternalCompactionId> statusesToDelete = new ArrayList<>();
+ Map<KeyExtent,TabletMetadata> tabletsMetadata = new HashMap<>();
+
+ List<KeyExtent> extents =
+ batch.stream().map(ecfs ->
ecfs.getExtent()).collect(Collectors.toList());
+
+ try (TabletsMetadata tablets =
context.getAmple().readTablets().forTablets(extents)
+ .fetch(ColumnType.LOCATION, ColumnType.PREV_ROW,
ColumnType.ECOMP).build()) {
+ tablets.forEach(tm -> tabletsMetadata.put(tm.getExtent(), tm));
+ }
Review comment:
Can stream directly to a map using a collector (I did a static import of
`Function.identity()` and `Collectors.toMap()` to shorten the line for
readability):
```suggestion
Map<KeyExtent,TabletMetadata> tabletsMetadata;
var extents =
batch.stream().map(ExternalCompactionFinalState::getExtent).collect(toList());
try (TabletsMetadata tablets =
context.getAmple().readTablets().forTablets(extents)
.fetch(ColumnType.LOCATION, ColumnType.PREV_ROW,
ColumnType.ECOMP).build()) {
tabletsMetadata =
tablets.stream().collect(toMap(TabletMetadata::getExtent, identity()));
}
```
##########
File path:
core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletMetadata.java
##########
@@ -429,6 +434,27 @@ private void setLocationOnce(String val, String qual,
LocationType lt) {
location = new Location(val, qual, lt);
}
+ static Iterable<TabletMetadata> convert(BatchScanner input,
EnumSet<ColumnType> fetchedColumns,
+ boolean buildKeyValueMap) {
+
+ IteratorSetting iterSetting = new IteratorSetting(100,
WholeRowIterator.class);
+ input.addScanIterator(iterSetting);
Review comment:
It seems like a bad idea to put these inside this method, as they mutate
BatchScanner, which could be re-used by the caller. This requires a significant
amount of coordination between the caller and this method's implementation,
such that it probably doesn't make sense to have this method. Since there's
only one caller, it could be inline'd... or at least made local and private for
the one caller.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]