ctubbsii commented on a change in pull request #2181:
URL: https://github.com/apache/accumulo/pull/2181#discussion_r665638698
##########
File path:
server/tserver/src/main/java/org/apache/accumulo/tserver/log/RecoveryLogsIterator.java
##########
@@ -56,15 +56,20 @@
private final List<Scanner> scanners;
private final Iterator<Entry<Key,Value>> iter;
+ public RecoveryLogsIterator(ServerContext context, List<Path>
recoveryLogDirs,
+ boolean checkFirstKEy) throws IOException {
+ this(context, recoveryLogDirs, null, null, checkFirstKEy);
+ }
+
/**
* Scans the files in each recoveryLogDir over the range [start,end].
*/
- RecoveryLogsIterator(ServerContext context, List<Path> recoveryLogDirs,
LogFileKey start,
+ public RecoveryLogsIterator(ServerContext context, List<Path>
recoveryLogDirs, LogFileKey start,
LogFileKey end, boolean checkFirstKey) throws IOException {
List<Iterator<Entry<Key,Value>>> iterators = new
ArrayList<>(recoveryLogDirs.size());
scanners = new ArrayList<>();
- Range range = LogFileKey.toRange(start, end);
+ Range range = start != null ? LogFileKey.toRange(start, end) : null;
Review comment:
This is more readable without the negations:
```suggestion
Range range = start == null ? null : LogFileKey.toRange(start, end);
```
##########
File path:
server/tserver/src/main/java/org/apache/accumulo/tserver/logger/LogReader.java
##########
@@ -35,11 +36,13 @@
import org.apache.accumulo.core.data.Mutation;
import org.apache.accumulo.core.data.TableId;
import org.apache.accumulo.core.dataImpl.KeyExtent;
+import org.apache.accumulo.server.ServerContext;
+import org.apache.accumulo.server.fs.VolumeManager;
import org.apache.accumulo.server.fs.VolumeManagerImpl;
import org.apache.accumulo.start.spi.KeywordExecutable;
import org.apache.accumulo.tserver.log.DfsLogger;
import org.apache.accumulo.tserver.log.DfsLogger.LogHeaderIncompleteException;
-import org.apache.accumulo.tserver.log.RecoveryLogReader;
Review comment:
Is the old RecoveryLogReader not used after this? Can it be deleted?
##########
File path:
server/tserver/src/main/java/org/apache/accumulo/tserver/log/RecoveryLogsIterator.java
##########
@@ -56,15 +56,20 @@
private final List<Scanner> scanners;
private final Iterator<Entry<Key,Value>> iter;
+ public RecoveryLogsIterator(ServerContext context, List<Path>
recoveryLogDirs,
+ boolean checkFirstKEy) throws IOException {
+ this(context, recoveryLogDirs, null, null, checkFirstKEy);
+ }
Review comment:
It seems like this new constructor was added for convenience, but it
doesn't have a javadoc to explain how it differs from the other constructor. If
your goal is to make it crystal clear when you would use these additional
parameters and when you don't need them, instead of using constructors here you
could use static methods whose names are descriptive in the API.
So, instead of the following in the calling code:
```java
var iter = new RecoveryLogsIterator(context, dirs, start, stop, false);
var iter2 = new RecoveryLogsIterator(context, dirs, true);
```
it could look like:
```java
var iter = RecoveryLogsIterator.scanRange(context, dirs, start, stop, false);
var iter2 = RecoveryLogsIterator.scanAll(context, dirs); // don't think you
need the boolean for this case
```
Then, you can make the constructor private (or package-private, if you need
it visible for testing), and the implementing static methods can do some
additional parameter validation to ensure that when you specify a range, both
arguments aren't null.
These are just general design tips, though. For this, since the new
constructor is only ever used once, I'd probably just remove it and pass in
nulls on the old constructor for the one time it is called. Minimally, if
you're keeping the new constructor, it should have a javadoc (and you should
double check to see if you actually need to pass the boolean, since I don't
think it matters when the first key is null).
##########
File path:
server/tserver/src/main/java/org/apache/accumulo/tserver/logger/LogReader.java
##########
@@ -123,12 +127,16 @@ public void execute(String[] args) throws Exception {
Set<Integer> tabletIds = new HashSet<>();
for (String file : opts.files) {
-
Path path = new Path(file);
LogFileKey key = new LogFileKey();
LogFileValue value = new LogFileValue();
if (fs.getFileStatus(path).isFile()) {
+ if (file.endsWith(".rf")) {
+ log.error(
+ "Can not read from a single rfile. Please pass in a directory
for recovery logs.");
+ continue;
+ }
// read log entries from a simple hdfs file
try (final FSDataInputStream fsinput = fs.open(path);
DataInputStream input = DfsLogger.getDecryptingStream(fsinput,
siteConfig)) {
Review comment:
Given this error message and new behavior, it's not clear what this
condition (single file, but not ending in ".rf") is supposed to do. Is this for
reading the older sequence files? A comment could go a long way here. The only
comment it has is "read log entries from a simple hdfs file". Clearly, that
can't be read from a simple RFile in HDFS, so that comment could be updated to
make it clear what files it *can* read (i.e. what does "simple hdfs file"
mean?).
##########
File path:
server/tserver/src/main/java/org/apache/accumulo/tserver/logger/LogReader.java
##########
@@ -123,12 +127,16 @@ public void execute(String[] args) throws Exception {
Set<Integer> tabletIds = new HashSet<>();
for (String file : opts.files) {
-
Path path = new Path(file);
LogFileKey key = new LogFileKey();
LogFileValue value = new LogFileValue();
if (fs.getFileStatus(path).isFile()) {
+ if (file.endsWith(".rf")) {
Review comment:
There's probably a constant for the RFile extension somewhere that could
be used... not that it would change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]