simonbence commented on a change in pull request #4655:
URL: https://github.com/apache/nifi/pull/4655#discussion_r522080154
##########
File path:
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
.addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
.build();
+ static final PropertyDescriptor REREAD_ON_NUL = new
PropertyDescriptor.Builder()
+ .name("reread-on-nul")
+ .displayName("Reread when NUL encountered")
+ .description("If this option is set to 'true', when a NUL
character is read, the processor will yield and try to read the same part again
later. "
+ + "The purpose of this flag is to allow users to handle cases
where reading a file may return temporary NUL values. "
+ + "NFS for example may send file contents out of order. In
this case the missing parts are temporarily replaced by NUL values. "
Review comment:
Minor: temporarily filled up?
##########
File path:
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -610,7 +625,13 @@ public void onTrigger(final ProcessContext context, final
ProcessSession session
}
for (String tailFile : states.keySet()) {
- processTailFile(context, session, tailFile);
+ try {
+ processTailFile(context, session, tailFile);
+ } catch (NulCharacterEncounteredException e) {
+ getLogger().warn("NUL character encountered in " + tailFile +
" and '" + REREAD_ON_NUL.getDisplayName() + "' is set to 'true', yielding.");
+ context.yield();
+ return;
Review comment:
This will prevent the other tail files later in the collection from
processing. Is this intentional? I would expect the following test to be green:
```
@Test
public void testMultipleFilesWhenOneContainsNul() throws Exception {
// given
runner.setProperty(TailFile.BASE_DIRECTORY, "target");
runner.setProperty(TailFile.REREAD_ON_NUL, "true");
runner.setProperty(TailFile.FILENAME, "log(ging)?.txt");
runner.setProperty(TailFile.MODE, TailFile.MODE_MULTIFILE);
final File myOtherFile = new File("target/logging.txt");
if(myOtherFile.exists()) {
myOtherFile.delete();
}
final RandomAccessFile myOtherRaf = new
RandomAccessFile(myOtherFile, "rw");
runner.setProperty(TailFile.START_POSITION,
TailFile.START_CURRENT_FILE.getValue());
// The order might be not deterministic
raf.write("first_line_without_nul\n".getBytes());
myOtherRaf.write("another_line_with_nul\0\n".getBytes());
// when
runner.run();
// then
final List<MockFlowFile> flowFiles =
runner.getFlowFilesForRelationship(TailFile.REL_SUCCESS);
Assert.assertEquals(1, flowFiles.size());
}
```
##########
File path:
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
.addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
.build();
+ static final PropertyDescriptor REREAD_ON_NUL = new
PropertyDescriptor.Builder()
+ .name("reread-on-nul")
+ .displayName("Reread when NUL encountered")
+ .description("If this option is set to 'true', when a NUL
character is read, the processor will yield and try to read the same part again
later. "
+ + "The purpose of this flag is to allow users to handle cases
where reading a file may return temporary NUL values. "
+ + "NFS for example may send file contents out of order. In
this case the missing parts are temporarily replaced by NUL values. "
+ + "CAUTION! If the file contains legitimate NUL values,
setting this flag causes this processor to get stuck indefinitely. "
+ + "For this reason users should refrain from using this
feature if they can help it and try to avoid having the target file on a file
system where reads are unreliable.")
+ .required(false)
+ .allowableValues("true", "false")
+ .defaultValue("false")
Review comment:
I think, as it has "defaultValue" it could be required
##########
File path:
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -220,6 +221,19 @@
.addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
.build();
+ static final PropertyDescriptor REREAD_ON_NUL = new
PropertyDescriptor.Builder()
+ .name("reread-on-nul")
+ .displayName("Reread when NUL encountered")
+ .description("If this option is set to 'true', when a NUL
character is read, the processor will yield and try to read the same part again
later. "
+ + "The purpose of this flag is to allow users to handle cases
where reading a file may return temporary NUL values. "
+ + "NFS for example may send file contents out of order. In
this case the missing parts are temporarily replaced by NUL values. "
+ + "CAUTION! If the file contains legitimate NUL values,
setting this flag causes this processor to get stuck indefinitely. "
Review comment:
Minor: setting this flag causes the processor?
##########
File path:
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -1365,4 +1406,16 @@ public String toString() {
return map;
}
}
+
+ static class NulCharacterEncounteredException extends RuntimeException {
Review comment:
Just a possible performance improvement: as I understand, this kind of
exception should not leave the boundaries of the onTrigger and basically used
for flow control. Thus, stack trace will not be used for debug or other
purposes. Overriding #fillInStackTrace can prevent collecting the stack trace,
which is a costly operation and here looks unnecessary.
##########
File path:
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
##########
@@ -760,15 +781,26 @@ private void processTailFile(final ProcessContext
context, final ProcessSession
final FileChannel fileReader = reader;
final AtomicLong positionHolder = new AtomicLong(position);
+ Boolean reReadOnNul = context.getProperty(REREAD_ON_NUL).asBoolean();
Review comment:
Minor: just for consistency purposes: this could be final as the ones
above.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]