ijokarumawak commented on a change in pull request #3375: NIFI-5979 : enhanced
ReplaceText processor with "Number of Occurrences" and "Occurrence offset"
configurations
URL: https://github.com/apache/nifi/pull/3375#discussion_r270275110
##########
File path:
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceText.java
##########
@@ -586,52 +672,76 @@ public void process(final InputStream in, final
OutputStream out) throws IOExcep
try (final LineDemarcator demarcator = new
LineDemarcator(in, charset, maxBufferSize, 8192);
final BufferedWriter bw = new BufferedWriter(new
OutputStreamWriter(out, charset))) {
- String oneLine;
- final StringBuffer sb = new StringBuffer();
+// final StringBuffer sb = new StringBuffer();
Matcher matcher = null;
-
- while (null != (oneLine = demarcator.nextLine())) {
- additionalAttrs.clear();
- if (matcher == null) {
- matcher = searchPattern.matcher(oneLine);
+ String precedingLine = demarcator.nextLine();
+ String succeedingLine;
+ boolean firstLine = true;
+ while (null != (succeedingLine =
demarcator.nextLine())) {
+ if(firstLine &&
evaluateMode.equalsIgnoreCase(FIRST_LINE)){
+ replaceRegexInLine(bw, precedingLine,
matcher, searchPattern, context, flowFile);
+ firstLine = false;
+ } else if(firstLine &&
evaluateMode.equalsIgnoreCase(EXCEPT_FIRST_LINE)) {
+ firstLine = false;
+ bw.write(precedingLine);
+ } else
if(evaluateMode.equalsIgnoreCase(LINE_BY_LINE)
+ ||
evaluateMode.equalsIgnoreCase(EXCEPT_LAST_LINE)
+ || (!firstLine &&
evaluateMode.equalsIgnoreCase(EXCEPT_FIRST_LINE))) {
+ replaceRegexInLine(bw, precedingLine,
matcher, searchPattern, context, flowFile);
} else {
- matcher.reset(oneLine);
+ bw.write(precedingLine);
}
+ precedingLine = succeedingLine;
+ }
- int matches = 0;
- sb.setLength(0);
+ if
(evaluateMode.equalsIgnoreCase(EXCEPT_LAST_LINE) || (!firstLine &&
evaluateMode.equalsIgnoreCase(FIRST_LINE))) {
Review comment:
With a FlowFile only contains one line, `Except-First-Line` replaced the
line. My expectation is keeping the original line. For example, if the
processor is used to replace CSV header, then having only 1 line meaning no
data record. We should not replace the content in that case.
I suggest adding following condition here:
```java
|| (firstLine && evaluateMode.equalsIgnoreCase(EXCEPT_FIRST_LINE)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services