ijokarumawak commented on a change in pull request #3483: NIFI-6275 ListHDFS
now ignores scheme and authority when uses "Full P…
URL: https://github.com/apache/nifi/pull/3483#discussion_r319786069
##########
File path:
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/ListHDFS.java
##########
@@ -527,7 +527,7 @@ private PathFilter createPathFilter(final ProcessContext
context) {
return path -> {
final boolean accepted;
if (FILTER_FULL_PATH_VALUE.getValue().equals(filterMode)) {
- accepted = filePattern.matcher(path.toString()).matches();
+ accepted =
filePattern.matcher(Path.getPathWithoutSchemeAndAuthority(path).toString()).matches();
Review comment:
If it's possible that this improvement may break existing user flows, then
I'd like to discuss about other approaches to opt-in this.
We can provide different UX via different approaches:
1. Current approach: If existing flows regex contains schema or authority,
their flow will not list files as before. Users may wonder what goes wrong. May
not notice the change if they don't read docs..
2. Adding new 'Filter without Schema and Authority' property:
- A. If we leave its default value blank and implement a custom
validation to require it when filter regex is not empty, then we can make
existing ListHDFS invalid. That will give user to chance to review their
configuration.
- B. If we use `false` as default value, existing flows work as is.
While this improvement can be opt-in. The most safe approach, but a con is
people may forget enabling this option.
3. Adding new 'Full Path (without schema and authority)' filter mode, or add
new one and rename the existing one's display name to 'Full Path (include
schema and authority)': this guarantees existing flows work as is, while
providing easy configuration UX for new setups.
I personally prefer the option 3 above. How do you think?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services