I am reading the ListHDFS code. I can't tell if the description is wrong,
the code is wrong, or I'm missing something.
Description: The path is set to the absolute path of the file's directory
on HDFS. For example, if the Directory property is set to /tmp then files
picked up from /tmp will have the path attribute set to \"./\". If the
Recurse Subdirectories property is set to true and a file is picked up from
/tmp/abc/1/2/3, then the path attribute will be set to \"/tmp/abc/1/2/3\".
Code:
attributes.put(CoreAttributes.PATH.key(), getAbsolutePath(status.getPath().
getParent()));
private String getAbsolutePath(final Path path) {
final Path parent = path.getParent();
final String prefix = (parent == null ||
parent.getName().equals("")) ? "" : getAbsolutePath(parent);
return prefix + "/" + path.getName();
}
I don't understand how it will return ./, it looks a lot like path is
determined independently of the Directory
On Wed, Nov 25, 2015 at 2:01 PM, Mark Payne <[email protected]> wrote:
> I certainly cannot argue with that, either.
>
> > On Nov 25, 2015, at 1:59 PM, Joe Witt <[email protected]> wrote:
> >
> > It sounds like ListFile kept logic similar to GetFile which I can
> > understand that approach.
> >
> > However, I do believe it makes more sense to follow the behavior of
> > ListHDFS where the path would be absolute.
> >
> > Thanks
> > Joe
> >
> > On Wed, Nov 25, 2015 at 1:56 PM, Tony Kurc <[email protected]> wrote:
> >> All,
> >> Joe and I commented on NIFI-631 that it didn't "just work" when wiring
> the
> >> processors together. ListFile was populating the attributes as
> >> described in CoreAttributes.java
> >> [1] (path being relative to the input directory, and absolute being the
> >> full path). FetchFile was using ${path}/${filename} as the default,
> which
> >> wouldn't grab the directory. I'm puzzled as to what the correct behavior
> >> should be. The description of path said it is relative ... relative to
> >> what? ListHDFS appears to state path is absolute [2] [3], and I expect
> we
> >> should have consistent behavior between ListHDFS and ListFile.
> >>
> >> So, I guess I'm not sure what guidance to give on a review of NIFI-631.
> >> Should the default of FetchFile be changed to
> ${absolute.path}/${filename}
> >> (which may be inconsistent with other List/Fetch processor combos), or
> >> should ListFile be changed to have path be absolute?
> >>
> >> [1]
> >>
> https://github.com/apache/nifi/blob/master/nifi-commons/nifi-utils/src/main/java/org/apache/nifi/flowfile/attributes/CoreAttributes.java
> >> [2]
> >>
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/ListHDFS.java#L79
> >> [3]
> >>
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/ListHDFS.java#L442
>
>