[ https://issues.apache.org/jira/browse/NIFI-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15804699#comment-15804699 ]
ASF GitHub Bot commented on NIFI-2859: -------------------------------------- Github user pvillard31 commented on a diff in the pull request: https://github.com/apache/nifi/pull/1383#discussion_r94956675 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/ListHDFS.java --- @@ -176,7 +176,7 @@ private HDFSListing deserialize(final String serializedState) throws JsonParseEx // Build a sorted map to determine the latest possible entries for (final FileStatus status : statuses) { - if (status.getPath().getName().endsWith("_COPYING_")) { + if (status.getPath().getName().endsWith("_COPYING_") || status.getPath().getName().startsWith(".")) { --- End diff -- @bbende Yes you're right! Otherwise there is the following in GetHDFS: ````java public static final PropertyDescriptor IGNORE_DOTTED_FILES = new PropertyDescriptor.Builder() .name("Ignore Dotted Files") .description("If true, files whose names begin with a dot (\".\") will be ignored") .required(true) .allowableValues("true", "false") .defaultValue("true") .build(); ```` But the filter property is much better. I'll update the PR. > List + Fetch HDFS processors are reading part files from HDFS > ------------------------------------------------------------- > > Key: NIFI-2859 > URL: https://issues.apache.org/jira/browse/NIFI-2859 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions > Affects Versions: 1.0.0 > Reporter: Mahesh Nayak > Assignee: Pierre Villard > > Create the following ProcessGroups > GetFile --> PutHdfs --> PutFile > ListHDFS --> FetchHdfs --> putFile > 2. Now start both the processGroups > 3. Write lots of files into HDFS so that ListHDFS keeps listing and FetchHdfs > fetches. > 4. An exception is thrown because the processor reads the part file from the > putHdfs folder > {code:none} > java.io.FileNotFoundException: File does not exist: > /tmp/HDFSProcessorsTest_visjJMcHORUwigw/.ycnVSpBOzEaoTWk_7f37d5af-d4a4-4521-b60d-c3c11ae19669 > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1860) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1831) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1744) > {code} > Note that eventually the file is copied to the output successfully, but at > the same time there are some files in the failure/comms failure relationship -- This message was sent by Atlassian JIRA (v6.3.4#6332)