pgyori commented on a change in pull request #4273:
URL: https://github.com/apache/nifi/pull/4273#discussion_r426788175



##########
File path: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/azure/storage/FetchAzureDataLakeStorage.java
##########
@@ -67,6 +67,10 @@ public void onTrigger(ProcessContext context, ProcessSession 
session) throws Pro
             final DataLakeDirectoryClient directoryClient = 
dataLakeFileSystemClient.getDirectoryClient(directory);
             final DataLakeFileClient fileClient = 
directoryClient.getFileClient(fileName);
 
+            if (fileClient.getProperties().isDirectory()) {

Review comment:
       Unfortunately things get quite complicated in that case. If we call 
isDirectory() after session.write(), then the flowfile is already overwritten 
with the empty content, which means that after throwing the ProcessException, 
the flowfile that goes to the error output has no content (instead of the 
content of the original input flowfile). To avoid losing the content of the 
input flowfile, we would need to copy and store its content (before calling 
session.write()), and if isDirectory() returns true, we would need to load back 
this content to the original flowfile before throwing the exception. This would 
result in higher memory consumption in case of large input flowfiles.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to