[ 
https://issues.apache.org/jira/browse/NIFI-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16487180#comment-16487180
 ] 

ASF GitHub Bot commented on NIFI-5228:
--------------------------------------

Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2733#discussion_r190229672
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ListFile.java
 ---
    @@ -255,43 +262,47 @@ public void onScheduled(final ProcessContext context) 
{
             final Path absPath = filePath.toAbsolutePath();
             final String absPathString = absPath.getParent().toString() + 
File.separator;
     
    +        final DateFormat formatter = new 
SimpleDateFormat(FILE_MODIFY_DATE_ATTR_FORMAT, Locale.US);
    --- End diff --
    
    We don't recommend ever using ThreadLocal for processors in NiFi. This is 
because each time a Processor is run, it is done in a potentially different 
thread. For a large deployment you could have hundreds of threads, and the 
threads stay around for the life of the instance, so the cleanup is a little 
awkward. The pattern that we commonly follow is to use a BlockingQueue and poll 
from that, then create if necessary, and put back. I.e., a simple Object Pool. 
And I did consider it but decided that the complexity that it adds to the code 
was not worth it, given the cost of creating the DateFormat.


> Allow user to choose whether or not to add File Attributes as FlowFile 
> Attributes when using ListFile
> -----------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-5228
>                 URL: https://issues.apache.org/jira/browse/NIFI-5228
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>             Fix For: 1.7.0
>
>
> The FetchFile processor adds several FlowFIle attributes such as the file's 
> owner, last accessed time, creation time, etc. While these certainly can be 
> useful pieces of information and do serve a purpose, they can be expensive to 
> determine in some configurations. In my use case, I have an Azure File Store 
> mounted to an Ubuntu system with CIFS using SMB 3.0. The remote directory 
> that I am listing has 7,000-8,000 files and takes about 3 minutes to perform 
> the listing with ListFile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to