[jira] [Commented] (NIFI-11557) Eliminate use of Files.walkFileTree for any performance-critical parts of application
[ https://issues.apache.org/jira/browse/NIFI-11557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725565#comment-17725565 ] ASF subversion and git services commented on NIFI-11557: Commit a84a7cb60aa3bbae900ade5d1f2413b71dabdf38 in nifi's branch refs/heads/support/nifi-1.x from Matt Burgess [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=a84a7cb60a ] NIFI-11557: Fixed error with Java 11 code > Eliminate use of Files.walkFileTree for any performance-critical parts of > application > - > > Key: NIFI-11557 > URL: https://issues.apache.org/jira/browse/NIFI-11557 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework, Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Labels: content-repo, content-repository, performance, slowness, > startup > Fix For: 2.0.0, 1.22.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The FileSystemRepository (content repo implementation) as well as ListFile > both make use of the {{Files.walkFileTree}} method. Recently, I worked with a > user who had horribly long startup times. Thread dumps show that the time was > almost entirely in the FileSystemRepository's {{initializeRepository}} method > as it is walking the file tree in order to determine which archive files can > be cleaned up next. This is done during startup and again periodically in > background threads. > I made a small modification locally to instead use the standard synchronous > IO methods ( {{File.listFiles}} method. I used GenerateFlowFile to generate > 1-byte FlowFiles and set {{nifi.content.claim.max.appendable.size=1 B}} in > nifi.properties in order to generate a huge number of files - about 1.2 > million files in the content repository and restarted a few times. > Additionally, added some log lines to show how long this part of the startup > process took. > With the existing code, startup took 210 seconds (3.5 mins). With the new > implementation, it took 6.7 seconds. The appears to be due to the fact that > when using NIO.2 for every file, it does an individual disk access to obtain > File attributes, while when using the {{File.listFiles}} method the File > objects that are returned already have the necessary attributes. As a result, > the NIO.2 approach makes millions of disk accesses that are unnecessary. As > the number of files in the repository grows, the discrepancy also grows. > We need to eliminate any use of {{File.walkFileTree}} for any > performance-critical parts of the codebase. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11557) Eliminate use of Files.walkFileTree for any performance-critical parts of application
[ https://issues.apache.org/jira/browse/NIFI-11557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725535#comment-17725535 ] ASF subversion and git services commented on NIFI-11557: Commit a12c9ca9c72e8004afaf2f91088141ffd67ac437 in nifi's branch refs/heads/main from Mark Payne [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=a12c9ca9c7 ] NIFI-11557: Avoid using the expensive and unnecessary Files.walkFileTree on startup and initialization of Content Repository. Also performed some code cleanup: IntelliJ flagged many warnings in the class, mostly around methods that are no longer used and potential NullPointerExceptions, so those were cleaned up. Additionally, removed the nifi property for max flowfiles per claim - this property was never implemented. It was referenced, but the way in which is was used curiously had nothing to do with what the property was intended to be used for or for how it was documented. Instead, it was used to limit the max number of claims that could remain writable. As a result, it was removed. NIFI-11557: Added an additional system test and updated github actions to include surefire-report in order to help diagnose problem that occurred in one of the last system-test runs in Github. Could not replicate problem locally Signed-off-by: Matthew Burgess This closes #7265 > Eliminate use of Files.walkFileTree for any performance-critical parts of > application > - > > Key: NIFI-11557 > URL: https://issues.apache.org/jira/browse/NIFI-11557 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework, Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Labels: content-repo, content-repository, performance, slowness, > startup > Fix For: 1.latest, 2.latest > > Time Spent: 20m > Remaining Estimate: 0h > > The FileSystemRepository (content repo implementation) as well as ListFile > both make use of the {{Files.walkFileTree}} method. Recently, I worked with a > user who had horribly long startup times. Thread dumps show that the time was > almost entirely in the FileSystemRepository's {{initializeRepository}} method > as it is walking the file tree in order to determine which archive files can > be cleaned up next. This is done during startup and again periodically in > background threads. > I made a small modification locally to instead use the standard synchronous > IO methods ( {{File.listFiles}} method. I used GenerateFlowFile to generate > 1-byte FlowFiles and set {{nifi.content.claim.max.appendable.size=1 B}} in > nifi.properties in order to generate a huge number of files - about 1.2 > million files in the content repository and restarted a few times. > Additionally, added some log lines to show how long this part of the startup > process took. > With the existing code, startup took 210 seconds (3.5 mins). With the new > implementation, it took 6.7 seconds. The appears to be due to the fact that > when using NIO.2 for every file, it does an individual disk access to obtain > File attributes, while when using the {{File.listFiles}} method the File > objects that are returned already have the necessary attributes. As a result, > the NIO.2 approach makes millions of disk accesses that are unnecessary. As > the number of files in the repository grows, the discrepancy also grows. > We need to eliminate any use of {{File.walkFileTree}} for any > performance-critical parts of the codebase. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11557) Eliminate use of Files.walkFileTree for any performance-critical parts of application
[ https://issues.apache.org/jira/browse/NIFI-11557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725534#comment-17725534 ] ASF subversion and git services commented on NIFI-11557: Commit 82a55ebdd455f8429c577cea653f66b07db11f50 in nifi's branch refs/heads/support/nifi-1.x from Mark Payne [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=82a55ebdd4 ] NIFI-11557: Avoid using the expensive and unnecessary Files.walkFileTree on startup and initialization of Content Repository. Also performed some code cleanup: IntelliJ flagged many warnings in the class, mostly around methods that are no longer used and potential NullPointerExceptions, so those were cleaned up. Additionally, removed the nifi property for max flowfiles per claim - this property was never implemented. It was referenced, but the way in which is was used curiously had nothing to do with what the property was intended to be used for or for how it was documented. Instead, it was used to limit the max number of claims that could remain writable. As a result, it was removed. NIFI-11557: Added an additional system test and updated github actions to include surefire-report in order to help diagnose problem that occurred in one of the last system-test runs in Github. Could not replicate problem locally > Eliminate use of Files.walkFileTree for any performance-critical parts of > application > - > > Key: NIFI-11557 > URL: https://issues.apache.org/jira/browse/NIFI-11557 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework, Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Labels: content-repo, content-repository, performance, slowness, > startup > Fix For: 1.latest, 2.latest > > Time Spent: 10m > Remaining Estimate: 0h > > The FileSystemRepository (content repo implementation) as well as ListFile > both make use of the {{Files.walkFileTree}} method. Recently, I worked with a > user who had horribly long startup times. Thread dumps show that the time was > almost entirely in the FileSystemRepository's {{initializeRepository}} method > as it is walking the file tree in order to determine which archive files can > be cleaned up next. This is done during startup and again periodically in > background threads. > I made a small modification locally to instead use the standard synchronous > IO methods ( {{File.listFiles}} method. I used GenerateFlowFile to generate > 1-byte FlowFiles and set {{nifi.content.claim.max.appendable.size=1 B}} in > nifi.properties in order to generate a huge number of files - about 1.2 > million files in the content repository and restarted a few times. > Additionally, added some log lines to show how long this part of the startup > process took. > With the existing code, startup took 210 seconds (3.5 mins). With the new > implementation, it took 6.7 seconds. The appears to be due to the fact that > when using NIO.2 for every file, it does an individual disk access to obtain > File attributes, while when using the {{File.listFiles}} method the File > objects that are returned already have the necessary attributes. As a result, > the NIO.2 approach makes millions of disk accesses that are unnecessary. As > the number of files in the repository grows, the discrepancy also grows. > We need to eliminate any use of {{File.walkFileTree}} for any > performance-critical parts of the codebase. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11557) Eliminate use of Files.walkFileTree for any performance-critical parts of application
[ https://issues.apache.org/jira/browse/NIFI-11557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723950#comment-17723950 ] Mark Payne commented on NIFI-11557: --- Looking further into this, I found that the logic that we have currently that scans through the content repo serves two purposes: 1. To count how many files are archived 2. To determine the timestamp of the oldest archived file. The timestamp of the oldest archived file was to be used for performance gains, in order to determine that there are no files that need to be cleaned up due to time constraints and as a result don't bother scanning in the background. Interestingly, this code was buggy - while it checked the last modified time of each file, it then compared it to the 'oldestTimestamp' but 'oldestTimestamp' was initialized to 0, which means that it would always remain 0. As a result, this code was very expensive and unneeded. We only really need to count the number of files archived. This can be achieved MUCH more efficiently by simply performing a {{File.listFiles}} call on each archive directory. This will drastically improve startup performance in cases where there are millions of files archived. > Eliminate use of Files.walkFileTree for any performance-critical parts of > application > - > > Key: NIFI-11557 > URL: https://issues.apache.org/jira/browse/NIFI-11557 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework, Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.latest, 2.latest > > > The FileSystemRepository (content repo implementation) as well as ListFile > both make use of the {{Files.walkFileTree}} method. Recently, I worked with a > user who had horribly long startup times. Thread dumps show that the time was > almost entirely in the FileSystemRepository's {{initializeRepository}} method > as it is walking the file tree in order to determine which archive files can > be cleaned up next. This is done during startup and again periodically in > background threads. > I made a small modification locally to instead use the standard synchronous > IO methods ( {{File.listFiles}} method. I used GenerateFlowFile to generate > 1-byte FlowFiles and set {{nifi.content.claim.max.appendable.size=1 B}} in > nifi.properties in order to generate a huge number of files - about 1.2 > million files in the content repository and restarted a few times. > Additionally, added some log lines to show how long this part of the startup > process took. > With the existing code, startup took 210 seconds (3.5 mins). With the new > implementation, it took 6.7 seconds. The appears to be due to the fact that > when using NIO.2 for every file, it does an individual disk access to obtain > File attributes, while when using the {{File.listFiles}} method the File > objects that are returned already have the necessary attributes. As a result, > the NIO.2 approach makes millions of disk accesses that are unnecessary. As > the number of files in the repository grows, the discrepancy also grows. > We need to eliminate any use of {{File.walkFileTree}} for any > performance-critical parts of the codebase. -- This message was sent by Atlassian Jira (v8.20.10#820010)