[
https://issues.apache.org/jira/browse/NIFI-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15140845#comment-15140845
]
Joseph Witt commented on NIFI-1484:
-----------------------------------
you must have been tired ;-)
{quote}
<file
name="/Users/jwitt/apachenifi/nifi.git/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/TestAbstractListProcessor.java">
<error line="25" column="8" severity="warning" message="Unused import -
java.util.AbstractList."
source="com.puppycrawl.tools.checkstyle.checks.imports.UnusedImportsCheck"/>
<error line="33" column="8" severity="warning" message="Unused import -
org.apache.activemq.broker.region.AbstractTempRegion."
source="com.puppycrawl.tools.checkstyle.checks.imports.UnusedImportsCheck"/>
</file>
{quote}
> ListFile holds unbounded list of files with matching time stamps
> ----------------------------------------------------------------
>
> Key: NIFI-1484
> URL: https://issues.apache.org/jira/browse/NIFI-1484
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core UI, Extensions
> Affects Versions: 0.4.0, 0.5.0
> Reporter: Joseph Witt
> Assignee: Aldrin Piri
> Fix For: 0.5.0
>
>
> ListFile appears to hold an unbounded set of filenames that match the last
> timestamp. While this is understandable to handle the edge case of new data
> arriving at the same time it presents two problems. First we hold all of
> this information in state management which could put considerable pressure on
> both the local and remote stores but we also have it in memory before we
> persist it.
> Also, the entire state listing appears to show up in the UI without
> pagination or any limit on number of entries. This seems like a problem for
> the client-side as well. The server side should probably restrict this.
> Finally, it seems like the need for saving filenames seen at a given
> timestamp is only necessary if we're assuming the listing we do is 'as-of'
> RIGHT NOW. What is instead we did the listing based on a last modified time
> of 'RIGHTNOW'-1 millisecond or something like that? Then we should not have
> to worry at all about keeping a listing of names for the timestamp.
> The reason I think this is important is that it is not at all uncommon for a
> directory with large quantities of files to have data at the same time due to
> a copy operation not preserving original file attributes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)