tpalfy commented on a change in pull request #5413:
URL: https://github.com/apache/nifi/pull/5413#discussion_r736547653
##########
File path:
nifi-nar-bundles/nifi-gcp-bundle/nifi-gcp-processors/src/main/java/org/apache/nifi/processors/gcp/storage/ListGCSBucket.java
##########
@@ -157,6 +164,45 @@
@WritesAttribute(attribute = URI_ATTR, description = URI_DESC)
})
public class ListGCSBucket extends AbstractGCSProcessor {
+ public static final AllowableValue BY_TIMESTAMPS = new
AllowableValue("timestamps", "Tracking Timestamps",
+ "This strategy tracks the latest timestamp of listed entity to
determine new/updated entities." +
+ " Since it only tracks few timestamps, it can manage listing state
efficiently." +
+ " However, any newly added, or updated entity having timestamp
older than the tracked latest timestamp can not be picked by this strategy." +
+ " Also may miss files when multiple subdirectories are being
written at the same time while listing is running.");
Review comment:
I see no significant difference between the two phrasing. I copied the
original from AbstractListProcessor and I think it's fine.
##########
File path:
nifi-nar-bundles/nifi-gcp-bundle/nifi-gcp-processors/src/main/java/org/apache/nifi/processors/gcp/storage/ListGCSBucket.java
##########
@@ -157,6 +164,45 @@
@WritesAttribute(attribute = URI_ATTR, description = URI_DESC)
})
public class ListGCSBucket extends AbstractGCSProcessor {
+ public static final AllowableValue BY_TIMESTAMPS = new
AllowableValue("timestamps", "Tracking Timestamps",
+ "This strategy tracks the latest timestamp of listed entity to
determine new/updated entities." +
+ " Since it only tracks few timestamps, it can manage listing state
efficiently." +
+ " However, any newly added, or updated entity having timestamp
older than the tracked latest timestamp can not be picked by this strategy." +
+ " Also may miss files when multiple subdirectories are being
written at the same time while listing is running.");
+
+ public static final AllowableValue BY_ENTITIES = new
AllowableValue("entities", "Tracking Entities",
+ "This strategy tracks information of all the listed entities within
the latest 'Entity Tracking Time Window' to determine new/updated entities." +
+ " This strategy can pick entities having old timestamp that can be
missed with 'Tracing Timestamps'." +
+ " Works even when multiple subdirectories are being written at the
same time while listing is running." +
+ " However additional DistributedMapCache controller service is
required and more JVM heap memory is used." +
Review comment:
I see no significant difference between the two phrasing. I copied the
original from AbstractListProcessor and I think it's fine.
##########
File path:
nifi-nar-bundles/nifi-gcp-bundle/nifi-gcp-processors/src/main/java/org/apache/nifi/processors/gcp/storage/ListGCSBucket.java
##########
@@ -157,6 +164,45 @@
@WritesAttribute(attribute = URI_ATTR, description = URI_DESC)
})
public class ListGCSBucket extends AbstractGCSProcessor {
+ public static final AllowableValue BY_TIMESTAMPS = new
AllowableValue("timestamps", "Tracking Timestamps",
+ "This strategy tracks the latest timestamp of listed entity to
determine new/updated entities." +
+ " Since it only tracks few timestamps, it can manage listing state
efficiently." +
+ " However, any newly added, or updated entity having timestamp
older than the tracked latest timestamp can not be picked by this strategy." +
+ " Also may miss files when multiple subdirectories are being
written at the same time while listing is running.");
+
+ public static final AllowableValue BY_ENTITIES = new
AllowableValue("entities", "Tracking Entities",
+ "This strategy tracks information of all the listed entities within
the latest 'Entity Tracking Time Window' to determine new/updated entities." +
+ " This strategy can pick entities having old timestamp that can be
missed with 'Tracing Timestamps'." +
+ " Works even when multiple subdirectories are being written at the
same time while listing is running." +
+ " However additional DistributedMapCache controller service is
required and more JVM heap memory is used." +
+ " See the description of 'Entity Tracking Time Window' property
for further details on how it works.");
Review comment:
I see no significant difference between the two phrasing. I copied the
original from AbstractListProcessor and I think it's fine.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]