Re: [PR] NIFI-12441 Added No Tracking listing strategy to ListS3 [nifi]

2024-01-16 Thread via GitHub


exceptionfactory closed pull request #8088: NIFI-12441 Added No Tracking 
listing strategy to ListS3
URL: https://github.com/apache/nifi/pull/8088


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12441 Added No Tracking listing strategy to ListS3 [nifi]

2024-01-15 Thread via GitHub


juldrixx commented on PR #8088:
URL: https://github.com/apache/nifi/pull/8088#issuecomment-1892841189

   > @juldrixx There appears to be a test failure in the `ci-workflow` for the 
new listing strategy:
   > 
   > ```
   > org.apache.nifi.processors.aws.s3.TestListS3.testNoTrackingList -- Time 
elapsed: 0.020 s <<< FAILURE
   > ```
   
   Sorry, I didn't see that the unit test results had changed when I rebased. 
It should be fine now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12441 Added No Tracking listing strategy to ListS3 [nifi]

2024-01-15 Thread via GitHub


exceptionfactory commented on PR #8088:
URL: https://github.com/apache/nifi/pull/8088#issuecomment-1892798911

   @juldrixx There appears to be a test failure in the `ci-workflow` for the 
new listing strategy:
   
   ```
   org.apache.nifi.processors.aws.s3.TestListS3.testNoTrackingList -- Time 
elapsed: 0.020 s <<< FAILURE
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12441 Added No Tracking listing strategy to ListS3 [nifi]

2023-12-22 Thread via GitHub


exceptionfactory commented on code in PR #8088:
URL: https://github.com/apache/nifi/pull/8088#discussion_r1435309260


##
nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/ListS3.java:
##
@@ -451,11 +456,92 @@ public void onTrigger(final ProcessContext context, final 
ProcessSession session
 listByTrackingTimestamps(context, session);
 } else if (BY_ENTITIES.equals(listingStrategy)) {
 listByTrackingEntities(context, session);
+} else if (NO_TRACKING.equals(listingStrategy)) {
+listNoTracking(context, session);
 } else {
 throw new ProcessException("Unknown listing strategy: " + 
listingStrategy);
 }
 }
 
+private void listNoTracking(ProcessContext context, ProcessSession 
session) {
+final AmazonS3 client = getClient(context);
+
+S3BucketLister bucketLister = getS3BucketLister(context, client);
+
+final long startNanos = System.nanoTime();
+final long minAgeMilliseconds = 
context.getProperty(MIN_AGE).asTimePeriod(TimeUnit.MILLISECONDS);
+final Long maxAgeMilliseconds = context.getProperty(MAX_AGE) != null ? 
context.getProperty(MAX_AGE).asTimePeriod(TimeUnit.MILLISECONDS) : null;
+final long listingTimestamp = System.currentTimeMillis();
+
+final String bucket = 
context.getProperty(BUCKET_WITHOUT_DEFAULT_VALUE).evaluateAttributeExpressions().getValue();
+final int batchSize = context.getProperty(BATCH_SIZE).asInteger();
+
+int listCount = 0;
+int totalListCount = 0;
+
+getLogger().trace("Start listing, listingTimestamp={}", new 
Object[]{listingTimestamp});
+
+final S3ObjectWriter writer;
+final RecordSetWriterFactory writerFactory = 
context.getProperty(RECORD_WRITER).asControllerService(RecordSetWriterFactory.class);
+if (writerFactory == null) {
+writer = new AttributeObjectWriter(session);
+} else {
+writer = new RecordObjectWriter(session, writerFactory, 
getLogger(), context.getProperty(S3_REGION).getValue());
+}
+
+try {
+writer.beginListing();
+
+do {
+VersionListing versionListing = bucketLister.listVersions();
+for (S3VersionSummary versionSummary : 
versionListing.getVersionSummaries()) {
+long lastModified = 
versionSummary.getLastModified().getTime();
+if ((maxAgeMilliseconds != null && (lastModified < 
(listingTimestamp - maxAgeMilliseconds)))
+|| lastModified > (listingTimestamp - 
minAgeMilliseconds)) {
+continue;
+}
+
+getLogger().trace("Listed key={}, lastModified={}", new 
Object[]{versionSummary.getKey(), lastModified});

Review Comment:
   ```suggestion
   getLogger().trace("Listed key={}, lastModified={}", 
versionSummary.getKey(), lastModified);
   ```



##
nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/ListS3.java:
##
@@ -451,11 +456,92 @@ public void onTrigger(final ProcessContext context, final 
ProcessSession session
 listByTrackingTimestamps(context, session);
 } else if (BY_ENTITIES.equals(listingStrategy)) {
 listByTrackingEntities(context, session);
+} else if (NO_TRACKING.equals(listingStrategy)) {
+listNoTracking(context, session);
 } else {
 throw new ProcessException("Unknown listing strategy: " + 
listingStrategy);
 }
 }
 
+private void listNoTracking(ProcessContext context, ProcessSession 
session) {
+final AmazonS3 client = getClient(context);
+
+S3BucketLister bucketLister = getS3BucketLister(context, client);
+
+final long startNanos = System.nanoTime();
+final long minAgeMilliseconds = 
context.getProperty(MIN_AGE).asTimePeriod(TimeUnit.MILLISECONDS);
+final Long maxAgeMilliseconds = context.getProperty(MAX_AGE) != null ? 
context.getProperty(MAX_AGE).asTimePeriod(TimeUnit.MILLISECONDS) : null;
+final long listingTimestamp = System.currentTimeMillis();
+
+final String bucket = 
context.getProperty(BUCKET_WITHOUT_DEFAULT_VALUE).evaluateAttributeExpressions().getValue();
+final int batchSize = context.getProperty(BATCH_SIZE).asInteger();
+
+int listCount = 0;
+int totalListCount = 0;
+
+getLogger().trace("Start listing, listingTimestamp={}", new 
Object[]{listingTimestamp});

Review Comment:
   The `Object[]` wrapper is not necessary for log arguments, although not all 
components have been updated to reflect the recommended usage, new code should 
follow the pattern.
   ```suggestion
   getLogger().trace("Start listing, listingTimestamp={}", 
listingTimestamp);
   ```



##

[PR] NIFI-12441 Added No Tracking listing strategy to ListS3 [nifi]

2023-11-30 Thread via GitHub


juldrixx opened a new pull request, #8088:
URL: https://github.com/apache/nifi/pull/8088

   
   
   
   
   
   
   
   
   
   
   
   
   
   # Summary
   
   [NIFI-12441](https://issues.apache.org/jira/browse/NIFI-12441)
   
   # Tracking
   
   Please complete the following tracking steps prior to pull request creation.
   
   ### Issue Tracking
   
   - [X] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue 
created
   
   ### Pull Request Tracking
   
   - [X] Pull Request title starts with Apache NiFi Jira issue number, such as 
`NIFI-0`
   - [X] Pull Request commit message starts with Apache NiFi Jira issue number, 
as such `NIFI-0`
   
   ### Pull Request Formatting
   
   - [X] Pull Request based on current revision of the `main` branch
   - [X] Pull Request refers to a feature branch with one commit containing 
changes
   
   # Verification
   
   Please indicate the verification steps performed prior to pull request 
creation.
   
   ### Build
   
   - [X] Build completed using `mvn clean install -P contrib-check`
 - [X] JDK 21
   
   ### Licensing
   
   - [X] New dependencies are compatible with the [Apache License 
2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License 
Policy](https://www.apache.org/legal/resolved.html)
   - [X] New dependencies are documented in applicable `LICENSE` and `NOTICE` 
files
   
   ### Documentation
   
   - [X] Documentation formatting appears as expected in rendered files
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org